diff options
Diffstat (limited to 'manual/xmlformat/BUGS')
-rw-r--r-- | manual/xmlformat/BUGS | 55 |
1 files changed, 55 insertions, 0 deletions
diff --git a/manual/xmlformat/BUGS b/manual/xmlformat/BUGS new file mode 100644 index 0000000000..d1e81954fa --- /dev/null +++ b/manual/xmlformat/BUGS @@ -0,0 +1,55 @@ +xmlformat bugs + +Ruby version is slower than the Perl version, a difference that shows up +particularly for larger documents. I haven't done any profiling to determine +which parts of the program account for the differences. + +---------- +Within normalized elements, the line-wrapping algorithm preserves inline +tags, but it doesn't take any internal line-breaks within those tags into +account in its line-length calculations. Consider this element: + +<para> +This is text with an inline <inline-element attr="This is +an attribute +value">element</inline-element> in the middle. +</para> + +The opening <inline-element> tag is considered to have a length +equal to its total number of characters. The line-wrapping algorithm +could be made more complex to take the line-breaks into account. +I haven't bothered, and may never bother. If such a change is made, no +indenting should be applied that would change the attribute value. + +---------- +Line-wrapping length calculations don't take into account the possibility +that text from a different element may occur on the same line if break +values are set to zero. For example, with a wrap-length of 15, you could +end up with output like this: + +<listitem><para>This is a line +of text.</para></listitem><listitem><para>This is a line +of text.</para></listitem> + +The middle line has more than 15 characters of text. + +This also shows that wrap-length doesn't take into account the length of +tags of enclosing blocks on the same line, which can also be considered a +bug. + +Fix: Set all your break values > 0. + +---------- +Normalization converts runs of spaces to single spaces. This means that +if you write two spaces after periods in text, normalization will +convert them to single spaces. Even if normalization didn't do that, +the two spaces would be lost if line-wrapping is enabled and they occur +at a line break. + +I don't know if this is really a bug, but it's something to be aware of. + +---------- +Doesn't recognize multi-byte files. In some cases, you can work around this. +For example, an editor might save a file as Unicode even when the document +contains only ASCII characters. Re-save the file as an ASCII file (or +some single-byte encoding such as ISO-8859-1) and it should work. |