Formerly RST blocks were formatted without a trailing newline, which
wasn't particularly helpful. Now everything that comes back from the
formatter has a trailing newline so remove all the extra ones added by
users.
This standalone mode no longer works due to the mechanics of import
and util. When run not as part of a package, the import of util causes
an import of the built-in posix module, which doesn't contain the
expected symbols. This is difficult to fix in Py2.4.
This adds a subset of the 'simple table' support from RST to allow
formatting of options lists through RST. Table columns are
automatically sized based on contents, with line wrapping in the last
column.
These leaks may occur in environments that don't employ a reference
counting GC, i.e. PyPy.
This implies:
- changing opener(...).read() calls to opener.read(...)
- changing opener(...).write() calls to opener.write(...)
- changing open(...).read(...) to util.readfile(...)
- changing open(...).write(...) to util.writefile(...)
This enables minirst to parse and print option lists which have both
long and short options. Before, we could only parse option lists with
long options.
You can now split a list with a comment:
* foo
.. separator
* bar
and the two list items will no longer be run together, that is the
output is
* foo
* bar
instead of
* foo
* bar
Some character encodings use ASCII characters other than
control/alphabet/digit as a part of multi-bytes characters, so direct
replacing with such characters on strings in local encoding causes
invalid byte sequences.
Mercurial has problem around text wrapping/filling in MBCS encoding
environment, because standard 'textwrap' module of Python can not
treat it correctly. It splits byte sequence for one character into two
lines.
According to unicode specification, "east asian width" classifies
characters into:
W(ide), N(arrow), F(ull-width), H(alf-width), A(mbiguous)
W/N/F/H can be always recognized as 2/1/2/1 bytes in byte sequence,
but 'A' can not. Size of 'A' depends on language in which it is used.
Unicode specification says:
If the context(= language) cannot be established reliably they
should be treated as narrow characters by default
but many of class 'A' characters are full-width, at least, in Japanese
environment.
So, this patch treats class 'A' characters as full-width always for
safety wrapping.
This patch focuses only on MBCS safe-ness, not on writing/printing
rule strict wrapping for each languages
MBCS sensitive textwrap class is originally implemented
by ITO Nobuaki <daydream.trippers@gmail.com>.
Text can be grouped into generic containers in reStructuredText:
.. container:: foo
This is text inside a "foo" container.
.. container:: bar
This is nested inside two containers.
The minirst parser now recognizes these containers. The containers are
either pruned completely from the output (included all nested blocks)
or they are simply un-indented. So if 'foo' and 'bar' containers are
kept, the above example will result in:
This is text inside a "foo" container.
This is nested inside two containers.
If only 'foo' containers are kept, we get:
This is text inside a "foo" container.
No output is made if only 'bar' containers are kept.
This feature will come in handy for implementing different levels of
help output (e.g., verbose and debug level help texts).
Bullet, option, field, and definition lists were parsed very similar
code. They are now parsed by a single function (splitparagraphs).
Some logic from the old parsing functions has been moved down to
formatblock. This simplifies the parsing while putting the logic where
it's really needed.
The help topics are reused in the HTML documentation, and there it
looks odd that whole sections are indented. We now only indent it for
output on the terminal.
The vast majority* of them are formatted like this in the source, so
this basically reverts the output to how it looked before we got the
minirst parser.
*: the help on templating use four spaces for some examples and will
now shown with an indentation of just two spaces.
Before, we used the padding following the key to compute where to wrap
the text. Long keys would thus give a big indentation. It also
required careful alignment of the source text, making it cumbersome to
items to the list.
We now compute the maximum key length and use that for all items in
the list. We also put a cap on the indentation: keys longer than 10
characters are put on their own line. This is similar to how rst2html
handles large keys: it uses 14 as the cutoff point, but I felt that 10
was better for monospaced text in the console.