FIXED: Clean up Quick Start guide

This commit is contained in:
Fletcher T. Penney 2018-11-08 12:33:02 -05:00
parent a047f0eed9
commit 9717314f2a
4 changed files with 341 additions and 946 deletions

Binary file not shown.

View File

@ -323,89 +323,34 @@ office:mimetype="application/vnd.oasis.opendocument.text">
<text:p text:style-name="Standard">Version: 6.4.0</text:p>
<text:p text:style-name="Standard">This document serves as a description of MultiMarkdown (MMD) v6, as well as a sample
document to demonstrate the various features. Specifically, differences from
MMD v5 will be pointed out.</text:p>
<text:p text:style-name="Standard">This document serves as a description of MultiMarkdown (MMD) v6, as well as a sample document to demonstrate the various features. Specifically, differences from MMD v5 will be pointed out.</text:p>
<text:h text:outline-level="3"><text:bookmark text:name="performance"/>Performance</text:h>
<text:p text:style-name="Standard">A big motivating factor leading to the development of MMD v6 was
performance. When MMD first migrated from Perl to C (based on <text:a xlink:type="simple" xlink:href="https://github.com/jgm/peg-markdown">peg-
markdown</text:a>), it was among the fastest
Markdown parsers available. That was many years ago, and the &#8220;competition&#8221;
has made a great deal of progress since that time.</text:p>
<text:p text:style-name="Standard">A big motivating factor leading to the development of MMD v6 was performance. When MMD first migrated from Perl to C (based on <text:a xlink:type="simple" xlink:href="https://github.com/jgm/peg-markdown">peg- markdown</text:a>), it was among the fastest Markdown parsers available. That was many years ago, and the &#8220;competition&#8221; has made a great deal of progress since that time.</text:p>
<text:p text:style-name="Standard">When developing MMD v6, one of my goals was to keep MMD at least in the
ballpark of the fastest processors. Of course, being <text:span text:style-name="MMD-Italic">the</text:span> fastest would be
fantastic, but I was more concerned with ensuring that the code was easily
understood, and easily updated with new features in the future.</text:p>
<text:p text:style-name="Standard">When developing MMD v6, one of my goals was to keep MMD at least in the ballpark of the fastest processors. Of course, being <text:span text:style-name="MMD-Italic">the</text:span> fastest would be fantastic, but I was more concerned with ensuring that the code was easily understood, and easily updated with new features in the future.</text:p>
<text:p text:style-name="Standard">MMD v3 &#8211; v5 used a PEG<text:note text:id="gn1" text:note-class="glossary"><text:note-body><text:p text:style-name="Footnote">Parsing Expression Grammar <text:a xlink:type="simple" xlink:href="https://en.wikipedia.org/wiki/Parsing_expression_grammar">https://en.wikipedia.org/wiki/Parsing_expression_grammar</text:a></text:p></text:note-body></text:note> to handle the parsing. This made it easy to
understand the relationship between the MMD grammar and the parsing code,
since they were one and the same. However, the parsing code generated by
the parsers was not particularly fast, and was prone to troublesome edge
cases with terrible performance characteristics.</text:p>
<text:p text:style-name="Standard">MMD v3 &#8211; v5 used a PEG<text:note text:id="gn1" text:note-class="glossary"><text:note-body><text:p text:style-name="Footnote">Parsing Expression Grammar <text:a xlink:type="simple" xlink:href="https://en.wikipedia.org/wiki/Parsing_expression_grammar">https://en.wikipedia.org/wiki/Parsing_expression_grammar</text:a></text:p></text:note-body></text:note> to handle the parsing. This made it easy to understand the relationship between the MMD grammar and the parsing code, since they were one and the same. However, the parsing code generated by the parsers was not particularly fast, and was prone to troublesome edge cases with terrible performance characteristics.</text:p>
<text:p text:style-name="Standard">The first step in MMD v6 parsing is to break the source text into a series
of tokens, which may consist of plain text, whitespace, or special characters
such as &#8216;*&#8217;, &#8216;[&#8217;, etc. This chain of tokens is then used to perform the
actual parsing.</text:p>
<text:p text:style-name="Standard">The first step in MMD v6 parsing is to break the source text into a series of tokens, which may consist of plain text, whitespace, or special characters such as &#8216;*&#8217;, &#8216;[&#8217;, etc. This chain of tokens is then used to perform the actual parsing.</text:p>
<text:p text:style-name="Standard">MMD v6 divides the parsing into two separate phases, which actually fits
more with Markdown&#8217;s design philosophically.</text:p>
<text:p text:style-name="Standard">MMD v6 divides the parsing into two separate phases, which actually fits more with Markdown&#8217;s design philosophically.</text:p>
<text:list text:style-name="L2">
<text:list-item>
<text:p text:style-name="Standard">Block parsing consists of identifying the &#8220;type&#8221; of each line of the
source text, and grouping the lines into blocks (e.g. paragraphs, lists,
blockquotes, etc.) Some blocks are a single line (e.g. ATX headers), and
others can be many lines long. The block parsing in MMD v6 is handled
by a parser generated by <text:a xlink:type="simple" xlink:href="http://www.hwaci.com/sw/lemon/">lemon</text:a>. This
parser allows the block structure to be more readily understood by
non-programmers, but the generated parser is still fast.</text:p></text:list-item>
<text:p text:style-name="Standard">Block parsing consists of identifying the &#8220;type&#8221; of each line of the source text, and grouping the lines into blocks (e.g. paragraphs, lists, blockquotes, etc.) Some blocks are a single line (e.g. ATX headers), and others can be many lines long. The block parsing in MMD v6 is handled by a parser generated by <text:a xlink:type="simple" xlink:href="http://www.hwaci.com/sw/lemon/">lemon</text:a>. This parser allows the block structure to be more readily understood by non-programmers, but the generated parser is still fast.</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard">Span parsing consists of identifying Markdown/MMD structures that occur
inside of blocks, such as links, images, strong, emph, etc. Most of these
structures require matching pairs of tokens to specify where the span starts
and where it ends. Most of these spans allow arbitrary levels of nesting as
well. This made parsing them correctly in the PEG-based code difficult and
slow. MMD v6 uses a different approach that is accurate and has good
performance characteristics even with edge cases. Basically, it keeps a stack
of each &#8220;opening&#8221; token as it steps through the token chain. When a &#8220;closing&#8221;
token is found, it is paired with the most recent appropriate opener on the
stack. Any tokens in between the opener and closer are removed, as they are
not able to be matched any more. To avoid unnecessary searches for non-
existent openers, the parser keeps track of which opening tokens have been
discovered. This allows the parser to continue moving forwards without having
to go backwards and re-parse any previously visited tokens.</text:p></text:list-item>
<text:p text:style-name="Standard">Span parsing consists of identifying Markdown/MMD structures that occur inside of blocks, such as links, images, strong, emph, etc. Most of these structures require matching pairs of tokens to specify where the span starts and where it ends. Most of these spans allow arbitrary levels of nesting as well. This made parsing them correctly in the PEG-based code difficult and slow. MMD v6 uses a different approach that is accurate and has good performance characteristics even with edge cases. Basically, it keeps a stack of each &#8220;opening&#8221; token as it steps through the token chain. When a &#8220;closing&#8221; token is found, it is paired with the most recent appropriate opener on the stack. Any tokens in between the opener and closer are removed, as they are not able to be matched any more. To avoid unnecessary searches for non- existent openers, the parser keeps track of which opening tokens have been discovered. This allows the parser to continue moving forwards without having to go backwards and re-parse any previously visited tokens.</text:p></text:list-item>
</text:list>
<text:p text:style-name="Standard">The result of this redesigned MMD parser is that it can parse short
documents more quickly than <text:a xlink:type="simple" xlink:href="http://commonmark.org/">CommonMark</text:a>, and takes
only 15% &#8211; 20% longer to parse long documents. I have not delved too deeply
into this, but I presume that CommonMark has a bit more &#8220;set-up&#8221; time that
becomes expensive when parsing a short document (e.g. a paragraph or two). But
this cost becomes negligible when parsing longer documents (e.g. file sizes of
1 MB). So depending on your use case, CommonMark may well be faster than
MMD, but we&#8217;re talking about splitting hairs here&#8230;. Recent comparisons
show MMD v6 taking approximately 4.37 seconds to parse a 108 MB file
(approximately 24.8 MB/second), and CommonMark took 3.72 seconds for the same
file (29.2 MB/second). For comparison, MMD v5.4 took approximately 94
second for the same file (1.15 MB/second).</text:p>
<text:p text:style-name="Standard">The result of this redesigned MMD parser is that it can parse short documents more quickly than <text:a xlink:type="simple" xlink:href="http://commonmark.org/">CommonMark</text:a>, and takes only 15% &#8211; 20% longer to parse long documents. I have not delved too deeply into this, but I presume that CommonMark has a bit more &#8220;set-up&#8221; time that becomes expensive when parsing a short document (e.g. a paragraph or two). But this cost becomes negligible when parsing longer documents (e.g. file sizes of 1 MB). So depending on your use case, CommonMark may well be faster than MMD, but we&#8217;re talking about splitting hairs here&#8230;. Recent comparisons show MMD v6 taking approximately 4.37 seconds to parse a 108 MB file (approximately 24.8 MB/second), and CommonMark took 3.72 seconds for the same file (29.2 MB/second). For comparison, MMD v5.4 took approximately 94 second for the same file (1.15 MB/second).</text:p>
<text:p text:style-name="Standard">For a more realistic file of approx 28 kb (the source of the Markdown Syntax
web page), both MMD and CommonMark parse it too quickly to accurately
measure. In fact, it requires a file consisting of the original file copied
32 times over (0.85 MB) before <text:span text:style-name="Source_20_Text">/usr/bin/env time</text:span> reports a time over the
minimum threshold of 0.01 seconds for either program.</text:p>
<text:p text:style-name="Standard">For a more realistic file of approx 28 kb (the source of the Markdown Syntax web page), both MMD and CommonMark parse it too quickly to accurately measure. In fact, it requires a file consisting of the original file copied 32 times over (0.85 MB) before <text:span text:style-name="Source_20_Text">/usr/bin/env time</text:span> reports a time over the minimum threshold of 0.01 seconds for either program.</text:p>
<text:p text:style-name="Standard">There is still potentially room for additional optimization in MMD.
However, even if I can&#8217;t close the performance gap with CommonMark on longer
files, the additional features of MMD compared with Markdown in addition to
the increased legibility of the source code of MMD (in my biased opinion
anyway) make this project worthwhile.</text:p>
<text:p text:style-name="Standard">There is still potentially room for additional optimization in MMD. However, even if I can&#8217;t close the performance gap with CommonMark on longer files, the additional features of MMD compared with Markdown in addition to the increased legibility of the source code of MMD (in my biased opinion anyway) make this project worthwhile.</text:p>
<text:h text:outline-level="3"><text:bookmark text:name="parsetree"/>Parse Tree</text:h>
@ -422,12 +367,10 @@ anyway) make this project worthwhile.</text:p>
<text:p text:style-name="Standard">Parse token chain into blocks</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard">Parse tokens within each block into span level structures (e.g. strong,
emph, etc.)</text:p></text:list-item>
<text:p text:style-name="Standard">Parse tokens within each block into span level structures (e.g. strong, emph, etc.)</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard">Export the token tree into the desired output format (e.g. HTML, LaTeX,
etc.) and return the resulting C style string</text:p>
<text:p text:style-name="Standard">Export the token tree into the desired output format (e.g. HTML, LaTeX, etc.) and return the resulting C style string</text:p>
<text:p text:style-name="Standard"><text:span text:style-name="MMD-Bold">OR</text:span></text:p></text:list-item>
@ -436,70 +379,37 @@ etc.) and return the resulting C style string</text:p>
</text:list>
<text:p text:style-name="Standard">The token tree (AST<text:note text:id="gn2" text:note-class="glossary"><text:note-body><text:p text:style-name="Footnote">Abstract Syntax Tree <text:a xlink:type="simple" xlink:href="https://en.wikipedia.org/wiki/Abstract_syntax_tree">https://en.wikipedia.org/wiki/Abstract_syntax_tree</text:a></text:p></text:note-body></text:note>) includes starting offsets and length of each token,
allowing you to use MMD as part of a syntax highlighter. MMD v5 did not
have this functionality in the public version, in part because the PEG parsers
used did not provide reliable offset positions, requiring a great deal of
effort when I adapted MMD for use in <text:a xlink:type="simple" xlink:href="http://multimarkdown.com/">MultiMarkdown
Composer</text:a>.</text:p>
<text:p text:style-name="Standard">The token tree (AST<text:note text:id="gn2" text:note-class="glossary"><text:note-body><text:p text:style-name="Footnote">Abstract Syntax Tree <text:a xlink:type="simple" xlink:href="https://en.wikipedia.org/wiki/Abstract_syntax_tree">https://en.wikipedia.org/wiki/Abstract_syntax_tree</text:a></text:p></text:note-body></text:note>) includes starting offsets and length of each token, allowing you to use MMD as part of a syntax highlighter. MMD v5 did not have this functionality in the public version, in part because the PEG parsers used did not provide reliable offset positions, requiring a great deal of effort when I adapted MMD for use in <text:a xlink:type="simple" xlink:href="http://multimarkdown.com/">MultiMarkdown Composer</text:a>.</text:p>
<text:p text:style-name="Standard">These steps are managed using the <text:span text:style-name="Source_20_Text">mmd_engine</text:span> &#8220;object&#8221;. An individual
<text:span text:style-name="Source_20_Text">mmd_engine</text:span> cannot be used by multiple threads simultaneously, so if
libMultiMarkdown is to be used in a multithreaded program, a separate
<text:span text:style-name="Source_20_Text">mmd_engine</text:span> should be created for each thread. Alternatively, just use the
slightly more abstracted <text:span text:style-name="Source_20_Text">mmd_convert_string()</text:span> function that handles creating
and destroying the <text:span text:style-name="Source_20_Text">mmd_engine</text:span> automatically.</text:p>
<text:p text:style-name="Standard">These steps are managed using the <text:span text:style-name="Source_20_Text">mmd_engine</text:span> &#8220;object&#8221;. An individual <text:span text:style-name="Source_20_Text">mmd_engine</text:span> cannot be used by multiple threads simultaneously, so if libMultiMarkdown is to be used in a multithreaded program, a separate <text:span text:style-name="Source_20_Text">mmd_engine</text:span> should be created for each thread. Alternatively, just use the slightly more abstracted <text:span text:style-name="Source_20_Text">mmd_convert_string()</text:span> function that handles creating and destroying the <text:span text:style-name="Source_20_Text">mmd_engine</text:span> automatically.</text:p>
<text:h text:outline-level="3"><text:bookmark text:name="features"/>Features</text:h>
<text:h text:outline-level="4"><text:bookmark text:name="abbreviationsoracronyms"/>Abbreviations (Or Acronyms)</text:h>
<text:p text:style-name="Standard">This file includes the use of MMD as an abbreviation for MultiMarkdown. The
abbreviation will be expanded on the first use, and the shortened form will be
used on subsequent occurrences.</text:p>
<text:p text:style-name="Standard">This file includes the use of MMD as an abbreviation for MultiMarkdown. The abbreviation will be expanded on the first use, and the shortened form will be used on subsequent occurrences.</text:p>
<text:p text:style-name="Standard">Abbreviations can be specified using inline or reference syntax. The inline
variant requires that the abbreviation be wrapped in parentheses and
immediately follows the <text:span text:style-name="Source_20_Text">&gt;</text:span>.</text:p>
<text:p text:style-name="Standard">Abbreviations can be specified using inline or reference syntax. The inline variant requires that the abbreviation be wrapped in parentheses and immediately follows the <text:span text:style-name="Source_20_Text">&gt;</text:span>.</text:p>
<text:p text:style-name="Preformatted Text">[>MMD] is an abbreviation. So is [>(MD) Markdown].<text:line-break/><text:line-break/>[>MMD]: MultiMarkdown<text:line-break/></text:p>
<text:p text:style-name="Standard">There is also a &#8220;shortcut&#8221; method for abbreviations that is similar to the
approach used in prior versions of MMD. You specify the definition for the
abbreviation in the usual manner, but MMD will automatically identify each
instance where the abbreviation is used and substitute it automatically. In
this case, the abbreviation is limited to a more basic character set which
includes letters, numbers, periods, and hyphens, but not much else. For more
complex abbreviations, you must explicitly mark uses of the abbreviation.</text:p>
<text:p text:style-name="Standard">There is also a &#8220;shortcut&#8221; method for abbreviations that is similar to the approach used in prior versions of MMD. You specify the definition for the abbreviation in the usual manner, but MMD will automatically identify each instance where the abbreviation is used and substitute it automatically. In this case, the abbreviation is limited to a more basic character set which includes letters, numbers, periods, and hyphens, but not much else. For more complex abbreviations, you must explicitly mark uses of the abbreviation.</text:p>
<text:h text:outline-level="4"><text:bookmark text:name="citations"/>Citations</text:h>
<text:p text:style-name="Standard">Citations can be specified using an inline syntax, just like inline footnotes.
If you wish to use BibTeX, then configure the <text:span text:style-name="Source_20_Text">bibtex</text:span> metadata (required) and
the <text:span text:style-name="Source_20_Text">biblio style</text:span> metadata (optional).</text:p>
<text:p text:style-name="Standard">Citations can be specified using an inline syntax, just like inline footnotes. If you wish to use BibTeX, then configure the <text:span text:style-name="Source_20_Text">bibtex</text:span> metadata (required) and the <text:span text:style-name="Source_20_Text">biblio style</text:span> metadata (optional).</text:p>
<text:p text:style-name="Standard">The HTML output for citations now uses parentheses instead of brackets, e.g.
<text:span text:style-name="Source_20_Text">(1)</text:span> instead of <text:span text:style-name="Source_20_Text">[1]</text:span>.</text:p>
<text:p text:style-name="Standard">The HTML output for citations now uses parentheses instead of brackets, e.g. <text:span text:style-name="Source_20_Text">(1)</text:span> instead of <text:span text:style-name="Source_20_Text">[1]</text:span>.</text:p>
<text:h text:outline-level="4"><text:bookmark text:name="criticmarkup"/>CriticMarkup</text:h>
<text:p text:style-name="Standard">MMD v6 has improved support for <text:a xlink:type="simple" xlink:href="http://criticmarkup.com/">CriticMarkup</text:a>, both in terms of parsing, and
in terms of support for each output format. You can <text:span text:style-name="Underline">insert text</text:span>,
<text:span text:style-name="Strike">delete text</text:span>, substitute <text:span text:style-name="Strike">one thing</text:span><text:span text:style-name="Underline">for another</text:span>, <text:span text:style-name="Highlight">highlight text</text:span>,
and <text:span text:style-name="Comment">leave comments</text:span> in the text.</text:p>
<text:p text:style-name="Standard">MMD v6 has improved support for <text:a xlink:type="simple" xlink:href="http://criticmarkup.com/">CriticMarkup</text:a>, both in terms of parsing, and in terms of support for each output format. You can <text:span text:style-name="Underline">insert text</text:span>, <text:span text:style-name="Strike">delete text</text:span>, substitute <text:span text:style-name="Strike">one thing</text:span><text:span text:style-name="Underline">for another</text:span>, <text:span text:style-name="Highlight">highlight text</text:span>, and <text:span text:style-name="Comment">leave comments</text:span> in the text.</text:p>
<text:p text:style-name="Standard">If you don&#8217;t specify any command line options, then MMD will apply special
formatting to the CriticMarkup formatting as in the preceding paragraph.
Alternatively, you can use the <text:span text:style-name="Source_20_Text">-a\--accept</text:span> or <text:span text:style-name="Source_20_Text">-r\--reject</text:span> options to cause
MMD to accept or reject, respectively, the proposed changes within the CM
markup. When doing this, CM will work across blank lines. Without either of
these two options, then CriticMarkup that spans a blank line is not recognized
as such. I working on options for this for the future.</text:p>
<text:p text:style-name="Standard">If you don&#8217;t specify any command line options, then MMD will apply special formatting to the CriticMarkup formatting as in the preceding paragraph. Alternatively, you can use the <text:span text:style-name="Source_20_Text">-a\--accept</text:span> or <text:span text:style-name="Source_20_Text">-r\--reject</text:span> options to cause MMD to accept or reject, respectively, the proposed changes within the CM markup. When doing this, CM will work across blank lines. Without either of these two options, then CriticMarkup that spans a blank line is not recognized as such. I am working on options for this for the future.</text:p>
<text:h text:outline-level="4"><text:bookmark text:name="embeddedimages"/>Embedded Images</text:h>
<text:p text:style-name="Standard">Supported export formats (<text:span text:style-name="Source_20_Text">odt</text:span>, <text:span text:style-name="Source_20_Text">epub</text:span>, <text:span text:style-name="Source_20_Text">bundle</text:span>, <text:span text:style-name="Source_20_Text">bundlezip</text:span>) include
images inside the export document:</text:p>
<text:p text:style-name="Standard">Supported export formats (<text:span text:style-name="Source_20_Text">odt</text:span>, <text:span text:style-name="Source_20_Text">epub</text:span>, <text:span text:style-name="Source_20_Text">bundle</text:span>, <text:span text:style-name="Source_20_Text">bundlezip</text:span>) include images inside the export document:</text:p>
<text:list text:style-name="L1">
<text:list-item>
@ -507,39 +417,23 @@ images inside the export document:</text:p>
Local images are embedded automatically</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard">Images stored on remote servers are embedded <text:span text:style-name="MMD-Italic">if</text:span> <text:a xlink:type="simple" xlink:href="https://curl.haxx.se/libcurl/">libCurl</text:a> is
properly installed when MMD is compiled. This is true for macOS builds.</text:p></text:list-item>
<text:p text:style-name="Standard">Images stored on remote servers are embedded <text:span text:style-name="MMD-Italic">if</text:span> <text:a xlink:type="simple" xlink:href="https://curl.haxx.se/libcurl/">libCurl</text:a> is properly installed when MMD is compiled. This is true for macOS builds.</text:p></text:list-item>
</text:list>
<text:h text:outline-level="4"><text:bookmark text:name="emphandstrong"/>Emph and Strong</text:h>
<text:p text:style-name="Standard">The basics of emphasis and strong emphasis are unchanged, but the parsing
engine has been improved to be more accurate, particularly in various edge
cases where proper parsing can be difficult.</text:p>
<text:p text:style-name="Standard">The basics of emphasis and strong emphasis are unchanged, but the parsing engine has been improved to be more accurate, particularly in various edge cases where proper parsing can be difficult.</text:p>
<text:h text:outline-level="4"><text:bookmark text:name="epub3support"/>EPUB 3 Support</text:h>
<text:p text:style-name="Standard">MMD v6 now provides support for direct creation of <text:a xlink:type="simple" xlink:href="https://en.wikipedia.org/wiki/EPUB">EPUB 3</text:a> files. Previously
a separate tool was required to create EPUB files from MMD. It&#8217;s now built-
in. Currently, EPUB 3 files are built using the usual HTML 5 output. No
extra CSS is applied, so the default from the reader will be used. Images are
not yet supported, but are planned for the future.</text:p>
<text:p text:style-name="Standard">MMD v6 now provides support for direct creation of <text:a xlink:type="simple" xlink:href="https://en.wikipedia.org/wiki/EPUB">EPUB 3</text:a> files. Previously a separate tool was required to create EPUB files from MMD. It&#8217;s now built-in. Currently, EPUB 3 files are built using the usual HTML 5 output. No extra CSS is applied, so the default from the reader will be used. Images are not yet supported, but are planned for the future.</text:p>
<text:p text:style-name="Standard">EPUB files can be highly customized with other tools, and I recommend doing
that for production quality files. For example, apparently performance is
improved when the content is divided into multiple files (e.g. one file per
chapter). MMD creates EPUB 3 files using a single file. Tools like <text:a xlink:type="simple" xlink:href="https://sigil-ebook.com/">Sigil</text:a>
are useful for improving your EPUB files, and I recommend doing that.</text:p>
<text:p text:style-name="Standard">EPUB files can be highly customized with other tools, and I recommend doing that for production quality files. For example, apparently performance is improved when the content is divided into multiple files (e.g. one file per chapter). MMD creates EPUB 3 files using a single file. Tools like <text:a xlink:type="simple" xlink:href="https://sigil-ebook.com/">Sigil</text:a> are useful for improving your EPUB files, and I recommend doing that.</text:p>
<text:p text:style-name="Standard">Not all EPUB readers support v3 files. I don&#8217;t plan on adding support for
older versions of the EPUB format, but other tools can convert to other
document formats you need. Same goes for Amazon&#8217;s ebook formats &#8211; the
<text:a xlink:type="simple" xlink:href="https://calibre-ebook.com/">Calibre</text:a> program can also be used to interconvert between formats.</text:p>
<text:p text:style-name="Standard">Not all EPUB readers support v3 files. I don&#8217;t plan on adding support for older versions of the EPUB format, but other tools can convert to other document formats you need. Same goes for Amazon&#8217;s ebook formats &#8211; the <text:a xlink:type="simple" xlink:href="https://calibre-ebook.com/">Calibre</text:a> program can also be used to interconvert between formats.</text:p>
<text:p text:style-name="Standard"><text:span text:style-name="MMD-Bold">NOTE</text:span>: Because EPUB documents are binary files, MMD only creates them when
run in batch mode (using the <text:span text:style-name="Source_20_Text">-b\--batch</text:span> options). Otherwise, it simply
outputs the HTML 5 file that would serve as the primary content for the EPUB.</text:p>
<text:p text:style-name="Standard"><text:span text:style-name="MMD-Bold">NOTE</text:span>: Because EPUB documents are binary files, MMD only creates them when run in batch mode (using the <text:span text:style-name="Source_20_Text">-b\--batch</text:span> options). Otherwise, it simply outputs the HTML 5 file that would serve as the primary content for the EPUB.</text:p>
<text:h text:outline-level="4"><text:bookmark text:name="fencedcodeblocks"/>Fenced Code Blocks</text:h>
@ -547,80 +441,55 @@ outputs the HTML 5 file that would serve as the primary content for the EPUB.</t
<text:list text:style-name="L2">
<text:list-item>
<text:p text:style-name="Standard">The leading and trailing fences can be 3, 4, or 5 backticks in length. That
should be sufficient to account for complex documents without requiring a more
complex parser.</text:p></text:list-item>
<text:p text:style-name="Standard">The leading and trailing fences can be 3, 4, or 5 backticks in length. That should be sufficient to account for complex documents without requiring a more complex parser.</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard">If there is no trailing fence, then everything after the leading fence is
considered to be part of the code block.</text:p></text:list-item>
<text:p text:style-name="Standard">If there is no trailing fence, then everything after the leading fence is considered to be part of the code block.</text:p></text:list-item>
</text:list>
<text:h text:outline-level="4"><text:bookmark text:name="footnotes"/>Footnotes</text:h>
<text:p text:style-name="Standard">The HTML output for footnotes now uses superscripts instead of brackets, e.g.
<text:span text:style-name="Source_20_Text">&lt;sup&gt;1&lt;/sup&gt;</text:span> instead of <text:span text:style-name="Source_20_Text">[1]</text:span>.</text:p>
<text:p text:style-name="Standard">The HTML output for footnotes now uses superscripts instead of brackets, e.g. <text:span text:style-name="Source_20_Text">&lt;sup&gt;1&lt;/sup&gt;</text:span> instead of <text:span text:style-name="Source_20_Text">[1]</text:span>.</text:p>
<text:h text:outline-level="4"><text:bookmark text:name="glossaryterms"/>Glossary Terms</text:h>
<text:p text:style-name="Standard">If there are terms in your document you wish to define in a glossary<text:note text:id="gn3" text:note-class="glossary"><text:note-body><text:p text:style-name="Footnote">The
glossary collects information about important terms used in your document</text:p></text:note-body></text:note> at
the end of your document, you can define them using the glossary syntax.</text:p>
<text:p text:style-name="Standard">If there are terms in your document you wish to define in a glossary<text:note text:id="gn3" text:note-class="glossary"><text:note-body><text:p text:style-name="Footnote">The glossary collects information about important terms used in your document</text:p></text:note-body></text:note> at the end of your document, you can define them using the glossary syntax.</text:p>
<text:p text:style-name="Standard">Glossary terms can be specified using inline or reference syntax. The inline
variant requires that the abbreviation be wrapped in parentheses and
immediately follows the <text:span text:style-name="Source_20_Text">?</text:span>.</text:p>
<text:p text:style-name="Standard">Glossary terms can be specified using inline or reference syntax. The inline variant requires that the abbreviation be wrapped in parentheses and immediately follows the <text:span text:style-name="Source_20_Text">?</text:span>.</text:p>
<text:p text:style-name="Preformatted Text">[?(glossary) The glossary collects information about important<text:line-break/>terms used in your document] is a glossary term.<text:line-break/><text:line-break/>[?glossary] is also a glossary term.<text:line-break/><text:line-break/>[?glossary]: The glossary collects information about important<text:line-break/>terms used in your document<text:line-break/></text:p>
<text:p text:style-name="Standard">Much like abbreviations, there is also a &#8220;shortcut&#8221; method that is similar to
the approach used in prior versions of MMD. You specify the definition for
the glossary term in the usual manner, but MMD will automatically identify
each instance where the term is used and substitute it automatically. In this
case, the term is limited to a more basic character set which includes
letters, numbers, periods, and hyphens, but not much else. For more complex
glossary terms, you must explicitly mark uses of the term.</text:p>
<text:p text:style-name="Standard">Much like abbreviations, there is also a &#8220;shortcut&#8221; method that is similar to the approach used in prior versions of MMD. You specify the definition for the glossary term in the usual manner, but MMD will automatically identify each instance where the term is used and substitute it automatically. In this case, the term is limited to a more basic character set which includes letters, numbers, periods, and hyphens, but not much else. For more complex glossary terms, you must explicitly mark uses of the term.</text:p>
<text:h text:outline-level="4"><text:bookmark text:name="htmlcomments"/>HTML Comments</text:h>
<text:p text:style-name="Standard">Previously, HTML Comments were used by MultiMarkdown to include raw text for
inclusion in the output file. This was useful, but limited, as it could only
work for one output format at a time.</text:p>
<text:p text:style-name="Standard">Previously, HTML Comments were used by MultiMarkdown to include raw text for inclusion in the output file. This was useful, but limited, as it could only work for one output format at a time.</text:p>
<text:p text:style-name="Standard">HTML Comments are now only included in HTML output, but not in any other
format since they would cause errors.</text:p>
<text:p text:style-name="Standard">HTML Comments are now only included in HTML output, but not in any other format since they would cause errors.</text:p>
<text:p text:style-name="Standard">Take a look at the <text:span text:style-name="Source_20_Text">HTML Comments.text</text:span> file in the test suite for a better
understanding of comment blocks vs comment spans, and how they are parsed.</text:p>
<text:p text:style-name="Standard">Take a look at the <text:span text:style-name="Source_20_Text">HTML Comments.text</text:span> file in the test suite for a better understanding of comment blocks vs comment spans, and how they are parsed.</text:p>
<text:h text:outline-level="4"><text:bookmark text:name="internationalization"/>Internationalization</text:h>
<text:p text:style-name="Standard">MMD v6 includes support for substituting certain text phrases in other
languages. This only affects the HTML format.</text:p>
<text:p text:style-name="Standard">MMD v6 includes support for substituting certain text phrases in other languages. This only affects the HTML format.</text:p>
<text:h text:outline-level="4"><text:bookmark text:name="latexchanges"/>LaTeX Changes</text:h>
<text:p text:style-name="Standard">LaTeX support is slightly different than in prior versions of MMD. It is
designed to be a bit more consistent, and easier for basic use.</text:p>
<text:p text:style-name="Standard">LaTeX support is slightly different than in prior versions of MMD. It is designed to be a bit more consistent, and easier for basic use.</text:p>
<text:p text:style-name="Standard">The previous approach used two types of metadata:</text:p>
<text:list text:style-name="L1">
<text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">latex input</text:span> &#8211; this uses the name of a latex file that will be used in a
<text:span text:style-name="Source_20_Text">\input{file}</text:span> command. This key can be used multiple times (the only
metadata key that worked this way), and all the basic metadata is written to
the LaTeX file in order.</text:p></text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">latex input</text:span> &#8211; this uses the name of a latex file that will be used in a <text:span text:style-name="Source_20_Text">\input{file}</text:span> command. This key can be used multiple times (the only metadata key that worked this way), and all the basic metadata is written to the LaTeX file in order.</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">latex footer</text:span> &#8211; this file worked the same way as <text:span text:style-name="Source_20_Text">latex input</text:span>, but was
inserted at the end of the file</text:p></text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">latex footer</text:span> &#8211; this file worked the same way as <text:span text:style-name="Source_20_Text">latex input</text:span>, but was inserted at the end of the file</text:p></text:list-item>
</text:list>
<text:p text:style-name="Standard">In practice, one typically needs to be able to insert <text:span text:style-name="Source_20_Text">\input</text:span> commands at
only a few key places in the final document:</text:p>
<text:p text:style-name="Standard">In practice, one typically needs to be able to insert <text:span text:style-name="Source_20_Text">\input</text:span> commands at only a few key places in the final document:</text:p>
<text:list text:style-name="L2">
<text:list-item>
@ -640,43 +509,47 @@ After metadata, and before the body of the document</text:p></text:list-item>
<text:list text:style-name="L2">
<text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">latex leader</text:span> &#8211; this specifies a file that will be used at the very
beginning of the document.</text:p></text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">latex leader</text:span> &#8211; this specifies a file that will be used at the very beginning of the document.</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">latex begin</text:span> &#8211; this comes after metadata, and before the body of the
document. This will usually include the <text:span text:style-name="Source_20_Text">\begin{document}</text:span> command, hence the
name.</text:p></text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">latex begin</text:span> &#8211; this comes after metadata, and before the body of the document. This will usually include the <text:span text:style-name="Source_20_Text">\begin{document}</text:span> command, hence the name.</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">latex footer</text:span> &#8211; this comes after the body of the document.</text:p></text:list-item>
</text:list>
<text:p text:style-name="Standard">You can use these 3 keys to replace the old <text:span text:style-name="Source_20_Text">latex input</text:span> metadata keys, as
long as you pay attention as to which is which. If you used more than three
include statements, you may have to combine your latex files to fit into the
new system.</text:p>
<text:p text:style-name="Standard">You can use these 3 keys to replace the old <text:span text:style-name="Source_20_Text">latex input</text:span> metadata keys, as long as you pay attention as to which is which. If you used more than three include statements, you may have to combine your latex files to fit into the new system.</text:p>
<text:p text:style-name="Standard"><text:span text:style-name="MMD-Bold"><text:span text:style-name="MMD-Italic">In addition</text:span></text:span>, there is a new shortcut key &#8211; <text:span text:style-name="Source_20_Text">latex config</text:span>. This allows
you to specify a &#8220;document name&#8221; that is used to automatically identify the
corresponding <text:span text:style-name="Source_20_Text">latex leader</text:span>, <text:span text:style-name="Source_20_Text">latex begin</text:span>, and <text:span text:style-name="Source_20_Text">latex footer</text:span> files. For
example, using <text:span text:style-name="Source_20_Text">latex config: article</text:span> is the same as using:</text:p>
<text:p text:style-name="Standard"><text:span text:style-name="MMD-Bold"><text:span text:style-name="MMD-Italic">In addition</text:span></text:span>, there is a new shortcut key &#8211; <text:span text:style-name="Source_20_Text">latex config</text:span>. This allows you to specify a &#8220;document name&#8221; that is used to automatically identify the corresponding <text:span text:style-name="Source_20_Text">latex leader</text:span>, <text:span text:style-name="Source_20_Text">latex begin</text:span>, and <text:span text:style-name="Source_20_Text">latex footer</text:span> files. For example, using <text:span text:style-name="Source_20_Text">latex config: article</text:span> is the same as using:</text:p>
<text:p text:style-name="Preformatted Text">latex leader:<text:tab/>mmd6-article-leader<text:line-break/>latex begin:<text:tab/>mmd6-article-begin<text:line-break/>latex footer:<text:tab/>mmd6-article-footer<text:line-break/></text:p>
<text:p text:style-name="Standard">Using the new system will require migrating your old configuration to the new
naming convention, but once done I believe it should me much more intuitive to
use.</text:p>
<text:p text:style-name="Standard">Using the new system will require migrating your old configuration to the new naming convention, but once done I believe it should me much more intuitive to use.</text:p>
<text:p text:style-name="Standard">The LaTeX support files included with the MMD v6 repository support the use of
the following <text:span text:style-name="Source_20_Text">latex config</text:span> values by default:</text:p>
<text:p text:style-name="Standard">The LaTeX support files included with the MMD v6 repository support the use of the following <text:span text:style-name="Source_20_Text">latex config</text:span> values by default:</text:p>
<text:list text:style-name="L1">
<text:list-item>
<text:p text:style-name="P1">
<text:span text:style-name="Source_20_Text">article</text:span></text:p></text:list-item>
<text:list-item>
<text:p text:style-name="P1">
<text:span text:style-name="Source_20_Text">beamer</text:span></text:p></text:list-item>
<text:list-item>
<text:p text:style-name="P1">
<text:span text:style-name="Source_20_Text">letterhead</text:span></text:p></text:list-item>
<text:list-item>
<text:p text:style-name="P1">
<text:span text:style-name="Source_20_Text">manuscript</text:span></text:p></text:list-item>
<text:list-item>
<text:p text:style-name="P1">
<text:span text:style-name="Source_20_Text">memoir-book</text:span></text:p></text:list-item>
<text:list-item>
<text:p text:style-name="P1">
<text:span text:style-name="Source_20_Text">tufte-book</text:span></text:p></text:list-item>
@ -686,46 +559,30 @@ the following <text:span text:style-name="Source_20_Text">latex config</text:spa
</text:list>
<text:p text:style-name="Standard"><text:span text:style-name="MMD-Bold">NOTE</text:span>: You do have to install the MMD support files into the proper
location for your system. I would like to make this easier, but haven&#8217;t found
the best configuration yet.</text:p>
<text:p text:style-name="Standard"><text:span text:style-name="MMD-Bold">NOTE</text:span>: You do have to install the MMD support files into the proper location for your system. I would like to make this easier, but haven&#8217;t found the best configuration yet.</text:p>
<text:h text:outline-level="4"><text:bookmark text:name="metadata"/>Metadata</text:h>
<text:p text:style-name="Standard">Metadata in MMD v6 includes new support for LaTeX &#8211; the <text:span text:style-name="Source_20_Text">latex config</text:span> key
allows you to automatically setup of multiple <text:span text:style-name="Source_20_Text">latex include</text:span> files at once.
The default setups that I use would typically consist of one LaTeX file to be
included at the top of the file, one to be included right at the beginning of
the document, and one to be included at the end of the document. If you want
to specify the latex files separately, you can use <text:span text:style-name="Source_20_Text">latex leader</text:span>, <text:span text:style-name="Source_20_Text">latex<text:line-break/>begin</text:span>, and <text:span text:style-name="Source_20_Text">latex footer</text:span>.</text:p>
<text:p text:style-name="Standard">Metadata in MMD v6 includes new support for LaTeX &#8211; the <text:span text:style-name="Source_20_Text">latex config</text:span> key allows you to automatically setup of multiple <text:span text:style-name="Source_20_Text">latex include</text:span> files at once. The default setups that I use would typically consist of one LaTeX file to be included at the top of the file, one to be included right at the beginning of the document, and one to be included at the end of the document. If you want to specify the latex files separately, you can use <text:span text:style-name="Source_20_Text">latex leader</text:span>, <text:span text:style-name="Source_20_Text">latex begin</text:span>, and <text:span text:style-name="Source_20_Text">latex footer</text:span>.</text:p>
<text:p text:style-name="Standard">There are new metadata keys for controlling internationalization:</text:p>
<text:list text:style-name="L1">
<text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">language</text:span> &#8211; specify the content language for a document, using the two
letter code for the language (e.g. <text:span text:style-name="Source_20_Text">en</text:span> for English). Where possible, this
will also set the default <text:span text:style-name="Source_20_Text">quotes language</text:span>.</text:p></text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">language</text:span> &#8211; specify the content language for a document, using the two letter code for the language (e.g. <text:span text:style-name="Source_20_Text">en</text:span> for English). Where possible, this will also set the default <text:span text:style-name="Source_20_Text">quotes language</text:span>.</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">quotes language</text:span> &#8211; specify which variant of smart quotes to use. Valid
options are <text:span text:style-name="Source_20_Text">dutch</text:span>, <text:span text:style-name="Source_20_Text">french</text:span>, <text:span text:style-name="Source_20_Text">german</text:span>, <text:span text:style-name="Source_20_Text">germanguillemets</text:span>, <text:span text:style-name="Source_20_Text">swedish</text:span>, <text:span text:style-name="Source_20_Text">nl</text:span>,
<text:span text:style-name="Source_20_Text">fr</text:span>, <text:span text:style-name="Source_20_Text">de</text:span>, <text:span text:style-name="Source_20_Text">sv</text:span>. Anything else defaults to English.</text:p></text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">quotes language</text:span> &#8211; specify which variant of smart quotes to use. Valid options are <text:span text:style-name="Source_20_Text">dutch</text:span>, <text:span text:style-name="Source_20_Text">french</text:span>, <text:span text:style-name="Source_20_Text">german</text:span>, <text:span text:style-name="Source_20_Text">germanguillemets</text:span>, <text:span text:style-name="Source_20_Text">swedish</text:span>, <text:span text:style-name="Source_20_Text">nl</text:span>, <text:span text:style-name="Source_20_Text">fr</text:span>, <text:span text:style-name="Source_20_Text">de</text:span>, <text:span text:style-name="Source_20_Text">sv</text:span>. Anything else defaults to English.</text:p></text:list-item>
</text:list>
<text:p text:style-name="Standard">Additionally, the <text:span text:style-name="Source_20_Text">MMD Header</text:span> and <text:span text:style-name="Source_20_Text">MMD Footer</text:span> metadata work slightly
differently. In v5, these fields were used to list names of files that should
be transcluded before and after the main body. In v6, these fields represent
the actual text to be inserted. If you want them to reference separate files,
use the transclusion functionality:</text:p>
<text:p text:style-name="Standard">Additionally, the <text:span text:style-name="Source_20_Text">MMD Header</text:span> and <text:span text:style-name="Source_20_Text">MMD Footer</text:span> metadata work slightly differently. In v5, these fields were used to list names of files that should be transcluded before and after the main body. In v6, these fields represent the actual text to be inserted. If you want them to reference separate files, use the transclusion functionality:</text:p>
<text:p text:style-name="Preformatted Text">Title:<text:tab/>Some Title<text:line-break/>MMD Header:<text:tab/>This is *MMD* text.<text:line-break/>MMD Footer:<text:tab/>{{footer.txt}}<text:line-break/></text:p>
<text:h text:outline-level="4"><text:bookmark text:name="outputformats"/>Output Formats</text:h>
<text:p text:style-name="Standard">MultiMarkdown 6 supports the following output formats, using the <text:span text:style-name="Source_20_Text">-t</text:span>
command-line argument:</text:p>
<text:p text:style-name="Standard">MultiMarkdown 6 supports the following output formats, using the <text:span text:style-name="Source_20_Text">-t</text:span> command-line argument:</text:p>
<text:list text:style-name="L1">
<text:list-item>
@ -733,50 +590,51 @@ command-line argument:</text:p>
<text:span text:style-name="Source_20_Text">html</text:span> &#8211; (Default) create HTML 5</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">latex</text:span> &#8211; create <text:a xlink:type="simple" xlink:href="https://en.wikipedia.org/wiki/LaTeX">LaTeX</text:a> for conversion to PDF using high quality
typography</text:p></text:list-item>
<text:p text:style-name="P1">
<text:span text:style-name="Source_20_Text">latex</text:span> &#8211; create <text:a xlink:type="simple" xlink:href="https://en.wikipedia.org/wiki/LaTeX">LaTeX</text:a> for conversion to PDF using high quality typography</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">beamer</text:span> and <text:span text:style-name="Source_20_Text">memoir</text:span> &#8211; two additional LaTeX variants for creating
slide presentations and longer documents, respectively</text:p></text:list-item>
<text:p text:style-name="P1">
<text:span text:style-name="Source_20_Text">beamer</text:span> and <text:span text:style-name="Source_20_Text">memoir</text:span> &#8211; two additional LaTeX variants for creating slide presentations and longer documents, respectively</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">mmd</text:span> &#8211; output the MMD text before converting to another format,
but after performing transclusion. This format is not generally needed.</text:p></text:list-item>
<text:p text:style-name="P1">
<text:span text:style-name="Source_20_Text">mmd</text:span> &#8211; output the MMD text before converting to another format, but after performing transclusion. This format is not generally needed.</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">odt</text:span> &#8211; OpenDocument text file, used by OpenOffice and compatible
word processors. Images are embedded inside the file package.</text:p></text:list-item>
<text:p text:style-name="P1">
<text:span text:style-name="Source_20_Text">odt</text:span> &#8211; OpenDocument text file, used by OpenOffice and compatible word processors. Images are embedded inside the file package.</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">fodt</text:span> &#8211; OpenDocument text variant using a single text (XML) file
instead of a compressed zip file. Images are not embedded in this format.</text:p></text:list-item>
<text:p text:style-name="P1">
<text:span text:style-name="Source_20_Text">fodt</text:span> &#8211; OpenDocument text variant using a single text (XML) file instead of a compressed zip file. Images are not embedded in this format.</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">epub</text:span> &#8211; EPUB 3 ebook format. Images and CSS are embedded in the
file package.</text:p></text:list-item>
<text:p text:style-name="P1">
<text:span text:style-name="Source_20_Text">epub</text:span> &#8211; EPUB 3 ebook format. Images and CSS are embedded in the file package.</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">bundle</text:span> &#8211; [TextBundle] format consisting of Markdown/MultiMarkdown
text file and embedded images and CSS. Useful for sharing Markdown files
and images between applications (on any OS, but especially on iOS)</text:p></text:list-item>
<text:p text:style-name="P1">
<text:span text:style-name="Source_20_Text">opml</text:span> &#8211; <text:a xlink:type="simple" xlink:href="http://en.wikipedia.org/wiki/OPML">OPML</text:a> is a standard file format used for a wide range of outlining programs. This allows you to use a single file for editing MultiMarkdown text and for outlining longer documents. <text:a xlink:type="simple" xlink:href="https://multimarkdown.com/">MultiMarkdown Composer</text:a> can read/write the OPML format, making it easy to share documents with other programs.</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">bundlezip</text:span> &#8211; TextPack variant of the TextBundle format &#8211; the file
package is compressed to a single zip file (similar to EPUB and ODT
formats).</text:p></text:list-item>
<text:p text:style-name="P1">
<text:span text:style-name="Source_20_Text">itmz</text:span> &#8211; ITMZ is the file format used for the <text:a xlink:type="simple" xlink:href="http://www.ithoughts.co.uk/">iThoughts</text:a> mind mapping software (macOS, iOS, Windows). Much like OPML, this format allows you to use a single file for your outlining/brainstorming and final production. <text:a xlink:type="simple" xlink:href="https://multimarkdown.com/">MultiMarkdown Composer</text:a> can read/write this format as well, giving you additional flexibility.</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="P1">
<text:span text:style-name="Source_20_Text">bundle</text:span> &#8211; [TextBundle] format consisting of Markdown/MultiMarkdown text file and embedded images and CSS. Useful for sharing Markdown files and images between applications (on any OS, but especially on iOS)</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Source_20_Text">bundlezip</text:span> &#8211; TextPack variant of the TextBundle format &#8211; the file package is compressed to a single zip file (similar to EPUB and ODT formats).</text:p></text:list-item>
</text:list>
<text:h text:outline-level="4"><text:bookmark text:name="rawsource"/>Raw Source</text:h>
<text:p text:style-name="Standard">In older versions of MultiMarkdown you could use an HTML comment to pass raw
LaTeX or other content to the final document. This worked reasonably well,
but was limited and didn&#8217;t work well when exporting to multiple formats. It
was time for something new.</text:p>
<text:p text:style-name="Standard">In older versions of MultiMarkdown you could use an HTML comment to pass raw LaTeX or other content to the final document. This worked reasonably well, but was limited and didn&#8217;t work well when exporting to multiple formats. It was time for something new.</text:p>
<text:p text:style-name="Standard">MMD v6 offers a new feature to handle this. Code spans and code blocks can
be flagged as representing raw source:</text:p>
<text:p text:style-name="Standard">MMD v6 offers a new feature to handle this. Code spans and code blocks can be flagged as representing raw source:</text:p>
<text:p text:style-name="Preformatted Text">foo `*bar*`{=html}<text:line-break/><text:line-break/>```{=latex}<text:line-break/>*foo*<text:line-break/>```<text:line-break/></text:p>
@ -791,7 +649,7 @@ be flagged as representing raw source:</text:p>
<text:list-item>
<text:p text:style-name="P1">
<text:span text:style-name="Source_20_Text">odt</text:span></text:p></text:list-item>
<text:span text:style-name="Source_20_Text">odt</text:span> &#8211; for ODT and FODT</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="P1">
@ -808,25 +666,17 @@ be flagged as representing raw source:</text:p>
<text:h text:outline-level="4"><text:bookmark text:name="tableofcontents"/>Table of Contents</text:h>
<text:p text:style-name="Standard">By placing <text:span text:style-name="Source_20_Text">{{TOC}}</text:span> in your document, you can insert an automatically
generated Table of Contents in your document. As of MMD v6, the native
Table of Contents functionality is used when exporting to LaTeX or
OpenDocument formats.</text:p>
<text:p text:style-name="Standard">By placing <text:span text:style-name="Source_20_Text">{{TOC}}</text:span> in your document, you can insert an automatically generated Table of Contents in your document. As of MMD v6, the native Table of Contents functionality is used when exporting to LaTeX or OpenDocument formats.</text:p>
<text:h text:outline-level="4"><text:bookmark text:name="tables"/>Tables</text:h>
<text:p text:style-name="Standard">Tables in MultiMarkdown-6 work basically the same as before, but a caption, if
present, must come <text:span text:style-name="MMD-Italic">after</text:span> the body of the table, not <text:span text:style-name="MMD-Italic">before</text:span>.</text:p>
<text:p text:style-name="Standard">Tables in MultiMarkdown-6 work basically the same as before, but a caption, if present, must come <text:span text:style-name="MMD-Italic">after</text:span> the body of the table, not <text:span text:style-name="MMD-Italic">before</text:span>.</text:p>
<text:h text:outline-level="4"><text:bookmark text:name="transclusion"/>Transclusion</text:h>
<text:p text:style-name="Standard">File transclusion works basically the same way &#8211; <text:span text:style-name="Source_20_Text">{{file}}</text:span> is used to
indicate a file that needs to be transcluded. <text:span text:style-name="Source_20_Text">{{file.*}}</text:span> allows for
wildcard transclusion. What&#8217;s different is that the way search paths are
handled is more flexible, though it may take a moment to understand.</text:p>
<text:p text:style-name="Standard">File transclusion works basically the same way &#8211; <text:span text:style-name="Source_20_Text">{{file}}</text:span> is used to indicate a file that needs to be transcluded. <text:span text:style-name="Source_20_Text">{{file.*}}</text:span> allows for wildcard transclusion. What&#8217;s different is that the way search paths are handled is more flexible, though it may take a moment to understand.</text:p>
<text:p text:style-name="Standard">When you process a file with MMD, it uses that file&#8217;s directory as the search
path for included files. For example:</text:p>
<text:p text:style-name="Standard">When you process a file with MMD, it uses that file&#8217;s directory as the search path for included files. For example:</text:p>
<table:table>
<table:table-column/>
@ -882,17 +732,13 @@ path for included files. For example:</text:p>
</table:table>
<text:p text:style-name="Standard">This is the same as MMD v 5. What&#8217;s different is that when you transclude a
file, the search path stays the same as the &#8220;parent&#8221; file, <text:span text:style-name="MMD-Bold">UNLESS</text:span> you use
the <text:span text:style-name="Source_20_Text">transclude base</text:span> metadata to override it. The simplest override is:</text:p>
<text:p text:style-name="Standard">This is the same as MMD v 5. What&#8217;s different is that when you transclude a file, the search path stays the same as the &#8220;parent&#8221; file, <text:span text:style-name="MMD-Bold">UNLESS</text:span> you use the <text:span text:style-name="Source_20_Text">transclude base</text:span> metadata to override it. The simplest override is:</text:p>
<text:p text:style-name="Preformatted Text">transclude base: .<text:line-break/></text:p>
<text:p text:style-name="Standard">This means that any transclusions within the file will be calculated relative
to the file, regardless of the original search path.</text:p>
<text:p text:style-name="Standard">This means that any transclusions within the file will be calculated relative to the file, regardless of the original search path.</text:p>
<text:p text:style-name="Standard">Alternatively you could specify that any transclusion happens inside a
subfolder:</text:p>
<text:p text:style-name="Standard">Alternatively you could specify that any transclusion happens inside a subfolder:</text:p>
<text:p text:style-name="Preformatted Text">transclude base: folder/<text:line-break/></text:p>
@ -900,45 +746,32 @@ subfolder:</text:p>
<text:p text:style-name="Preformatted Text">transclude base: /some/path<text:line-break/></text:p>
<text:p text:style-name="Standard">This flexibility means that you can transclude different files based on
whether a file is being processed by itself or as part of a &#8220;parent&#8221; file.
This can be useful when a particular file can either be a standalone document,
or a chapter inside a larger document.</text:p>
<text:p text:style-name="Standard">This flexibility means that you can transclude different files based on whether a file is being processed by itself or as part of a &#8220;parent&#8221; file. This can be useful when a particular file can either be a standalone document, or a chapter inside a larger document.</text:p>
<text:h text:outline-level="3"><text:bookmark text:name="developernotes"/>Developer Notes</text:h>
<text:p text:style-name="Standard">If you&#8217;re using MMD as a library in another application, there are a few
things to be aware of.</text:p>
<text:p text:style-name="Standard">If you&#8217;re using MMD as a library in another application, there are a few things to be aware of.</text:p>
<text:h text:outline-level="4"><text:bookmark text:name="objectpools"/>Object Pools</text:h>
<text:p text:style-name="Standard">To improve performance, MMD has the option to allocate the memory for the
tokens used in parsing in large chunks (&#8220;object pools&#8221;). Allocating a single
large chunk of memory is more efficient than allocating many smaller chunks.
However, this does complicate memory management.</text:p>
<text:p text:style-name="Standard">To improve performance, MMD has the option to allocate the memory for the tokens used in parsing in large chunks (&#8220;object pools&#8221;). Allocating a single large chunk of memory is more efficient than allocating many smaller chunks. However, this does complicate memory management.</text:p>
<text:p text:style-name="Standard">By default <text:span text:style-name="Source_20_Text">token.h</text:span> defines <text:span text:style-name="Source_20_Text">kUseObjectPool</text:span> which enables this performance
improvement. This does require more caution with the way that memory is
managed. (See <text:span text:style-name="Source_20_Text">main.c</text:span> for an example of how the object pool is allocated and
drained.) I recommend disabling object pools unless you really understand C
memory management, and understand MultiMarkdown&#8217;s program flow. Failure to
properly manage the object pool can lead to massive memory leaks, freeing
memory before that is still in use, or other potential problems.</text:p>
<text:p text:style-name="Standard">By default <text:span text:style-name="Source_20_Text">token.h</text:span> defines <text:span text:style-name="Source_20_Text">kUseObjectPool</text:span> which enables this performance improvement. This does require more caution with the way that memory is managed. (See <text:span text:style-name="Source_20_Text">main.c</text:span> for an example of how the object pool is allocated and drained.) I recommend disabling object pools unless you really understand C memory management, and understand MultiMarkdown&#8217;s program flow. Failure to properly manage the object pool can lead to massive memory leaks, freeing memory that is still in use, or other potential problems.</text:p>
<text:h text:outline-level="4"><text:bookmark text:name="htmlbooleanattributes"/>HTML Boolean Attributes</text:h>
<text:p text:style-name="Standard">Most HTML attributes are of the key-value type (e.g. <text:span text:style-name="Source_20_Text">key=&quot;value&quot;</text:span>). But some
less frequently used attributes are boolean attributes (e.g. <text:span text:style-name="Source_20_Text">&lt;video<text:line-break/>controls&gt;</text:span>). Properly distinguishing HTML from other uses of the <text:span text:style-name="Source_20_Text">&lt;</text:span>
character requires matching both types under certain circumstances.</text:p>
<text:p text:style-name="Standard">Most HTML attributes are of the key-value type (e.g. <text:span text:style-name="Source_20_Text">key=&quot;value&quot;</text:span>). But some less frequently used attributes are boolean attributes (e.g. <text:span text:style-name="Source_20_Text">&lt;video controls&gt;</text:span>). Properly distinguishing HTML from other uses of the <text:span text:style-name="Source_20_Text">&lt;</text:span> character requires matching both types under certain circumstances.</text:p>
<text:p text:style-name="Standard">There are some trade-offs to be made:</text:p>
<text:list text:style-name="L1">
<text:list-item>
<text:p text:style-name="Standard">Performance when compiling MultiMarkdown</text:p></text:list-item>
<text:p text:style-name="P1">
Performance when compiling MultiMarkdown</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard">Performance when processing parts of documents that are <text:span text:style-name="MMD-Italic">not</text:span> HTML</text:p></text:list-item>
<text:p text:style-name="P1">
Performance when processing parts of documents that are <text:span text:style-name="MMD-Italic">not</text:span> HTML</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard">Accuracy when matching HTML</text:p></text:list-item>
@ -949,28 +782,16 @@ character requires matching both types under certain circumstances.</text:p>
<text:list text:style-name="L1">
<text:list-item>
<text:p text:style-name="Standard">Ignore boolean attributes &#8211; this is how MMD-6 started. This is fast, but
not accurate for some users. Several users found issues with the <text:span text:style-name="Source_20_Text">&lt;video&gt;</text:span> tag
when MMD was used in HTML heavy documents.</text:p></text:list-item>
<text:p text:style-name="Standard">Ignore boolean attributes &#8211; this is how MMD-6 started. This is fast, but not accurate for some users. Several users found issues with the <text:span text:style-name="Source_20_Text">&lt;video&gt;</text:span> tag when MMD was used in HTML heavy documents.</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard">Use regexp to match all boolean attributes. This is fast to compile, but
adds roughly 5&#8211;8% processing time (probably due to false positive HTML
matches). This <text:span text:style-name="MMD-Italic">may</text:span> cause some text to be classified as HTML when it
shouldn&#8217;t.</text:p></text:list-item>
<text:p text:style-name="Standard">Use regexp to match all boolean attributes. This is fast to compile, but adds roughly 5&#8211;8% processing time (probably due to false positive HTML matches). This <text:span text:style-name="MMD-Italic">may</text:span> cause some text to be classified as HTML when it shouldn&#8217;t.</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard">Explicitly match all possible boolean attributes &#8211; This would presumably be
relatively fast when processing (due to the nature of re2c lexers), but it may
be prohibitively slow to compile for some users. As someone who compiles MMD
frequently, it is too slow to compile be useful for me during development.</text:p></text:list-item>
<text:p text:style-name="Standard">Explicitly match all possible boolean attributes &#8211; This would presumably be relatively fast when processing (due to the nature of re2c lexers), but it may be prohibitively slow to compile for some users. As someone who compiles MMD frequently, it is too slow to compile for it to be usable by me during development.</text:p></text:list-item>
<text:list-item>
<text:p text:style-name="Standard">Use a hand-curated list of boolean attributes that are most commonly used &#8211;
this does not incur much of a performance hit when parsing, and compiles
faster than the complete list of all boolean attributes. For now, this is the
option I have chosen as default for MMD &#8211; it seems to be a reasonable trade-
off. I will continue to research additional options.</text:p></text:list-item>
<text:p text:style-name="Standard">Use a hand-curated list of boolean attributes that are most commonly used &#8211; this does not incur much of a performance hit when parsing, and compiles faster than the complete list of all boolean attributes. For now, this is the option I have chosen as default for MMD &#8211; it seems to be a reasonable trade-off. I will continue to research additional options.</text:p></text:list-item>
</text:list>
@ -980,11 +801,7 @@ off. I will continue to research additional options.</text:p></text:list-item>
<text:list text:style-name="L2">
<text:list-item>
<text:p text:style-name="Standard">OPML export support is not available in v6. I plan on adding improved
support for this at some point. I was hoping to be able to re-use the
existing v6 parser but it might be simpler to use the approach from v5 and
earlier, which was to have a separate parser tuned to only identify headers
and &#8220;stuff between headers&#8221;.</text:p></text:list-item>
<text:p text:style-name="Standard"><text:span text:style-name="Strike">OPML export support is not available in v6. I plan on adding improved support for this at some point. I was hoping to be able to re-use the existing v6 parser but it might be simpler to use the approach from v5 and earlier, which was to have a separate parser tuned to only identify headers and &#8220;stuff between headers&#8221;.</text:span><text:span text:style-name="Comment">OPML read/write support implemented.</text:span></text:p></text:list-item>
</text:list>
</office:text>

View File

@ -51,85 +51,30 @@
<p>Version: 6.4.0</p>
<p>This document serves as a description of MultiMarkdown (<abbr title="MultiMarkdown">MMD</abbr>) v6, as well as a sample
document to demonstrate the various features. Specifically, differences from
<abbr title="MultiMarkdown">MMD</abbr> v5 will be pointed out.</p>
<p>This document serves as a description of MultiMarkdown (<abbr title="MultiMarkdown">MMD</abbr>) v6, as well as a sample document to demonstrate the various features. Specifically, differences from <abbr title="MultiMarkdown">MMD</abbr> v5 will be pointed out.</p>
<h3 id="performance">Performance</h3>
<p>A big motivating factor leading to the development of <abbr title="MultiMarkdown">MMD</abbr> v6 was
performance. When <abbr title="MultiMarkdown">MMD</abbr> first migrated from Perl to C (based on <a href="https://github.com/jgm/peg-markdown">peg-
markdown</a>), it was among the fastest
Markdown parsers available. That was many years ago, and the &#8220;competition&#8221;
has made a great deal of progress since that time.</p>
<p>A big motivating factor leading to the development of <abbr title="MultiMarkdown">MMD</abbr> v6 was performance. When <abbr title="MultiMarkdown">MMD</abbr> first migrated from Perl to C (based on <a href="https://github.com/jgm/peg-markdown">peg- markdown</a>), it was among the fastest Markdown parsers available. That was many years ago, and the &#8220;competition&#8221; has made a great deal of progress since that time.</p>
<p>When developing <abbr title="MultiMarkdown">MMD</abbr> v6, one of my goals was to keep <abbr title="MultiMarkdown">MMD</abbr> at least in the
ballpark of the fastest processors. Of course, being <em>the</em> fastest would be
fantastic, but I was more concerned with ensuring that the code was easily
understood, and easily updated with new features in the future.</p>
<p>When developing <abbr title="MultiMarkdown">MMD</abbr> v6, one of my goals was to keep <abbr title="MultiMarkdown">MMD</abbr> at least in the ballpark of the fastest processors. Of course, being <em>the</em> fastest would be fantastic, but I was more concerned with ensuring that the code was easily understood, and easily updated with new features in the future.</p>
<p><abbr title="MultiMarkdown">MMD</abbr> v3 &#8211; v5 used a <a href="#gn:1" id="gnref:1" title="see glossary" class="glossary">PEG</a> to handle the parsing. This made it easy to
understand the relationship between the <abbr title="MultiMarkdown">MMD</abbr> grammar and the parsing code,
since they were one and the same. However, the parsing code generated by
the parsers was not particularly fast, and was prone to troublesome edge
cases with terrible performance characteristics.</p>
<p><abbr title="MultiMarkdown">MMD</abbr> v3 &#8211; v5 used a <a href="#gn:1" id="gnref:1" title="see glossary" class="glossary">PEG</a> to handle the parsing. This made it easy to understand the relationship between the <abbr title="MultiMarkdown">MMD</abbr> grammar and the parsing code, since they were one and the same. However, the parsing code generated by the parsers was not particularly fast, and was prone to troublesome edge cases with terrible performance characteristics.</p>
<p>The first step in <abbr title="MultiMarkdown">MMD</abbr> v6 parsing is to break the source text into a series
of tokens, which may consist of plain text, whitespace, or special characters
such as &#8216;*&#8217;, &#8216;[&#8217;, etc. This chain of tokens is then used to perform the
actual parsing.</p>
<p>The first step in <abbr title="MultiMarkdown">MMD</abbr> v6 parsing is to break the source text into a series of tokens, which may consist of plain text, whitespace, or special characters such as &#8216;*&#8217;, &#8216;[&#8217;, etc. This chain of tokens is then used to perform the actual parsing.</p>
<p><abbr title="MultiMarkdown">MMD</abbr> v6 divides the parsing into two separate phases, which actually fits
more with Markdown&#8217;s design philosophically.</p>
<p><abbr title="MultiMarkdown">MMD</abbr> v6 divides the parsing into two separate phases, which actually fits more with Markdown&#8217;s design philosophically.</p>
<ol>
<li><p>Block parsing consists of identifying the &#8220;type&#8221; of each line of the
source text, and grouping the lines into blocks (e.g. paragraphs, lists,
blockquotes, etc.) Some blocks are a single line (e.g. ATX headers), and
others can be many lines long. The block parsing in <abbr title="MultiMarkdown">MMD</abbr> v6 is handled
by a parser generated by <a href="http://www.hwaci.com/sw/lemon/">lemon</a>. This
parser allows the block structure to be more readily understood by
non-programmers, but the generated parser is still fast.</p></li>
<li><p>Span parsing consists of identifying Markdown/<abbr title="MultiMarkdown">MMD</abbr> structures that occur
inside of blocks, such as links, images, strong, emph, etc. Most of these
structures require matching pairs of tokens to specify where the span starts
and where it ends. Most of these spans allow arbitrary levels of nesting as
well. This made parsing them correctly in the <a href="#gn:1" title="see glossary" class="glossary">PEG</a>-based code difficult and
slow. <abbr title="MultiMarkdown">MMD</abbr> v6 uses a different approach that is accurate and has good
performance characteristics even with edge cases. Basically, it keeps a stack
of each &#8220;opening&#8221; token as it steps through the token chain. When a &#8220;closing&#8221;
token is found, it is paired with the most recent appropriate opener on the
stack. Any tokens in between the opener and closer are removed, as they are
not able to be matched any more. To avoid unnecessary searches for non-
existent openers, the parser keeps track of which opening tokens have been
discovered. This allows the parser to continue moving forwards without having
to go backwards and re-parse any previously visited tokens.</p></li>
<li><p>Block parsing consists of identifying the &#8220;type&#8221; of each line of the source text, and grouping the lines into blocks (e.g. paragraphs, lists, blockquotes, etc.) Some blocks are a single line (e.g. ATX headers), and others can be many lines long. The block parsing in <abbr title="MultiMarkdown">MMD</abbr> v6 is handled by a parser generated by <a href="http://www.hwaci.com/sw/lemon/">lemon</a>. This parser allows the block structure to be more readily understood by non-programmers, but the generated parser is still fast.</p></li>
<li><p>Span parsing consists of identifying Markdown/<abbr title="MultiMarkdown">MMD</abbr> structures that occur inside of blocks, such as links, images, strong, emph, etc. Most of these structures require matching pairs of tokens to specify where the span starts and where it ends. Most of these spans allow arbitrary levels of nesting as well. This made parsing them correctly in the <a href="#gn:1" title="see glossary" class="glossary">PEG</a>-based code difficult and slow. <abbr title="MultiMarkdown">MMD</abbr> v6 uses a different approach that is accurate and has good performance characteristics even with edge cases. Basically, it keeps a stack of each &#8220;opening&#8221; token as it steps through the token chain. When a &#8220;closing&#8221; token is found, it is paired with the most recent appropriate opener on the stack. Any tokens in between the opener and closer are removed, as they are not able to be matched any more. To avoid unnecessary searches for non- existent openers, the parser keeps track of which opening tokens have been discovered. This allows the parser to continue moving forwards without having to go backwards and re-parse any previously visited tokens.</p></li>
</ol>
<p>The result of this redesigned <abbr title="MultiMarkdown">MMD</abbr> parser is that it can parse short
documents more quickly than <a href="http://commonmark.org/">CommonMark</a>, and takes
only 15% &#8211; 20% longer to parse long documents. I have not delved too deeply
into this, but I presume that CommonMark has a bit more &#8220;set-up&#8221; time that
becomes expensive when parsing a short document (e.g. a paragraph or two). But
this cost becomes negligible when parsing longer documents (e.g. file sizes of
1 MB). So depending on your use case, CommonMark may well be faster than
<abbr title="MultiMarkdown">MMD</abbr>, but we&#8217;re talking about splitting hairs here&#8230;. Recent comparisons
show <abbr title="MultiMarkdown">MMD</abbr> v6 taking approximately 4.37 seconds to parse a 108 MB file
(approximately 24.8 MB/second), and CommonMark took 3.72 seconds for the same
file (29.2 MB/second). For comparison, <abbr title="MultiMarkdown">MMD</abbr> v5.4 took approximately 94
second for the same file (1.15 MB/second).</p>
<p>The result of this redesigned <abbr title="MultiMarkdown">MMD</abbr> parser is that it can parse short documents more quickly than <a href="http://commonmark.org/">CommonMark</a>, and takes only 15% &#8211; 20% longer to parse long documents. I have not delved too deeply into this, but I presume that CommonMark has a bit more &#8220;set-up&#8221; time that becomes expensive when parsing a short document (e.g. a paragraph or two). But this cost becomes negligible when parsing longer documents (e.g. file sizes of 1 MB). So depending on your use case, CommonMark may well be faster than <abbr title="MultiMarkdown">MMD</abbr>, but we&#8217;re talking about splitting hairs here&#8230;. Recent comparisons show <abbr title="MultiMarkdown">MMD</abbr> v6 taking approximately 4.37 seconds to parse a 108 MB file (approximately 24.8 MB/second), and CommonMark took 3.72 seconds for the same file (29.2 MB/second). For comparison, <abbr title="MultiMarkdown">MMD</abbr> v5.4 took approximately 94 second for the same file (1.15 MB/second).</p>
<p>For a more realistic file of approx 28 kb (the source of the Markdown Syntax
web page), both <abbr title="MultiMarkdown">MMD</abbr> and CommonMark parse it too quickly to accurately
measure. In fact, it requires a file consisting of the original file copied
32 times over (0.85 MB) before <code>/usr/bin/env time</code> reports a time over the
minimum threshold of 0.01 seconds for either program.</p>
<p>For a more realistic file of approx 28 kb (the source of the Markdown Syntax web page), both <abbr title="MultiMarkdown">MMD</abbr> and CommonMark parse it too quickly to accurately measure. In fact, it requires a file consisting of the original file copied 32 times over (0.85 MB) before <code>/usr/bin/env time</code> reports a time over the minimum threshold of 0.01 seconds for either program.</p>
<p>There is still potentially room for additional optimization in <abbr title="MultiMarkdown">MMD</abbr>.
However, even if I can&#8217;t close the performance gap with CommonMark on longer
files, the additional features of <abbr title="MultiMarkdown">MMD</abbr> compared with Markdown in addition to
the increased legibility of the source code of <abbr title="MultiMarkdown">MMD</abbr> (in my biased opinion
anyway) make this project worthwhile.</p>
<p>There is still potentially room for additional optimization in <abbr title="MultiMarkdown">MMD</abbr>. However, even if I can&#8217;t close the performance gap with CommonMark on longer files, the additional features of <abbr title="MultiMarkdown">MMD</abbr> compared with Markdown in addition to the increased legibility of the source code of <abbr title="MultiMarkdown">MMD</abbr> (in my biased opinion anyway) make this project worthwhile.</p>
<h3 id="parsetree">Parse Tree</h3>
@ -139,143 +84,85 @@ anyway) make this project worthwhile.</p>
<li><p>Start with a null-terminated string of source text (C style string)</p></li>
<li><p>Lex string into token chain</p></li>
<li><p>Parse token chain into blocks</p></li>
<li><p>Parse tokens within each block into span level structures (e.g. strong,
emph, etc.)</p></li>
<li><p>Export the token tree into the desired output format (e.g. HTML, LaTeX,
etc.) and return the resulting C style string</p>
<li><p>Parse tokens within each block into span level structures (e.g. strong, emph, etc.)</p></li>
<li><p>Export the token tree into the desired output format (e.g. HTML, LaTeX, etc.) and return the resulting C style string</p>
<p><strong>OR</strong></p></li>
<li><p>Use the resulting token tree for your own purposes.</p></li>
</ol>
<p>The token tree (<a href="#gn:2" id="gnref:2" title="see glossary" class="glossary">AST</a>) includes starting offsets and length of each token,
allowing you to use <abbr title="MultiMarkdown">MMD</abbr> as part of a syntax highlighter. <abbr title="MultiMarkdown">MMD</abbr> v5 did not
have this functionality in the public version, in part because the <a href="#gn:1" title="see glossary" class="glossary">PEG</a> parsers
used did not provide reliable offset positions, requiring a great deal of
effort when I adapted <abbr title="MultiMarkdown">MMD</abbr> for use in <a href="http://multimarkdown.com/">MultiMarkdown
Composer</a>.</p>
<p>The token tree (<a href="#gn:2" id="gnref:2" title="see glossary" class="glossary">AST</a>) includes starting offsets and length of each token, allowing you to use <abbr title="MultiMarkdown">MMD</abbr> as part of a syntax highlighter. <abbr title="MultiMarkdown">MMD</abbr> v5 did not have this functionality in the public version, in part because the <a href="#gn:1" title="see glossary" class="glossary">PEG</a> parsers used did not provide reliable offset positions, requiring a great deal of effort when I adapted <abbr title="MultiMarkdown">MMD</abbr> for use in <a href="http://multimarkdown.com/">MultiMarkdown Composer</a>.</p>
<p>These steps are managed using the <code>mmd_engine</code> &#8220;object&#8221;. An individual
<code>mmd_engine</code> cannot be used by multiple threads simultaneously, so if
libMultiMarkdown is to be used in a multithreaded program, a separate
<code>mmd_engine</code> should be created for each thread. Alternatively, just use the
slightly more abstracted <code>mmd_convert_string()</code> function that handles creating
and destroying the <code>mmd_engine</code> automatically.</p>
<p>These steps are managed using the <code>mmd_engine</code> &#8220;object&#8221;. An individual <code>mmd_engine</code> cannot be used by multiple threads simultaneously, so if libMultiMarkdown is to be used in a multithreaded program, a separate <code>mmd_engine</code> should be created for each thread. Alternatively, just use the slightly more abstracted <code>mmd_convert_string()</code> function that handles creating and destroying the <code>mmd_engine</code> automatically.</p>
<h3 id="features">Features</h3>
<h4 id="abbreviationsoracronyms">Abbreviations (Or Acronyms)</h4>
<p>This file includes the use of <abbr title="MultiMarkdown">MMD</abbr> as an abbreviation for MultiMarkdown. The
abbreviation will be expanded on the first use, and the shortened form will be
used on subsequent occurrences.</p>
<p>This file includes the use of <abbr title="MultiMarkdown">MMD</abbr> as an abbreviation for MultiMarkdown. The abbreviation will be expanded on the first use, and the shortened form will be used on subsequent occurrences.</p>
<p>Abbreviations can be specified using inline or reference syntax. The inline
variant requires that the abbreviation be wrapped in parentheses and
immediately follows the <code>&gt;</code>.</p>
<p>Abbreviations can be specified using inline or reference syntax. The inline variant requires that the abbreviation be wrapped in parentheses and immediately follows the <code>&gt;</code>.</p>
<pre><code>[>MMD] is an abbreviation. So is [>(MD) Markdown].
[>MMD]: MultiMarkdown
</code></pre>
<p>There is also a &#8220;shortcut&#8221; method for abbreviations that is similar to the
approach used in prior versions of <abbr title="MultiMarkdown">MMD</abbr>. You specify the definition for the
abbreviation in the usual manner, but <abbr title="MultiMarkdown">MMD</abbr> will automatically identify each
instance where the abbreviation is used and substitute it automatically. In
this case, the abbreviation is limited to a more basic character set which
includes letters, numbers, periods, and hyphens, but not much else. For more
complex abbreviations, you must explicitly mark uses of the abbreviation.</p>
<p>There is also a &#8220;shortcut&#8221; method for abbreviations that is similar to the approach used in prior versions of <abbr title="MultiMarkdown">MMD</abbr>. You specify the definition for the abbreviation in the usual manner, but <abbr title="MultiMarkdown">MMD</abbr> will automatically identify each instance where the abbreviation is used and substitute it automatically. In this case, the abbreviation is limited to a more basic character set which includes letters, numbers, periods, and hyphens, but not much else. For more complex abbreviations, you must explicitly mark uses of the abbreviation.</p>
<h4 id="citations">Citations</h4>
<p>Citations can be specified using an inline syntax, just like inline footnotes.
If you wish to use BibTeX, then configure the <code>bibtex</code> metadata (required) and
the <code>biblio style</code> metadata (optional).</p>
<p>Citations can be specified using an inline syntax, just like inline footnotes. If you wish to use BibTeX, then configure the <code>bibtex</code> metadata (required) and the <code>biblio style</code> metadata (optional).</p>
<p>The HTML output for citations now uses parentheses instead of brackets, e.g.
<code>(1)</code> instead of <code>[1]</code>.</p>
<p>The HTML output for citations now uses parentheses instead of brackets, e.g. <code>(1)</code> instead of <code>[1]</code>.</p>
<h4 id="criticmarkup">CriticMarkup</h4>
<p><abbr title="MultiMarkdown">MMD</abbr> v6 has improved support for <a href="http://criticmarkup.com/">CriticMarkup</a>, both in terms of parsing, and
in terms of support for each output format. You can <ins>insert text</ins>,
<del>delete text</del>, substitute <del>one thing</del><ins>for another</ins>, <mark>highlight text</mark>,
and <span class="critic comment">leave comments</span> in the text.</p>
<p><abbr title="MultiMarkdown">MMD</abbr> v6 has improved support for <a href="http://criticmarkup.com/">CriticMarkup</a>, both in terms of parsing, and in terms of support for each output format. You can <ins>insert text</ins>, <del>delete text</del>, substitute <del>one thing</del><ins>for another</ins>, <mark>highlight text</mark>, and <span class="critic comment">leave comments</span> in the text.</p>
<p>If you don&#8217;t specify any command line options, then <abbr title="MultiMarkdown">MMD</abbr> will apply special
formatting to the CriticMarkup formatting as in the preceding paragraph.
Alternatively, you can use the <code>-a\--accept</code> or <code>-r\--reject</code> options to cause
<abbr title="MultiMarkdown">MMD</abbr> to accept or reject, respectively, the proposed changes within the CM
markup. When doing this, CM will work across blank lines. Without either of
these two options, then CriticMarkup that spans a blank line is not recognized
as such. I working on options for this for the future.</p>
<p>If you don&#8217;t specify any command line options, then <abbr title="MultiMarkdown">MMD</abbr> will apply special formatting to the CriticMarkup formatting as in the preceding paragraph. Alternatively, you can use the <code>-a\--accept</code> or <code>-r\--reject</code> options to cause <abbr title="MultiMarkdown">MMD</abbr> to accept or reject, respectively, the proposed changes within the CM markup. When doing this, CM will work across blank lines. Without either of these two options, then CriticMarkup that spans a blank line is not recognized as such. I am working on options for this for the future.</p>
<h4 id="embeddedimages">Embedded Images</h4>
<p>Supported export formats (<code>odt</code>, <code>epub</code>, <code>bundle</code>, <code>bundlezip</code>) include
images inside the export document:</p>
<p>Supported export formats (<code>odt</code>, <code>epub</code>, <code>bundle</code>, <code>bundlezip</code>) include images inside the export document:</p>
<ul>
<li>Local images are embedded automatically</li>
<li>Images stored on remote servers are embedded <em>if</em> <a href="https://curl.haxx.se/libcurl/">libCurl</a> is
properly installed when <abbr title="MultiMarkdown">MMD</abbr> is compiled. This is true for macOS builds.</li>
<li>Images stored on remote servers are embedded <em>if</em> <a href="https://curl.haxx.se/libcurl/">libCurl</a> is properly installed when <abbr title="MultiMarkdown">MMD</abbr> is compiled. This is true for macOS builds.</li>
</ul>
<h4 id="emphandstrong">Emph and Strong</h4>
<p>The basics of emphasis and strong emphasis are unchanged, but the parsing
engine has been improved to be more accurate, particularly in various edge
cases where proper parsing can be difficult.</p>
<p>The basics of emphasis and strong emphasis are unchanged, but the parsing engine has been improved to be more accurate, particularly in various edge cases where proper parsing can be difficult.</p>
<h4 id="epub3support">EPUB 3 Support</h4>
<p><abbr title="MultiMarkdown">MMD</abbr> v6 now provides support for direct creation of <a href="https://en.wikipedia.org/wiki/EPUB">EPUB 3</a> files. Previously
a separate tool was required to create EPUB files from <abbr title="MultiMarkdown">MMD</abbr>. It&#8217;s now built-
in. Currently, EPUB 3 files are built using the usual HTML 5 output. No
extra CSS is applied, so the default from the reader will be used. Images are
not yet supported, but are planned for the future.</p>
<p><abbr title="MultiMarkdown">MMD</abbr> v6 now provides support for direct creation of <a href="https://en.wikipedia.org/wiki/EPUB">EPUB 3</a> files. Previously a separate tool was required to create EPUB files from <abbr title="MultiMarkdown">MMD</abbr>. It&#8217;s now built-in. Currently, EPUB 3 files are built using the usual HTML 5 output. No extra CSS is applied, so the default from the reader will be used. Images are not yet supported, but are planned for the future.</p>
<p>EPUB files can be highly customized with other tools, and I recommend doing
that for production quality files. For example, apparently performance is
improved when the content is divided into multiple files (e.g. one file per
chapter). <abbr title="MultiMarkdown">MMD</abbr> creates EPUB 3 files using a single file. Tools like <a href="https://sigil-ebook.com/">Sigil</a>
are useful for improving your EPUB files, and I recommend doing that.</p>
<p>EPUB files can be highly customized with other tools, and I recommend doing that for production quality files. For example, apparently performance is improved when the content is divided into multiple files (e.g. one file per chapter). <abbr title="MultiMarkdown">MMD</abbr> creates EPUB 3 files using a single file. Tools like <a href="https://sigil-ebook.com/">Sigil</a> are useful for improving your EPUB files, and I recommend doing that.</p>
<p>Not all EPUB readers support v3 files. I don&#8217;t plan on adding support for
older versions of the EPUB format, but other tools can convert to other
document formats you need. Same goes for Amazon&#8217;s ebook formats &#8211; the
<a href="https://calibre-ebook.com/">Calibre</a> program can also be used to interconvert between formats.</p>
<p>Not all EPUB readers support v3 files. I don&#8217;t plan on adding support for older versions of the EPUB format, but other tools can convert to other document formats you need. Same goes for Amazon&#8217;s ebook formats &#8211; the <a href="https://calibre-ebook.com/">Calibre</a> program can also be used to interconvert between formats.</p>
<p><strong>NOTE</strong>: Because EPUB documents are binary files, <abbr title="MultiMarkdown">MMD</abbr> only creates them when
run in batch mode (using the <code>-b\--batch</code> options). Otherwise, it simply
outputs the HTML 5 file that would serve as the primary content for the EPUB.</p>
<p><strong>NOTE</strong>: Because EPUB documents are binary files, <abbr title="MultiMarkdown">MMD</abbr> only creates them when run in batch mode (using the <code>-b\--batch</code> options). Otherwise, it simply outputs the HTML 5 file that would serve as the primary content for the EPUB.</p>
<h4 id="fencedcodeblocks">Fenced Code Blocks</h4>
<p>Fenced code blocks are fundamentally the same as <abbr title="MultiMarkdown">MMD</abbr> v5, except:</p>
<ol>
<li><p>The leading and trailing fences can be 3, 4, or 5 backticks in length. That
should be sufficient to account for complex documents without requiring a more
complex parser.</p></li>
<li><p>If there is no trailing fence, then everything after the leading fence is
considered to be part of the code block.</p></li>
<li><p>The leading and trailing fences can be 3, 4, or 5 backticks in length. That should be sufficient to account for complex documents without requiring a more complex parser.</p></li>
<li><p>If there is no trailing fence, then everything after the leading fence is considered to be part of the code block.</p></li>
</ol>
<h4 id="footnotes">Footnotes</h4>
<p>The HTML output for footnotes now uses superscripts instead of brackets, e.g.
<code>&lt;sup&gt;1&lt;/sup&gt;</code> instead of <code>[1]</code>.</p>
<p>The HTML output for footnotes now uses superscripts instead of brackets, e.g. <code>&lt;sup&gt;1&lt;/sup&gt;</code> instead of <code>[1]</code>.</p>
<h4 id="glossaryterms">Glossary Terms</h4>
<p>If there are terms in your document you wish to define in a <a href="#gn:3" id="gnref:3" title="see glossary" class="glossary">glossary</a> at
the end of your document, you can define them using the glossary syntax.</p>
<p>If there are terms in your document you wish to define in a <a href="#gn:3" id="gnref:3" title="see glossary" class="glossary">glossary</a> at the end of your document, you can define them using the glossary syntax.</p>
<p>Glossary terms can be specified using inline or reference syntax. The inline
variant requires that the abbreviation be wrapped in parentheses and
immediately follows the <code>?</code>.</p>
<p>Glossary terms can be specified using inline or reference syntax. The inline variant requires that the abbreviation be wrapped in parentheses and immediately follows the <code>?</code>.</p>
<pre><code>[?(glossary) The glossary collects information about important
terms used in your document] is a glossary term.
@ -286,49 +173,32 @@ terms used in your document] is a glossary term.
terms used in your document
</code></pre>
<p>Much like abbreviations, there is also a &#8220;shortcut&#8221; method that is similar to
the approach used in prior versions of <abbr title="MultiMarkdown">MMD</abbr>. You specify the definition for
the glossary term in the usual manner, but <abbr title="MultiMarkdown">MMD</abbr> will automatically identify
each instance where the term is used and substitute it automatically. In this
case, the term is limited to a more basic character set which includes
letters, numbers, periods, and hyphens, but not much else. For more complex
glossary terms, you must explicitly mark uses of the term.</p>
<p>Much like abbreviations, there is also a &#8220;shortcut&#8221; method that is similar to the approach used in prior versions of <abbr title="MultiMarkdown">MMD</abbr>. You specify the definition for the glossary term in the usual manner, but <abbr title="MultiMarkdown">MMD</abbr> will automatically identify each instance where the term is used and substitute it automatically. In this case, the term is limited to a more basic character set which includes letters, numbers, periods, and hyphens, but not much else. For more complex glossary terms, you must explicitly mark uses of the term.</p>
<h4 id="htmlcomments">HTML Comments</h4>
<p>Previously, HTML Comments were used by MultiMarkdown to include raw text for
inclusion in the output file. This was useful, but limited, as it could only
work for one output format at a time.</p>
<p>Previously, HTML Comments were used by MultiMarkdown to include raw text for inclusion in the output file. This was useful, but limited, as it could only work for one output format at a time.</p>
<p>HTML Comments are now only included in HTML output, but not in any other
format since they would cause errors.</p>
<p>HTML Comments are now only included in HTML output, but not in any other format since they would cause errors.</p>
<p>Take a look at the <code>HTML Comments.text</code> file in the test suite for a better
understanding of comment blocks vs comment spans, and how they are parsed.</p>
<p>Take a look at the <code>HTML Comments.text</code> file in the test suite for a better understanding of comment blocks vs comment spans, and how they are parsed.</p>
<h4 id="internationalization">Internationalization</h4>
<p><abbr title="MultiMarkdown">MMD</abbr> v6 includes support for substituting certain text phrases in other
languages. This only affects the HTML format.</p>
<p><abbr title="MultiMarkdown">MMD</abbr> v6 includes support for substituting certain text phrases in other languages. This only affects the HTML format.</p>
<h4 id="latexchanges">LaTeX Changes</h4>
<p>LaTeX support is slightly different than in prior versions of <abbr title="MultiMarkdown">MMD</abbr>. It is
designed to be a bit more consistent, and easier for basic use.</p>
<p>LaTeX support is slightly different than in prior versions of <abbr title="MultiMarkdown">MMD</abbr>. It is designed to be a bit more consistent, and easier for basic use.</p>
<p>The previous approach used two types of metadata:</p>
<ul>
<li><p><code>latex input</code> &#8211; this uses the name of a latex file that will be used in a
<code>\input{file}</code> command. This key can be used multiple times (the only
metadata key that worked this way), and all the basic metadata is written to
the LaTeX file in order.</p></li>
<li><p><code>latex footer</code> &#8211; this file worked the same way as <code>latex input</code>, but was
inserted at the end of the file</p></li>
<li><p><code>latex input</code> &#8211; this uses the name of a latex file that will be used in a <code>\input{file}</code> command. This key can be used multiple times (the only metadata key that worked this way), and all the basic metadata is written to the LaTeX file in order.</p></li>
<li><p><code>latex footer</code> &#8211; this file worked the same way as <code>latex input</code>, but was inserted at the end of the file</p></li>
</ul>
<p>In practice, one typically needs to be able to insert <code>\input</code> commands at
only a few key places in the final document:</p>
<p>In practice, one typically needs to be able to insert <code>\input</code> commands at only a few key places in the final document:</p>
<ol>
<li>At the very beginning</li>
@ -339,72 +209,48 @@ only a few key places in the final document:</p>
<p><abbr title="MultiMarkdown">MMD</abbr> 6 standardizes the metadata to use 3 new keys:</p>
<ol>
<li><p><code>latex leader</code> &#8211; this specifies a file that will be used at the very
beginning of the document.</p></li>
<li><p><code>latex begin</code> &#8211; this comes after metadata, and before the body of the
document. This will usually include the <code>\begin{document}</code> command, hence the
name.</p></li>
<li><p><code>latex leader</code> &#8211; this specifies a file that will be used at the very beginning of the document.</p></li>
<li><p><code>latex begin</code> &#8211; this comes after metadata, and before the body of the document. This will usually include the <code>\begin{document}</code> command, hence the name.</p></li>
<li><p><code>latex footer</code> &#8211; this comes after the body of the document.</p></li>
</ol>
<p>You can use these 3 keys to replace the old <code>latex input</code> metadata keys, as
long as you pay attention as to which is which. If you used more than three
include statements, you may have to combine your latex files to fit into the
new system.</p>
<p>You can use these 3 keys to replace the old <code>latex input</code> metadata keys, as long as you pay attention as to which is which. If you used more than three include statements, you may have to combine your latex files to fit into the new system.</p>
<p><strong><em>In addition</em></strong>, there is a new shortcut key &#8211; <code>latex config</code>. This allows
you to specify a &#8220;document name&#8221; that is used to automatically identify the
corresponding <code>latex leader</code>, <code>latex begin</code>, and <code>latex footer</code> files. For
example, using <code>latex config: article</code> is the same as using:</p>
<p><strong><em>In addition</em></strong>, there is a new shortcut key &#8211; <code>latex config</code>. This allows you to specify a &#8220;document name&#8221; that is used to automatically identify the corresponding <code>latex leader</code>, <code>latex begin</code>, and <code>latex footer</code> files. For example, using <code>latex config: article</code> is the same as using:</p>
<pre><code>latex leader: mmd6-article-leader
latex begin: mmd6-article-begin
latex footer: mmd6-article-footer
</code></pre>
<p>Using the new system will require migrating your old configuration to the new
naming convention, but once done I believe it should me much more intuitive to
use.</p>
<p>Using the new system will require migrating your old configuration to the new naming convention, but once done I believe it should me much more intuitive to use.</p>
<p>The LaTeX support files included with the <abbr title="MultiMarkdown">MMD</abbr> v6 repository support the use of
the following <code>latex config</code> values by default:</p>
<p>The LaTeX support files included with the <abbr title="MultiMarkdown">MMD</abbr> v6 repository support the use of the following <code>latex config</code> values by default:</p>
<ul>
<li><code>article</code></li>
<li><code>beamer</code></li>
<li><code>letterhead</code></li>
<li><code>manuscript</code></li>
<li><code>memoir-book</code></li>
<li><code>tufte-book</code></li>
<li><code>tufte-handout</code></li>
</ul>
<p><strong>NOTE</strong>: You do have to install the <abbr title="MultiMarkdown">MMD</abbr> support files into the proper
location for your system. I would like to make this easier, but haven&#8217;t found
the best configuration yet.</p>
<p><strong>NOTE</strong>: You do have to install the <abbr title="MultiMarkdown">MMD</abbr> support files into the proper location for your system. I would like to make this easier, but haven&#8217;t found the best configuration yet.</p>
<h4 id="metadata">Metadata</h4>
<p>Metadata in <abbr title="MultiMarkdown">MMD</abbr> v6 includes new support for LaTeX &#8211; the <code>latex config</code> key
allows you to automatically setup of multiple <code>latex include</code> files at once.
The default setups that I use would typically consist of one LaTeX file to be
included at the top of the file, one to be included right at the beginning of
the document, and one to be included at the end of the document. If you want
to specify the latex files separately, you can use <code>latex leader</code>, <code>latex
begin</code>, and <code>latex footer</code>.</p>
<p>Metadata in <abbr title="MultiMarkdown">MMD</abbr> v6 includes new support for LaTeX &#8211; the <code>latex config</code> key allows you to automatically setup of multiple <code>latex include</code> files at once. The default setups that I use would typically consist of one LaTeX file to be included at the top of the file, one to be included right at the beginning of the document, and one to be included at the end of the document. If you want to specify the latex files separately, you can use <code>latex leader</code>, <code>latex begin</code>, and <code>latex footer</code>.</p>
<p>There are new metadata keys for controlling internationalization:</p>
<ul>
<li><p><code>language</code> &#8211; specify the content language for a document, using the two
letter code for the language (e.g. <code>en</code> for English). Where possible, this
will also set the default <code>quotes language</code>.</p></li>
<li><p><code>quotes language</code> &#8211; specify which variant of smart quotes to use. Valid
options are <code>dutch</code>, <code>french</code>, <code>german</code>, <code>germanguillemets</code>, <code>swedish</code>, <code>nl</code>,
<code>fr</code>, <code>de</code>, <code>sv</code>. Anything else defaults to English.</p></li>
<li><p><code>language</code> &#8211; specify the content language for a document, using the two letter code for the language (e.g. <code>en</code> for English). Where possible, this will also set the default <code>quotes language</code>.</p></li>
<li><p><code>quotes language</code> &#8211; specify which variant of smart quotes to use. Valid options are <code>dutch</code>, <code>french</code>, <code>german</code>, <code>germanguillemets</code>, <code>swedish</code>, <code>nl</code>, <code>fr</code>, <code>de</code>, <code>sv</code>. Anything else defaults to English.</p></li>
</ul>
<p>Additionally, the <code>MMD Header</code> and <code>MMD Footer</code> metadata work slightly
differently. In v5, these fields were used to list names of files that should
be transcluded before and after the main body. In v6, these fields represent
the actual text to be inserted. If you want them to reference separate files,
use the transclusion functionality:</p>
<p>Additionally, the <code>MMD Header</code> and <code>MMD Footer</code> metadata work slightly differently. In v5, these fields were used to list names of files that should be transcluded before and after the main body. In v6, these fields represent the actual text to be inserted. If you want them to reference separate files, use the transclusion functionality:</p>
<pre><code>Title: Some Title
MMD Header: This is *MMD* text.
@ -413,40 +259,27 @@ MMD Footer: {{footer.txt}}
<h4 id="outputformats">Output Formats</h4>
<p>MultiMarkdown 6 supports the following output formats, using the <code>-t</code>
command-line argument:</p>
<p>MultiMarkdown 6 supports the following output formats, using the <code>-t</code> command-line argument:</p>
<ul>
<li><code>html</code> &#8211; (Default) create HTML 5</li>
<li><code>latex</code> &#8211; create <a href="https://en.wikipedia.org/wiki/LaTeX">LaTeX</a> for conversion to PDF using high quality
typography</li>
<li><code>beamer</code> and <code>memoir</code> &#8211; two additional LaTeX variants for creating
slide presentations and longer documents, respectively</li>
<li><code>mmd</code> &#8211; output the <abbr title="MultiMarkdown">MMD</abbr> text before converting to another format,
but after performing transclusion. This format is not generally needed.</li>
<li><code>odt</code> &#8211; OpenDocument text file, used by OpenOffice and compatible
word processors. Images are embedded inside the file package.</li>
<li><code>fodt</code> &#8211; OpenDocument text variant using a single text (XML) file
instead of a compressed zip file. Images are not embedded in this format.</li>
<li><code>epub</code> &#8211; EPUB 3 ebook format. Images and CSS are embedded in the
file package.</li>
<li><code>bundle</code> &#8211; [TextBundle] format consisting of Markdown/MultiMarkdown
text file and embedded images and CSS. Useful for sharing Markdown files
and images between applications (on any OS, but especially on iOS)</li>
<li><code>bundlezip</code> &#8211; TextPack variant of the TextBundle format &#8211; the file
package is compressed to a single zip file (similar to EPUB and ODT
formats).</li>
<li><code>latex</code> &#8211; create <a href="https://en.wikipedia.org/wiki/LaTeX">LaTeX</a> for conversion to PDF using high quality typography</li>
<li><code>beamer</code> and <code>memoir</code> &#8211; two additional LaTeX variants for creating slide presentations and longer documents, respectively</li>
<li><code>mmd</code> &#8211; output the <abbr title="MultiMarkdown">MMD</abbr> text before converting to another format, but after performing transclusion. This format is not generally needed.</li>
<li><code>odt</code> &#8211; OpenDocument text file, used by OpenOffice and compatible word processors. Images are embedded inside the file package.</li>
<li><code>fodt</code> &#8211; OpenDocument text variant using a single text (XML) file instead of a compressed zip file. Images are not embedded in this format.</li>
<li><code>epub</code> &#8211; EPUB 3 ebook format. Images and CSS are embedded in the file package.</li>
<li><code>opml</code> &#8211; <a href="http://en.wikipedia.org/wiki/OPML">OPML</a> is a standard file format used for a wide range of outlining programs. This allows you to use a single file for editing MultiMarkdown text and for outlining longer documents. <a href="https://multimarkdown.com/">MultiMarkdown Composer</a> can read/write the OPML format, making it easy to share documents with other programs.</li>
<li><code>itmz</code> &#8211; ITMZ is the file format used for the <a href="http://www.ithoughts.co.uk/">iThoughts</a> mind mapping software (macOS, iOS, Windows). Much like OPML, this format allows you to use a single file for your outlining/brainstorming and final production. <a href="https://multimarkdown.com/">MultiMarkdown Composer</a> can read/write this format as well, giving you additional flexibility.</li>
<li><code>bundle</code> &#8211; [TextBundle] format consisting of Markdown/MultiMarkdown text file and embedded images and CSS. Useful for sharing Markdown files and images between applications (on any OS, but especially on iOS)</li>
<li><code>bundlezip</code> &#8211; TextPack variant of the TextBundle format &#8211; the file package is compressed to a single zip file (similar to EPUB and ODT formats).</li>
</ul>
<h4 id="rawsource">Raw Source</h4>
<p>In older versions of MultiMarkdown you could use an HTML comment to pass raw
LaTeX or other content to the final document. This worked reasonably well,
but was limited and didn&#8217;t work well when exporting to multiple formats. It
was time for something new.</p>
<p>In older versions of MultiMarkdown you could use an HTML comment to pass raw LaTeX or other content to the final document. This worked reasonably well, but was limited and didn&#8217;t work well when exporting to multiple formats. It was time for something new.</p>
<p><abbr title="MultiMarkdown">MMD</abbr> v6 offers a new feature to handle this. Code spans and code blocks can
be flagged as representing raw source:</p>
<p><abbr title="MultiMarkdown">MMD</abbr> v6 offers a new feature to handle this. Code spans and code blocks can be flagged as representing raw source:</p>
<pre><code>foo `*bar*`{=html}
@ -461,7 +294,7 @@ be flagged as representing raw source:</p>
<ul>
<li><code>html</code></li>
<li><code>odt</code></li>
<li><code>odt</code> &#8211; for ODT and FODT</li>
<li><code>epub</code></li>
<li><code>latex</code></li>
<li><code>*</code> &#8211; wildcard matches any output format</li>
@ -469,25 +302,17 @@ be flagged as representing raw source:</p>
<h4 id="tableofcontents">Table of Contents</h4>
<p>By placing <code>{{TOC}}</code> in your document, you can insert an automatically
generated Table of Contents in your document. As of <abbr title="MultiMarkdown">MMD</abbr> v6, the native
Table of Contents functionality is used when exporting to LaTeX or
OpenDocument formats.</p>
<p>By placing <code>{{TOC}}</code> in your document, you can insert an automatically generated Table of Contents in your document. As of <abbr title="MultiMarkdown">MMD</abbr> v6, the native Table of Contents functionality is used when exporting to LaTeX or OpenDocument formats.</p>
<h4 id="tables">Tables</h4>
<p>Tables in MultiMarkdown-6 work basically the same as before, but a caption, if
present, must come <em>after</em> the body of the table, not <em>before</em>.</p>
<p>Tables in MultiMarkdown-6 work basically the same as before, but a caption, if present, must come <em>after</em> the body of the table, not <em>before</em>.</p>
<h4 id="transclusion">Transclusion</h4>
<p>File transclusion works basically the same way &#8211; <code>{{file}}</code> is used to
indicate a file that needs to be transcluded. <code>{{file.*}}</code> allows for
wildcard transclusion. What&#8217;s different is that the way search paths are
handled is more flexible, though it may take a moment to understand.</p>
<p>File transclusion works basically the same way &#8211; <code>{{file}}</code> is used to indicate a file that needs to be transcluded. <code>{{file.*}}</code> allows for wildcard transclusion. What&#8217;s different is that the way search paths are handled is more flexible, though it may take a moment to understand.</p>
<p>When you process a file with <abbr title="MultiMarkdown">MMD</abbr>, it uses that file&#8217;s directory as the search
path for included files. For example:</p>
<p>When you process a file with <abbr title="MultiMarkdown">MMD</abbr>, it uses that file&#8217;s directory as the search path for included files. For example:</p>
<table>
<colgroup>
@ -523,18 +348,14 @@ path for included files. For example:</p>
</tbody>
</table>
<p>This is the same as <abbr title="MultiMarkdown">MMD</abbr> v 5. What&#8217;s different is that when you transclude a
file, the search path stays the same as the &#8220;parent&#8221; file, <strong>UNLESS</strong> you use
the <code>transclude base</code> metadata to override it. The simplest override is:</p>
<p>This is the same as <abbr title="MultiMarkdown">MMD</abbr> v 5. What&#8217;s different is that when you transclude a file, the search path stays the same as the &#8220;parent&#8221; file, <strong>UNLESS</strong> you use the <code>transclude base</code> metadata to override it. The simplest override is:</p>
<pre><code>transclude base: .
</code></pre>
<p>This means that any transclusions within the file will be calculated relative
to the file, regardless of the original search path.</p>
<p>This means that any transclusions within the file will be calculated relative to the file, regardless of the original search path.</p>
<p>Alternatively you could specify that any transclusion happens inside a
subfolder:</p>
<p>Alternatively you could specify that any transclusion happens inside a subfolder:</p>
<pre><code>transclude base: folder/
</code></pre>
@ -544,65 +365,37 @@ subfolder:</p>
<pre><code>transclude base: /some/path
</code></pre>
<p>This flexibility means that you can transclude different files based on
whether a file is being processed by itself or as part of a &#8220;parent&#8221; file.
This can be useful when a particular file can either be a standalone document,
or a chapter inside a larger document.</p>
<p>This flexibility means that you can transclude different files based on whether a file is being processed by itself or as part of a &#8220;parent&#8221; file. This can be useful when a particular file can either be a standalone document, or a chapter inside a larger document.</p>
<h3 id="developernotes">Developer Notes</h3>
<p>If you&#8217;re using <abbr title="MultiMarkdown">MMD</abbr> as a library in another application, there are a few
things to be aware of.</p>
<p>If you&#8217;re using <abbr title="MultiMarkdown">MMD</abbr> as a library in another application, there are a few things to be aware of.</p>
<h4 id="objectpools">Object Pools</h4>
<p>To improve performance, <abbr title="MultiMarkdown">MMD</abbr> has the option to allocate the memory for the
tokens used in parsing in large chunks (&#8220;object pools&#8221;). Allocating a single
large chunk of memory is more efficient than allocating many smaller chunks.
However, this does complicate memory management.</p>
<p>To improve performance, <abbr title="MultiMarkdown">MMD</abbr> has the option to allocate the memory for the tokens used in parsing in large chunks (&#8220;object pools&#8221;). Allocating a single large chunk of memory is more efficient than allocating many smaller chunks. However, this does complicate memory management.</p>
<p>By default <code>token.h</code> defines <code>kUseObjectPool</code> which enables this performance
improvement. This does require more caution with the way that memory is
managed. (See <code>main.c</code> for an example of how the object pool is allocated and
drained.) I recommend disabling object pools unless you really understand C
memory management, and understand MultiMarkdown&#8217;s program flow. Failure to
properly manage the object pool can lead to massive memory leaks, freeing
memory before that is still in use, or other potential problems.</p>
<p>By default <code>token.h</code> defines <code>kUseObjectPool</code> which enables this performance improvement. This does require more caution with the way that memory is managed. (See <code>main.c</code> for an example of how the object pool is allocated and drained.) I recommend disabling object pools unless you really understand C memory management, and understand MultiMarkdown&#8217;s program flow. Failure to properly manage the object pool can lead to massive memory leaks, freeing memory that is still in use, or other potential problems.</p>
<h4 id="htmlbooleanattributes">HTML Boolean Attributes</h4>
<p>Most HTML attributes are of the key-value type (e.g. <code>key=&quot;value&quot;</code>). But some
less frequently used attributes are boolean attributes (e.g. <code>&lt;video
controls&gt;</code>). Properly distinguishing HTML from other uses of the <code>&lt;</code>
character requires matching both types under certain circumstances.</p>
<p>Most HTML attributes are of the key-value type (e.g. <code>key=&quot;value&quot;</code>). But some less frequently used attributes are boolean attributes (e.g. <code>&lt;video controls&gt;</code>). Properly distinguishing HTML from other uses of the <code>&lt;</code> character requires matching both types under certain circumstances.</p>
<p>There are some trade-offs to be made:</p>
<ul>
<li><p>Performance when compiling MultiMarkdown</p></li>
<li><p>Performance when processing parts of documents that are <em>not</em> HTML</p></li>
<li><p>Accuracy when matching HTML</p></li>
<li>Performance when compiling MultiMarkdown</li>
<li>Performance when processing parts of documents that are <em>not</em> HTML</li>
<li>Accuracy when matching HTML</li>
</ul>
<p>So far, there seem to be four main approaches:</p>
<ul>
<li><p>Ignore boolean attributes &#8211; this is how <abbr title="MultiMarkdown">MMD</abbr>-6 started. This is fast, but
not accurate for some users. Several users found issues with the <code>&lt;video&gt;</code> tag
when <abbr title="MultiMarkdown">MMD</abbr> was used in HTML heavy documents.</p></li>
<li><p>Use regexp to match all boolean attributes. This is fast to compile, but
adds roughly 5&#8211;8% processing time (probably due to false positive HTML
matches). This <em>may</em> cause some text to be classified as HTML when it
shouldn&#8217;t.</p></li>
<li><p>Explicitly match all possible boolean attributes &#8211; This would presumably be
relatively fast when processing (due to the nature of re2c lexers), but it may
be prohibitively slow to compile for some users. As someone who compiles <abbr title="MultiMarkdown">MMD</abbr>
frequently, it is too slow to compile be useful for me during development.</p></li>
<li><p>Use a hand-curated list of boolean attributes that are most commonly used &#8211;
this does not incur much of a performance hit when parsing, and compiles
faster than the complete list of all boolean attributes. For now, this is the
option I have chosen as default for <abbr title="MultiMarkdown">MMD</abbr> &#8211; it seems to be a reasonable trade-
off. I will continue to research additional options.</p></li>
<li><p>Ignore boolean attributes &#8211; this is how <abbr title="MultiMarkdown">MMD</abbr>-6 started. This is fast, but not accurate for some users. Several users found issues with the <code>&lt;video&gt;</code> tag when <abbr title="MultiMarkdown">MMD</abbr> was used in HTML heavy documents.</p></li>
<li><p>Use regexp to match all boolean attributes. This is fast to compile, but adds roughly 5&#8211;8% processing time (probably due to false positive HTML matches). This <em>may</em> cause some text to be classified as HTML when it shouldn&#8217;t.</p></li>
<li><p>Explicitly match all possible boolean attributes &#8211; This would presumably be relatively fast when processing (due to the nature of re2c lexers), but it may be prohibitively slow to compile for some users. As someone who compiles <abbr title="MultiMarkdown">MMD</abbr> frequently, it is too slow to compile for it to be usable by me during development.</p></li>
<li><p>Use a hand-curated list of boolean attributes that are most commonly used &#8211; this does not incur much of a performance hit when parsing, and compiles faster than the complete list of all boolean attributes. For now, this is the option I have chosen as default for <abbr title="MultiMarkdown">MMD</abbr> &#8211; it seems to be a reasonable trade-off. I will continue to research additional options.</p></li>
</ul>
<h3 id="futuresteps">Future Steps</h3>
@ -610,11 +403,7 @@ off. I will continue to research additional options.</p></li>
<p>Some features I plan to implement at some point:</p>
<ol>
<li>OPML export support is not available in v6. I plan on adding improved
support for this at some point. I was hoping to be able to re-use the
existing v6 parser but it might be simpler to use the approach from v5 and
earlier, which was to have a separate parser tuned to only identify headers
and &#8220;stuff between headers&#8221;.</li>
<li><del>OPML export support is not available in v6. I plan on adding improved support for this at some point. I was hoping to be able to re-use the existing v6 parser but it might be simpler to use the approach from v5 and earlier, which was to have a separate parser tuned to only identify headers and &#8220;stuff between headers&#8221;.</del><span class="critic comment">OPML read/write support implemented.</span></li>
</ol>
<div class="glossary">
@ -630,8 +419,7 @@ AST: <p>Abstract Syntax Tree <a href="https://en.wikipedia.org/wiki/Abstract_syn
</li>
<li id="gn:3">
glossary: <p>The
glossary collects information about important terms used in your document <a href="#gnref:3" title="return to body" class="reverseglossary">&#160;&#8617;</a></p>
glossary: <p>The glossary collects information about important terms used in your document <a href="#gnref:3" title="return to body" class="reverseglossary">&#160;&#8617;</a></p>
</li>
</ol>

View File

@ -11,246 +11,132 @@ uuid: 0d6313fa-9135-477e-9c14-7d62c1977833
# Introduction #
Version: [%version]
Version: [%version]
This document serves as a description of MMD v6, as well as a sample
document to demonstrate the various features. Specifically, differences from
MMD v5 will be pointed out.
This document serves as a description of MMD v6, as well as a sample document to demonstrate the various features. Specifically, differences from MMD v5 will be pointed out.
# Performance #
A big motivating factor leading to the development of MMD v6 was
performance. When MMD first migrated from Perl to C (based on [peg-
markdown](https://github.com/jgm/peg-markdown)), it was among the fastest
Markdown parsers available. That was many years ago, and the "competition"
has made a great deal of progress since that time.
A big motivating factor leading to the development of MMD v6 was performance. When MMD first migrated from Perl to C (based on [peg- markdown](https://github.com/jgm/peg-markdown)), it was among the fastest Markdown parsers available. That was many years ago, and the "competition" has made a great deal of progress since that time.
When developing MMD v6, one of my goals was to keep MMD at least in the
ballpark of the fastest processors. Of course, being *the* fastest would be
fantastic, but I was more concerned with ensuring that the code was easily
understood, and easily updated with new features in the future.
When developing MMD v6, one of my goals was to keep MMD at least in the ballpark of the fastest processors. Of course, being *the* fastest would be fantastic, but I was more concerned with ensuring that the code was easily understood, and easily updated with new features in the future.
MMD v3 -- v5 used a PEG to handle the parsing. This made it easy to
understand the relationship between the MMD grammar and the parsing code,
since they were one and the same. However, the parsing code generated by
the parsers was not particularly fast, and was prone to troublesome edge
cases with terrible performance characteristics.
MMD v3 -- v5 used a PEG to handle the parsing. This made it easy to understand the relationship between the MMD grammar and the parsing code, since they were one and the same. However, the parsing code generated by the parsers was not particularly fast, and was prone to troublesome edge cases with terrible performance characteristics.
The first step in MMD v6 parsing is to break the source text into a series
of tokens, which may consist of plain text, whitespace, or special characters
such as '*', '[', etc. This chain of tokens is then used to perform the
actual parsing.
The first step in MMD v6 parsing is to break the source text into a series of tokens, which may consist of plain text, whitespace, or special characters such as '*', '[', etc. This chain of tokens is then used to perform the actual parsing.
MMD v6 divides the parsing into two separate phases, which actually fits
more with Markdown's design philosophically.
MMD v6 divides the parsing into two separate phases, which actually fits more with Markdown's design philosophically.
1. Block parsing consists of identifying the "type" of each line of the
source text, and grouping the lines into blocks (e.g. paragraphs, lists,
blockquotes, etc.) Some blocks are a single line (e.g. ATX headers), and
others can be many lines long. The block parsing in MMD v6 is handled
by a parser generated by [lemon](http://www.hwaci.com/sw/lemon/). This
parser allows the block structure to be more readily understood by
non-programmers, but the generated parser is still fast.
1. Block parsing consists of identifying the "type" of each line of the source text, and grouping the lines into blocks (e.g. paragraphs, lists, blockquotes, etc.) Some blocks are a single line (e.g. ATX headers), and others can be many lines long. The block parsing in MMD v6 is handled by a parser generated by [lemon](http://www.hwaci.com/sw/lemon/). This parser allows the block structure to be more readily understood by non-programmers, but the generated parser is still fast.
2. Span parsing consists of identifying Markdown/MMD structures that occur inside of blocks, such as links, images, strong, emph, etc. Most of these structures require matching pairs of tokens to specify where the span starts and where it ends. Most of these spans allow arbitrary levels of nesting as well. This made parsing them correctly in the PEG-based code difficult and slow. MMD v6 uses a different approach that is accurate and has good performance characteristics even with edge cases. Basically, it keeps a stack of each "opening" token as it steps through the token chain. When a "closing" token is found, it is paired with the most recent appropriate opener on the stack. Any tokens in between the opener and closer are removed, as they are not able to be matched any more. To avoid unnecessary searches for non- existent openers, the parser keeps track of which opening tokens have been discovered. This allows the parser to continue moving forwards without having to go backwards and re-parse any previously visited tokens.
2. Span parsing consists of identifying Markdown/MMD structures that occur
inside of blocks, such as links, images, strong, emph, etc. Most of these
structures require matching pairs of tokens to specify where the span starts
and where it ends. Most of these spans allow arbitrary levels of nesting as
well. This made parsing them correctly in the PEG-based code difficult and
slow. MMD v6 uses a different approach that is accurate and has good
performance characteristics even with edge cases. Basically, it keeps a stack
of each "opening" token as it steps through the token chain. When a "closing"
token is found, it is paired with the most recent appropriate opener on the
stack. Any tokens in between the opener and closer are removed, as they are
not able to be matched any more. To avoid unnecessary searches for non-
existent openers, the parser keeps track of which opening tokens have been
discovered. This allows the parser to continue moving forwards without having
to go backwards and re-parse any previously visited tokens.
The result of this redesigned MMD parser is that it can parse short documents more quickly than [CommonMark](http://commonmark.org/), and takes only 15% -- 20% longer to parse long documents. I have not delved too deeply into this, but I presume that CommonMark has a bit more "set-up" time that becomes expensive when parsing a short document (e.g. a paragraph or two). But this cost becomes negligible when parsing longer documents (e.g. file sizes of 1 MB). So depending on your use case, CommonMark may well be faster than MMD, but we're talking about splitting hairs here.... Recent comparisons show MMD v6 taking approximately 4.37 seconds to parse a 108 MB file (approximately 24.8 MB/second), and CommonMark took 3.72 seconds for the same file (29.2 MB/second). For comparison, MMD v5.4 took approximately 94 second for the same file (1.15 MB/second).
The result of this redesigned MMD parser is that it can parse short
documents more quickly than [CommonMark](http://commonmark.org/), and takes
only 15% -- 20% longer to parse long documents. I have not delved too deeply
into this, but I presume that CommonMark has a bit more "set-up" time that
becomes expensive when parsing a short document (e.g. a paragraph or two). But
this cost becomes negligible when parsing longer documents (e.g. file sizes of
1 MB). So depending on your use case, CommonMark may well be faster than
MMD, but we're talking about splitting hairs here.... Recent comparisons
show MMD v6 taking approximately 4.37 seconds to parse a 108 MB file
(approximately 24.8 MB/second), and CommonMark took 3.72 seconds for the same
file (29.2 MB/second). For comparison, MMD v5.4 took approximately 94
second for the same file (1.15 MB/second).
For a more realistic file of approx 28 kb (the source of the Markdown Syntax web page), both MMD and CommonMark parse it too quickly to accurately measure. In fact, it requires a file consisting of the original file copied 32 times over (0.85 MB) before `/usr/bin/env time` reports a time over the minimum threshold of 0.01 seconds for either program.
For a more realistic file of approx 28 kb (the source of the Markdown Syntax
web page), both MMD and CommonMark parse it too quickly to accurately
measure. In fact, it requires a file consisting of the original file copied
32 times over (0.85 MB) before `/usr/bin/env time` reports a time over the
minimum threshold of 0.01 seconds for either program.
There is still potentially room for additional optimization in MMD.
However, even if I can't close the performance gap with CommonMark on longer
files, the additional features of MMD compared with Markdown in addition to
the increased legibility of the source code of MMD (in my biased opinion
anyway) make this project worthwhile.
There is still potentially room for additional optimization in MMD. However, even if I can't close the performance gap with CommonMark on longer files, the additional features of MMD compared with Markdown in addition to the increased legibility of the source code of MMD (in my biased opinion anyway) make this project worthwhile.
# Parse Tree #
MMD v6 performs its parsing in the following steps:
MMD v6 performs its parsing in the following steps:
1. Start with a null-terminated string of source text (C style string)
1. Start with a null-terminated string of source text (C style string)
2. Lex string into token chain
2. Lex string into token chain
3. Parse token chain into blocks
3. Parse token chain into blocks
4. Parse tokens within each block into span level structures (e.g. strong,
emph, etc.)
4. Parse tokens within each block into span level structures (e.g. strong, emph, etc.)
5. Export the token tree into the desired output format (e.g. HTML, LaTeX,
etc.) and return the resulting C style string
5. Export the token tree into the desired output format (e.g. HTML, LaTeX, etc.) and return the resulting C style string
**OR**
**OR**
6. Use the resulting token tree for your own purposes.
6. Use the resulting token tree for your own purposes.
The token tree ([?AST]) includes starting offsets and length of each token,
allowing you to use MMD as part of a syntax highlighter. MMD v5 did not
have this functionality in the public version, in part because the PEG parsers
used did not provide reliable offset positions, requiring a great deal of
effort when I adapted MMD for use in [MultiMarkdown
Composer](http://multimarkdown.com/).
The token tree ([?AST]) includes starting offsets and length of each token, allowing you to use MMD as part of a syntax highlighter. MMD v5 did not have this functionality in the public version, in part because the PEG parsers used did not provide reliable offset positions, requiring a great deal of effort when I adapted MMD for use in [MultiMarkdown Composer](http://multimarkdown.com/).
These steps are managed using the `mmd_engine` "object". An individual
`mmd_engine` cannot be used by multiple threads simultaneously, so if
libMultiMarkdown is to be used in a multithreaded program, a separate
`mmd_engine` should be created for each thread. Alternatively, just use the
slightly more abstracted `mmd_convert_string()` function that handles creating
and destroying the `mmd_engine` automatically.
These steps are managed using the `mmd_engine` "object". An individual `mmd_engine` cannot be used by multiple threads simultaneously, so if libMultiMarkdown is to be used in a multithreaded program, a separate `mmd_engine` should be created for each thread. Alternatively, just use the slightly more abstracted `mmd_convert_string()` function that handles creating and destroying the `mmd_engine` automatically.
# Features #
## Abbreviations (Or Acronyms) ##
This file includes the use of MMD as an abbreviation for MultiMarkdown. The
abbreviation will be expanded on the first use, and the shortened form will be
used on subsequent occurrences.
This file includes the use of MMD as an abbreviation for MultiMarkdown. The abbreviation will be expanded on the first use, and the shortened form will be used on subsequent occurrences.
Abbreviations can be specified using inline or reference syntax. The inline
variant requires that the abbreviation be wrapped in parentheses and
immediately follows the `>`.
Abbreviations can be specified using inline or reference syntax. The inline variant requires that the abbreviation be wrapped in parentheses and immediately follows the `>`.
[>MMD] is an abbreviation. So is [>(MD) Markdown].
[>MMD]: MultiMarkdown
There is also a "shortcut" method for abbreviations that is similar to the
approach used in prior versions of MMD. You specify the definition for the
abbreviation in the usual manner, but MMD will automatically identify each
instance where the abbreviation is used and substitute it automatically. In
this case, the abbreviation is limited to a more basic character set which
includes letters, numbers, periods, and hyphens, but not much else. For more
complex abbreviations, you must explicitly mark uses of the abbreviation.
There is also a "shortcut" method for abbreviations that is similar to the approach used in prior versions of MMD. You specify the definition for the abbreviation in the usual manner, but MMD will automatically identify each instance where the abbreviation is used and substitute it automatically. In this case, the abbreviation is limited to a more basic character set which includes letters, numbers, periods, and hyphens, but not much else. For more complex abbreviations, you must explicitly mark uses of the abbreviation.
## Citations ##
Citations can be specified using an inline syntax, just like inline footnotes.
If you wish to use BibTeX, then configure the `bibtex` metadata (required) and
the `biblio style` metadata (optional).
Citations can be specified using an inline syntax, just like inline footnotes. If you wish to use BibTeX, then configure the `bibtex` metadata (required) and the `biblio style` metadata (optional).
The HTML output for citations now uses parentheses instead of brackets, e.g.
`(1)` instead of `[1]`.
The HTML output for citations now uses parentheses instead of brackets, e.g. `(1)` instead of `[1]`.
## CriticMarkup ##
MMD v6 has improved support for [CriticMarkup], both in terms of parsing, and
in terms of support for each output format. You can {++insert text++},
{--delete text--}, substitute {~~one thing~>for another~~}, {==highlight text==},
and {>>leave comments<<} in the text.
MMD v6 has improved support for [CriticMarkup], both in terms of parsing, and in terms of support for each output format. You can {++insert text++}, {--delete text--}, substitute {~~one thing~>for another~~}, {==highlight text==}, and {>>leave comments<<} in the text.
If you don't specify any command line options, then MMD will apply special
formatting to the CriticMarkup formatting as in the preceding paragraph.
Alternatively, you can use the `-a\--accept` or `-r\--reject` options to cause
MMD to accept or reject, respectively, the proposed changes within the CM
markup. When doing this, CM will work across blank lines. Without either of
these two options, then CriticMarkup that spans a blank line is not recognized
as such. I working on options for this for the future.
If you don't specify any command line options, then MMD will apply special formatting to the CriticMarkup formatting as in the preceding paragraph. Alternatively, you can use the `-a\--accept` or `-r\--reject` options to cause MMD to accept or reject, respectively, the proposed changes within the CM markup. When doing this, CM will work across blank lines. Without either of these two options, then CriticMarkup that spans a blank line is not recognized as such. I am working on options for this for the future.
## Embedded Images ##
Supported export formats (`odt`, `epub`, `bundle`, `bundlezip`) include
images inside the export document:
Supported export formats (`odt`, `epub`, `bundle`, `bundlezip`) include images inside the export document:
* Local images are embedded automatically
* Images stored on remote servers are embedded *if* [libCurl] is
properly installed when MMD is compiled. This is true for macOS builds.
* Local images are embedded automatically
* Images stored on remote servers are embedded *if* [libCurl] is properly installed when MMD is compiled. This is true for macOS builds.
[libCurl]: https://curl.haxx.se/libcurl/
## Emph and Strong ##
The basics of emphasis and strong emphasis are unchanged, but the parsing
engine has been improved to be more accurate, particularly in various edge
cases where proper parsing can be difficult.
The basics of emphasis and strong emphasis are unchanged, but the parsing engine has been improved to be more accurate, particularly in various edge cases where proper parsing can be difficult.
## EPUB 3 Support ##
MMD v6 now provides support for direct creation of [EPUB 3] files. Previously
a separate tool was required to create EPUB files from MMD. It's now built-
in. Currently, EPUB 3 files are built using the usual HTML 5 output. No
extra CSS is applied, so the default from the reader will be used. Images are
not yet supported, but are planned for the future.
MMD v6 now provides support for direct creation of [EPUB 3] files. Previously a separate tool was required to create EPUB files from MMD. It's now built-in. Currently, EPUB 3 files are built using the usual HTML 5 output. No extra CSS is applied, so the default from the reader will be used. Images are not yet supported, but are planned for the future.
EPUB files can be highly customized with other tools, and I recommend doing
that for production quality files. For example, apparently performance is
improved when the content is divided into multiple files (e.g. one file per
chapter). MMD creates EPUB 3 files using a single file. Tools like [Sigil]
are useful for improving your EPUB files, and I recommend doing that.
EPUB files can be highly customized with other tools, and I recommend doing that for production quality files. For example, apparently performance is improved when the content is divided into multiple files (e.g. one file per chapter). MMD creates EPUB 3 files using a single file. Tools like [Sigil] are useful for improving your EPUB files, and I recommend doing that.
Not all EPUB readers support v3 files. I don't plan on adding support for
older versions of the EPUB format, but other tools can convert to other
document formats you need. Same goes for Amazon's ebook formats -- the
[Calibre] program can also be used to interconvert between formats.
Not all EPUB readers support v3 files. I don't plan on adding support for older versions of the EPUB format, but other tools can convert to other document formats you need. Same goes for Amazon's ebook formats -- the [Calibre] program can also be used to interconvert between formats.
**NOTE**: Because EPUB documents are binary files, MMD only creates them when
run in batch mode (using the `-b\--batch` options). Otherwise, it simply
outputs the HTML 5 file that would serve as the primary content for the EPUB.
**NOTE**: Because EPUB documents are binary files, MMD only creates them when run in batch mode (using the `-b\--batch` options). Otherwise, it simply outputs the HTML 5 file that would serve as the primary content for the EPUB.
## Fenced Code Blocks ##
Fenced code blocks are fundamentally the same as MMD v5, except:
Fenced code blocks are fundamentally the same as MMD v5, except:
1. The leading and trailing fences can be 3, 4, or 5 backticks in length. That
should be sufficient to account for complex documents without requiring a more
complex parser.
1. The leading and trailing fences can be 3, 4, or 5 backticks in length. That should be sufficient to account for complex documents without requiring a more complex parser.
2. If there is no trailing fence, then everything after the leading fence is
considered to be part of the code block.
2. If there is no trailing fence, then everything after the leading fence is considered to be part of the code block.
## Footnotes ##
The HTML output for footnotes now uses superscripts instead of brackets, e.g.
`<sup>1</sup>` instead of `[1]`.
The HTML output for footnotes now uses superscripts instead of brackets, e.g. `<sup>1</sup>` instead of `[1]`.
## Glossary Terms ##
If there are terms in your document you wish to define in a [?(glossary) The
glossary collects information about important terms used in your document] at
the end of your document, you can define them using the glossary syntax.
If there are terms in your document you wish to define in a [?(glossary) The glossary collects information about important terms used in your document] at the end of your document, you can define them using the glossary syntax.
Glossary terms can be specified using inline or reference syntax. The inline
variant requires that the abbreviation be wrapped in parentheses and
immediately follows the `?`.
Glossary terms can be specified using inline or reference syntax. The inline variant requires that the abbreviation be wrapped in parentheses and immediately follows the `?`.
[?(glossary) The glossary collects information about important
terms used in your document] is a glossary term.
@ -260,122 +146,81 @@ immediately follows the `?`.
[?glossary]: The glossary collects information about important
terms used in your document
Much like abbreviations, there is also a "shortcut" method that is similar to
the approach used in prior versions of MMD. You specify the definition for
the glossary term in the usual manner, but MMD will automatically identify
each instance where the term is used and substitute it automatically. In this
case, the term is limited to a more basic character set which includes
letters, numbers, periods, and hyphens, but not much else. For more complex
glossary terms, you must explicitly mark uses of the term.
Much like abbreviations, there is also a "shortcut" method that is similar to the approach used in prior versions of MMD. You specify the definition for the glossary term in the usual manner, but MMD will automatically identify each instance where the term is used and substitute it automatically. In this case, the term is limited to a more basic character set which includes letters, numbers, periods, and hyphens, but not much else. For more complex glossary terms, you must explicitly mark uses of the term.
## HTML Comments ##
Previously, HTML Comments were used by MultiMarkdown to include raw text for
inclusion in the output file. This was useful, but limited, as it could only
work for one output format at a time.
Previously, HTML Comments were used by MultiMarkdown to include raw text for inclusion in the output file. This was useful, but limited, as it could only work for one output format at a time.
HTML Comments are now only included in HTML output, but not in any other
format since they would cause errors.
HTML Comments are now only included in HTML output, but not in any other format since they would cause errors.
Take a look at the `HTML Comments.text` file in the test suite for a better
understanding of comment blocks vs comment spans, and how they are parsed.
Take a look at the `HTML Comments.text` file in the test suite for a better understanding of comment blocks vs comment spans, and how they are parsed.
## Internationalization ##
MMD v6 includes support for substituting certain text phrases in other
languages. This only affects the HTML format.
MMD v6 includes support for substituting certain text phrases in other languages. This only affects the HTML format.
## LaTeX Changes ##
LaTeX support is slightly different than in prior versions of MMD. It is
designed to be a bit more consistent, and easier for basic use.
LaTeX support is slightly different than in prior versions of MMD. It is designed to be a bit more consistent, and easier for basic use.
The previous approach used two types of metadata:
The previous approach used two types of metadata:
* `latex input` -- this uses the name of a latex file that will be used in a
`\input{file}` command. This key can be used multiple times (the only
metadata key that worked this way), and all the basic metadata is written to
the LaTeX file in order.
* `latex input` -- this uses the name of a latex file that will be used in a `\input{file}` command. This key can be used multiple times (the only metadata key that worked this way), and all the basic metadata is written to the LaTeX file in order.
* `latex footer` -- this file worked the same way as `latex input`, but was
inserted at the end of the file
* `latex footer` -- this file worked the same way as `latex input`, but was inserted at the end of the file
In practice, one typically needs to be able to insert `\input` commands at
only a few key places in the final document:
In practice, one typically needs to be able to insert `\input` commands at only a few key places in the final document:
1. At the very beginning
2. After metadata, and before the body of the document
3. After the body of the document
1. At the very beginning
2. After metadata, and before the body of the document
3. After the body of the document
MMD 6 standardizes the metadata to use 3 new keys:
MMD 6 standardizes the metadata to use 3 new keys:
1. `latex leader` -- this specifies a file that will be used at the very
beginning of the document.
1. `latex leader` -- this specifies a file that will be used at the very beginning of the document.
2. `latex begin` -- this comes after metadata, and before the body of the
document. This will usually include the `\begin{document}` command, hence the
name.
2. `latex begin` -- this comes after metadata, and before the body of the document. This will usually include the `\begin{document}` command, hence the name.
3. `latex footer` -- this comes after the body of the document.
3. `latex footer` -- this comes after the body of the document.
You can use these 3 keys to replace the old `latex input` metadata keys, as
long as you pay attention as to which is which. If you used more than three
include statements, you may have to combine your latex files to fit into the
new system.
You can use these 3 keys to replace the old `latex input` metadata keys, as long as you pay attention as to which is which. If you used more than three include statements, you may have to combine your latex files to fit into the new system.
***In addition***, there is a new shortcut key -- `latex config`. This allows
you to specify a "document name" that is used to automatically identify the
corresponding `latex leader`, `latex begin`, and `latex footer` files. For
example, using `latex config: article` is the same as using:
***In addition***, there is a new shortcut key -- `latex config`. This allows you to specify a "document name" that is used to automatically identify the corresponding `latex leader`, `latex begin`, and `latex footer` files. For example, using `latex config: article` is the same as using:
latex leader: mmd6-article-leader
latex begin: mmd6-article-begin
latex footer: mmd6-article-footer
Using the new system will require migrating your old configuration to the new
naming convention, but once done I believe it should me much more intuitive to
use.
Using the new system will require migrating your old configuration to the new naming convention, but once done I believe it should me much more intuitive to use.
The LaTeX support files included with the MMD v6 repository support the use of
the following `latex config` values by default:
The LaTeX support files included with the MMD v6 repository support the use of the following `latex config` values by default:
* `article`
* `tufte-book`
* `tufte-handout`
* `article`
* `beamer`
* `letterhead`
* `manuscript`
* `memoir-book`
* `tufte-book`
* `tufte-handout`
**NOTE**: You do have to install the MMD support files into the proper
location for your system. I would like to make this easier, but haven't found
the best configuration yet.
**NOTE**: You do have to install the MMD support files into the proper location for your system. I would like to make this easier, but haven't found the best configuration yet.
## Metadata ##
Metadata in MMD v6 includes new support for LaTeX -- the `latex config` key
allows you to automatically setup of multiple `latex include` files at once.
The default setups that I use would typically consist of one LaTeX file to be
included at the top of the file, one to be included right at the beginning of
the document, and one to be included at the end of the document. If you want
to specify the latex files separately, you can use `latex leader`, `latex
begin`, and `latex footer`.
Metadata in MMD v6 includes new support for LaTeX -- the `latex config` key allows you to automatically setup of multiple `latex include` files at once. The default setups that I use would typically consist of one LaTeX file to be included at the top of the file, one to be included right at the beginning of the document, and one to be included at the end of the document. If you want to specify the latex files separately, you can use `latex leader`, `latex begin`, and `latex footer`.
There are new metadata keys for controlling internationalization:
There are new metadata keys for controlling internationalization:
* `language` -- specify the content language for a document, using the two
letter code for the language (e.g. `en` for English). Where possible, this
will also set the default `quotes language`.
* `language` -- specify the content language for a document, using the two letter code for the language (e.g. `en` for English). Where possible, this will also set the default `quotes language`.
* `quotes language` -- specify which variant of smart quotes to use. Valid
options are `dutch`, `french`, `german`, `germanguillemets`, `swedish`, `nl`,
`fr`, `de`, `sv`. Anything else defaults to English.
* `quotes language` -- specify which variant of smart quotes to use. Valid options are `dutch`, `french`, `german`, `germanguillemets`, `swedish`, `nl`, `fr`, `de`, `sv`. Anything else defaults to English.
Additionally, the `MMD Header` and `MMD Footer` metadata work slightly
differently. In v5, these fields were used to list names of files that should
be transcluded before and after the main body. In v6, these fields represent
the actual text to be inserted. If you want them to reference separate files,
use the transclusion functionality:
Additionally, the `MMD Header` and `MMD Footer` metadata work slightly differently. In v5, these fields were used to list names of files that should be transcluded before and after the main body. In v6, these fields represent the actual text to be inserted. If you want them to reference separate files, use the transclusion functionality:
Title: Some Title
MMD Header: This is *MMD* text.
@ -384,42 +229,32 @@ use the transclusion functionality:
## Output Formats ##
MultiMarkdown 6 supports the following output formats, using the `-t`
command-line argument:
MultiMarkdown 6 supports the following output formats, using the `-t` command-line argument:
* `html` -- (Default) create HTML 5
* `latex` -- create [LaTeX] for conversion to PDF using high quality
typography
* `beamer` and `memoir` -- two additional LaTeX variants for creating
slide presentations and longer documents, respectively
* `mmd` -- output the MMD text before converting to another format,
but after performing transclusion. This format is not generally needed.
* `odt` -- OpenDocument text file, used by OpenOffice and compatible
word processors. Images are embedded inside the file package.
* `fodt` -- OpenDocument text variant using a single text (XML) file
instead of a compressed zip file. Images are not embedded in this format.
* `epub` -- EPUB 3 ebook format. Images and CSS are embedded in the
file package.
* `bundle` -- [TextBundle] format consisting of Markdown/MultiMarkdown
text file and embedded images and CSS. Useful for sharing Markdown files
and images between applications (on any OS, but especially on iOS)
* `bundlezip` -- TextPack variant of the TextBundle format -- the file
package is compressed to a single zip file (similar to EPUB and ODT
formats).
* `latex` -- create [LaTeX] for conversion to PDF using high quality typography
* `beamer` and `memoir` -- two additional LaTeX variants for creating slide presentations and longer documents, respectively
* `mmd` -- output the MMD text before converting to another format, but after performing transclusion. This format is not generally needed.
* `odt` -- OpenDocument text file, used by OpenOffice and compatible word processors. Images are embedded inside the file package.
* `fodt` -- OpenDocument text variant using a single text (XML) file instead of a compressed zip file. Images are not embedded in this format.
* `epub` -- EPUB 3 ebook format. Images and CSS are embedded in the file package.
* `opml` -- [OPML] is a standard file format used for a wide range of outlining programs. This allows you to use a single file for editing MultiMarkdown text and for outlining longer documents. [MultiMarkdown Composer] can read/write the OPML format, making it easy to share documents with other programs.
* `itmz` -- ITMZ is the file format used for the [iThoughts] mind mapping software (macOS, iOS, Windows). Much like OPML, this format allows you to use a single file for your outlining/brainstorming and final production. [MultiMarkdown Composer] can read/write this format as well, giving you additional flexibility.
* `bundle` -- [TextBundle] format consisting of Markdown/MultiMarkdown text file and embedded images and CSS. Useful for sharing Markdown files and images between applications (on any OS, but especially on iOS)
* `bundlezip` -- TextPack variant of the TextBundle format -- the file package is compressed to a single zip file (similar to EPUB and ODT formats).
[iThoughts]: http://www.ithoughts.co.uk/
[LaTeX]: https://en.wikipedia.org/wiki/LaTeX
[MultiMarkdown Composer]: https://multimarkdown.com/
[OPML]: http://en.wikipedia.org/wiki/OPML
## Raw Source ##
In older versions of MultiMarkdown you could use an HTML comment to pass raw
LaTeX or other content to the final document. This worked reasonably well,
but was limited and didn't work well when exporting to multiple formats. It
was time for something new.
In older versions of MultiMarkdown you could use an HTML comment to pass raw LaTeX or other content to the final document. This worked reasonably well, but was limited and didn't work well when exporting to multiple formats. It was time for something new.
MMD v6 offers a new feature to handle this. Code spans and code blocks can
be flagged as representing raw source:
MMD v6 offers a new feature to handle this. Code spans and code blocks can be flagged as representing raw source:
foo `*bar*`{=html}
@ -427,40 +262,32 @@ be flagged as representing raw source:
*foo*
```
The contents of the span/block will be passed through unchanged.
The contents of the span/block will be passed through unchanged.
You can specify which output format is compatible with the specified source:
You can specify which output format is compatible with the specified source:
* `html`
* `odt`
* `odt` -- for ODT and FODT
* `epub`
* `latex`
* `*` -- wildcard matches any output format
* `*` -- wildcard matches any output format
## Table of Contents ##
By placing `{{TOC}}` in your document, you can insert an automatically
generated Table of Contents in your document. As of MMD v6, the native
Table of Contents functionality is used when exporting to LaTeX or
OpenDocument formats.
By placing `{{TOC}}` in your document, you can insert an automatically generated Table of Contents in your document. As of MMD v6, the native Table of Contents functionality is used when exporting to LaTeX or OpenDocument formats.
## Tables ##
Tables in MultiMarkdown-6 work basically the same as before, but a caption, if
present, must come *after* the body of the table, not *before*.
Tables in MultiMarkdown-6 work basically the same as before, but a caption, if present, must come *after* the body of the table, not *before*.
## Transclusion ##
File transclusion works basically the same way -- `{{file}}` is used to
indicate a file that needs to be transcluded. `{{file.*}}` allows for
wildcard transclusion. What's different is that the way search paths are
handled is more flexible, though it may take a moment to understand.
File transclusion works basically the same way -- `{{file}}` is used to indicate a file that needs to be transcluded. `{{file.*}}` allows for wildcard transclusion. What's different is that the way search paths are handled is more flexible, though it may take a moment to understand.
When you process a file with MMD, it uses that file's directory as the search
path for included files. For example:
When you process a file with MMD, it uses that file's directory as the search path for included files. For example:
| Directory | Transcluded Filename | Resolved Path |
| ------------------ | ----------------------------- | ------------------------------ |
@ -468,105 +295,68 @@ path for included files. For example:
| `/foo/bar/` | `baz/bat` | `/foo/bar/baz/bat` |
| `/foo/bar/` | `../bat` | `/foo/bat` |
This is the same as MMD v 5. What's different is that when you transclude a
file, the search path stays the same as the "parent" file, **UNLESS** you use
the `transclude base` metadata to override it. The simplest override is:
This is the same as MMD v 5. What's different is that when you transclude a file, the search path stays the same as the "parent" file, **UNLESS** you use the `transclude base` metadata to override it. The simplest override is:
transclude base: .
This means that any transclusions within the file will be calculated relative
to the file, regardless of the original search path.
This means that any transclusions within the file will be calculated relative to the file, regardless of the original search path.
Alternatively you could specify that any transclusion happens inside a
subfolder:
Alternatively you could specify that any transclusion happens inside a subfolder:
transclude base: folder/
Or you can specify an absolute path:
Or you can specify an absolute path:
transclude base: /some/path
This flexibility means that you can transclude different files based on
whether a file is being processed by itself or as part of a "parent" file.
This can be useful when a particular file can either be a standalone document,
or a chapter inside a larger document.
This flexibility means that you can transclude different files based on whether a file is being processed by itself or as part of a "parent" file. This can be useful when a particular file can either be a standalone document, or a chapter inside a larger document.
# Developer Notes #
If you're using MMD as a library in another application, there are a few
things to be aware of.
If you're using MMD as a library in another application, there are a few things to be aware of.
## Object Pools ##
To improve performance, MMD has the option to allocate the memory for the
tokens used in parsing in large chunks ("object pools"). Allocating a single
large chunk of memory is more efficient than allocating many smaller chunks.
However, this does complicate memory management.
To improve performance, MMD has the option to allocate the memory for the tokens used in parsing in large chunks ("object pools"). Allocating a single large chunk of memory is more efficient than allocating many smaller chunks. However, this does complicate memory management.
By default `token.h` defines `kUseObjectPool` which enables this performance
improvement. This does require more caution with the way that memory is
managed. (See `main.c` for an example of how the object pool is allocated and
drained.) I recommend disabling object pools unless you really understand C
memory management, and understand MultiMarkdown's program flow. Failure to
properly manage the object pool can lead to massive memory leaks, freeing
memory before that is still in use, or other potential problems.
By default `token.h` defines `kUseObjectPool` which enables this performance improvement. This does require more caution with the way that memory is managed. (See `main.c` for an example of how the object pool is allocated and drained.) I recommend disabling object pools unless you really understand C memory management, and understand MultiMarkdown's program flow. Failure to properly manage the object pool can lead to massive memory leaks, freeing memory that is still in use, or other potential problems.
## HTML Boolean Attributes ##
Most HTML attributes are of the key-value type (e.g. `key="value"`). But some
less frequently used attributes are boolean attributes (e.g. `<video
controls>`). Properly distinguishing HTML from other uses of the `<`
character requires matching both types under certain circumstances.
Most HTML attributes are of the key-value type (e.g. `key="value"`). But some less frequently used attributes are boolean attributes (e.g. `<video controls>`). Properly distinguishing HTML from other uses of the `<` character requires matching both types under certain circumstances.
There are some trade-offs to be made:
There are some trade-offs to be made:
* Performance when compiling MultiMarkdown
* Performance when compiling MultiMarkdown
* Performance when processing parts of documents that are *not* HTML
* Accuracy when matching HTML
* Performance when processing parts of documents that are *not* HTML
So far, there seem to be four main approaches:
* Accuracy when matching HTML
* Ignore boolean attributes -- this is how MMD-6 started. This is fast, but not accurate for some users. Several users found issues with the `<video>` tag when MMD was used in HTML heavy documents.
So far, there seem to be four main approaches:
* Use regexp to match all boolean attributes. This is fast to compile, but adds roughly 5-8% processing time (probably due to false positive HTML matches). This *may* cause some text to be classified as HTML when it shouldn't.
* Ignore boolean attributes -- this is how MMD-6 started. This is fast, but
not accurate for some users. Several users found issues with the `<video>` tag
when MMD was used in HTML heavy documents.
* Explicitly match all possible boolean attributes -- This would presumably be relatively fast when processing (due to the nature of re2c lexers), but it may be prohibitively slow to compile for some users. As someone who compiles MMD frequently, it is too slow to compile for it to be usable by me during development.
* Use regexp to match all boolean attributes. This is fast to compile, but
adds roughly 5-8% processing time (probably due to false positive HTML
matches). This *may* cause some text to be classified as HTML when it
shouldn't.
* Explicitly match all possible boolean attributes -- This would presumably be
relatively fast when processing (due to the nature of re2c lexers), but it may
be prohibitively slow to compile for some users. As someone who compiles MMD
frequently, it is too slow to compile be useful for me during development.
* Use a hand-curated list of boolean attributes that are most commonly used --
this does not incur much of a performance hit when parsing, and compiles
faster than the complete list of all boolean attributes. For now, this is the
option I have chosen as default for MMD -- it seems to be a reasonable trade-
off. I will continue to research additional options.
* Use a hand-curated list of boolean attributes that are most commonly used -- this does not incur much of a performance hit when parsing, and compiles faster than the complete list of all boolean attributes. For now, this is the option I have chosen as default for MMD -- it seems to be a reasonable trade-off. I will continue to research additional options.
# Future Steps #
Some features I plan to implement at some point:
Some features I plan to implement at some point:
1. OPML export support is not available in v6. I plan on adding improved
support for this at some point. I was hoping to be able to re-use the
existing v6 parser but it might be simpler to use the approach from v5 and
earlier, which was to have a separate parser tuned to only identify headers
and "stuff between headers".
1. {--OPML export support is not available in v6. I plan on adding improved support for this at some point. I was hoping to be able to re-use the existing v6 parser but it might be simpler to use the approach from v5 and earlier, which was to have a separate parser tuned to only identify headers and "stuff between headers".--}{>>OPML read/write support implemented.<<}
[>MMD]: MultiMarkdown
[CriticMarkup]: http://criticmarkup.com/
[?PEG]: Parsing Expression Grammar <https://en.wikipedia.org/wiki/Parsing_expression_grammar>
[?AST]: Abstract Syntax Tree <https://en.wikipedia.org/wiki/Abstract_syntax_tree>
[EPUB 3]: https://en.wikipedia.org/wiki/EPUB
[Sigil]: https://sigil-ebook.com/
[Calibre]: https://calibre-ebook.com/
[>MMD]: MultiMarkdown
[CriticMarkup]: http://criticmarkup.com/
[?PEG]: Parsing Expression Grammar <https://en.wikipedia.org/wiki/Parsing_expression_grammar>
[?AST]: Abstract Syntax Tree <https://en.wikipedia.org/wiki/Abstract_syntax_tree>
[EPUB 3]: https://en.wikipedia.org/wiki/EPUB
[Sigil]: https://sigil-ebook.com/
[Calibre]: https://calibre-ebook.com/