From: Lady Date: Tue, 31 Mar 2026 01:29:32 +0000 (-0400) Subject: Add some documentation to the transform X-Git-Tag: 0.7.0^0 X-Git-Url: https://git.ladys.computer/LesML/commitdiff_plain/51326631af07b5afd059dc21fc2d510b5583c71cba31815d5bc54aacb1a1e064 Add some documentation to the transform --- diff --git a/xslt/lesml.xslt b/xslt/lesml.xslt index 9a60383..2100e06 100644 --- a/xslt/lesml.xslt +++ b/xslt/lesml.xslt @@ -1,6 +1,6 @@ @@ -21,6 +21,31 @@ This file implements a transformation, via X·S·L·T, from an H·T·M·L This forms the “canonical” definition of the Les·M·L syntax; no features will be added to the language which are not feasible to be implemented here. + +§ Implementation + +❦ On `LesML-´ processing instructions + +Processing instructions which begin with `LesML-´ are used internally + for bookkeeping as, unlike attributes, the Les·M·L format provides + no means by which authors could generate them. +All of these instructions should be removed by the end of the + transformation, but understanding them is important to understanding + how this code operates. +The processing instructions are as follows :⁠— + +• `´ identifies list items which are also footnotes. + +• `´ gives the level of the containing block or + paragraph. + +❦ Doctype, entity, and namespace definitions + +The `&block-types;´ and `¶graph-types;´ entities provide X·Path + tests for determining if the current node is a block or paragraph + element. +The `&level-pi;´ entity provides an X·Path selector for + `´ processing instructions. <书社:id>urn:fdc:ladys.computer:20240512:LesML:parser.xslt + + +❦ Template parameters + +The `LESML_SECTION_BREAK_CHARS´ parameter is provided to allow users to + configure which characters are treated as section break characters. +Changing this value isn¦t recommended, but the option is provided for + now. + + + + +❦ Functions + +The `LesML:split´ function returns a nodeset of H·T·M·L `´ + elements, with each one giving the next item whivh results from + splitting the provided source text on the provided separator. +This is just a convenience wrapper for the `LesML:do-split´ named + template. + + @@ -57,6 +103,22 @@ This forms the “canonical” definition of the Les·M·L syntax; + + +❦ Named templates + +In contrast to matching templates, named templates generally operate on + a single string or a flat list of lines, which are provided as + parameters. +Some of these are small utility templates, whereas others provide the + bulk of the block‐level processing. + +✠ `LesML:do-split´ + +The `LesML:do-split´ template provides the internal implementation for + the `LesML:split´ function. + + + + +✠ `LesML:comment-out´ + +The `LesML:comment-out´ template simply produces an X·M·L comment + containing the provided text. +This is only nontrivial because it has to escape any `{U+2D}{U+2D}´ + sequences, which are not permitted in X·M·L. + +The escaping simply inserts `U+034F COMBINING GRAPHEME JOINER´, which + is (per Unicode) intended to be used when disambiguating digraphs + from two independent characters in sequence. + + + + +✠ `LesML:unescape´ + +The `LesML:unescape´ template takes the provided source text and + processes the Unicode character escapes within it. +The result of this template is raw, unescaped X·M·L, because the only + way to implement Unicode character escapes is to convert them to + X·M·L entities, which do not have a representation in the X·S·L·T + data model. +Consequently, this template must be among the last ones called and its + output cannot be fed back into X·S·L·T safely. + + + + +✠ `LesML:expand-sigils´ + +The `LesML:expand-sigils´ template converts a list of block sigils into + the resulting H·T·M·L structure. +It requires a number of parameters :⁠— + +• The sigils themselves, as a flat list of nodes. + +• The level of the resulting (outermost) block. + +• The type of paragraph to be created within the (innermost) block. + +• For preformatted code paragraphs, the associated syntax name. + +• The lines of content within the paragraph, as a flat list of nodes. + +For as long as the sigils are nonempty, the template is simply called + again with the level incremented and the outer sigil removed, with + the result wrapped appropriately. +When no sigils remain, an appropriate paragraph element is created, + processing as necessary a leading `¶´ only if it appears on the first + line. + +Paragraphs and comments are “wrapped” in a `
´ to enable them to + participate in level nesting, and this `
´ is assigned their + `@id´ and `@lang´. +They will later be “unwrapped”, and reclaim these attributes, if they + are not footnotes or comments and do not have any block children. +If a block is nested within the wrapping `
´ of a paragraph, the + wrapping `
´ is preserved as the wrapper for both the paragraph + and its nested contents. + + + + +✠ `LesML:chunk´ + +The `LesML:chunk´ template converts the provided list of lines into the + list of block elements that they represent. +This comprises the following steps :⁠— + +• Discovering the “last” lines of each block, which are nonempty lines + that are not followed by a nonempty line. + +• Collecting all of the lines in the block; i·e all lines since the + previous “last” line that are not empty. + +• Determining the level of the block and its sigils. + +• Passing the lines of the block, with any leading sigils removed, to + `LesML:expand-sigils´ to generate the actual block element. +This also requires processing any paragraph‐type indicators and + stripping them if necessary to get the type of the underlying + paragraph. + + + + +✠ `LesML:paragraphize´ + +The `LesML:paragraphize´ template first turns the provided lines into + blocks via `LesML:chunk´, arranges those blocks by applying templates + in the `LesML:blockify´ mode, then processes the inline contents of + those blocks by applying templates in the `LesML:inlinify´ mode. +This does much of the processing for the body of a Les·M·L document, + altho some things, like footnote handling, can¦t be fully finalized + until later. + + + + +✠ `LesML:parse´ + +The `LesML:parse´ template is the main entrypoint for Les·M·L handling, + taking a list of input lines and returning a series of resulting + Les·M·L documents. +This template operates by outputting the result of the first document + in the provided lines, and then calling itself recursively with any + remaining documents until the lines are exhausted. + + + + +❦ Templates in the default mode + +This file provides a single template in the default mode: one which + matches H·T·M·L `