-<!--
-SPDX-FileCopyrightText: 2024, 2025 Lady <https://www.ladys.computer/about/#lady>
-SPDX-License-Identifier: CC0-1.0
--->
-# 💄📝 Les·M·L
-
-<b>Ladys simple markup language.</b>
-
-💄📝 Les·M·L is a document markup language designed with two goals in
- mind :—
-
-1. It must be trivial to parse, even with limited tooling such as that
- provided by X·S·L·T.
-
-2. It must be sophisticated enough to handle longform hypertext
- documents and associated metadata.
-
-It is implemented as an X·S·L·T transformation from a
- `<html:script type="text/lesml">` element into H·T·M·L
- (`parser.xslt`).
-
-## Nomenclature
-
-<i>Les·M·L</i> is an abbreviation of the phrase “Ladys Extremely Simple
- Markup Language”.
-
-## Markup Syntax
-
-The first line of any 💄📝 Les·M·L document should be the string
- `#!lesml`.
-A language tag may follow this, beginning with `@` and terminated with
- `$`, like so:
-`#!lesml@en$`.
-Regardless of whether a language tag is present, the shebang line may
- be terminated by a space‐separated list of properties of the form
- `key=value`.
-Only one property is currently permitted: `profile`, whose value should
- be a U·R·I and is translated to the `@data-lesml-profile` attribute
- on the resulting `<html:article>` element.
-
-Following the shebang line, document metadata may be provided in the
- [Record Jar][draft-phillips-record-jar-01] format.
-The body of the document begins after the last line which begins with
- the string `%%`, or after the shebang line if none exists.
-
-Multiple documents can be catenated into a single file; a new document
- is begun on any line which starts with `#!lesml` or `##`.
-Documents in the later case inherit the latest preceding `#!lesml`
- declaration.
-`##` may be followed by other text; this is treated as an interdocument
- comment.
-
-Documents are broken into paragraphs by blank lines.
-Empty paragraphs are ignored.
-
-If every line in the paragraph begins with (optional white·space
- followed by) `»` it is quoted (`<html:blockquote>`); if every line
- begins with `]` it is bracketed.
-The lines, minus this leading, are then re‐analysed.
-Bracketed paragraphs which end quotes are treated as captions
- (`<html:figcaption>`); otherwise, they are footers (`<html:footer>`).
-
-Non·empty paragraphs (which, to be clear, may still result in empty
- `<html:p>` elements) are classified as follows :—
-
-- If the paragraph consists of only the following section‐break
- characters, plus any amount of white·space, then it is
- considered to be a section break (`<html:hr>`).
-
- The section break characters are :—
-
- | Character | Codepoint | Unicode Name |
- | --------- | --------- | ------------ |
- | `*` | `U+002A` | `ASTERISK` |
- | `-` | `U+002D` | `HYPHEN-MINUS` |
- | `.` | `U+002E` | `FULL STOP` |
- | `=` | `U+003D` | `EQUALS SIGN` |
- | `_` | `U+005F` | `LOW LINE` |
- | `~` | `U+007E` | `TILDE` |
- | `·` | `U+00B7` | `MIDDLE DOT` |
- | `․` | `U+2024` | `ONE DOT LEADER` |
- | `‥` | `U+2025` | `TWO DOT LEADER` |
- | `…` | `U+2026` | `HORIZONTAL ELLIPSIS` |
- | `⁂` | `U+2042` | `ASTERISM` |
- | `⋯` | `U+22EF` | `MIDLINE HORIZONTAL ELLIPSIS` |
- | `─` | `U+2500` | `BOX DRAWINGS LIGHT HORIZONTAL` |
- | `━` | `U+2501` | `BOX DRAWINGS HEAVY HORIZONTAL` |
- | `┄` | `U+2504` | `BOX DRAWINGS LIGHT TRIPLE DASH HORIZONTAL` |
- | `┅` | `U+2505` | `BOX DRAWINGS HEAVY TRIPLE DASH HORIZONTAL` |
- | `┈` | `U+2508` | `BOX DRAWINGS LIGHT QUADRUPLE DASH HORIZONTAL` |
- | `┉` | `U+2509` | `BOX DRAWINGS HEAVY QUADRUPLE DASH HORIZONTAL` |
- | `╌` | `U+254C` | `BOX DRAWINGS LIGHT DOUBLE DASH HORIZONTAL` |
- | `╍` | `U+254D` | `BOX DRAWINGS HEAVY DOUBLE DASH HORIZONTAL` |
- | `═` | `U+2550` | `BOX DRAWINGS DOUBLE HORIZONTAL` |
- | `╴` | `U+2574` | `BOX DRAWINGS LIGHT LEFT` |
- | `╶` | `U+2576` | `BOX DRAWINGS LIGHT RIGHT` |
- | `╸` | `U+2578` | `BOX DRAWINGS HEAVY LEFT` |
- | `╺` | `U+257A` | `BOX DRAWINGS HEAVY RIGHT` |
- | `☙` | `U+2619` | `REVERSED ROTATED FLORAL HEART BULLET` |
- | `❧` | `U+2767` | `ROTATED FLORAL HEART BULLET` |
- | ` ` | `U+3000` | `IDEOGRAPHIC SPACE` |
- | `・` | `U+30FB` | `KATAKANA MIDDLE DOT` |
- | `*` | `U+FF0A` | `FULLWIDTH ASTERISK` |
- | `-` | `U+FF0D` | `FULLWIDTH HYPHEN-MINUS` |
- | `.` | `U+FF0E` | `FULLWIDTH FULL STOP` |
- | `=` | `U+FF1D` | `FULLWIDTH EQUALS SIGN` |
- | `_` | `U+FF3F` | `FULLWIDTH LOW LINE` |
- | `~` | `U+FF5E` | `FULLWIDTH TILDE` |
-
-- If every line in the paragraph begins with zero or more white·space
- characters followed by `|`, it is a “preformatted” paragraph and
- white·space is not collapsed (`<html:pre>`).
-
-- Otherwise, the paragraph is ordinary.
-
-After this classification, each ordinary paragraph is further
- classified by type based on its first character (which must be
- followed by white·space or a pilcrow, or else be the only thing on
- the line) :—
-
-- If the paragraph is preformatted, it is an ordinary paragraph.
-
-- If the paragraph begins with `⁌`, it is a chapter heading
- (`<html:h1>`).
-
-- If the paragraph begins with `§`, it is a section heading
- (`<html:h2>`).
-
-- If the paragraph begins with `❦`, it is a subsection heading
- (`<html:h3>`).
-
-- If the paragraph begins with `✠`, it is a subsubsection heading
- (`<html:h4>`).
-
-- If the paragraph begins with `•` or `🔢`, it is a primary unordered
- or ordered list item (`<html:li class="unordered" data-level="1">`
- or `<html:li class="ordered" data-level="1">`).
-
-- If the paragraph begins with `◦` or `🔠`, it is a secondary unordered
- or ordered list item (`<html:li class="unordered" data-level="2">`
- or `<html:li class="ordered" data-level="2">`).
- Secondary list items are considered to be nested inside of primary
- list items which precede them.
-
-- If the paragraph begins with `▪` or `🔡`, it is a tertiary unordered
- or ordered list item (`<html:li class="unordered" data-level="3">`
- or `<html:li class="ordered" data-level="3">`).
- Tertiary list items are considered to be nested inside of primary
- and secondary list items which precede them.
-
-- If the paragraph begins with `⁃` or `🔣`, it is a quaternary
- unordered or ordered list item
- (`<html:li class="unordered" data-level="4">` or
- `<html:li class="ordered" data-level="4">`).
- Quaternary list items are considered to be nested inside of primary,
- secondary, and tertiary list items which precede them.
-
-- If the paragraph begins with `※`, it is an ordinary note
- (`<html:section role="note" class="note">`).
-
-- If the paragraph begins with `☡`, it is a cautionary note
- (`<html:section role="note" class="caution">`).
-
-- If the paragraph begins with `⯑`, it is a questioning note
- (`<html:section role="note" class="query">`).
-
-- If the paragraph begins with `@`, it is an abstract
- (`<html:section role="doc-abstract">`).
-
-- If the paragraph begins with `🛈`, it is a (informative) tip
- (`<html:section role="doc-tip">`).
-
-- If the paragraph begins with `⚠︎`, it is a (warning) notice
- (`<html:section role="doc-notice">`).
-
-- If the paragraph begins with `^`, it is a footnote
- (`<html:li class="ordered footnote" data-level="1">`).
- Footnotes are ignored unless their first paragraph has an i·d
- (specified with `¶`) which is referenced by one or more footnote
- references.
- Footnotes are treated as level 1 ordered list items, so they can
- contain nested lists.
-
- Footnotes are removed from the normal document flow and placed in a
- footer (`<html:section role="doc-endnotes">`) in order of first
- reference.
- It is recommended that the i·d¦s you choose are kept stable, so that
- links to footnotes do not break.
-
-- If the paragraph begins with `#`, it is a comment.
- Comments produce X·M·L comment nodes and can be used to break up list
- items into separate lists.
-
-- If the paragraph begins with `⋯`, it is a continuation paragraph.
- Continuation paragraphs may be used to continue a preceding note,
- footnote, or list item.
- If there is no such preceding note, footnote, or list item, they will
- attach to adjacent heading elements to form heading groups
- (`<html:hgroup>`).
- Otherwise, they will be treated as ordinary paragraphs.
-
-- Otherwise, it is an ordinary paragraph.
-
-Following this sigil (if any) there may be a `¶` followed by zero or
- more non·white·space characters.
-The characters following the `¶` give the identifier for the paragraph,
- which is expected to be unique within a document.
-This may be suffixed with a language tag beginning with `@` and
- terminated with `$`.
-
-When a paragraph produces an `<html:p>` element “wrapped in” another
- kind of element (e·g, a blockquote, section, or list item), the
- identifier and language of the first paragraph are applied to the
- wrapping element.
-If the first paragraph has no other contents, it is deleted.
-To apply the identifier or language to the `<html:p>` element itself,
- and not its wrapper, one can simply make the first paragraph empty
- (using a literal `¶` with no other contents).
-This paragraph will be dropped, but the following paragraphs will still
- be processed as non·initial.
-
-The remaining characters in a paragraph form its contents.
-Markup within paragraphs is delimited with·out exception by pairs of
- characters, with the following precedence :—
-
-- The characters `⌦` and `⌫` indicate inline comments.
- A single character `⌧` may be used to indicate an “empty” comment
- (consisting of `U+034F COMBINING GRAPHEME JOINER` for X·M·L
- compatibility).
-
-- The characters `{@` and `"}` indicate attribute specifications.
- The attribute specification must contain at least one `="` which
- separates the key of the attribute from the value.
- Attributes attach to the previous element or text node, with
- white·space‐only text nodes after elements ignored; if there is no
- such previous element or text node, an empty text node is used
- instead.
- Multiple attributes can be given in sequence using multiple
- specifications.
- Text nodes with attributes are wrapped in `<html:span>`.
-
-- The characters `{🔗` and `>}` indicate a hyperlink to a U·R·L
- (`<html:a>`).
- The hyperlink must contain at least one `<`; the content before the
- last `<` gives the text of the link, and the content after gives
- the U·R·L that the link points to.
- If no text is given, the U·R·L will be used instead.
-
-- The characters `⸠` and `⸡` indicate a strikethru (`<html:s>`).
-
-- The characters `⸤` and `⸥` indicate underlining (`<html:u>`).
-
-- The characters `⟦` and `⟧` indicate an inline note
- (`<html:small role="note">`).
-
-- The characters `⸨` and `⸩` indicate parenthetical content
- (`<html:small>`).
-
-- The characters `` ` `` and `´` indicate code (`<html:code>`).
-
-- The characters `⟪` and `⟫` indicate titles (`<html:cite>`).
-
-- The characters `⸶` and `⸷` indicate names (`<html:u class="name">`).
-
-- The characters `⟨` and `⟩` indicate offset text (`<html:i>`).
-
-- The characters `⦃` and `⦄` indicate keyword highlighting
- (`<html:b>`).
-
-- The characters `☞︎` and `☜︎` indicate strong importance
- (`<html:strong>`).
-
-- The characters `⹐` and `⹑` indicate emphasis (`<html:em>`).
-
-- The characters `^` and `.` indicate a footnote reference
- (`<html:a role="doc-noteref">`).
- The characters between these sigils must match the i·d of the first
- paragraph of some footnote in the same document.
-
-Once the tree is built as above, it is remediated into its final form
- by the following steps :—
-
-- Continuation paragraphs are joined with the preceding list items or
- sections.
-
-- List items of a higher level are nested in preceding list items, when
- present.
- List items of a level greater than 1 can also be nested in preceding
- sections (notes, abstracts, ⁊·c…).
-
-- Successive list items of the same level and class are joined into
- a single list.
-
-- Linebreaks in preformatted paragraphs are replaced with `<html:br>`.
-
-Finally, any character can be escaped by instead providing its Unicode
- codepoint in the form `{U+NNNN}`, where `NNNN` is one or more
- hexadecimal digits.
-Multiple codepoints may be provided separated by periods, as in
- `{U+WWWW.ZZZZ}`.
-Due to limitations in X·S·L·T, characters cannot be escaped in
- attributes (including link targets).
-
-## Usage
-
-💄📝 Les·M·L is designed for usage with [⛩📰 书社][Shushe].
-Simply include the `parser.xslt` provided by this repository to
- ⛩📰 书社 as an additional parser, and `magic` as an additional
- magic file.
-
-## License
-
-This repository conforms to [REUSE][].
-
-The parser is licensed under the terms of the <cite>Mozilla Public
- License, version 2.0</cite>.
-
-[REUSE]: <https://reuse.software/spec/>
-[Shushe]: <https://git.ladys.computer/Shushe/>
-[draft-phillips-record-jar-01]: <https://datatracker.ietf.org/doc/html/draft-phillips-record-jar-01>