Support hgroups with heading continuations

[LesML] / README.markdown
diff --git a/README.markdown b/README.markdown

index 2bfa515283882215a19e576e040b1476277bd468..ac2d0dc5b89facc1fbea5275baed244f33140145 100644 (file)
--- a/README.markdown
+++ b/README.markdown
@@ -1,5 +1,5 @@
  <!--
  <!--
-SPDX-FileCopyrightText: 2024 Lady <https://www.ladys.computer/about/#lady>
+SPDX-FileCopyrightText: 2024, 2025 Lady <https://www.ladys.computer/about/#lady>
  SPDX-License-Identifier: CC0-1.0
  -->
  # 💄📝 Les·M·L
  SPDX-License-Identifier: CC0-1.0
  -->
  # 💄📝 Les·M·L
@@ -52,6 +52,14 @@ Documents in the later case inherit the latest preceding `#!lesml`
  
  Documents are broken into paragraphs by blank lines.
  Empty paragraphs are ignored.
  
  Documents are broken into paragraphs by blank lines.
  Empty paragraphs are ignored.
+
+If every line in the paragraph begins with (optional white·space
+  followed by) `»` it is quoted (`<html:blockquote>`); if every line
+  begins with `]` it is bracketed.
+The lines, minus this leading, are then re‐analysed.
+Bracketed paragraphs which end quotes are treated as captions
+  (`<html:figcaption>`); otherwise, they are footers (`<html:footer>`).
+
  Non·empty paragraphs are classified as follows :⁠—
  
  - If the paragraph consists of only the following section‐break
  Non·empty paragraphs are classified as follows :⁠—
  
  - If the paragraph consists of only the following section‐break
@@ -98,21 +106,16 @@ Non·empty paragraphs are classified as follows :⁠—
    | `＿` | `U+FF3F` | `FULLWIDTH LOW LINE` |
    | `～` | `U+FF5E` | `FULLWIDTH TILDE` |
  
    | `＿` | `U+FF3F` | `FULLWIDTH LOW LINE` |
    | `～` | `U+FF5E` | `FULLWIDTH TILDE` |
  
-- If every line in the paragraph begins with at least one space, then
-    it is considered to be a quoted paragraph (`<html:blockquote>`).
-  There is only one level of paragraph quoting; quoted paragraphs may
-    not be quoted again.
-
  - If every line in the paragraph begins with zero or more white·space
      characters followed by `|`, it is a “preformatted” paragraph and
      white·space is not collapsed (`<html:pre>`).
  - If every line in the paragraph begins with zero or more white·space
      characters followed by `|`, it is a “preformatted” paragraph and
      white·space is not collapsed (`<html:pre>`).
-  A paragraph may be both quoted and preformatted.
  
  
-- Otherwise, the paragraph is unquoted.
+- Otherwise, the paragraph is ordinary.
  
  
-After this classification, each quoted or unquoted paragraph is further
+After this classification, each ordinary paragraph is further
    classified by type based on its first character (which is must be
    classified by type based on its first character (which is must be
-   followed by white·space, or else the only thing on the line) :⁠—
+  followed by white·space, a pilcrow, or else the only thing on the
+  line) :⁠—
  
  - If the paragraph is preformatted, it is an ordinary paragraph.
  
  
  - If the paragraph is preformatted, it is an ordinary paragraph.
  
@@ -170,24 +173,42 @@ After this classification, each quoted or unquoted paragraph is further
    Comments produce X·M·L comment nodes and can be used to break up list
      items into separate lists.
  
    Comments produce X·M·L comment nodes and can be used to break up list
      items into separate lists.
  
-- If the paragraph begins with `⋯`, it is a continuation paragraph
-    (`<html:div class="continuation">`).
-  Continuation paragraphs may be used to continue a preceding list item
-    or quote.
-  Note, however, that an unquoted paragraph cannot continue a quoted
-    one, or vice·versa.
+- If the paragraph begins with `⋯`, it is a continuation paragraph.
+  Continuation paragraphs may be used to continue a preceding div or
+    list item.
+  If there is no such preceding div or list item, they will attach to
+    adjacent heading elements to form heading groups (`<html:hgroup>`).
+  Otherwise, they will be treated as ordinary paragraphs.
  
  - Otherwise, it is an ordinary paragraph.
  
  
  - Otherwise, it is an ordinary paragraph.
  
-Following this sigil (if any, including trailing white·space) there may
-  be a `¶` followed by zero or more non·white·space characters.
+Following this sigil (if any) there may be a `¶` followed by zero or
+  more non·white·space characters.
  The characters following the `¶` give the identifier for the paragraph,
    which is expected to be unique within a document.
  The characters following the `¶` give the identifier for the paragraph,
    which is expected to be unique within a document.
+This may be suffixed with a language tag beginning with `@` and
+  terminated with `$`.
  
  The remaining characters in a paragraph form its contents.
  Markup within paragraphs is delimited with·out exception by pairs of
    characters, with the following precedence :⁠—
  
  
  The remaining characters in a paragraph form its contents.
  Markup within paragraphs is delimited with·out exception by pairs of
    characters, with the following precedence :⁠—
  
+- The characters `⌦` and `⌫` indicate inline comments.
+  A single character `⌧` may be used to indicate an “empty” comment
+    (consisting of `U+034F COMBINING GRAPHEME JOINER` for X·M·L
+    compatibility).
+
+- The characters `{@` and `"}` indicate attribute specifications.
+  The attribute specification must contain at least one `="` which
+    separates the key of the attribute from the value.
+  Attributes attach to the previous element or text node, with
+    white·space‐only text nodes after elements ignored; if there is no
+    such previous element or text node, an empty text node is used
+    instead.
+  Multiple attributes can be given in sequence using multiple
+    specifications.
+  Text nodes with attributes are wrapped in `<html:span>`.
+
  - The characters `{🔗` and `>}` indicate a hyperlink to a U·R·L
      (`<html:a>`).
    The hyperlink must contain at least one `<`; the content before the
  - The characters `{🔗` and `>}` indicate a hyperlink to a U·R·L
      (`<html:a>`).
    The hyperlink must contain at least one `<`; the content before the
@@ -205,32 +226,27 @@ Markup within paragraphs is delimited with·out exception by pairs of
  - The characters `⸨` and `⸩` indicate parenthetical content
      (`<html:small>`).
  
  - The characters `⸨` and `⸩` indicate parenthetical content
      (`<html:small>`).
  
-- The characters `☞︎` and `☜︎` indicate strong importance
-    (`<html:strong>`).
-
-- The characters `⹐` and `⹑` indicate emphasis (`<html:em>`).
+- The characters `` ` `` and `´` indicate code (`<html:code>`).
  
  - The characters `⟪` and `⟫` indicate titles (`<html:cite>`).
  
  
  - The characters `⟪` and `⟫` indicate titles (`<html:cite>`).
  
+- The characters `⸶` and `⸷` indicate names (`<html:u class="name">`).
+
  - The characters `⟨` and `⟩` indicate offset text (`<html:i>`).
  - The characters `⟨` and `⟩` indicate offset text (`<html:i>`).
-  This may be followed by a `@`, a language tag, and a `$` to provide
-    the language of the text.
  
  - The characters `⦃` and `⦄` indicate keyword highlighting
      (`<html:b>`).
  
  
  - The characters `⦃` and `⦄` indicate keyword highlighting
      (`<html:b>`).
  
-- The characters `` ` `` and `´` indicate code (`<html:code>`).
+- The characters `☞︎` and `☜︎` indicate strong importance
+    (`<html:strong>`).
+
+- The characters `⹐` and `⹑` indicate emphasis (`<html:em>`).
  
  Once the tree is built as above, it is remediated into its final form
    by the following steps :⁠—
  
  
  Once the tree is built as above, it is remediated into its final form
    by the following steps :⁠—
  
-- Successive quoted paragraphs are joined into one quote.
-  If the final quoted paragraph is an ordinary paragraph which begins
-    with `—` and a space, the quote is wrapped in a `<html:figure>`
-    and the final paragraph becomes its `<html:figcaption>`.
-
  - Continuation paragraphs are joined with the preceding list items or
  - Continuation paragraphs are joined with the preceding list items or
-    quotes.
+    divs.
  
  - List items of a higher level are nested in preceding list items, when
      present.
  
  - List items of a higher level are nested in preceding list items, when
      present.
@@ -241,10 +257,12 @@ Once the tree is built as above, it is remediated into its final form
  - Linebreaks in preformatted paragraphs are replaced with `<html:br>`.
  
  Finally, any character can be escaped by instead providing its Unicode
  - Linebreaks in preformatted paragraphs are replaced with `<html:br>`.
  
  Finally, any character can be escaped by instead providing its Unicode
-  codepoint in the form `<U+NNNN>`, where `NNNN` is one or more
+  codepoint in the form `{U+NNNN}`, where `NNNN` is one or more
    hexadecimal digits.
  Multiple codepoints may be provided separated by periods, as in
    hexadecimal digits.
  Multiple codepoints may be provided separated by periods, as in
-  `<U+WWWW.ZZZZ>`
+  `{U+WWWW.ZZZZ}`.
+Due to limitations in X·S·L·T, characters cannot be escaped in
+  attributes (including link targets).
  
  ## Usage
  
  
  ## Usage