]> Lady’s Gitweb - LesML/blob - README.markdown
Initial implementation
[LesML] / README.markdown
1 <!--
2 SPDX-FileCopyrightText: 2024 Lady <https://www.ladys.computer/about/#lady>
3 SPDX-License-Identifier: CC0-1.0
4 -->
5 # 💄📝 Les·M·L
6
7 <b>Ladys simple markup language.</b>
8
9 💄📝 Les·M·L is a document markup language designed with two goals in
10 mind :⁠—
11
12 1. It must be trivial to parse, even with limited tooling such as that
13 provided by X·S·L·T.
14
15 2. It must be sophisticated enough to handle longform hypertext
16 documents and associated metadata.
17
18 It is implemented as an X·S·L·T transformation from a
19 `<html:script type="text/lesml">` element into H·T·M·L
20 (`parser.xslt`).
21
22 ## Nomenclature
23
24 <i>Les·M·L</i> is an abbreviation of the phrase “Ladys Extremely Simple
25 Markup Language”.
26
27 ## Markup Syntax
28
29 The first line of any 💄📝 Les·M·L document should be the string
30 `#!lesml`.
31
32 Following the shebang, document metadata may be provided in the [Record
33 Jar][draft-phillips-record-jar-01] format.
34 The body of the document begins after the last line which begins with
35 the string `%%`, or after the shebang line if none exists.
36
37 Documents are broken into paragraphs by blank lines.
38 Non·empty paragraphs are classified as follows :⁠—
39
40 - If the paragraph consists of only the characters
41 `#*-=_~⁂─━┄┅┈┉╌╍═╴╶╸╺☙❧` plus any amount of white·space, then it is
42 considered to be a section break (`<html:hr>`).
43
44 - If every line in the paragraph begins with at least one space, then
45 it is considered to be a quoted paragraph (`<html:blockquote>`).
46 There is only one level of paragraph quoting; quoted paragraphs may
47 not be quoted again.
48
49 - Otherwise, the paragraph is unquoted.
50
51 After this classification, each quoted or unquoted paragraph is further
52 classified by type based on its first character (which is must be
53 followed by white·space to be recognized) :⁠—
54
55 - If the paragraph begins with `⁌`, it is a chapter heading
56 (`<html:h1>`).
57
58 - If the paragraph begins with `§`, it is a section heading
59 (`<html:h2>`).
60
61 - If the paragraph begins with `✠`, it is a subsection heading
62 (`<html:h3>`).
63
64 - If the paragraph begins with `❦`, it is a subsubsection heading
65 (`<html:h4>`).
66
67 - If the paragraph begins with `•` or `🔢`, it is a primary unordered
68 or ordered list item (`<html:li class="unordered" data-level="1">`
69 or `<html:li class="ordered" data-level="1">`).
70
71 - If the paragraph begins with `◦` or `🔠`, it is a secondary unordered
72 or ordered list item (`<html:li class="unordered" data-level="2">`
73 or `<html:li class="ordered" data-level="2">`).
74 Secondary list items are considered to be nested inside of primary
75 list items which precede them.
76
77 - If the paragraph begins with `‣` or `🔡`, it is a tertiary unordered
78 or ordered list item (`<html:li class="unordered" data-level="3">`
79 or `<html:li class="ordered" data-level="3">`).
80 Tertiary list items are considered to be nested inside of primary
81 and secondary list items which precede them.
82
83 - If the paragraph begins with `⁃` or `🔣`, it is a quaternary
84 unordered or ordered list item
85 (`<html:li class="unordered" data-level="4">` or
86 `<html:li class="ordered" data-level="4">`).
87 Quaternary list items are considered to be nested inside of primary,
88 secondary, and tertiary list items which precede them.
89
90 - If the paragraph begins with `※`, it is an ordinary note
91 (`<html:div role="note" class="note">`).
92
93 - If the paragraph begins with `☡`, it is a cautionary note
94 (`<html:div role="note" class="caution">`).
95
96 - If the paragraph begins with `🛈`, it is an informative note
97 (`<html:div role="note" class="info">`).
98
99 - If the paragraph begins with `⯑`, it is a questioning note
100 (`<html:div role="note" class="query">`).
101
102 - If the paragraph begins with `⚠︎`, it is a warning note
103 (`<html:div role="note" class="warn">`).
104
105 - If the paragraph begins with `⋯`, it is a continuation paragraph
106 (`<html:div class="continuation">`).
107 Continuation paragraphs may be used to continue a preceding list item
108 or quote.
109 Note, however, that an unquoted paragraph cannot continue a quoted
110 one, or vice·versa.
111
112 - Otherwise, it is an ordinary paragraph.
113
114 Following this sigil (if any, including trailing white·space) there may
115 be a `¶` followed by zero or more non·white·space characters.
116 The characters following the `¶` give the identifier for the paragraph,
117 which is expected to be unique within a document.
118
119 The remaining characters in a paragraph form its contents.
120 Markup within paragraphs is delimited with·out exception by pairs of
121 characters, with the following precedence :⁠—
122
123 - The characters `{🔗` and `>}` indicate a hyperlink to a U·R·L
124 (`<html:a>`).
125 The hyperlink must contain at least one `<`; the content before the
126 last `<` gives the text of the link, and the content after gives
127 the U·R·L that the link points to.
128 If no text is given, the U·R·L will be used instead.
129
130 - The characters `⸠` and `⸡` indicate a strikethru (`<html:s>`).
131
132 - The characters `⸤` and `⸥` indicate underlining (`<html:u>`).
133
134 - The characters `⟦` and `⟧` indicate an inline note
135 (`<html:small role="note">`).
136
137 - The characters `⸨` and `⸩` indicate parenthetical content
138 (`<html:small>`).
139
140 - The characters `☞︎` and `☜︎` indicate strong importance
141 (`<html:strong>`).
142
143 - The characters `⹐` and `⹑` indicate emphasis (`<html:em>`).
144
145 - The characters `⟪` and `⟫` indicate titles (`<html:cite>`).
146
147 - The characters `⟨` and `⟩` indicate offset text (`<html:i>`).
148 This may be followed by a `@`, a language tag, and a `$` to provide
149 the language of the text.
150
151 - The characters `⦃` and `⦄` indicate keyword highlighting
152 (`<html:b>`).
153
154 - The characters `` ` `` and `´` indicate code (`<html:code>`).
155
156 Once the tree is built as above, it is remediated into its final form
157 by the following steps :⁠—
158
159 - Successive quoted paragraphs are joined into one quote.
160 If the final quoted paragraph is an ordinary paragraph which begins
161 with `—` and a space, the quote is wrapped in a `<html:figure>`
162 and the final paragraph becomes its `<html:figcaption>`.
163
164 - Continuation paragraphs are joined with the preceding list items or
165 quotes.
166
167 - List items of a higher level are nested in preceding list items, when
168 present.
169
170 - Successive list items of the same level and class are joined into
171 a single list.
172
173 Finally, any character can be escaped by instead providing its Unicode
174 codepoint in the form `<U+NNNN>`, where `NNNN` is one or more
175 hexadecimal digits.
176 Multiple codepoints may be provided separated by periods, as in
177 `<U+WWWW.ZZZZ>`
178
179 ## Usage
180
181 💄📝 Les·M·L is designed for usage with [⛩️📰 书社][Shushe].
182 Simply include the `parser.xslt` provided by this repository to
183 ⛩️📰 书社 as an additional parser, and `magic` as an additional
184 magic file.
185
186 ## License
187
188 This repository conforms to [REUSE][].
189
190 The parser is licensed under the terms of the <cite>Mozilla Public
191 License, version 2.0</cite>.
192
193 [REUSE]: <https://reuse.software/spec/>
194 [Shushe]: <https://git.ladys.computer/Shushe/>
195 [draft-phillips-record-jar-01]: <https://datatracker.ietf.org/doc/html/draft-phillips-record-jar-01>
This page took 0.065739 seconds and 5 git commands to generate.