]> Lady’s Gitweb - Shushe/blob - README.markdown
654904ae9eda3ab787fbdc83dc7b001778f5ffda
[Shushe] / README.markdown
1 # ⛩️📰 书社
2
3 <b>A make·file for X·M·L.</b>
4
5 <dfn>⛩️📰 书社</dfn> aims to make it easy to generate websites with
6 X·S·L·T and G·N·U Make.
7 It is consequently only a good choice for people who like X·S·L·T and
8 G·N·U Make and wish it were easier to make websites with them.
9
10 It makes things easier by :⁠—
11
12 - Automatically identifying source files and characterizing them by
13 type (X·M·L, text, or asset).
14
15 - Parsing supported text types into X·M·L trees.
16
17 - Enabling easy inclusion of source files within each other.
18
19 It aims to do this with zero dependencies beyond the programs already
20 installed on your computer.
21 (On Linux machines, you may need to install `libxml2-utils` to get the
22 commandline programs from `libxml2`.)
23
24 **Note:**
25 ⛩️📰 书社 requires functionality present in G·N·U Make 3.81 (or later)
26 and will not work in previous versions, or other implementations of
27 Make.
28 Compatibility with later versions of G·N·U Make is assumed, but not
29 tested.
30
31 ## Nomenclature
32
33 <i lang="cmn-Hans">书社</i> is a Chinese word meaning “publishing
34 house”.
35
36 The first character, <i lang="cmn-Hans">书</i>, is the simplified form
37 of “document”.
38
39 The second character, <i lang="cmn-Hans">社</i>, contemporarily means
40 “association”, but historically referred to the god of the soil and
41 related altars or festivities.
42 In Japanese, it is an alternate spelling for <i lang="ja">やしろ</i>,
43 the word for “Shinto shrine”.
44
45 The name <i lang="cmn-Hans">书社</i> was chosen to play on this pun, as
46 it is intended as a publishing program for webshrines.
47
48 In Ascii environments, ⛩️📰 书社 should be written `Shushe`, following
49 the pinyin transliteration.
50
51 ## Basic Usage
52
53 Place source files in `sources/` and run `make install` to compile
54 the result to `public/`.
55 Compilation involves the following steps :⁠—
56
57 1. ⛩️📰 书社 compiles all of the magic files in `magic/` into a single
58 file, `build/magic.mgc`.
59
60 2. ⛩️📰 书社 processes all of the parsers in `parsers/` and determines
61 the list of supported plaintext types.
62
63 3. ⛩️📰 书社 identifies all of the source files and includes and uses
64 `build/magic.mgc` to classify them by media type.
65
66 4. ⛩️📰 书社 parses all plaintext and X·M·L source files and includes
67 and then builds a dependency tree between them.
68
69 5. ⛩️📰 书社 uses the dependency tree to establish prerequisites for
70 each output file.
71
72 6. ⛩️📰 书社 compiles each output file to `build/public`.
73
74 7. ⛩️📰 书社 copies the output files to `public`.
75
76 You can use `make list` to list each identified source file or include
77 alongside its computed type and dependencies.
78 As this is a Make‐based program, steps will only be run if the
79 corresponding buildfile or output file is older than its
80 prerequisites.
81
82 ## Namespaces
83
84 The ⛩️📰 书社 namespace is `urn:fdc:ladys.computer:20231231:Shu1She4`.
85
86 This document uses a few namespace prefixes, with the following
87 meanings :⁠—
88
89 | Prefix | Expansion |
90 | -------: | :----------------------------------------- |
91 | `html:` | `http://www.w3.org/1999/xhtml` |
92 | `xlink:` | `http://www.w3.org/1999/xlink` |
93 | `xslt:` | `http://www.w3.org/1999/XSL/Transform` |
94 | `书社:` | `urn:fdc:ladys.computer:20231231:Shu1She4` |
95
96 ## Setup and Configuration
97
98 ⛩️📰 书社 depends on the following programs to run.
99 In every case, you may supply your own implementation by overriding the
100 corresponding (allcaps) variable (e·g, set `MKDIR` to supply your own
101 `mkdir` implementation).
102
103 - `awk`
104 - `cat`
105 - `cp`
106 - `date`
107 - `echo`
108 - `file`
109 - `find`
110 - `git` (optional; set `GIT=` to disable)
111 - `mkdir` (requires support for `-p`)
112 - `mv`
113 - `od` (requires support for `-t x1`)
114 - `printf`
115 - `rm`
116 - `sed`
117 - `sleep`
118 - `stat`
119 - `test`
120 - `touch`
121 - `tr` (requires support for `-d`)
122 - `uuencode` (requires support for `-m` and `-r`)
123 - `xargs` (requires support for `-0`)
124 - `xmlcatalog` (provided by `libxml2`)
125 - `xmllint` (provided by `libxml2`)
126 - `xsltproc` (provided by `libxslt`)
127
128 The following additional variables can be used to control the behaviour
129 of ⛩️📰 书社 :⁠—
130
131 - **`SRCDIR`:**
132 The location of the source files (default: `sources`).
133 Multiple source directories can be provided, so long as the same
134 file subpath doesn’t exist in more than one of them.
135
136 - **`INCLUDEDIR`:**
137 The location of source includes (default: `sources/includes`).
138 This can be inside of `SRCDIR`, but needn’t be.
139 Multiple include directories can be provided, so long as the same
140 file subpath doesn’t exist in more than one of them.
141
142 - **`BUILDDIR`:**
143 The location of the (temporary) build directory (default: `build`).
144 `make clean` will delete this, and it is recommended that it not be
145 used for programs aside from ⛩️📰 书社.
146
147 - **`DESTDIR`:**
148 The location of directory to output files to (default: `public`).
149 `make install` will overwrite files in this directory which
150 correspond to those in `SRCDIR`.
151 It *will not* touch other files, including those generated from files
152 in `SRCDIR` which have since been deleted.
153
154 Files are first compiled to `$(BUILDDIR)/public` before they are
155 copied to `DESTDIR`, so this folder is relatively quick and
156 inexpensive to re·create.
157 It’s reasonable to simply delete it before every `make install` to
158 ensure stale content is removed.
159
160 - **`THISDIR`:**
161 The location of the ⛩️📰 书社 `GNUmakefile`.
162 This should be set automatically when calling Make and shouldn’t ever
163 need to be set manually.
164 This variable is used to find the ⛩️📰 书社 `lib/` folder, which is
165 expected to be in the same location.
166
167 - **`MAGICDIR`:**
168 The location of the magic files to use (default: `$(THISDIR)/magic`).
169
170 - **`FINDRULES`:**
171 Rules to use with `find` when searching for source files.
172 The default ignores files that start with a period or hyphen‐minus
173 and those which contain a hash, buck, percent, asterisk, colon,
174 semi, eroteme, bracket, backslash, or pipe.
175
176 - **`EXTRAFINDRULES`:**
177 The value of this variable is appended to `FINDRULES` by default, to
178 enable additional rules without overriding the existing ones.
179
180 - **`FINDINCLUDERULES`:**
181 Rules to use with `find` when searching for includes (default:
182 `$(FINDRULES)`).
183
184 - **`EXTRAFINDINCLUDERULES`:**
185 The value of this variable is appended to `FINDINCLUDERULES` by
186 default, to enable additional rules without overriding the existing
187 ones.
188
189 - **`PARSERS`:**
190 A white·space‐separated list of parsers to use (default:
191 `$(THISDIR)/parsers/*.xslt`).
192
193 - **`EXTRAPARSERS`:**
194 The value of this variable is appended to `PARSERS` by default, to
195 enable additional parsers without overriding the existing ones.
196
197 - **`TRANSFORMS`:**
198 A white·space‐separated list of transforms to use (default:
199 `$(THISDIR)/transforms/*.xslt`).
200
201 - **`EXTRATRANSFORMS`:**
202 The value of this variable is appended to `TRANSFORMS` by default, to
203 enable additional transforms without overriding the existing ones.
204
205 - **`XMLTYPES`:**
206 A white·space‐separated list of media types to consider X·M·L
207 (default: `application/xml text/xml`).
208
209 - **`VERBOSE`:**
210 If this variable has a value, every recipe instruction will be
211 printed when it runs (default: empty).
212 This is helpful for debugging, but typically too noisy for general
213 usage.
214
215 ## Source Files
216
217 Source files may be placed in `SRCDIR` in any manner; the file
218 structure used there will match the output.
219 The type of source files is *not* determined by file extension, but
220 rather by magic number; this means that files **must** begin with
221 something recognizable.
222 Supported magic numbers include :⁠—
223
224 - `<?xml` for `application/xml` files
225 - `#!js` for `text/javascript` files
226 - `@charset "` for `text/css` files
227 - `#!tsv` for `text/tab-separated-values` files
228 - `%%` for `text/record-jar` files (unregistered; see
229 [[draft-phillips-record-jar-01][]])
230
231 Text formats with associated X·S·L·T parsers are wrapped in a H·T·M·L
232 `<script>` element whose `@type` gives its media type, and then
233 passed to the parser to process.
234 Source files whose media type does not have an associated X·S·L·T
235 parser are considered “assets” and will not be transformed.
236
237 **☡ For compatibility with this program, source file·names must not
238 contain Ascii white·space, colons (`:`), semis (`;`), pipes (`|`),
239 bucks (`$`), percents (`%`), hashes (`#`), asterisks (`*`), brackets
240 (`[` or `]`), erotemes (`?`), backslashes (`\`), or control
241 characters, and must not begin with a hyphen‐minus (`-`).**
242 The former characters have the potential to conflict with make syntax,
243 and a leading hyphen‐minus is confusable for a command‐line argument.
244
245 ## Parsers
246
247 Parsers are used to convert plaintext files into X·M·L trees, as well
248 as convert plaintext formats which are already included inline in
249 existing source X·M·L documents.
250 ⛩️📰 书社 comes with some parsers; namely :⁠—
251
252 - **`parsers/plain.xslt`:**
253 Wraps `text/plain` contents in a `<html:pre class="plain">` element.
254
255 - **`parsers/record-jar.xslt`:**
256 Converts `text/record-jar` contents into a
257 `<html:div class="record-jar">` of `<html:dl>` elements (one for
258 each record).
259
260 - **`parsers/tsv.xslt`:**
261 Converts `text/tab-separated-values` contents into an
262 `<html:table class="tsv">` element.
263
264 New ⛩️📰 书社 parsers which target plaintext formats should have an
265 `<xslt:template>` element with no `@name` or `@mode` and whose
266 `@match` attribute…
267
268 - Starts with an appropriately‐namespaced qualified name for a
269 `<html:script>` element.
270
271 - Follows this with the string `[@type=`.
272
273 - Follows this with a quoted string giving a media type supported by
274 the parser.
275 Media type parameters are *not* supported.
276
277 - Follows this with the string `]`.
278
279 For example, the trivial `text/plain` parser is defined as follows :⁠—
280
281 ```xml
282 <?xml version="1.0"?>
283 <transform
284 xmlns="http://www.w3.org/1999/XSL/Transform"
285 xmlns:html="http://www.w3.org/1999/xhtml"
286 xmlns:书社="urn:fdc:ladys.computer:20231231:Shu1She4"
287 version="1.0"
288 >
289 <书社:id>example:text/plain</书社:id>
290 <template match="html:script[@type='text/plain']">
291 <html:pre><value-of select="."/></html:pre>
292 </template>
293 </transform>
294 ```
295
296 ⛩️📰 书社 will scan the provided parsers for this pattern to determine
297 the set of allowed plaintext file types.
298 Multiple such `<xslt:template>` elements may be provided in a single
299 parser, for example if the parser supports multiple media types.
300 Alternatively, you can set the `@书社:supported-media-types` attribute
301 on the root element of the parser to override media type support
302 detection.
303
304 Even when `@书社:supported-media-types` is set, it is a requirement
305 that each parser transform any `<html:script>` elements with a
306 `@type` which matches their registered types into something else.
307 Otherwise the parser will be stuck in an endless loop.
308 The result tree of applying the transform to the `<html:script>`
309 element will be reparsed (in case any new `<html:script>` elements
310 were added in its subtree), and a `@书社:parsed-by` attribute will be
311 added to each toplevel element in the result.
312 The value of this attribute will be the value of the `<书社:id>`
313 toplevel element in the parser.
314
315 It is possible for parsers to support zero plaintext types.
316 This is useful when targeting specific dialects of X·M·L; parsers in
317 this sense operate on the same basic principles as transforms
318 (described below).
319 The major distinction between X·M·L parsers and transforms is where in
320 the process the transformation happens:
321 Parsers are applied *prior* to embedding (and can be used to generate
322 embeds); transforms are applied *after*.
323
324 It is **strongly recommended** that auxillary templates in parsers be
325 namespaced (by `@name` or `@mode`) whenever possible, to avoid
326 conflicts between parsers.
327
328 ## Embedding
329
330 Documents can be embedded in other documents using a `<书社:link>`
331 element with `@xlink:show="embed"`.
332 The `@xlink:href`s of these elements should have the format
333 `about:shushe?source=<path>`, where `<path>` provides the path to the
334 file within `SRCDIR`.
335 Includes, which do not generate outputs of their own but may still be
336 freely embedded, instead use the format
337 `about:shushe?include=<path>`, where `<path>` provides the path
338 within `INCLUDEDIR`.
339
340 Embeds are replaced with the parsed contents of a file, unless the file
341 is an asset, in which case an `<html:object>` element is produced
342 instead (with the contents of the asset file provided as a base64
343 `data:` u·r·i).
344
345 Embedding takes place after parsing but before transformation, so
346 parsers are able to generate their own embeds.
347 ⛩️📰 书社 is able to detect the transitive embed dependencies of files
348 and update them accordingly; it will signal an error if the
349 dependencies are recursive.
350
351 ## Output Redirection
352
353 By default, ⛩️📰 书社 installs files to the same location in `DESTDIR`
354 as they were placed in their `SRCDIR`.
355 This behaviour can be customized by setting the `@书社:destination`
356 attribute on the root element, whose value can give a different path.
357 This attribute is read after parsing, but before transformation (where
358 it is silently dropped).
359
360 ## Transforms
361
362 Transforms are used to convert X·M·L files into their final output,
363 after all necessary parsing and embedding has taken place.
364 ⛩️📰 书社 comes with some transforms; namely :⁠—
365
366 - **`transforms/attributes.xslt`:**
367 Applies transforms to the children of any `<书社:apply-attributes>`
368 elements, and then applies the attributes of the
369 `<书社:apply-attributes>` to each result child, replacing the
370 element with the result.
371 This is useful in combination with image embeds to apply alt‐text to
372 the resulting `<html:img>`.
373
374 - **`transforms/asset.xslt`:**
375 Converts `<html:object>` elements which correspond to recognized
376 media types into the appropriate H·T·M·L elements, and deletes
377 `<html:style>` elements from the body of the document and moves
378 them to the head.
379
380 - **`transforms/metadata.xslt`:**
381 Provides basic `<html:head>` metadata.
382 This metadata is generated from `<html:meta>` elements with one of
383 the following `@itemprop` attributes :⁠—
384
385 - **`urn:fdc:ladys.computer:20231231:Shu1She4:title`:**
386 Provides the title of the page.
387
388 ⛩️📰 书社 automatically encapsulates embeds so that their metadata
389 does not propogate up to the embedding document.
390 To undo this behaviour, remove the `@itemscope` and `@itemtype`
391 attributes from the embed during the transformation phase.
392
393 The following are recommendations on effective creation of
394 transforms :⁠—
395
396 - Make template matchers as specific as possible.
397 It is likely an error if two transforms have templates which match
398 the same element (unless the templates have different priority).
399
400 - Namespace templates (with `@name` or `@mode`) whenever possible.
401
402 - Set `@exclude-result-prefixes` on the root `xslt:transform` element
403 to reduce the number of declared namespaces in the final result.
404
405 ## Global Params
406
407 The following params are made available globally in parsers and
408 transforms :⁠—
409
410 - **`BUILDTIME`:**
411 The current time.
412
413 - **`SRCREV`:**
414 The tag or hash of the current commit in the working directory (if
415 `GIT` is defined and `./.git` exists).
416
417 - **`SRCTIME`:**
418 The time at which the source file was last modified.
419
420 - **`VERSION`:**
421 The tag or hash of the current commit in `THISDIR` (if `GIT` is
422 defined and `$(THISDIR)/.git` exists).
423
424 The following params are only available in transforms :⁠—
425
426 - **`CATALOG`:**
427 The path of the catalog file (within `BUILDDIR`).
428
429 - **`PATH`:**
430 The path of the output file (within `DESTDIR`).
431
432 ## Output Wrapping
433
434 ⛩️📰 书社 will wrap the final output of the transforms in appropriate
435 `<html:html>` and `<html:body>` elements, so it is not necessary for
436 transforms to do this explicitly.
437 After performing the initial transform, ⛩️📰 书社 will match the root
438 node of the result in the following modes to fill in areas of the
439 wrapper :⁠—
440
441 - **`书社:header`:**
442 The result of matching in this mode is prepended into the
443 `<html:body>` of the output (before the transformation result).
444
445 - **`书社:footer`:**
446 The result of matching in this mode is appended into the
447 `<html:body>` of the output (after the transformation result).
448
449 - **`书社:metadata`:**
450 The result of matching in this mode is inserted into the
451 `<html:head>` of the output.
452
453 In addition to being called with the transform result, each of these
454 modes will additionally be called with a `<xslt:include>` element
455 corresponding to each transform.
456 If a transform has a `<书社:id>` top‐level element whose value is an
457 i·r·i, its `<xslt:include>` element will have a corresponding
458 `@书社:id` attribute.
459 This mechanism can be used to allow transforms to insert content
460 without matching any elements in the result; for example, the
461 following transform adds a link to a stylesheet to the `<html:head>`
462 of every page :⁠—
463
464 ```xml
465 <?xml version="1.0"?>
466 <transform
467 xmlns="http://www.w3.org/1999/XSL/Transform"
468 xmlns:html="http://www.w3.org/1999/xhtml"
469 xmlns:xslt="http://www.w3.org/1999/XSL/Transform"
470 xmlns:书社="urn:fdc:ladys.computer:20231231:Shu1She4"
471 exclude-result-prefixes="书社"
472 version="1.0"
473 >
474 <书社:id>example:add-stylesheet-links.xslt</书社:id>
475 <template match="xslt:include[@书社:id='example:add-stylesheet-links.xslt']" mode="书社:metadata">
476 <html:link rel="stylesheet" type="text/css" href="/style.css"/>
477 </template>
478 </transform>
479 ```
480
481 Output wrapping can be entirely disabled by adding a
482 `@书社:disable-output-wrapping` attribute to the top‐level element in
483 the result tree.
484
485 ## License
486
487 Source files are licensed under the terms of the <cite>Mozilla Public
488 License, version 2.0</cite>.
489 For more information, see [LICENSE](./LICENSE).
490
491 [draft-phillips-record-jar-01]: <https://datatracker.ietf.org/doc/html/draft-phillips-record-jar-01>
This page took 0.066338 seconds and 3 git commands to generate.