]> Lady’s Gitweb - Shushe/blob - README.markdown
Add DATAOPTS
[Shushe] / README.markdown
1 <!--
2 SPDX-FileCopyrightText: 2024, 2025 Lady <https://www.ladys.computer/about/#lady>
3 SPDX-License-Identifier: CC0-1.0
4 -->
5 # ⛩📰 书社
6
7 <b>A make·file for X·M·L.</b>
8
9 <dfn>⛩📰 书社</dfn> aims to make it easy to generate websites with
10 X·S·L·T and G·N·U Make.
11 It is consequently only a good choice for people who like X·S·L·T and
12 G·N·U Make and wish it were easier to make websites with them.
13
14 It makes things easier by :⁠—
15
16 - Automatically identifying source files and characterizing them by
17 type (X·M·L, text, or asset).
18
19 - Parsing supported text types into X·M·L trees.
20
21 - Enabling easy inclusion of source files within each other.
22
23 It aims to do this with zero dependencies beyond the programs already
24 installed on your computer†.
25
26 † Assuming an operating system with a fairly featureful, and
27 Posix‐compliant, development setup (e·g, Macintosh ≥ version 10.8).
28 In fact, on Linux you will probably need to install a few programs:
29 `libxml2-utils`, `xsltproc`, `sharutils`, and `pax`.
30
31 ## Nomenclature
32
33 <i lang="cmn-Hans">书社</i> is a Chinese word meaning “publishing
34 house”.
35
36 The first character, <i lang="cmn-Hans">书</i>, is the simplified form
37 of “document”.
38
39 The second character, <i lang="cmn-Hans">社</i>, contemporarily means
40 “association”, but historically referred to the god of the soil and
41 related altars or festivities.
42 In Japanese, it is an alternate spelling for <i lang="ja">やしろ</i>,
43 the word for “Shinto shrine”.
44
45 The name <i lang="cmn-Hans">书社</i> was chosen to play on this pun, as
46 it is intended as a publishing program for webshrines.
47
48 In Ascii environments, ⛩📰 书社 should be written `Shushe`, following
49 the pinyin transliteration.
50
51 ## Prerequisites
52
53 In most cases, ⛩📰 书社 aims to require only functionality which is
54 present in all Posix‐compliant (`POSIX.1-2001`) operating systems.
55 There are a few exceptions.
56 Details on particular programs are given below; if a program is not
57 listed, it is assumed that any Posix‐compliant implementation will
58 work.
59
60 ### `diff`
61
62 This is a Posix utility, but ⛩📰 书社 depends on functionality
63 introduced after `POSIX.1-2001` (the `-u` option, introduced in
64 `POSIX.1-2008`).
65 Macintosh systems somewhat interestingly implement this option
66 correctly in legacy mode (`COMMAND_MODE=legacy`) but incorrectly by
67 default (despite claiming `POSIX.1-2008` conformance for this
68 utility).
69 [Note this erroneous comment claiming nanosecond & timezone are
70 extensions rather than standardized.][rdar-92753335]
71 Despite this, the default Macintosh implementation will still work with
72 ⛩📰 书社, with the caveat that the timestamp will only include a
73 fractional component when a Posix⹀compliant (e·g, Macintosh legacy or
74 G·N·U) implementation is used.
75
76 ### `file`
77
78 This is a Posix utility, but it was considered optional in
79 `POSIX.1-2001` (altho it was made mandatory in `POSIX.1-2008`) and
80 ⛩📰 书社 currently depends on unspecified behaviour.
81 It requires support for the following additional options :⁠—
82
83 - **`-C`**, when supplied with `-m`, must be useable to compile a
84 `.mgc` magicfile for use with future invocations of `file`.
85
86 - **`--files-from`** must be useable to provide a file that `file`
87 should read file·names from, and `-` must be useable in this
88 context to specify the standard input.
89
90 - **`--mime-type`** must cause `file` to print the internet media type
91 of the file with no charset parameter.
92
93 - **`--separator`** must be useable to set the separator that `file`
94 uses to separate file names from types.
95
96 These options are implemented by the
97 [Fine Free File Command](https://darwinsys.com/file/), which is used
98 by most operating systems.
99
100 ### `git`
101
102 This is not a Posix utility.
103 Usage of `git` is optional, but recommended (and activated by default).
104 To disable it, set `GIT=`.
105
106 ### `make`
107
108 This is a Posix utility, but it is considered an optional Software
109 Development utility and ⛩📰 书社 currently depends on unspecified
110 behaviour.
111 ⛩📰 书社 requires specifically the G·N·U version of `make`, and
112 depends on functionality present in version 3.81 or later.
113 It is not expected to work in previous versions, or with other
114 implementations of Make.
115
116 ### `pax`
117
118 This is a Posix utility, but it is not included in the Linux Standard
119 Base or installed by default in many distributions.
120 ⛩📰 书社 only requires support for the `ustar` format.
121
122 ### `uudecode` and `uuencode`
123
124 These are Posix utilities, but they were considered optional in
125 `POSIX.1-2001` (altho they are made mandatory in `POSIX.1-2008`) and
126 they are not included in the Linux Standard Base or installed by
127 default in many distributions.
128 The G·N·U [Sharutils](https://www.gnu.org/software/sharutils/) package
129 provides one implementation.
130
131 ### `xmlcatalog` and `xmllint`
132
133 These are not a Posix utilities.
134 They are a part of `libxml2`, but may need to be installed separately
135 on some platforms (e·g by the name `libxml2-utils`).
136
137 ### `xsltproc`
138
139 This is not a Posix utility.
140 It is a part of `libxslt`, but may need to be installed separately on
141 some platforms.
142
143 ## Basic Usage
144
145 Place source files in `sources/` and run `make install` to compile
146 the result to `public/`.
147 Compilation involves the following steps :⁠—
148
149 1. ⛩📰 书社 compiles all of the magic files in `magic/` into a single
150 file, `build/magic.mgc`.
151
152 2. ⛩📰 书社 processes all of the parsers in `parsers/` and determines
153 the list of supported plaintext types.
154
155 3. ⛩📰 书社 identifies all of the source files and includes and uses
156 `build/magic.mgc` to classify them by media type.
157
158 4. ⛩📰 书社 parses all plaintext and X·M·L source files and includes
159 and then builds a dependency tree between them.
160
161 5. ⛩📰 书社 uses the dependency tree to establish prerequisites for
162 each output file.
163
164 6. ⛩📰 书社 compiles each output file to `build/result`.
165
166 7. ⛩📰 书社 copies most output files from `build/result` to
167 `build/public`, but it does some additional processing instead on
168 those which indicate a non‐X·M·L desired final output form.
169
170 8. ⛩📰 书社 copies the final resulting files to `public`.
171
172 You can use `make list` to list each identified source file or include
173 alongside its computed type and dependencies.
174 As this is a Make‐based program, steps will only be run if the
175 corresponding buildfile or output file is older than its
176 prerequisites.
177
178 ## Name·spaces
179
180 The ⛩📰 书社 name·space is `urn:fdc:ladys.computer:20231231:Shu1She4`.
181
182 This document uses a few name·space prefixes, with the following
183 meanings :⁠—
184
185 | Prefix | Expansion |
186 | ---------: | :-------------------------------------------- |
187 | `catalog:` | `urn:oasis:names:tc:entity:xmlns:xml:catalog` |
188 | `exsl:` | `http://exslt.org/common` |
189 | `exslstr:` | `http://exslt.org/strings` |
190 | `html:` | `http://www.w3.org/1999/xhtml` |
191 | `svg:` | `http://www.w3.org/2000/svg` |
192 | `xlink:` | `http://www.w3.org/1999/xlink` |
193 | `xslt:` | `http://www.w3.org/1999/XSL/Transform` |
194 | `书社:` | `urn:fdc:ladys.computer:20231231:Shu1She4` |
195
196 ## Setup and Configuration
197
198 ⛩📰 书社 depends on the following programs to run.
199 In every case, you may supply your own implementation by overriding the
200 corresponding (allcaps) variable (e·g, set `MKDIR` to supply your own
201 `mkdir` implementation).
202
203 - `awk`
204 - `cat`
205 - `cd`
206 - `cksum`
207 - `cp`
208 - `date`
209 - `diff`
210 - `file`
211 - `find`
212 - `git` (optional; set `GIT=` to disable)
213 - `grep`
214 - `ln`
215 - `mkdir`
216 - `mv`
217 - `od`
218 - `pax` (only when generating archives)
219 - `printf`
220 - `rm`
221 - `sed`
222 - `sleep`
223 - `test`
224 - `touch`
225 - `tr`
226 - `uuencode`
227 - `uudecode`
228 - `xargs`
229 - `xmlcatalog` (provided by `libxml2`)
230 - `xmllint` (provided by `libxml2`)
231 - `xsltproc` (provided by `libxslt`)
232
233 The following additional variables can be used to control the behaviour
234 of ⛩📰 书社 :⁠—
235
236 - **`SRCDIR`:**
237 The location of the source files (default: `sources`).
238 Multiple source directories can be provided, so long as the same
239 file subpath doesn’t exist in more than one of them.
240
241 - **`INCLUDEDIR`:**
242 The location of source includes (default: `sources/includes`).
243 This can be inside of `SRCDIR`, but needn’t be.
244 Multiple include directories can be provided, so long as the same
245 file subpath doesn’t exist in more than one of them.
246
247 - **`DATADIR`:**
248 If set to the location of a directory, ⛩📰 书社 will run a two‐stage build.
249 In the first stage, only files in `SRCDIR` which match `FINDDATARULES` (see below) will be built, with files in `DATADIR` serving as includes.
250 In the second stage, the remaining files in `SRCDIR` will be built, with the files built during the first stage, in addition to any files in `INCLUDEDIR`, serving as includes.
251 Files built during the first stage are copied into `DESTDIR` alongside those from the second stage when installing.
252
253 This functionality is intended for sites where the bulk of the site can be built from a few data files which are expensive to create.
254
255 - **`BUILDDIR`:**
256 The location of the (temporary) build directory (default: `build`).
257 `make clean` will delete this, and it is recommended that it not be
258 used for programs aside from ⛩📰 书社.
259
260 - **`DESTDIR`:**
261 The location of directory to output files to (default: `public`).
262 `make install` will overwrite files in this directory which
263 correspond to those in `SRCDIR`.
264 It *will not* touch other files, including those generated from files
265 in `SRCDIR` which have since been deleted.
266
267 Files are first compiled to `$(BUILDDIR)/public` before they are
268 copied to `DESTDIR`, so this folder is relatively quick and
269 inexpensive to re·create.
270 It’s reasonable to simply delete it before every `make install` to
271 ensure stale content is removed.
272
273 - **`THISDIR`:**
274 The location of the ⛩📰 书社 `GNUmakefile`.
275 This should be set automatically when calling Make and shouldn’t ever
276 need to be set manually.
277 This variable is used to find the ⛩📰 书社 `lib/` folder, which is
278 expected to be in the same location.
279
280 - **`MAGIC`:**
281 A white·space‐separated list of magic files to use (default:
282 `$(THISDIR)/magic/*`).
283
284 - **`EXTRAMAGIC`:**
285 The value of this variable is appended to `MAGIC` by default, to
286 enable additional magic files without overriding the existing ones.
287
288 - **`FINDRULES`:**
289 Rules to use with `find` when searching for source files.
290 The default ignores files that start with a period or hyphen‐minus,
291 those which end with a cloparen, and those which contain a hash,
292 buck, percent, asterisk, colon, semi, eroteme, bracket, backslash,
293 or pipe.
294 It is important that these rules not produce any output, as anything
295 printed to `stdout` will be considered a result of the find.
296
297 - **`EXTRAFINDRULES`:**
298 The value of this variable is appended to `FINDRULES` by default, to
299 enable additional rules without overriding the existing ones.
300
301 - **`FINDINCLUDERULES`:**
302 Rules to use with `find` when searching for includes (default:
303 `$(FINDRULES)`).
304
305 - **`EXTRAFINDINCLUDERULES`:**
306 The value of this variable is appended to `FINDINCLUDERULES` by
307 default, to enable additional rules without overriding the existing
308 ones.
309
310 - **`DATAOPTS`:**
311 Additional options to use when calling Make during the first stage of a two‐stage build using `DATADIR`.
312
313 This can be used to override variables which are only applicable during the second stage.
314 Note that when supplying this variable on the shell, it will need to be double‐quoted.
315
316 - **`DATAEXT`:**
317 A list of file extensions which signify “data” files during a two‐stage build using `DATADIR`.
318
319 - **`FINDDATARULES`:**
320 Rules to use with `find` when searching for data files.
321 By default, these rules are derived from `DATAEXT`.
322
323 - **`EXTRAFINDDATARULES`:**
324 The value of this variable is appended to `FINDDATARULES` by
325 default, to enable additional rules without overriding the existing
326 ones.
327
328 - **`FINDFILTERONLY`:**
329 A semicolon‐separated list of regular expressions, at least one of which the paths for sources and includes are required to match, unless empty (default: empty).
330
331 - **`FINDFILTEROUT`:**
332 A semicolon‐separated list of regular expressions, each of which matches paths that should _not_ be considered sources or includes (default: empty).
333
334 - **`FINDINCLUDEFILTERONLY`:**
335 A semicolon‐separated list of regular expressions, at least one of which the paths for includes are required to match, unless empty (default: empty).
336
337 Note that only paths which already match `FINDFILTERONLY` are considered.
338
339 - **`FINDINCLUDEFILTEROUT`:**
340 A semicolon‐separated list of regular expressions, each of which matches paths that should _not_ be considered includes, but may still be considered sources (default: empty).
341
342 - **`FINDFILTERONLYEXTENDED`:**
343 If non·empty, `FINDFILTERONLY` is an extended regular expression; otherwise, it is basic (default: empty).
344
345 - **`FINDFILTEROUTEXTENDED`:**
346 If non·empty, `FINDFILTEROUT` is an extended regular expression; otherwise, it is basic (default: matches `FINDFILTERONLYEXTENDED`).
347
348 - **`FINDINCLUDEFILTERONLYEXTENDED`:**
349 If non·empty, `FINDINCLUDEFILTERONLY` is an extended regular expression; otherwise, it is basic (default: matches `FINDFILTERONLYEXTENDED`).
350
351 - **`FINDINCLUDEFILTEROUTEXTENDED`:**
352 If non·empty, `FINDINCLUDEFILTEROUT` is an extended regular expression; otherwise, it is basic (default: `1` if either `FINDFILTEROUTEXTENDED` or `FINDINCLUDEFILTERONLYEXTENDED` is non·empty).
353
354 - **`PARSERS`:**
355 A white·space‐separated list of parsers to use (default:
356 `$(THISDIR)/parsers/*.xslt`).
357
358 - **`EXTRAPARSERS`:**
359 The value of this variable is appended to `PARSERS` by default, to
360 enable additional parsers without overriding the existing ones.
361
362 - **`PARSERLIBS`:**
363 A white·space‐separated list of parser dependencies (default:
364 `$(THISDIR)/lib/split.xslt`).
365
366 - **`EXTRAPARSERLIBS`:**
367 The value of this variable is appended to `PARSERLIBS` by default, to
368 enable additional parser dependencies without overriding the
369 existing ones.
370
371 - **`TRANSFORMS`:**
372 A white·space‐separated list of transforms to use (default:
373 `$(THISDIR)/transforms/*.xslt`).
374
375 - **`EXTRATRANSFORMS`:**
376 The value of this variable is appended to `TRANSFORMS` by default, to
377 enable additional transforms without overriding the existing ones.
378
379 - **`TRANSFORMLIBS`:**
380 A white·space‐separated list of transform dependencies (default:
381 `$(THISDIR)/lib/serialize.xslt`).
382
383 - **`EXTRATRANSFORMLIBS`:**
384 The value of this variable is appended to `TRANSFORMLIBS` by default,
385 to enable additional transform dependencies without overriding the
386 existing ones.
387
388 - **`XMLTYPES`:**
389 A white·space‐separated list of media types or media type suffixes to
390 consider X·M·L (default: `application/xml text/xml +xml`).
391
392 - **`FINALIZE`:**
393 A program to run on (unspecial) X·M·L files after they are
394 transformed (default: `xmllint --nonet --nsclean`).
395 This variable can be used for postprocessing.
396
397 - **`THISREV`:**
398 The current version of ⛩📰 书社 (default: derived from the current
399 git tag/branch/commit).
400
401 - **`SRCREV`:**
402 The current version of the source files (default: derived from the
403 current git tag/branch/commit).
404
405 - **`QUIET`:**
406 If this variable has a value, informative messages will not be
407 printed (default: empty).
408 Informative messages print to stderr, not stdout, so disabling them
409 usually shouldn’t be necessary.
410 This does not (cannot) disable messages from Make itself, for which
411 the `-s`, `--silent` ∕ `--quiet` Make option is more likely to be
412 useful.
413
414 - **`VERBOSE`:**
415 If this variable has a value, every recipe instruction will be
416 printed when it runs (default: empty).
417 This is helpful for debugging, but typically too noisy for general
418 usage.
419
420 ## Source Files
421
422 Source files may be placed in `SRCDIR` in any manner; the file
423 structure used there will match the output.
424 The type of source files is *not* determined by file extension, but
425 rather by magic number; this means that files **must** begin with
426 something recognizable.
427 Supported magic numbers include :⁠—
428
429 - `<?xml` for `application/xml` files
430 - `#!js` for `text/javascript` files
431 - `@charset "` for `text/css` files
432 - `#!tsv` for `text/tab-separated-values` files
433 - `%%` for `text/record-jar` files (unregistered; see
434 [[draft-phillips-record-jar-01][]])
435
436 Text formats with associated X·S·L·T parsers are wrapped in a H·T·M·L
437 `<script>` element whose `@type` gives its media type, and then
438 passed to the parser to process.
439 Source files whose media type does not have an associated X·S·L·T
440 parser are considered “assets” and will not be transformed.
441
442 **☡ For compatibility with this program, source file·names must not
443 contain Ascii white·space, colons (`:`), semis (`;`), pipes (`|`),
444 bucks (`$`), percents (`%`), hashes (`#`), asterisks (`*`), brackets
445 (`[` or `]`), erotemes (`?`), backslashes (`\`), or control
446 characters, must not begin with a hyphen‐minus (`-`), and must not end
447 with a cloparen (`)`).**
448 The former characters have the potential to conflict with make syntax,
449 a leading hyphen‐minus is confusable for a commandline argument, and a
450 trailing cloparen [activates a bug in G·N·U Make
451 3.81](https://stackoverflow.com/questions/17148468/capturing-filenames-including-parentheses-with-gnu-makes-wildcard-function#comment24825307_17148894).
452
453 ## Parsers
454
455 Parsers are used to convert plaintext files into X·M·L trees, as well
456 as convert plaintext formats which are already included inline in
457 existing source X·M·L documents.
458 ⛩📰 书社 comes with some parsers; namely :⁠—
459
460 - **`parsers/plain.xslt`:**
461 Wraps `text/plain` contents in a `<html:pre>` element.
462
463 - **`parsers/record-jar.xslt`:**
464 Converts `text/record-jar` contents into a `<html:div>` of
465 `<html:dl>` elements (one for each record).
466
467 - **`parsers/tsv.xslt`:**
468 Converts `text/tab-separated-values` contents into an `<html:table>`
469 element.
470
471 New ⛩📰 书社 parsers which target plaintext formats should have an
472 `<xslt:template>` element with no `@name` or `@mode` and whose
473 `@match` attribute…
474
475 - Starts with an appropriately‐name·spaced qualified name for a
476 `<html:script>` element.
477
478 - Follows this with the string `[@type=`.
479
480 - Follows this with a quoted string giving a media type supported by
481 the parser.
482 Media type parameters are *not* supported.
483
484 - Follows this with the string `]`.
485
486 For example, the trivial `text/plain` parser is defined as follows :⁠—
487
488 ```xml
489 <?xml version="1.0"?>
490 <transform
491 xmlns="http://www.w3.org/1999/XSL/Transform"
492 xmlns:html="http://www.w3.org/1999/xhtml"
493 xmlns:书社="urn:fdc:ladys.computer:20231231:Shu1She4"
494 version="1.0"
495 >
496 <书社:id>example:text/plain</书社:id>
497 <template match="html:script[@type='text/plain']">
498 <html:pre><value-of select="."/></html:pre>
499 </template>
500 </transform>
501 ```
502
503 ⛩📰 书社 will scan the provided parsers for this pattern to determine
504 the set of allowed plaintext file types.
505 Multiple such `<xslt:template>` elements may be provided in a single
506 parser, for example if the parser supports multiple media types.
507 Alternatively, you can set the `@书社:supported-media-types` attribute
508 on the root element of the parser to override media type support
509 detection.
510
511 Even when `@书社:supported-media-types` is set, it is a requirement
512 that each parser transform any `<html:script>` elements with a
513 `@type` which matches their registered types into something else.
514 Otherwise the parser will be stuck in an endless loop.
515 The result tree of applying the transform to the `<html:script>`
516 element will be reparsed (in case any new `<html:script>` elements
517 were added in its subtree), and a `@书社:parsed-by` attribute will be
518 added to each toplevel element in the result.
519 The value of this attribute will be the value of the `<书社:id>`
520 toplevel element in the parser.
521
522 It is possible for parsers to support zero plaintext types.
523 This is useful when targeting specific dialects of X·M·L; parsers in
524 this sense operate on the same basic principles as transforms
525 (described below).
526 The major distinction between X·M·L parsers and transforms is where in
527 the process the transformation happens:
528 Parsers are applied *prior* to embedding (and can be used to generate
529 embeds); transforms are applied *after*.
530
531 It is **strongly recommended** that auxillary templates in parsers be
532 name·spaced (by `@name` or `@mode`) whenever possible, to avoid
533 conflicts between parsers.
534
535 ### Attributes added during parsing
536
537 ⛩📰 书社 will add a few attributes to elements which result from
538 parsing plaintext `<html:script>` elements.
539 These include :⁠—
540
541 - A `@书社:parsed-by` attribute, giving a space‐separated list of
542 parsers which parsed the node.
543 (Generally, this will be a list of one, but it is possible for the
544 result of a parse to be another plaintext node, which may be parsed
545 by a different parser.)
546
547 - A `@书社:media-type` attribute, giving the identified media type of
548 the plaintext node.
549
550 ## Output Redirection
551
552 By default, ⛩📰 书社 installs files to the same location in `DESTDIR`
553 as they were placed in their `SRCDIR`.
554 This behaviour can be customized by setting the `@书社:destination`
555 attribute on the root element, whose value can give a different path.
556 This attribute is read after parsing, but before transformation (where
557 it is silently dropped).
558
559 ## Embedding
560
561 Documents can be embedded in other documents using a `<书社:link>`
562 element with `@xlink:show="embed"`.
563 The `@xlink:href`s of these elements should have the format
564 `about:shushe?source=<path>`, where `<path>` provides the path to the
565 file within `SRCDIR`.
566 Includes, which do not generate outputs of their own but may still be
567 freely embedded, instead use the format
568 `about:shushe?include=<path>`, where `<path>` provides the path
569 within `INCLUDEDIR`.
570
571 Embeds are replaced with the parsed contents of a file, unless the file
572 is an asset, in which case an `<html:object>` element is produced
573 instead (with the contents of the asset file provided as a base64
574 `data:` u·r·i).
575 Embed replacements will be given a `@书社:identifier` attribute whose
576 value will match the `@xlink:href` of the embed.
577
578 Embedding takes place after parsing but before transformation, so
579 parsers are able to generate their own embeds.
580 ⛩📰 书社 is able to detect the transitive embed dependencies of files
581 and update them accordingly; it will signal an error if the
582 dependencies are recursive.
583
584 ### Attributes added during expansion
585
586 ⛩📰 书社 will add a few attributes to toplevel result elements, both
587 in the main document and any embedded documents, during the expansion
588 phase prior to the main transformation.
589 These include :⁠—
590
591 - A `@书社:cksum` attribute giving the `cksum` checksum of the
592 corresponding source file.
593
594 - A `@书社:mtime` attribute giving the last modified time of the
595 corresponding source file.
596
597 - A `@书社:identifier` attribute giving the ⛩📰 书社 identifier
598 (i·e, starting with `about:shushe?`) of the corresponding source
599 file.
600
601 - For elements in the `html` namespace, an `itemscope` attribute and an
602 `itemtype` attribute with a value of
603 `urn:fdc:ladys.computer:20231231:Shu1She4:document` (for the main
604 document) or `urn:fdc:ladys.computer:20231231:Shu1She4:embed` (for
605 embedded documents).
606 These attributes are used to scope any nested `<html:meta>` elements
607 with `@itemprop` attributes to their containing documents.
608
609 ## Transforms
610
611 Transforms are used to convert X·M·L files into their final output,
612 after all necessary parsing and embedding has taken place.
613 ⛩📰 书社 comes with some transforms; namely :⁠—
614
615 - **`transforms/asset.xslt`:**
616 Converts `<html:object>` elements which correspond to recognized
617 media types into the appropriate H·T·M·L elements, and deletes
618 `<html:style>` elements from the body of the document and moves
619 them to the head.
620 This conversion happens during the finalization phase, after the main
621 transformation.
622
623 - **`transforms/metadata.xslt`:**
624 Provides basic `<html:head>` metadata.
625 This metadata is generated from `<html:meta>` elements with one of
626 the following `@itemprop` attributes :⁠—
627
628 - **`urn:fdc:ladys.computer:20231231:Shu1She4:title`:**
629 Provides the title of the page.
630
631 ⛩📰 书社 automatically encapsulates H·T·M·L embeds so that their
632 metadata does not propogate up to the embedding document.
633 To undo this behaviour, remove the `@itemscope` and `@itemtype`
634 attributes from the embed during the transformation phase.
635
636 - **`transforms/serialization.xslt`:**
637 Replaces `<书社:serialize-xml>` elements with the (escaped)
638 serialized X·M·L of their contents.
639 This replacement happens during the finalization phase, after most
640 other transformations have taken place.
641
642 If a `@with-namespaces` attribute is provided, any name·space nodes
643 on the toplevel serialized elements whose U·R·I’s correspond to the
644 definitions of the provided prefixes, as defined for the
645 `<书社:serialize-xml>` element, will be declared using name·space
646 attributes on the serialized elements.
647 Otherwise, only name·space nodes which _differ_ from the definitions
648 on the `<书社:serialize-xml>` element will be declared.
649 The string `#default` may be used to represent the default
650 name·space.
651 Multiple prefixes may be provided, separated by white·space.
652
653 When it comes to name·spaces used internally by ⛩📰 书社, the
654 prefix used by ⛩📰 书社 may be declared _in addition to_ the
655 prefix(es) used in the source document(s).
656 It is not possible to selectively only declare one prefix for a
657 name·space to the exclusion of others.
658
659 `<书社:raw-output>` elements may be used inside of
660 `<书社:serialize-xml>` elements to inject raw output into the
661 serialized X·M·L.
662
663 The following are recommendations on effective creation of
664 transforms :⁠—
665
666 - Make template matchers as specific as possible.
667 It is likely an error if two transforms have templates which match
668 the same element (unless the templates have different priority).
669
670 - Name·space templates (with `@name` or `@mode`) whenever possible.
671
672 - Set `@exclude-result-prefixes` on the root `xslt:transform` element
673 to reduce the number of declared name·spaces in the final result.
674
675 ## Global Params
676
677 The following params are made available globally in parsers and
678 transforms :⁠—
679
680 - **`BUILDTIME`:**
681 The current time.
682
683 - **`IDENTIFIER`:**
684 The ⛩📰 书社 identifier of the source file (a u·r·i beginning with
685 `about:shushe`).
686
687 - **`SRCREV`:**
688 The value of the `SRCREV` variable (if present).
689
690 - **`THISREV`:**
691 The value of the `THISREV` variable (if present).
692
693 ## Output Wrapping
694
695 Provided at least one toplevel result element belongs to the H·T·M·L
696 namespace, ⛩📰 书社 will wrap the final output of the transforms in
697 appropriate `<html:html>` and `<html:body>` elements, so it is not
698 necessary for transforms to do this explicitly.
699 If a toplevel result element _is_ a `<html:html>` and `<html:body>`
700 element, it will be merged with the one that ⛩📰 书社 creates.
701 Consequently, wrapping the result in a `<html:body>` element can be
702 used to enable wrapping for non‐H·T·M·L content, when desired.
703
704 As a part of this process, after performing the initial transform
705 ⛩📰 书社 will match in the following modes to fill in areas of the
706 wrapper :⁠—
707
708 - **`书社:header`:**
709 The result of matching in this mode is prepended into the
710 `<html:body>` of the output (before the transformation result).
711
712 - **`书社:footer`:**
713 The result of matching in this mode is appended into the
714 `<html:body>` of the output (after the transformation result).
715
716 - **`书社:metadata`:**
717 The result of matching in this mode is inserted into the
718 `<html:head>` of the output.
719
720 The document being matched will contain the full transform result
721 prior to wrapping as well as an `<书社:id>` element for each
722 transform.
723 The latter elements can be matched to enable transforms to provide
724 content _without_ matching any elements in the result; for example,
725 the following transform adds a link to a stylesheet to the
726 `<html:head>` of every page :⁠—
727
728 ```xml
729 <?xml version="1.0"?>
730 <transform
731 xmlns="http://www.w3.org/1999/XSL/Transform"
732 xmlns:html="http://www.w3.org/1999/xhtml"
733 xmlns:xslt="http://www.w3.org/1999/XSL/Transform"
734 xmlns:书社="urn:fdc:ladys.computer:20231231:Shu1She4"
735 exclude-result-prefixes="书社"
736 version="1.0"
737 >
738 <书社:id>example:add-stylesheet-links.xslt</书社:id>
739 <template match="书社:id[string(.)='example:add-stylesheet-links.xslt']" mode="书社:metadata">
740 <html:link rel="stylesheet" type="text/css" href="/style.css"/>
741 </template>
742 </transform>
743 ```
744
745 Output wrapping can be entirely disabled by adding a
746 `@书社:disable-output-wrapping` attribute to the top‐level element in
747 the result tree.
748 It will not be performed on outputs whose root elements are
749 `<书社:archive>`, `<书社:base64-binary>`, or `<书社:raw-text>`
750 (described below), or on result trees which do not contain a toplevel
751 element in the H·T·M·L namespace.
752
753 ## Applying Attributes
754
755 The `<书社:apply-attributes>` element will apply any attributes on the
756 element to the element(s) it wraps.
757 It is especially useful in combination with embeds.
758
759 The `<书社:apply-attributes-to-root>` element will apply any attributes
760 on the element to the root node of the final transformation result.
761 It is especially useful in combination with output wrapping.
762
763 In both cases, attributes from various sources are combined with
764 white·space between them.
765 Attribute application takes place after each stage of the
766 transformation, including after the initial embedding phase.
767
768 Both elements ignore attributes in the `xml:` name·space, except for
769 `@xml:lang`, which ignores all but the first definition (including
770 any already present on the root element).
771 On H·T·M·L and S·V·G elements, `@lang` has the same behaviour as
772 `@xml:lang`.
773
774 ## Other Kinds of Output
775
776 There are a few special elements in the `书社:` name·space which, if
777 they appear as the toplevel element in a transformation result, cause
778 ⛩📰 书社 to produce something other than an X·M·L file.
779 They are :⁠—
780
781 - **`<书社:archive>`:**
782 Each child element with a `@书社:archived-as` attribute will be
783 archived as a separate file in a resulting tarball (this attribute
784 gives the file name).
785 These elements will be processed the same as the root elements of any
786 other file (e·g, they will be wrapped; they can themselves specify
787 non X·M·L output types, ⁊·c).
788 Other child elements will be ignored.
789
790 If the `<书社:archive>` element is given an `@书社:expanded`
791 attribute, rather than producing a tarball ⛩📰 书社 will output
792 the directory which expanding the tarball would produce.
793 This mechanism can be used to generate multiple files from a single
794 source, provided all of the files are contained with·in the same
795 directory.
796
797 - **`<书社:base64-binary>`:**
798 The text nodes in the transformation result will, after removing all
799 Ascii whitespace, be treated as a Base·64 string, which is then
800 decoded.
801
802 - **`<书社:raw-text>`:**
803 A plaintext (U·T·F‐8) file will be produced from the text nodes in
804 the transformation result.
805
806 ## License
807
808 This repository conforms to [REUSE][].
809
810 Most source files are licensed under the terms of the <cite>Mozilla
811 Public License, version 2.0</cite>.
812
813 [REUSE]: <https://reuse.software/spec/>
814 [draft-phillips-record-jar-01]: <https://datatracker.ietf.org/doc/html/draft-phillips-record-jar-01>
815 [rdar-92753335]: <https://github.com/apple-oss-distributions/patch_cmds/blob/5084833f90df1b0e0924ea56f94c0199b3b8bbc6/diff/diffreg.c#L1800-L1808>
This page took 0.101844 seconds and 5 git commands to generate.