Lady’s Gitweb - Shushe/blob - README.markdown

   1 <!--
   2 SPDX-FileCopyrightText: 2024, 2025, 2026 Lady <https://www.ladys.computer/about/#lady>
   3 SPDX-License-Identifier: CC0-1.0
   4 -->
   5 # ⛩📰 书社
   6
   7 <b>A make·file for X·M·L.</b>
   8
   9 <dfn>⛩📰 书社</dfn> aims to make it easy to generate websites with
  10   X·S·L·T and G·N·U Make.
  11 It is consequently only a good choice for people who like X·S·L·T and
  12   G·N·U Make and wish it were easier to make websites with them.
  13
  14 It makes things easier by :⁠—
  15
  16 - Automatically identifying source files and characterizing them by
  17     type (X·M·L, text, or asset).
  18
  19 - Parsing supported text types into X·M·L trees.
  20
  21 - Enabling easy inclusion of source files within each other.
  22
  23 It aims to do this with zero dependencies beyond the programs already
  24   installed on your computer†.
  25
  26 † Assuming an operating system with a fairly featureful, and
  27   Posix‐compliant, development setup (e·g, Macintosh ≥ version 10.8).
  28 In fact, on Linux you will probably need to install a few programs:
  29   `libxml2-utils`, `xsltproc`, `sharutils`, and `pax`.
  30
  31 ## Nomenclature
  32
  33 <i lang="cmn-Hans">书社</i> is a Chinese word meaning “publishing
  34   house”.
  35
  36 The first character, <i lang="cmn-Hans">书</i>, is the simplified form
  37   of “document”.
  38
  39 The second character, <i lang="cmn-Hans">社</i>, contemporarily means
  40   “association”, but historically referred to the god of the soil and
  41   related altars or festivities.
  42 In Japanese, it is an alternate spelling for <i lang="ja">やしろ</i>,
  43   the word for “Shinto shrine”.
  44
  45 The name <i lang="cmn-Hans">书社</i> was chosen to play on this pun, as
  46   it is intended as a publishing program for webshrines.
  47
  48 In Ascii environments, ⛩📰 书社 should be written `Shushe`, following
  49   the pinyin transliteration.
  50
  51 ## Prerequisites
  52
  53 In most cases, ⛩📰 书社 aims to require only functionality which is
  54   present in all Posix‐compliant (`POSIX.1-2001`) operating systems.
  55 There are a few exceptions.
  56 Details on particular programs are given below; if a program is not
  57   listed, it is assumed that any Posix‐compliant implementation will
  58   work.
  59
  60 ### `diff`
  61
  62 This is a Posix utility, but ⛩📰 书社 depends on functionality
  63   introduced after `POSIX.1-2001` (the `-u` option, introduced in
  64   `POSIX.1-2008`).
  65 Macintosh systems somewhat interestingly implement this option
  66   correctly in legacy mode (`COMMAND_MODE=legacy`) but incorrectly by
  67   default (despite claiming `POSIX.1-2008` conformance for this
  68   utility).
  69 [Note this erroneous comment claiming nanosecond & timezone are
  70   extensions rather than standardized.][rdar-92753335]
  71 Despite this, the default Macintosh implementation will still work with
  72   ⛩📰 书社, with the caveat that the timestamp will only include a
  73   fractional component when a Posix‐compliant (e·g, Macintosh legacy or
  74   G·N·U) implementation is used.
  75
  76 ### `file`
  77
  78 This is a Posix utility, but it was considered optional in
  79   `POSIX.1-2001` (altho it was made mandatory in `POSIX.1-2008`) and
  80   ⛩📰 书社 currently depends on unspecified behaviour.
  81 It requires support for the following additional options :⁠—
  82
  83 - **`-C`**, when supplied with `-m`, must be useable to compile a
  84     `.mgc` magicfile for use with future invocations of `file`.
  85
  86 - **`--files-from`** must be useable to provide a file that `file`
  87     should read file·names from, and `-` must be useable in this
  88     context to specify the standard input.
  89
  90 - **`--mime-type`** must cause `file` to print the internet media type
  91     of the file with no charset parameter.
  92
  93 - **`--separator`** must be useable to set the separator that `file`
  94     uses to separate file names from types.
  95
  96 These options are implemented by the [Fine Free File Command][F3C],
  97   which is used by most operating systems.
  98
  99 ### `git`
 100
 101 This is not a Posix utility.
 102 Usage of `git` is optional, but recommended (and activated by default).
 103 To disable it, set `GIT=`.
 104
 105 ### `make`
 106
 107 This is a Posix utility, but it is considered an optional Software
 108   Development utility and ⛩📰 书社 currently depends on unspecified
 109   behaviour.
 110 ⛩📰 书社 requires specifically the G·N·U version of `make`, and
 111   depends on functionality present in version 3.81 or later.
 112 It is not expected to work in previous versions, or with other
 113   implementations of Make.
 114
 115 ### `pax`
 116
 117 This is a Posix utility, but it is not included in the Linux Standard
 118   Base or installed by default in many distributions.
 119 ⛩📰 书社 only requires support for the `ustar` format.
 120
 121 ### `uudecode` and `uuencode`
 122
 123 These are Posix utilities, but they were considered optional in
 124   `POSIX.1-2001` (altho they are made mandatory in `POSIX.1-2008`) and
 125   they are not included in the Linux Standard Base or installed by
 126   default in many distributions.
 127 The G·N·U [Sharutils][] package provides one implementation.
 128
 129 ### `xmlcatalog` and `xmllint`
 130
 131 These are not a Posix utilities.
 132 They are a part of `libxml2`, but may need to be installed separately
 133   on some platforms (e·g by the name `libxml2-utils`).
 134
 135 ### `xsltproc`
 136
 137 This is not a Posix utility.
 138 It is a part of `libxslt`, but may need to be installed separately on
 139   some platforms.
 140
 141 ## Basic Usage
 142
 143 Place source files in `sources/` and run `make install` to compile
 144   the result to `public/`.
 145 Compilation involves the following steps :⁠—
 146
 147 1. ⛩📰 书社 compiles all of the magic files in `magic/` into a single
 148     file, `build/magic.mgc`.
 149
 150 2. ⛩📰 书社 processes all of the parsers in `parsers/` and determines
 151     the list of supported plaintext types.
 152
 153 3. ⛩📰 书社 identifies all of the source files and includes and uses
 154     `build/magic.mgc` to classify them by media type.
 155
 156 4. ⛩📰 书社 parses all plaintext and X·M·L source files and includes
 157     and then builds a dependency tree between them.
 158
 159 5. ⛩📰 书社 uses the dependency tree to establish prerequisites for
 160     each output file.
 161
 162 6. ⛩📰 书社 compiles each output file to `build/result`.
 163
 164 7. ⛩📰 书社 copies most output files from `build/result` to
 165      `build/public`, but it does some additional processing instead on
 166      those which indicate a non‐X·M·L desired final output form.
 167
 168 8. ⛩📰 书社 copies the final resulting files to `public`.
 169
 170 You can use `make list` to list each identified source file or include
 171   alongside its computed type and dependencies.
 172 As this is a Make‐based program, steps will only be run if the
 173   corresponding buildfile or output file is older than its
 174   prerequisites.
 175
 176 ## Name·spaces
 177
 178 The ⛩📰 书社 name·space is `urn:fdc:ladys.computer:20231231:Shu1She4`.
 179
 180 This document uses a few name·space prefixes, with the following
 181   meanings :⁠—
 182
 183 |     Prefix | Expansion                                     |
 184 | ---------: | :-------------------------------------------- |
 185 | `catalog:` | `urn:oasis:names:tc:entity:xmlns:xml:catalog` |
 186 |    `exsl:` | `http://exslt.org/common`                     |
 187 | `exslstr:` | `http://exslt.org/strings`                    |
 188 |    `html:` | `http://www.w3.org/1999/xhtml`                |
 189 |     `rdf:` | `http://www.w3.org/1999/02/22-rdf-syntax-ns#` |
 190 |     `svg:` | `http://www.w3.org/2000/svg`                  |
 191 |   `xlink:` | `http://www.w3.org/1999/xlink`                |
 192 |    `xslt:` | `http://www.w3.org/1999/XSL/Transform`        |
 193 |    `书社:` | `urn:fdc:ladys.computer:20231231:Shu1She4`    |
 194
 195 ## Setup and Configuration
 196
 197 ⛩📰 书社 depends on the following programs to run.
 198 In every case, you may supply your own implementation by overriding the
 199   corresponding (allcaps) variable (e·g, set `MKDIR` to supply your own
 200   `mkdir` implementation).
 201
 202 - `awk`
 203 - `cat`
 204 - `cd`
 205 - `cksum`
 206 - `cp`
 207 - `date`
 208 - `diff`
 209 - `file`
 210 - `find`
 211 - `git` (optional; set `GIT=` to disable)
 212 - `grep`
 213 - `ln`
 214 - `mkdir`
 215 - `mv`
 216 - `od`
 217 - `pax` (only when generating archives)
 218 - `printf`
 219 - `rm`
 220 - `sed`
 221 - `sleep`
 222 - `test`
 223 - `touch`
 224 - `tr`
 225 - `uuencode`
 226 - `uudecode`
 227 - `xargs`
 228 - `xmlcatalog` (provided by `libxml2`)
 229 - `xmllint` (provided by `libxml2`)
 230 - `xsltproc` (provided by `libxslt`)
 231
 232 The following additional variables can be used to control the behaviour
 233   of ⛩📰 书社 :⁠—
 234
 235 - **`SRCDIR`:**
 236   The location of the source files (default: `sources`).
 237   Multiple source directories can be provided, so long as the same
 238     file subpath doesn’t exist in more than one of them.
 239
 240 - **`INCLUDEDIR`:**
 241   The location of source includes (default: `sources/includes`).
 242   This can be inside of `SRCDIR`, but needn’t be.
 243   Multiple include directories can be provided, so long as the same
 244     file subpath doesn’t exist in more than one of them.
 245
 246 - **`DATADIR`:**
 247   If set to the location of a directory, ⛩📰 书社 will run a two‐stage
 248     build.
 249   In the first stage, only files in `SRCDIR` which match
 250     `FINDDATARULES` (see below) will be built, with files in `DATADIR`
 251     serving as includes.
 252   In the second stage, the remaining files in `SRCDIR` will be built,
 253     with the files built during the first stage, in addition to any
 254     files in `INCLUDEDIR`, serving as includes.
 255   Files built during the first stage are copied into `DESTDIR`
 256     alongside those from the second stage when installing.
 257
 258   This functionality is intended for sites where the bulk of the site
 259     can be built from a few data files which are expensive to create.
 260
 261 - **`BUILDDIR`:**
 262   The location of the (temporary) build directory (default: `build`).
 263   `make clean` will delete this, and it is recommended that it not be
 264     used for programs aside from ⛩📰 书社.
 265
 266 - **`DESTDIR`:**
 267   The location of directory to output files to (default: `public`).
 268   `make install` will overwrite files in this directory which
 269     correspond to those in `SRCDIR`.
 270   It _will not_ touch other files, including those generated from files
 271     in `SRCDIR` which have since been deleted.
 272
 273   Files are first compiled to `$(BUILDDIR)/public` before they are
 274     copied to `DESTDIR`, so this folder is relatively quick and
 275     inexpensive to re·create.
 276   It’s reasonable to simply delete it before every `make install` to
 277     ensure stale content is removed, assuming copies are quick on your
 278     file·system.
 279
 280 - **`THISDIR`:**
 281   The location of the ⛩📰 书社 `GNUmakefile`.
 282   This should be set automatically when calling Make and shouldn¦t ever
 283     need to be set manually.
 284   This variable is used to find the ⛩📰 书社 `lib/` folder, which is
 285     expected to be in the same location.
 286
 287 - **`MAGIC`:**
 288   A white·space‐separated list of magic files to use (default:
 289     `$(THISDIR)/magic/*`).
 290
 291 - **`EXTRAMAGIC`:**
 292   The value of this variable is appended to `MAGIC` by default, to
 293     enable additional magic files without overriding the existing
 294     ones (default: empty).
 295
 296 - **`FINDRULES`:**
 297   Rules to use with `find` when searching for source files.
 298   The default ignores files that start with a period or hyphen‐minus,
 299     those which end with a cloparen, and those which contain a hash,
 300     buck, percent, asterisk, colon, semi, eroteme, bracket, backslash,
 301     or pipe.
 302   It is important that these rules not produce any output, as anything
 303     printed to `stdout` will be considered a result of the find.
 304
 305 - **`EXTRAFINDRULES`:**
 306   The value of this variable is appended to `FINDRULES` by default, to
 307     enable additional rules without overriding the existing ones
 308     (default: empty).
 309
 310 - **`FINDINCLUDERULES`:**
 311   Rules to use with `find` when searching for includes (default:
 312     `$(FINDRULES)`).
 313
 314 - **`EXTRAFINDINCLUDERULES`:**
 315   The value of this variable is appended to `FINDINCLUDERULES` by
 316     default, to enable additional rules without overriding the existing
 317     ones (default: empty).
 318
 319 - **`DATAOPTS`:**
 320   Additional options to use when calling Make during the first stage of
 321     a two‐stage build using `DATADIR` (default: empty).
 322
 323   This can be used to override variables which are only applicable
 324     during the second stage.
 325   Note that when supplying this variable on the shell, it will need to
 326     be twice‐quoted.
 327
 328 - **`DATAEXT`:**
 329   A list of file extensions which signify “data” files during a
 330     two‐stage build using `DATADIR`  (default: `rdf`).
 331
 332 - **`FINDDATARULES`:**
 333   Rules to use with `find` when searching for data files.
 334   By default, these rules are derived from `DATAEXT`.
 335
 336 - **`EXTRAFINDDATARULES`:**
 337   The value of this variable is appended to `FINDDATARULES` by
 338     default, to enable additional rules without overriding the existing
 339     ones (default: empty).
 340
 341 - **`FINDFILTERONLY`:**
 342   A semicolon‐separated list of regular expressions, at least one of
 343     which the paths for sources and includes are required to match,
 344     unless empty (default: empty).
 345
 346 - **`FINDFILTEROUT`:**
 347   A semicolon‐separated list of regular expressions, each of which
 348     matches paths that should _not_ be considered sources or includes
 349     (default: empty).
 350
 351 - **`FINDINCLUDEFILTERONLY`:**
 352   A semicolon‐separated list of regular expressions, at least one of
 353     which the paths for includes are required to match, unless empty
 354     (default: empty).
 355
 356   Note that only paths which already match `FINDFILTERONLY` are
 357     considered.
 358
 359 - **`FINDINCLUDEFILTEROUT`:**
 360   A semicolon‐separated list of regular expressions, each of which
 361     matches paths that should _not_ be considered includes, but may
 362     still be considered sources (default: empty).
 363
 364 - **`FINDFILTERONLYEXTENDED`:**
 365   If non·empty, `FINDFILTERONLY` is an extended regular expression;
 366     otherwise, it is basic (default: empty).
 367
 368 - **`FINDFILTEROUTEXTENDED`:**
 369   If non·empty, `FINDFILTEROUT` is an extended regular expression;
 370     otherwise, it is basic (default: matches `FINDFILTERONLYEXTENDED`).
 371
 372 - **`FINDINCLUDEFILTERONLYEXTENDED`:**
 373   If non·empty, `FINDINCLUDEFILTERONLY` is an extended regular
 374     expression; otherwise, it is basic (default: matches
 375     `FINDFILTERONLYEXTENDED`).
 376
 377 - **`FINDINCLUDEFILTEROUTEXTENDED`:**
 378   If non·empty, `FINDINCLUDEFILTEROUT` is an extended regular
 379     expression; otherwise, it is basic (default: `1` if either
 380     `FINDFILTEROUTEXTENDED` or `FINDINCLUDEFILTERONLYEXTENDED` is
 381     non·empty).
 382
 383 - **`PARSERS`:**
 384   A white·space‐separated list of parsers to use (default:
 385     `$(THISDIR)/parsers/*.xslt`).
 386
 387 - **`EXTRAPARSERS`:**
 388   The value of this variable is appended to `PARSERS` by default, to
 389     enable additional parsers without overriding the existing ones
 390     (default: empty).
 391
 392 - **`PARSERLIBS`:**
 393   A white·space‐separated list of parser dependencies (default:
 394     `$(THISDIR)/lib/split.xslt`).
 395
 396 - **`EXTRAPARSERLIBS`:**
 397   The value of this variable is appended to `PARSERLIBS` by default, to
 398     enable additional parser dependencies without overriding the
 399     existing ones (default: empty).
 400
 401 - **`TRANSFORMS`:**
 402   A white·space‐separated list of transforms to use (default:
 403     `$(THISDIR)/transforms/*.xslt`).
 404
 405 - **`EXTRATRANSFORMS`:**
 406   The value of this variable is appended to `TRANSFORMS` by default, to
 407     enable additional transforms without overriding the existing ones
 408     (default: empty).
 409
 410 - **`TRANSFORMLIBS`:**
 411   A white·space‐separated list of transform dependencies (default:
 412     `$(THISDIR)/lib/serialize.xslt`).
 413
 414 - **`EXTRATRANSFORMLIBS`:**
 415   The value of this variable is appended to `TRANSFORMLIBS` by default,
 416     to enable additional transform dependencies without overriding the
 417     existing ones (default: empty).
 418
 419 - **`XMLTYPES`:**
 420   A white·space‐separated list of media types or media type suffixes to
 421     consider X·M·L (default: `application/xml text/xml +xml`).
 422
 423 - **`FINALIZE`:**
 424   A program to run on (unspecial) X·M·L files after they are
 425     transformed (default: `xmllint --nonet --nsclean`).
 426   This variable can be used for postprocessing.
 427
 428 - **`THISREV`:**
 429   The current version of ⛩📰 书社 (default: derived from the current
 430     git tag/branch/commit).
 431
 432 - **`SRCREV`:**
 433   The current version of the source files (default: derived from the
 434     current git tag/branch/commit).
 435
 436 - **`QUIET`:**
 437   If this variable has a value, informative messages will not be
 438     printed (default: empty).
 439   Informative messages print to stderr, not stdout, so disabling them
 440     usually shouldn’t be necessary.
 441   This does not (cannot) disable messages from Make itself, for which
 442     the `-s`, `--silent` ∕ `--quiet` Make option is more likely to be
 443     useful.
 444
 445 - **`VERBOSE`:**
 446   If this variable has a value, every recipe instruction will be
 447     printed when it runs (default: empty).
 448   This is helpful for debugging, but typically too noisy for general
 449     usage.
 450
 451 ## Source Files
 452
 453 Source files may be placed in `SRCDIR` in any manner; the file
 454   structure used there will match the output.
 455 The type of source files is _not_ determined by file extension, but
 456   rather by magic number; this means that files **must** begin with
 457   something recognizable.
 458 Supported magic numbers include :⁠—
 459
 460 - `<?xml` for `application/xml` files
 461 - `#!js` for `text/javascript` files
 462 - `@charset "` for `text/css` files
 463 - `#!tsv` for `text/tab-separated-values` files
 464 - `%%` for `text/record-jar` files (unregistered; see
 465     [[draft-phillips-record-jar-01][]])
 466
 467 Text formats with associated X·S·L·T parsers are wrapped in a H·T·M·L
 468   `<script>` element whose `@type` gives its media type, and then
 469   passed to the parser to process.
 470 Source files whose media type does not have an associated X·S·L·T
 471   parser are considered “assets” and will not be transformed.
 472
 473 **☡ For compatibility with this program, source file·names must not
 474   contain Ascii white·space, colons (`:`), semis (`;`), pipes (`|`),
 475   bucks (`$`), percents (`%`), hashes (`#`), asterisks (`*`), brackets
 476   (`[` or `]`), erotemes (`?`), backslashes (`\`), or control
 477   characters, must not begin with a hyphen‐minus (`-`), and must not end
 478   with a cloparen (`)`).**
 479 The former characters have the potential to conflict with make syntax,
 480   a leading hyphen‐minus is confusable for a commandline argument, and a
 481   trailing cloparen [activates a bug in G·N·U Make
 482   3.81][so-17148468-comment].
 483
 484 ## Parsers
 485
 486 Parsers are used to convert plaintext files into X·M·L trees, as well
 487   as convert plaintext formats which are already included inline in
 488   existing source X·M·L documents.
 489 ⛩📰 书社 comes with some parsers; namely :⁠—
 490
 491 - **`parsers/plain.xslt`:**
 492   Wraps `text/plain` contents in a `<html:pre>` element.
 493
 494 - **`parsers/record-jar.xslt`:**
 495   Converts `text/record-jar` contents into a `<html:div>` of
 496     `<html:dl>` elements (one for each record).
 497
 498 - **`parsers/tsv.xslt`:**
 499   Converts `text/tab-separated-values` contents into an `<html:table>`
 500     element.
 501
 502 New ⛩📰 书社 parsers which target plaintext formats should have an
 503   `<xslt:template>` element with no `@name` or `@mode` and whose
 504   `@match` attribute…
 505
 506 - Starts with an appropriately‐name·spaced qualified name for a
 507     `<html:script>` element.
 508
 509 - Follows this with the string `[@type=`.
 510
 511 - Follows this with a quoted string giving a media type supported by
 512     the parser.
 513   Media type parameters are _not_ supported.
 514
 515 - Follows this with the string `]`.
 516
 517 For example, the trivial `text/plain` parser is defined as follows :⁠—
 518
 519 ```xml
 520 <?xml version="1.0"?>
 521 <transform
 522   xmlns="http://www.w3.org/1999/XSL/Transform"
 523   xmlns:html="http://www.w3.org/1999/xhtml"
 524   xmlns:书社="urn:fdc:ladys.computer:20231231:Shu1She4"
 525   version="1.0"
 526 >
 527   <书社:id>example:text/plain</书社:id>
 528   <template match="html:script[@type='text/plain']">
 529     <html:pre><value-of select="."/></html:pre>
 530   </template>
 531 </transform>
 532 ```
 533
 534 ⛩📰 书社 will scan the provided parsers for this pattern to determine
 535   the set of allowed plaintext file types.
 536 Multiple such `<xslt:template>` elements may be provided in a single
 537   parser, for example if the parser supports multiple media types.
 538 Alternatively, you can set the `@书社:supported-media-types` attribute
 539   on the root element of the parser to override media type support
 540   detection.
 541
 542 Even when `@书社:supported-media-types` is set, it is a requirement
 543   that each parser transform any `<html:script>` elements with a
 544   `@type` which matches their registered types into something else.
 545 Otherwise the parser will be stuck in an endless loop.
 546 The result tree of applying the transform to the `<html:script>`
 547   element will be reparsed (in case any new `<html:script>` elements
 548   were added in its subtree), and a `@书社:parsed-by` attribute will be
 549   added to each toplevel element in the result.
 550 The value of this attribute will be the value of the `<书社:id>`
 551   toplevel element in the parser.
 552
 553 Parsers **should** have an `<书社:id>` and, if present, it **must** be
 554   unique.
 555
 556 It is possible for parsers to support zero plaintext types.
 557 This is useful when targeting specific dialects of X·M·L; parsers in
 558   this sense operate on the same basic principles as transforms
 559   (described below).
 560 The major distinction between X·M·L parsers and transforms is where in
 561   the process the transformation happens:
 562 Parsers are applied _prior_ to embedding (and can be used to generate
 563   embeds); transforms are applied _after_.
 564
 565 It is **strongly recommended** that auxiliary templates in parsers be
 566   name·spaced (by `@name` or `@mode`) whenever possible, to avoid
 567   conflicts between parsers.
 568
 569 ### Attributes added during parsing
 570
 571 ⛩📰 书社 will add a few attributes to elements which result from
 572   parsing plaintext `<html:script>` elements.
 573 These include :⁠—
 574
 575 - A `@书社:parsed-by` attribute, giving a space‐separated list of
 576     parsers which parsed the node.
 577   (Generally, this will be a list of one, but it is possible for the
 578     result of a parse to be another plaintext node, which may be parsed
 579     by a different parser.)
 580
 581 - A `@书社:media-type` attribute, giving the identified media type of
 582     the plaintext node.
 583
 584 ### Parsed metadata
 585
 586 It is possible to extract metadata from a document at the same time as
 587   it is being parsed.
 588 This is done by creating result elements in the `书社:about` mode;
 589   these should be R·D·F property elements which apply to the conceptual
 590   entity that is the document being parsed.
 591
 592 During transformation, metadata for the file with identifier `$FILE`
 593   can be read from the children of
 594   `$书社:about//*[@rdf:about=$FILE]/nie:interpretedAs/*`.
 595
 596 ## Output Redirection
 597
 598 By default, ⛩📰 书社 installs files to the same location in `DESTDIR`
 599   as they were placed in their `SRCDIR`.
 600 This behaviour can be customized by setting the `@书社:destination`
 601   attribute on the root element, whose value can give a different path.
 602 This attribute is read after parsing, but before transformation (where
 603   it is silently dropped).
 604
 605 Multiple destinations can be provided if the same file should be output to multiple places.
 606 The file is retransformed each time, with the value of the `DESTINATION` global param set appropriately.
 607
 608 ## Embedding
 609
 610 Documents can be embedded in other documents using a `<书社:link>`
 611   element with `@xlink:show="embed"` and an `@xlink:actuate` which is
 612   absent or `"none"`.
 613 The `@xlink:href`s of these elements should have the format
 614   `about:shushe?source=<path>`, where `<path>` provides the path to the
 615   file within `SRCDIR`.
 616 Includes, which do not generate outputs of their own but may still be
 617   freely embedded, instead use the format
 618   `about:shushe?include=<path>`, where `<path>` provides the path
 619   within `INCLUDEDIR`.
 620 If `<path>` indicates a directory and ends with a slash (`/`),
 621   everything within that directory will be embedded.
 622
 623 Embeds are replaced with the parsed contents of a file, unless the file
 624   is an asset, in which case an `<html:object>` element is produced
 625   instead (with the contents of the asset file provided as a base64
 626   `data:` u·r·i).
 627 Embed replacements will be given a `@书社:identifier` attribute whose
 628   value will match the `@xlink:href` of the embed.
 629
 630 Embedding takes place after parsing but before transformation, so
 631   parsers are able to generate their own embeds.
 632 ⛩📰 书社 is able to detect the transitive embed dependencies of files
 633   and update them accordingly; it will signal an error if the
 634   dependencies are recursive.
 635
 636 ### Attributes added during expansion
 637
 638 ⛩📰 书社 will add a few attributes to toplevel result elements, both
 639   in the main document and any embedded documents, during the expansion
 640   phase prior to the main transformation.
 641 These include :⁠—
 642
 643 - A `@书社:cksum` attribute giving the `cksum` checksum of the
 644     corresponding source file.
 645
 646 - A `@书社:mtime` attribute giving the last modified time of the
 647     corresponding source file.
 648
 649 - A `@书社:identifier` attribute giving the ⛩📰 书社 identifier
 650     (i·e, starting with `about:shushe?`) of the corresponding source
 651     file.
 652
 653 - For elements in the `html` namespace, an `itemscope` attribute and an
 654     `itemtype` attribute with a value of
 655     `urn:fdc:ladys.computer:20231231:Shu1She4:document` (for the main
 656     document) or `urn:fdc:ladys.computer:20231231:Shu1She4:embed` (for
 657     embedded documents).
 658   These attributes are used to scope any nested `<html:meta>` elements
 659     with `@itemprop` attributes to their containing documents.
 660
 661 ## Soft Dependencies
 662
 663 When a file depends only on the metadata of another file, and not its
 664   contents, it can be added as a soft dependency rather than an embed.
 665 Soft dependencies are indicated using a `<书社:link>` element with an
 666   `@xlink:show` of `"other"`, `"none"`, or absent, and an
 667   `@xlink:actuate` which is absent or `"none"`.
 668 A change to a soft dependency requires a file to be rebuilt, but no
 669   embedding occurs automatically.
 670 Because there is no automatic embedding, soft dependencies are allowed
 671   to be recursive.
 672
 673 The `@xlink:href`s of soft dependency `<书社:link>`s are processed in
 674   exactly the same fashion as embeds, described above.
 675
 676 If the value of `@xlink:show` is `"other"`, the soft dependency is
 677   transitive.
 678 Any dependencies of the indicated file which have a `@name` which
 679   matches that of the referencing `<书社:link>` element will also be
 680   treated as soft dependencies.
 681 If no `@name` is given, it is treated as the empty string.
 682
 683 When a document is embedded directly, all of its soft dependencies are
 684   also treated as soft dependencies of the embedding object.
 685 However, a document is embedded in a transitive soft dependency, the
 686   embed is treated exactly as tho it were itself a transitive soft
 687   dependency.
 688 That means it must have a matching `@name` to be included, and
 689   like·wise for any embeds or soft dependencies it contains.
 690
 691 If the value of `@xlink:show` is `"none"` or absent, the soft
 692   dependency is not transitive and its own dependencies are not
 693   checked.
 694
 695 ## Transforms
 696
 697 Transforms are used to convert X·M·L files into their final output,
 698   after all necessary parsing and embedding has taken place.
 699 ⛩📰 书社 comes with some transforms; namely :⁠—
 700
 701 - **`transforms/asset.xslt`:**
 702   Converts `<html:object>` elements which correspond to recognized
 703     media types into the appropriate H·T·M·L elements, and deletes
 704     `<html:style>` elements from the body of the document while
 705     transferring them to the head.
 706   This conversion happens during the finalization phase, after the main
 707     transformation.
 708
 709 - **`transforms/expansion.xslt`:**
 710   Performs embedding, as described above.
 711
 712 - **`transforms/metadata.xslt`:**
 713   Provides basic `<html:head>` metadata.
 714   This metadata is generated from `<html:meta>` elements with one of
 715     the following `@itemprop` attributes :⁠—
 716
 717   - **`urn:fdc:ladys.computer:20231231:Shu1She4:title`:**
 718     Provides the title of the page.
 719
 720   ⛩📰 书社 automatically encapsulates H·T·M·L embeds so that their
 721     metadata does not propagate up to the embedding document.
 722   To undo this behaviour, remove the `@itemscope` and `@itemtype`
 723     attributes from the embed during the transformation phase.
 724
 725 - **`transforms/serialization.xslt`:**
 726   Replaces `<书社:serialize-xml>` elements with the (escaped)
 727     serialized X·M·L of their contents.
 728   This replacement happens during the finalization phase, after most
 729     other transformations have taken place.
 730
 731   If a `@with-namespaces` attribute is provided, any name·space nodes
 732     on the toplevel serialized elements whose U·R·I’s correspond to the
 733     definitions of the provided prefixes, as defined for the
 734     `<书社:serialize-xml>` element, will be declared using name·space
 735     attributes on the serialized elements.
 736   Otherwise, only name·space nodes which _differ_ from the definitions
 737     on the `<书社:serialize-xml>` element will be declared.
 738   The string `#default` may be used to represent the default
 739     name·space.
 740   Multiple prefixes may be provided, separated by white·space.
 741
 742   When it comes to name·spaces used internally by ⛩📰 书社, the
 743     prefix used by ⛩📰 书社 may be declared _in addition to_ the
 744     prefix(es) used in the source document(s).
 745   It is not possible to selectively only declare one prefix for a
 746     name·space to the exclusion of others.
 747
 748   `<书社:raw-output>` elements may be used inside of
 749     `<书社:serialize-xml>` elements to inject raw output into the
 750     serialized X·M·L.
 751
 752 The following are recommendations on effective creation of
 753   transforms :⁠—
 754
 755 - Make template matchers as specific as possible.
 756   It is likely an error if two transforms have templates which match
 757     the same element (unless the templates have different priority).
 758
 759 - Name·space templates (with `@name` or `@mode`) whenever possible.
 760
 761 - Set `@exclude-result-prefixes` on the root `xslt:transform` element
 762     to reduce the number of declared name·spaces in the final result.
 763
 764 ## Global Params
 765
 766 The following params are made available globally in parsers and
 767   transforms :⁠—
 768
 769 - **`BUILDTIME`:**
 770   The current time.
 771
 772 - **`IDENTIFIER`:**
 773   The ⛩📰 书社 identifier of the source file (a u·r·i beginning with
 774     `about:shushe`).
 775
 776 - **`SRCREV`:**
 777   The value of the `SRCREV` variable (if present).
 778
 779 - **`THISREV`:**
 780   The value of the `THISREV` variable (if present).
 781
 782 In transforms, the following params are additionally available :⁠—
 783
 784 - **`DESTINATION`:**
 785   The destination being targeted by this transform.
 786
 787 - **`书社:about`:**
 788   R·D·F metadata about all of the documents ⛩📰 书社 knows about.
 789   Use `$书社:about//*[@rdf:about=$IDENTIFIER]` to get the metadata for
 790     the current document.
 791
 792 - **`书社:source`:**
 793   The parsed source document being transformed, prior to any expansion.
 794
 795 - **`书社:expansion`:**
 796   The document after the all embeds have been expanded.
 797   Unavailable during the `书社:expand` stage.
 798
 799 - **`书社:result`:**
 800   The document after the main set of transformations have been applied.
 801   Only available during the `书社:finalize` stage, where it is used to
 802     apply output wrapping and other clean·up.
 803
 804 ## Output Wrapping
 805
 806 Provided at least one toplevel result element belongs to the H·T·M·L
 807   namespace, ⛩📰 书社 will wrap the final output of the transforms in
 808   appropriate `<html:html>` and `<html:body>` elements, so it is not
 809   necessary for transforms to do this explicitly.
 810 If a toplevel result element _is_ a `<html:html>` and `<html:body>`
 811   element, it will be merged with the one that ⛩📰 书社 creates.
 812 Consequently, wrapping the result in a `<html:body>` element can be
 813   used to enable wrapping for non‐H·T·M·L content, when desired.
 814
 815 As a part of this process, after performing the initial transform
 816   ⛩📰 书社 will match in the following modes to fill in areas of the
 817   wrapper :⁠—
 818
 819 - **`书社:header`:**
 820   The result of matching in this mode is prepended into the
 821     `<html:body>` of the output (before the transformation result).
 822
 823 - **`书社:footer`:**
 824   The result of matching in this mode is appended into the
 825     `<html:body>` of the output (after the transformation result).
 826
 827 - **`书社:metadata`:**
 828   The result of matching in this mode is inserted into the
 829     `<html:head>` of the output.
 830
 831 The document being matched will contain the full transform result
 832   prior to wrapping as well as an `<书社:id>` element for each
 833   transform.
 834 The latter elements can be matched to enable transforms to provide
 835   content _without_ matching any elements in the result; for example,
 836   the following transform adds a link to a stylesheet to the
 837   `<html:head>` of every page :⁠—
 838
 839 ```xml
 840 <?xml version="1.0"?>
 841 <transform
 842   xmlns="http://www.w3.org/1999/XSL/Transform"
 843   xmlns:html="http://www.w3.org/1999/xhtml"
 844   xmlns:xslt="http://www.w3.org/1999/XSL/Transform"
 845   xmlns:书社="urn:fdc:ladys.computer:20231231:Shu1She4"
 846   exclude-result-prefixes="书社"
 847   version="1.0"
 848 >
 849   <书社:id>example:add-stylesheet-links.xslt</书社:id>
 850   <template match="书社:id[string(.)='example:add-stylesheet-links.xslt']" mode="书社:metadata">
 851     <html:link rel="stylesheet" type="text/css" href="/style.css"/>
 852   </template>
 853 </transform>
 854 ```
 855
 856 Output wrapping can be entirely disabled by adding a
 857   `@书社:disable-output-wrapping` attribute to the top‐level element in
 858   the result tree.
 859 It will not be performed on outputs whose root elements are
 860   `<书社:archive>`, `<书社:base64-binary>`, or `<书社:raw-text>`
 861   (described below), or on result trees which do not contain a toplevel
 862   element in the H·T·M·L namespace.
 863
 864 ## Applying Attributes
 865
 866 The `<书社:apply-attributes>` element will apply any attributes on the
 867   element to the element(s) it wraps.
 868 It is especially useful in combination with embeds.
 869
 870 The `<书社:apply-attributes-to-root>` element will apply any attributes
 871   on the element to the root node of the final transformation result.
 872 It is especially useful in combination with output wrapping.
 873
 874 In both cases, attributes from various sources are combined with
 875   white·space between them.
 876 Attribute application takes place after each stage of the
 877   transformation, including after the initial embedding phase.
 878
 879 Both elements ignore attributes in the `xml:` name·space, except for
 880   `@xml:lang`, which ignores all but the first definition (including
 881   any already present on the root element).
 882 On H·T·M·L and S·V·G elements, `@lang` has the same behaviour as
 883   `@xml:lang`.
 884
 885 ## Other Kinds of Output
 886
 887 There are a few special elements in the `书社:` name·space which, if
 888   they appear as the toplevel element in a transformation result, cause
 889   ⛩📰 书社 to produce something other than an X·M·L file.
 890 They are :⁠—
 891
 892 - **`<书社:archive>`:**
 893   Each child element with a `@书社:archived-as` attribute will be
 894     archived as a separate file in a resulting tarball (this attribute
 895     gives the file name).
 896   These elements will be processed the same as the root elements of any
 897     other file (e·g, they will be wrapped; they can themselves specify
 898     non X·M·L output types, ⁊·c).
 899   Other child elements will be ignored.
 900
 901   If the `<书社:archive>` element is given an `@书社:expanded`
 902     attribute, rather than producing a tarball ⛩📰 书社 will output
 903     the directory which expanding the tarball would produce.
 904   This mechanism can be used to generate multiple files from a single
 905     source, provided all of the files are contained with·in the same
 906     directory.
 907
 908 - **`<书社:base64-binary>`:**
 909   The text nodes in the transformation result will, after removing all
 910     Ascii whitespace, be treated as a Base·64 string, which is then
 911     decoded.
 912
 913 - **`<书社:raw-text>`:**
 914   A plaintext (U·T·F‐8) file will be produced from the text nodes in
 915     the transformation result.
 916
 917 ## License
 918
 919 This repository conforms to [REUSE][].
 920
 921 Most source files are licensed under the terms of the <cite>Mozilla
 922   Public License, version 2.0</cite>.
 923
 924 [F3C]: <https://darwinsys.com/file/>
 925 [REUSE]: <https://reuse.software/spec/>
 926 [Sharutils]: <https://www.gnu.org/software/sharutils/>
 927 [draft-phillips-record-jar-01]: <https://datatracker.ietf.org/doc/html/draft-phillips-record-jar-01>
 928 [rdar-92753335]: <https://github.com/apple-oss-distributions/patch_cmds/blob/5084833f90df1b0e0924ea56f94c0199b3b8bbc6/diff/diffreg.c#L1800-L1808>
 929 [so-17148468-comment]: <https://stackoverflow.com/questions/17148468/capturing-filenames-including-parentheses-with-gnu-makes-wildcard-function#comment24825307_17148894>