2 SPDX-FileCopyrightText: 2024 Lady <https://www.ladys.computer/about/#lady>
3 SPDX-License-Identifier: CC0-1.0
7 <b>A make·file for X·M·L.</b>
9 <dfn>⛩️📰 书社</dfn> aims to make it easy to generate websites with
10 X·S·L·T and G·N·U Make.
11 It is consequently only a good choice for people who like X·S·L·T and
12 G·N·U Make and wish it were easier to make websites with them.
14 It makes things easier by :—
16 - Automatically identifying source files and characterizing them by
17 type (X·M·L, text, or asset).
19 - Parsing supported text types into X·M·L trees.
21 - Enabling easy inclusion of source files within each other.
23 It aims to do this with zero dependencies beyond the programs already
24 installed on your computer†.
26 † The only non‐Posix programs‡ required are those provided by `libxml2`
27 and `libxslt` (which most operating systems provide), but on Linux
28 machines the commandline utilities may need to be installed
29 separately as **`libxml2-utils`** and **`xsltproc`**.
30 Additionally, not all Linux distributions bundle all necessary Posix
31 programs; on Debian (for example) you may need to separately install
32 **`sharutils`** for `uudecode` and `uuencode` and **`pax`** for
35 ‡ This make·file also currently depends on non‐Posix `stat` but
36 attempts to handle both the G·N·U and B·S·D variants.
37 It expects `xargs` to accept a `-0` option, which, while widely
38 supported, is not a part of the Posix standard.
41 ⛩️📰 书社 requires functionality present in G·N·U Make 3.81 (or later)
42 and will not work in previous versions, or other implementations of
44 Compatibility with later versions of G·N·U Make is assumed, but not
49 <i lang="cmn-Hans">书社</i> is a Chinese word meaning “publishing
52 The first character, <i lang="cmn-Hans">书</i>, is the simplified form
55 The second character, <i lang="cmn-Hans">社</i>, contemporarily means
56 “association”, but historically referred to the god of the soil and
57 related altars or festivities.
58 In Japanese, it is an alternate spelling for <i lang="ja">やしろ</i>,
59 the word for “Shinto shrine”.
61 The name <i lang="cmn-Hans">书社</i> was chosen to play on this pun, as
62 it is intended as a publishing program for webshrines.
64 In Ascii environments, ⛩️📰 书社 should be written `Shushe`, following
65 the pinyin transliteration.
69 Place source files in `sources/` and run `make install` to compile
70 the result to `public/`.
71 Compilation involves the following steps :—
73 1. ⛩️📰 书社 compiles all of the magic files in `magic/` into a single
74 file, `build/magic.mgc`.
76 2. ⛩️📰 书社 processes all of the parsers in `parsers/` and determines
77 the list of supported plaintext types.
79 3. ⛩️📰 书社 identifies all of the source files and includes and uses
80 `build/magic.mgc` to classify them by media type.
82 4. ⛩️📰 书社 parses all plaintext and X·M·L source files and includes
83 and then builds a dependency tree between them.
85 5. ⛩️📰 书社 uses the dependency tree to establish prerequisites for
88 6. ⛩️📰 书社 compiles each output file to `build/result`.
90 7. ⛩️📰 书社 copies most output files from `build/result` to
91 `build/public`, but it does some additional processing instead on
92 those which indicate a non‐X·M·L desired final output form.
94 8. ⛩️📰 书社 copies the final resulting files to `public`.
96 You can use `make list` to list each identified source file or include
97 alongside its computed type and dependencies.
98 As this is a Make‐based program, steps will only be run if the
99 corresponding buildfile or output file is older than its
104 The ⛩️📰 书社 name·space is `urn:fdc:ladys.computer:20231231:Shu1She4`.
106 This document uses a few name·space prefixes, with the following
109 | Prefix | Expansion |
110 | ---------: | :-------------------------------------------- |
111 | `catalog:` | `urn:oasis:names:tc:entity:xmlns:xml:catalog` |
112 | `exsl:` | `http://exslt.org/common` |
113 | `exslstr:` | `http://exslt.org/strings` |
114 | `html:` | `http://www.w3.org/1999/xhtml` |
115 | `svg:` | `http://www.w3.org/2000/svg` |
116 | `xlink:` | `http://www.w3.org/1999/xlink` |
117 | `xslt:` | `http://www.w3.org/1999/XSL/Transform` |
118 | `书社:` | `urn:fdc:ladys.computer:20231231:Shu1She4` |
120 ## Setup and Configuration
122 ⛩️📰 书社 depends on the following programs to run.
123 In every case, you may supply your own implementation by overriding the
124 corresponding (allcaps) variable (e·g, set `MKDIR` to supply your own
125 `mkdir` implementation).
135 - `git` (optional; set `GIT=` to disable)
141 - `pax` (only when generating archives)
146 - `stat` (BSD *or* GNU)
152 - `xargs` (requires support for `-0`)
153 - `xmlcatalog` (provided by `libxml2`)
154 - `xmllint` (provided by `libxml2`)
155 - `xsltproc` (provided by `libxslt`)
157 The following additional variables can be used to control the behaviour
161 The location of the source files (default: `sources`).
162 Multiple source directories can be provided, so long as the same
163 file subpath doesn’t exist in more than one of them.
166 The location of source includes (default: `sources/includes`).
167 This can be inside of `SRCDIR`, but needn’t be.
168 Multiple include directories can be provided, so long as the same
169 file subpath doesn’t exist in more than one of them.
172 The location of the (temporary) build directory (default: `build`).
173 `make clean` will delete this, and it is recommended that it not be
174 used for programs aside from ⛩️📰 书社.
177 The location of directory to output files to (default: `public`).
178 `make install` will overwrite files in this directory which
179 correspond to those in `SRCDIR`.
180 It *will not* touch other files, including those generated from files
181 in `SRCDIR` which have since been deleted.
183 Files are first compiled to `$(BUILDDIR)/public` before they are
184 copied to `DESTDIR`, so this folder is relatively quick and
185 inexpensive to re·create.
186 It’s reasonable to simply delete it before every `make install` to
187 ensure stale content is removed.
190 The location of the ⛩️📰 书社 `GNUmakefile`.
191 This should be set automatically when calling Make and shouldn’t ever
192 need to be set manually.
193 This variable is used to find the ⛩️📰 书社 `lib/` folder, which is
194 expected to be in the same location.
197 A white·space‐separated list of magic files to use (default:
198 `$(THISDIR)/magic/*`).
201 The value of this variable is appended to `MAGIC` by default, to
202 enable additional magic files without overriding the existing ones.
205 Rules to use with `find` when searching for source files.
206 The default ignores files that start with a period or hyphen‐minus,
207 those which end with a cloparen, and those which contain a hash,
208 buck, percent, asterisk, colon, semi, eroteme, bracket, backslash,
211 - **`EXTRAFINDRULES`:**
212 The value of this variable is appended to `FINDRULES` by default, to
213 enable additional rules without overriding the existing ones.
215 - **`FINDINCLUDERULES`:**
216 Rules to use with `find` when searching for includes (default:
219 - **`EXTRAFINDINCLUDERULES`:**
220 The value of this variable is appended to `FINDINCLUDERULES` by
221 default, to enable additional rules without overriding the existing
225 A white·space‐separated list of parsers to use (default:
226 `$(THISDIR)/parsers/*.xslt`).
228 - **`EXTRAPARSERS`:**
229 The value of this variable is appended to `PARSERS` by default, to
230 enable additional parsers without overriding the existing ones.
233 A white·space‐separated list of transforms to use (default:
234 `$(THISDIR)/transforms/*.xslt`).
236 - **`EXTRATRANSFORMS`:**
237 The value of this variable is appended to `TRANSFORMS` by default, to
238 enable additional transforms without overriding the existing ones.
241 A white·space‐separated list of media types to consider X·M·L
242 (default: `application/xml text/xml`).
245 The current version of ⛩️📰 书社 (default: derived from the current
246 git tag/branch/commit).
249 The current version of the source files (default: derived from the
250 current git tag/branch/commit).
253 If this variable has a value, every recipe instruction will be
254 printed when it runs (default: empty).
255 This is helpful for debugging, but typically too noisy for general
260 Source files may be placed in `SRCDIR` in any manner; the file
261 structure used there will match the output.
262 The type of source files is *not* determined by file extension, but
263 rather by magic number; this means that files **must** begin with
264 something recognizable.
265 Supported magic numbers include :—
267 - `<?xml` for `application/xml` files
268 - `#!js` for `text/javascript` files
269 - `@charset "` for `text/css` files
270 - `#!tsv` for `text/tab-separated-values` files
271 - `%%` for `text/record-jar` files (unregistered; see
272 [[draft-phillips-record-jar-01][]])
274 Text formats with associated X·S·L·T parsers are wrapped in a H·T·M·L
275 `<script>` element whose `@type` gives its media type, and then
276 passed to the parser to process.
277 Source files whose media type does not have an associated X·S·L·T
278 parser are considered “assets” and will not be transformed.
280 **☡ For compatibility with this program, source file·names must not
281 contain Ascii white·space, colons (`:`), semis (`;`), pipes (`|`),
282 bucks (`$`), percents (`%`), hashes (`#`), asterisks (`*`), brackets
283 (`[` or `]`), erotemes (`?`), backslashes (`\`), or control
284 characters, must not begin with a hyphen‐minus (`-`), and must not
285 end with a cloparen (`)`).**
286 The former characters have the potential to conflict with make syntax,
287 a leading hyphen‐minus is confusable for a command‐line argument, and
288 a trailing cloparen [activates a bug in G·N·U Make
289 3.81](https://stackoverflow.com/questions/17148468/capturing-filenames-including-parentheses-with-gnu-makes-wildcard-function#comment24825307_17148894).
293 Parsers are used to convert plaintext files into X·M·L trees, as well
294 as convert plaintext formats which are already included inline in
295 existing source X·M·L documents.
296 ⛩️📰 书社 comes with some parsers; namely :—
298 - **`parsers/plain.xslt`:**
299 Wraps `text/plain` contents in a `<html:pre>` element.
301 - **`parsers/record-jar.xslt`:**
302 Converts `text/record-jar` contents into a `<html:div>` of
303 `<html:dl>` elements (one for each record).
305 - **`parsers/tsv.xslt`:**
306 Converts `text/tab-separated-values` contents into an `<html:table>`
309 New ⛩️📰 书社 parsers which target plaintext formats should have an
310 `<xslt:template>` element with no `@name` or `@mode` and whose
313 - Starts with an appropriately‐name·spaced qualified name for a
314 `<html:script>` element.
316 - Follows this with the string `[@type=`.
318 - Follows this with a quoted string giving a media type supported by
320 Media type parameters are *not* supported.
322 - Follows this with the string `]`.
324 For example, the trivial `text/plain` parser is defined as follows :—
327 <?xml version="1.0"?>
329 xmlns="http://www.w3.org/1999/XSL/Transform"
330 xmlns:html="http://www.w3.org/1999/xhtml"
331 xmlns:书社="urn:fdc:ladys.computer:20231231:Shu1She4"
334 <书社:id>example:text/plain</书社:id>
335 <template match="html:script[@type='text/plain']">
336 <html:pre><value-of select="."/></html:pre>
341 ⛩️📰 书社 will scan the provided parsers for this pattern to determine
342 the set of allowed plaintext file types.
343 Multiple such `<xslt:template>` elements may be provided in a single
344 parser, for example if the parser supports multiple media types.
345 Alternatively, you can set the `@书社:supported-media-types` attribute
346 on the root element of the parser to override media type support
349 Even when `@书社:supported-media-types` is set, it is a requirement
350 that each parser transform any `<html:script>` elements with a
351 `@type` which matches their registered types into something else.
352 Otherwise the parser will be stuck in an endless loop.
353 The result tree of applying the transform to the `<html:script>`
354 element will be reparsed (in case any new `<html:script>` elements
355 were added in its subtree), and a `@书社:parsed-by` attribute will be
356 added to each toplevel element in the result.
357 The value of this attribute will be the value of the `<书社:id>`
358 toplevel element in the parser.
360 It is possible for parsers to support zero plaintext types.
361 This is useful when targeting specific dialects of X·M·L; parsers in
362 this sense operate on the same basic principles as transforms
364 The major distinction between X·M·L parsers and transforms is where in
365 the process the transformation happens:
366 Parsers are applied *prior* to embedding (and can be used to generate
367 embeds); transforms are applied *after*.
369 It is **strongly recommended** that auxillary templates in parsers be
370 name·spaced (by `@name` or `@mode`) whenever possible, to avoid
371 conflicts between parsers.
373 ### Attributes added during parsing
375 ⛩️📰 书社 will add a few attributes to the output of the parsing step,
378 - A `@书社:cksum` attribute on toplevel result elements, giving the
379 `cksum` checksum of the corresponding source file.
381 - For the elements which result from parsing plaintext `<html:script>`
384 - A `@书社:parsed-by` attribute, giving a space‐separated list of
385 parsers which parsed the node.
386 (Generally, this will be a list of one, but it is possible for the
387 result of a parse to be another plaintext node, which may be
388 parsed by a different parser.)
390 - A `@书社:media-type` attribute, giving the identified media type of
395 Documents can be embedded in other documents using a `<书社:link>`
396 element with `@xlink:show="embed"`.
397 The `@xlink:href`s of these elements should have the format
398 `about:shushe?source=<path>`, where `<path>` provides the path to the
399 file within `SRCDIR`.
400 Includes, which do not generate outputs of their own but may still be
401 freely embedded, instead use the format
402 `about:shushe?include=<path>`, where `<path>` provides the path
405 Embeds are replaced with the parsed contents of a file, unless the file
406 is an asset, in which case an `<html:object>` element is produced
407 instead (with the contents of the asset file provided as a base64
409 Embed replacements will be given a `@书社:identifier` attribute whose
410 value will match the `@xlink:href` of the embed.
412 Embedding takes place after parsing but before transformation, so
413 parsers are able to generate their own embeds.
414 ⛩️📰 书社 is able to detect the transitive embed dependencies of files
415 and update them accordingly; it will signal an error if the
416 dependencies are recursive.
418 ## Output Redirection
420 By default, ⛩️📰 书社 installs files to the same location in `DESTDIR`
421 as they were placed in their `SRCDIR`.
422 This behaviour can be customized by setting the `@书社:destination`
423 attribute on the root element, whose value can give a different path.
424 This attribute is read after parsing, but before transformation (where
425 it is silently dropped).
429 Transforms are used to convert X·M·L files into their final output,
430 after all necessary parsing and embedding has taken place.
431 ⛩️📰 书社 comes with some transforms; namely :—
433 - **`transforms/asset.xslt`:**
434 Converts `<html:object>` elements which correspond to recognized
435 media types into the appropriate H·T·M·L elements, and deletes
436 `<html:style>` elements from the body of the document and moves
439 - **`transforms/metadata.xslt`:**
440 Provides basic `<html:head>` metadata.
441 This metadata is generated from `<html:meta>` elements with one of
442 the following `@itemprop` attributes :—
444 - **`urn:fdc:ladys.computer:20231231:Shu1She4:title`:**
445 Provides the title of the page.
447 ⛩️📰 书社 automatically encapsulates H·T·M·L embeds so that their
448 metadata does not propogate up to the embedding document.
449 To undo this behaviour, remove the `@itemscope` and `@itemtype`
450 attributes from the embed during the transformation phase.
452 - **`transforms/serialization.xslt`:**
453 Replaces `<书社:serialize-xml>` elements with the (escaped)
454 serialized X·M·L of their contents.
455 This replacement happens during the application phase, after most
456 other transformations have taken place.
458 If a `@with-namespaces` attribute is provided, any name·space nodes
459 on the toplevel serialized elements whose U·R·I’s correspond to the
460 definitions of the provided prefixes, as defined for the
461 `<书社:serialize-xml>` element, will be declared using name·space
462 attributes on the serialized elements.
463 Otherwise, only name·space nodes which _differ_ from the definitions
464 on the `<书社:serialize-xml>` element will be declared.
465 The string `#default` may be used to represent the default
467 Multiple prefixes may be provided, separated by white·space.
469 When it comes to name·spaces used internally by ⛩️📰 书社, the
470 prefix used by ⛩️📰 书社 may be declared _in addition to_ the
471 prefix(es) used in the source document(s).
472 It is not possible to selectively only declare one prefix for a
473 name·space to the exclusion of others.
475 `<书社:raw-output>` elements may be used inside of
476 `<书社:serialize-xml>` elements to inject raw output into the
479 The following are recommendations on effective creation of
482 - Make template matchers as specific as possible.
483 It is likely an error if two transforms have templates which match
484 the same element (unless the templates have different priority).
486 - Name·space templates (with `@name` or `@mode`) whenever possible.
488 - Set `@exclude-result-prefixes` on the root `xslt:transform` element
489 to reduce the number of declared name·spaces in the final result.
493 The following params are made available globally in parsers and
500 The checksum of the source file (⅌ `cksum`).
503 The ⛩️📰 书社 identifier of the source file (a u·r·i beginning with
507 The value of the `SRCREV` variable (if present).
510 The time at which the source file was last modified.
513 The value of the `THISREV` variable (if present).
515 The following params are only available in transforms :—
518 The path of the catalog file (within `BUILDDIR`).
521 The path of the output file (within `DESTDIR`).
525 ⛩️📰 书社 will wrap the final output of the transforms in appropriate
526 `<html:html>` and `<html:body>` elements, so it is not necessary for
527 transforms to do this explicitly.
528 After performing the initial transform, ⛩️📰 书社 will match the root
529 node of the result in the following modes to fill in areas of the
533 The result of matching in this mode is prepended into the
534 `<html:body>` of the output (before the transformation result).
537 The result of matching in this mode is appended into the
538 `<html:body>` of the output (after the transformation result).
541 The result of matching in this mode is inserted into the
542 `<html:head>` of the output.
544 In addition to being called with the transform result, each of these
545 modes will additionally be called with a `<xslt:include>` element
546 corresponding to each transform.
547 If a transform has a `<书社:id>` top‐level element whose value is an
548 i·r·i, its `<xslt:include>` element will have a corresponding
550 This mechanism can be used to allow transforms to insert content
551 without matching any elements in the result; for example, the
552 following transform adds a link to a stylesheet to the `<html:head>`
556 <?xml version="1.0"?>
558 xmlns="http://www.w3.org/1999/XSL/Transform"
559 xmlns:html="http://www.w3.org/1999/xhtml"
560 xmlns:xslt="http://www.w3.org/1999/XSL/Transform"
561 xmlns:书社="urn:fdc:ladys.computer:20231231:Shu1She4"
562 exclude-result-prefixes="书社"
565 <书社:id>example:add-stylesheet-links.xslt</书社:id>
566 <template match="xslt:include[@书社:id='example:add-stylesheet-links.xslt']" mode="书社:metadata">
567 <html:link rel="stylesheet" type="text/css" href="/style.css"/>
572 Output wrapping can be entirely disabled by adding a
573 `@书社:disable-output-wrapping` attribute to the top‐level element in
575 It will not be performed on outputs whose root elements are
576 `<书社:archive>`, `<书社:base64-binary>`, or `<书社:raw-text>`
579 ## Applying Attributes
581 The `<书社:apply-attributes>` element will apply any attributes on the
582 element to the element(s) it wraps.
583 It is especially useful in combination with embeds.
585 The `<书社:apply-attributes-to-root>` element will apply any attributes
586 on the element to the root node of the final transformation result.
587 It is especially useful in combination with output wrapping.
589 In both cases, attributes from various sources are combined with
590 white·space between them.
591 Attribute application takes place after all ordinary transforms have
594 Both elements ignore attributes in the `xml:` name·space, except for
595 `@xml:lang`, which ignores all but the first definition (including
596 any already present on the root element).
597 On H·T·M·L and S·V·G elements, `@lang` has the same behaviour as
600 ## Other Kinds of Output
602 There are a few special elements in the `书社:` name·space which, if
603 they appear as the toplevel element in a transformation result, cause
604 ⛩️📰 书社 to produce something other than an X·M·L file.
607 - **`<书社:archive>`:**
608 Each child element with a `@书社:archived-as` attribute will be
609 archived as a separate file in a resulting tarball (this attribute
610 gives the file name).
611 These elements will be processed the same as the root elements of any
612 other file (e·g, they will be wrapped; they can themselves specify
613 non X·M·L output types, ⁊·c).
614 Other child elements will be ignored.
616 - **`<书社:base64-binary>`:**
617 The text nodes in the transformation result will, after removing all
618 Ascii whitespace, be treated as a Base·64 string, which is then
621 - **`<书社:raw-text>`:**
622 A plaintext (U·T·F‐8) file will be produced from the text nodes in
623 the transformation result.
627 This repository conforms to [REUSE][].
629 Most source files are licensed under the terms of the <cite>Mozilla
630 Public License, version 2.0</cite>.
632 [REUSE]: <https://reuse.software/spec/>
633 [draft-phillips-record-jar-01]: <https://datatracker.ietf.org/doc/html/draft-phillips-record-jar-01>