<!--
-SPDX-FileCopyrightText: 2024 Lady <https://www.ladys.computer/about/#lady>
+SPDX-FileCopyrightText: 2024, 2025, 2026 Lady <https://www.ladys.computer/about/#lady>
SPDX-License-Identifier: CC0-1.0
-->
# ⛩📰 书社
extensions rather than standardized.][rdar-92753335]
Despite this, the default Macintosh implementation will still work with
⛩📰 书社, with the caveat that the timestamp will only include a
- fractional component when a Posix⹀compliant (e·g, Macintosh legacy or
+ fractional component when a Posix‐compliant (e·g, Macintosh legacy or
G·N·U) implementation is used.
### `file`
- **`--separator`** must be useable to set the separator that `file`
uses to separate file names from types.
-These options are implemented by the
- [Fine Free File Command](https://darwinsys.com/file/), which is used
- by most operating systems.
+These options are implemented by the [Fine Free File Command][F3C],
+ which is used by most operating systems.
### `git`
`POSIX.1-2001` (altho they are made mandatory in `POSIX.1-2008`) and
they are not included in the Linux Standard Base or installed by
default in many distributions.
-The G·N·U [Sharutils](https://www.gnu.org/software/sharutils/) package
- provides one implementation.
+The G·N·U [Sharutils][] package provides one implementation.
### `xmlcatalog` and `xmllint`
| `exsl:` | `http://exslt.org/common` |
| `exslstr:` | `http://exslt.org/strings` |
| `html:` | `http://www.w3.org/1999/xhtml` |
+| `rdf:` | `http://www.w3.org/1999/02/22-rdf-syntax-ns#` |
| `svg:` | `http://www.w3.org/2000/svg` |
| `xlink:` | `http://www.w3.org/1999/xlink` |
| `xslt:` | `http://www.w3.org/1999/XSL/Transform` |
Multiple include directories can be provided, so long as the same
file subpath doesn’t exist in more than one of them.
+- **`DATADIR`:**
+ If set to the location of a directory, ⛩📰 书社 will run a two‐stage
+ build.
+ In the first stage, only files in `SRCDIR` which match
+ `FINDDATARULES` (see below) will be built, with files in `DATADIR`
+ serving as includes.
+ In the second stage, the remaining files in `SRCDIR` will be built,
+ with the files built during the first stage, in addition to any
+ files in `INCLUDEDIR`, serving as includes.
+ Files built during the first stage are copied into `DESTDIR`
+ alongside those from the second stage when installing.
+
+ This functionality is intended for sites where the bulk of the site
+ can be built from a few data files which are expensive to create.
+
- **`BUILDDIR`:**
The location of the (temporary) build directory (default: `build`).
`make clean` will delete this, and it is recommended that it not be
The location of directory to output files to (default: `public`).
`make install` will overwrite files in this directory which
correspond to those in `SRCDIR`.
- It *will not* touch other files, including those generated from files
+ It _will not_ touch other files, including those generated from files
in `SRCDIR` which have since been deleted.
Files are first compiled to `$(BUILDDIR)/public` before they are
copied to `DESTDIR`, so this folder is relatively quick and
inexpensive to re·create.
It’s reasonable to simply delete it before every `make install` to
- ensure stale content is removed.
+ ensure stale content is removed, assuming copies are quick on your
+ file·system.
- **`THISDIR`:**
The location of the ⛩📰 书社 `GNUmakefile`.
- This should be set automatically when calling Make and shouldn’t ever
+ This should be set automatically when calling Make and shouldn¦t ever
need to be set manually.
This variable is used to find the ⛩📰 书社 `lib/` folder, which is
expected to be in the same location.
- **`EXTRAMAGIC`:**
The value of this variable is appended to `MAGIC` by default, to
- enable additional magic files without overriding the existing ones.
+ enable additional magic files without overriding the existing
+ ones (default: empty).
- **`FINDRULES`:**
Rules to use with `find` when searching for source files.
- **`EXTRAFINDRULES`:**
The value of this variable is appended to `FINDRULES` by default, to
- enable additional rules without overriding the existing ones.
+ enable additional rules without overriding the existing ones
+ (default: empty).
- **`FINDINCLUDERULES`:**
Rules to use with `find` when searching for includes (default:
- **`EXTRAFINDINCLUDERULES`:**
The value of this variable is appended to `FINDINCLUDERULES` by
default, to enable additional rules without overriding the existing
- ones.
+ ones (default: empty).
+
+- **`DATAOPTS`:**
+ Additional options to use when calling Make during the first stage of
+ a two‐stage build using `DATADIR` (default: empty).
+
+ This can be used to override variables which are only applicable
+ during the second stage.
+ Note that when supplying this variable on the shell, it will need to
+ be twice‐quoted.
+
+- **`DATAEXT`:**
+ A list of file extensions which signify “data” files during a
+ two‐stage build using `DATADIR` (default: `rdf`).
+
+- **`FINDDATARULES`:**
+ Rules to use with `find` when searching for data files.
+ By default, these rules are derived from `DATAEXT`.
+
+- **`EXTRAFINDDATARULES`:**
+ The value of this variable is appended to `FINDDATARULES` by
+ default, to enable additional rules without overriding the existing
+ ones (default: empty).
+
+- **`FINDFILTERONLY`:**
+ A semicolon‐separated list of regular expressions, at least one of
+ which the paths for sources and includes are required to match,
+ unless empty (default: empty).
+
+- **`FINDFILTEROUT`:**
+ A semicolon‐separated list of regular expressions, each of which
+ matches paths that should _not_ be considered sources or includes
+ (default: empty).
+
+- **`FINDINCLUDEFILTERONLY`:**
+ A semicolon‐separated list of regular expressions, at least one of
+ which the paths for includes are required to match, unless empty
+ (default: empty).
+
+ Note that only paths which already match `FINDFILTERONLY` are
+ considered.
+
+- **`FINDINCLUDEFILTEROUT`:**
+ A semicolon‐separated list of regular expressions, each of which
+ matches paths that should _not_ be considered includes, but may
+ still be considered sources (default: empty).
+
+- **`FINDFILTERONLYEXTENDED`:**
+ If non·empty, `FINDFILTERONLY` is an extended regular expression;
+ otherwise, it is basic (default: empty).
+
+- **`FINDFILTEROUTEXTENDED`:**
+ If non·empty, `FINDFILTEROUT` is an extended regular expression;
+ otherwise, it is basic (default: matches `FINDFILTERONLYEXTENDED`).
+
+- **`FINDINCLUDEFILTERONLYEXTENDED`:**
+ If non·empty, `FINDINCLUDEFILTERONLY` is an extended regular
+ expression; otherwise, it is basic (default: matches
+ `FINDFILTERONLYEXTENDED`).
+
+- **`FINDINCLUDEFILTEROUTEXTENDED`:**
+ If non·empty, `FINDINCLUDEFILTEROUT` is an extended regular
+ expression; otherwise, it is basic (default: `1` if either
+ `FINDFILTEROUTEXTENDED` or `FINDINCLUDEFILTERONLYEXTENDED` is
+ non·empty).
- **`PARSERS`:**
A white·space‐separated list of parsers to use (default:
- **`EXTRAPARSERS`:**
The value of this variable is appended to `PARSERS` by default, to
- enable additional parsers without overriding the existing ones.
+ enable additional parsers without overriding the existing ones
+ (default: empty).
+
+- **`PARSERLIBS`:**
+ A white·space‐separated list of parser dependencies (default:
+ `$(THISDIR)/lib/split.xslt`).
+
+- **`EXTRAPARSERLIBS`:**
+ The value of this variable is appended to `PARSERLIBS` by default, to
+ enable additional parser dependencies without overriding the
+ existing ones (default: empty).
- **`TRANSFORMS`:**
A white·space‐separated list of transforms to use (default:
- **`EXTRATRANSFORMS`:**
The value of this variable is appended to `TRANSFORMS` by default, to
- enable additional transforms without overriding the existing ones.
+ enable additional transforms without overriding the existing ones
+ (default: empty).
+
+- **`TRANSFORMLIBS`:**
+ A white·space‐separated list of transform dependencies (default:
+ `$(THISDIR)/lib/serialize.xslt`).
+
+- **`EXTRATRANSFORMLIBS`:**
+ The value of this variable is appended to `TRANSFORMLIBS` by default,
+ to enable additional transform dependencies without overriding the
+ existing ones (default: empty).
- **`XMLTYPES`:**
A white·space‐separated list of media types or media type suffixes to
Source files may be placed in `SRCDIR` in any manner; the file
structure used there will match the output.
-The type of source files is *not* determined by file extension, but
+The type of source files is _not_ determined by file extension, but
rather by magic number; this means that files **must** begin with
something recognizable.
Supported magic numbers include :—
The former characters have the potential to conflict with make syntax,
a leading hyphen‐minus is confusable for a commandline argument, and a
trailing cloparen [activates a bug in G·N·U Make
- 3.81](https://stackoverflow.com/questions/17148468/capturing-filenames-including-parentheses-with-gnu-makes-wildcard-function#comment24825307_17148894).
+ 3.81][so-17148468-comment].
## Parsers
- Follows this with a quoted string giving a media type supported by
the parser.
- Media type parameters are *not* supported.
+ Media type parameters are _not_ supported.
- Follows this with the string `]`.
The value of this attribute will be the value of the `<书社:id>`
toplevel element in the parser.
+Parsers **should** have an `<书社:id>` and, if present, it **must** be
+ unique.
+
It is possible for parsers to support zero plaintext types.
This is useful when targeting specific dialects of X·M·L; parsers in
this sense operate on the same basic principles as transforms
(described below).
The major distinction between X·M·L parsers and transforms is where in
the process the transformation happens:
-Parsers are applied *prior* to embedding (and can be used to generate
- embeds); transforms are applied *after*.
+Parsers are applied _prior_ to embedding (and can be used to generate
+ embeds); transforms are applied _after_.
-It is **strongly recommended** that auxillary templates in parsers be
+It is **strongly recommended** that auxiliary templates in parsers be
name·spaced (by `@name` or `@mode`) whenever possible, to avoid
conflicts between parsers.
- A `@书社:media-type` attribute, giving the identified media type of
the plaintext node.
+### Parsed metadata
+
+It is possible to extract metadata from a document at the same time as
+ it is being parsed.
+This is done by creating result elements in the `书社:about` mode;
+ these should be R·D·F property elements which apply to the conceptual
+ entity that is the document being parsed.
+
+During transformation, metadata for the file with identifier `$FILE`
+ can be read from the children of
+ `$书社:about//*[@rdf:about=$FILE]/nie:interpretedAs/*`.
+
## Output Redirection
By default, ⛩📰 书社 installs files to the same location in `DESTDIR`
This attribute is read after parsing, but before transformation (where
it is silently dropped).
+Multiple destinations can be provided if the same file should be output to multiple places.
+The file is retransformed each time, with the value of the `DESTINATION` global param set appropriately.
+
## Embedding
Documents can be embedded in other documents using a `<书社:link>`
- element with `@xlink:show="embed"`.
+ element with `@xlink:show="embed"` and an `@xlink:actuate` which is
+ absent or `"none"`.
The `@xlink:href`s of these elements should have the format
`about:shushe?source=<path>`, where `<path>` provides the path to the
file within `SRCDIR`.
freely embedded, instead use the format
`about:shushe?include=<path>`, where `<path>` provides the path
within `INCLUDEDIR`.
+If `<path>` indicates a directory and ends with a slash (`/`),
+ everything within that directory will be embedded.
Embeds are replaced with the parsed contents of a file, unless the file
is an asset, in which case an `<html:object>` element is produced
These attributes are used to scope any nested `<html:meta>` elements
with `@itemprop` attributes to their containing documents.
+## Soft Dependencies
+
+When a file depends only on the metadata of another file, and not its
+ contents, it can be added as a soft dependency rather than an embed.
+Soft dependencies are indicated using a `<书社:link>` element with an
+ `@xlink:show` of `"other"`, `"none"`, or absent, and an
+ `@xlink:actuate` which is absent or `"none"`.
+A change to a soft dependency requires a file to be rebuilt, but no
+ embedding occurs automatically.
+Because there is no automatic embedding, soft dependencies are allowed
+ to be recursive.
+
+The `@xlink:href`s of soft dependency `<书社:link>`s are processed in
+ exactly the same fashion as embeds, described above.
+
+If the value of `@xlink:show` is `"other"`, the soft dependency is
+ transitive.
+Any dependencies of the indicated file which have a `@name` which
+ matches that of the referencing `<书社:link>` element will also be
+ treated as soft dependencies.
+If no `@name` is given, it is treated as the empty string.
+
+When a document is embedded directly, all of its soft dependencies are
+ also treated as soft dependencies of the embedding object.
+However, a document is embedded in a transitive soft dependency, the
+ embed is treated exactly as tho it were itself a transitive soft
+ dependency.
+That means it must have a matching `@name` to be included, and
+ like·wise for any embeds or soft dependencies it contains.
+
+If the value of `@xlink:show` is `"none"` or absent, the soft
+ dependency is not transitive and its own dependencies are not
+ checked.
+
## Transforms
Transforms are used to convert X·M·L files into their final output,
- **`transforms/asset.xslt`:**
Converts `<html:object>` elements which correspond to recognized
media types into the appropriate H·T·M·L elements, and deletes
- `<html:style>` elements from the body of the document and moves
- them to the head.
+ `<html:style>` elements from the body of the document while
+ transferring them to the head.
+ This conversion happens during the finalization phase, after the main
+ transformation.
+
+- **`transforms/expansion.xslt`:**
+ Performs embedding, as described above.
- **`transforms/metadata.xslt`:**
Provides basic `<html:head>` metadata.
Provides the title of the page.
⛩📰 书社 automatically encapsulates H·T·M·L embeds so that their
- metadata does not propogate up to the embedding document.
+ metadata does not propagate up to the embedding document.
To undo this behaviour, remove the `@itemscope` and `@itemtype`
attributes from the embed during the transformation phase.
- **`transforms/serialization.xslt`:**
Replaces `<书社:serialize-xml>` elements with the (escaped)
serialized X·M·L of their contents.
- This replacement happens during the application phase, after most
+ This replacement happens during the finalization phase, after most
other transformations have taken place.
If a `@with-namespaces` attribute is provided, any name·space nodes
- **`THISREV`:**
The value of the `THISREV` variable (if present).
-The following params are only available in transforms :—
+In transforms, the following params are additionally available :—
-- **`PATH`:**
- The path of the output file (within `DESTDIR`).
+- **`DESTINATION`:**
+ The destination being targeted by this transform.
+
+- **`书社:about`:**
+ R·D·F metadata about all of the documents ⛩📰 书社 knows about.
+ Use `$书社:about//*[@rdf:about=$IDENTIFIER]` to get the metadata for
+ the current document.
+
+- **`书社:source`:**
+ The parsed source document being transformed, prior to any expansion.
+
+- **`书社:expansion`:**
+ The document after the all embeds have been expanded.
+ Unavailable during the `书社:expand` stage.
+
+- **`书社:result`:**
+ The document after the main set of transformations have been applied.
+ Only available during the `书社:finalize` stage, where it is used to
+ apply output wrapping and other clean·up.
## Output Wrapping
-⛩📰 书社 will wrap the final output of the transforms in appropriate
- `<html:html>` and `<html:body>` elements, so it is not necessary for
- transforms to do this explicitly.
-After performing the initial transform, ⛩📰 书社 will match the root
- node of the result in the following modes to fill in areas of the
+Provided at least one toplevel result element belongs to the H·T·M·L
+ namespace, ⛩📰 书社 will wrap the final output of the transforms in
+ appropriate `<html:html>` and `<html:body>` elements, so it is not
+ necessary for transforms to do this explicitly.
+If a toplevel result element _is_ a `<html:html>` and `<html:body>`
+ element, it will be merged with the one that ⛩📰 书社 creates.
+Consequently, wrapping the result in a `<html:body>` element can be
+ used to enable wrapping for non‐H·T·M·L content, when desired.
+
+As a part of this process, after performing the initial transform
+ ⛩📰 书社 will match in the following modes to fill in areas of the
wrapper :—
- **`书社:header`:**
The result of matching in this mode is inserted into the
`<html:head>` of the output.
-In addition to being called with the transform result, each of these
- modes will additionally be called with a `<xslt:include>` element
- corresponding to each transform.
-If a transform has a `<书社:id>` top‐level element whose value is an
- i·r·i, its `<xslt:include>` element will have a corresponding
- `@书社:id` attribute.
-This mechanism can be used to allow transforms to insert content
- without matching any elements in the result; for example, the
- following transform adds a link to a stylesheet to the `<html:head>`
- of every page :—
+The document being matched will contain the full transform result
+ prior to wrapping as well as an `<书社:id>` element for each
+ transform.
+The latter elements can be matched to enable transforms to provide
+ content _without_ matching any elements in the result; for example,
+ the following transform adds a link to a stylesheet to the
+ `<html:head>` of every page :—
```xml
<?xml version="1.0"?>
version="1.0"
>
<书社:id>example:add-stylesheet-links.xslt</书社:id>
- <template match="xslt:include[@书社:id='example:add-stylesheet-links.xslt']" mode="书社:metadata">
+ <template match="书社:id[string(.)='example:add-stylesheet-links.xslt']" mode="书社:metadata">
<html:link rel="stylesheet" type="text/css" href="/style.css"/>
</template>
</transform>
the result tree.
It will not be performed on outputs whose root elements are
`<书社:archive>`, `<书社:base64-binary>`, or `<书社:raw-text>`
- (described below).
+ (described below), or on result trees which do not contain a toplevel
+ element in the H·T·M·L namespace.
## Applying Attributes
In both cases, attributes from various sources are combined with
white·space between them.
-Attribute application takes place after all ordinary transforms have
- completed.
+Attribute application takes place after each stage of the
+ transformation, including after the initial embedding phase.
Both elements ignore attributes in the `xml:` name·space, except for
`@xml:lang`, which ignores all but the first definition (including
Most source files are licensed under the terms of the <cite>Mozilla
Public License, version 2.0</cite>.
+[F3C]: <https://darwinsys.com/file/>
[REUSE]: <https://reuse.software/spec/>
+[Sharutils]: <https://www.gnu.org/software/sharutils/>
[draft-phillips-record-jar-01]: <https://datatracker.ietf.org/doc/html/draft-phillips-record-jar-01>
[rdar-92753335]: <https://github.com/apple-oss-distributions/patch_cmds/blob/5084833f90df1b0e0924ea56f94c0199b3b8bbc6/diff/diffreg.c#L1800-L1808>
+[so-17148468-comment]: <https://stackoverflow.com/questions/17148468/capturing-filenames-including-parentheses-with-gnu-makes-wildcard-function#comment24825307_17148894>