X-Git-Url: https://git.ladys.computer/Shushe/blobdiff_plain/af8b42c720c6aaa54199b556a656d40e8b8951a6..ca7be60f15de191e25f8cc1890e13dc499576ec4:/README.markdown?ds=inline
diff --git a/README.markdown b/README.markdown
index 84c3206..70a9730 100644
--- a/README.markdown
+++ b/README.markdown
@@ -1,8 +1,12 @@
-# ⛩️📰 书社
+
+# ⛩📰 书社
 
-An X·S·L·T‐based static site generator.
+A make·file for X·M·L.
 
-⛩️📰 书社 aims to make it easy to generate websites with
+⛩📰 书社 aims to make it easy to generate websites with
   X·S·L·T and G·N·U Make.
 It is consequently only a good choice for people who like X·S·L·T and
   G·N·U Make and wish it were easier to make websites with them.
@@ -17,7 +21,12 @@ It makes things easier by :—
 - Enabling easy inclusion of source files within each other.
 
 It aims to do this with zero dependencies beyond the programs already
-  installed on your computer.
+  installed on your computer†.
+
+† Assuming an operating system with a fairly featureful, and
+  Posix‐compliant, development setup (e·g, Macintosh ≥ version 10.8).
+In fact, on Linux you will probably need to install a few programs:
+  `libxml2-utils`, `xsltproc`, `sharutils`, and `pax`.
 
 ## Nomenclature
 
@@ -36,33 +45,129 @@ In Japanese, it is an alternate spelling for やしろ,
 The name 书社 was chosen to play on this pun, as
   it is intended as a publishing program for webshrines.
 
-In Ascii environments, ⛩️📰 书社 should be written `Shushe`, following
+In Ascii environments, ⛩📰 书社 should be written `Shushe`, following
   the pinyin transliteration.
 
+## Prerequisites
+
+In most cases, ⛩📰 书社 aims to require only functionality which is
+  present in all Posix‐compliant (`POSIX.1-2001`) operating systems.
+There are a few exceptions.
+Details on particular programs are given below; if a program is not
+  listed, it is assumed that any Posix‐compliant implementation will
+  work.
+
+### `diff`
+
+This is a Posix utility, but ⛩📰 书社 depends on functionality
+  introduced after `POSIX.1-2001` (the `-u` option, introduced in
+  `POSIX.1-2008`).
+Macintosh systems somewhat interestingly implement this option
+  correctly in legacy mode (`COMMAND_MODE=legacy`) but incorrectly by
+  default (despite claiming `POSIX.1-2008` conformance for this
+  utility).
+[Note this erroneous comment claiming nanosecond & timezone are
+  extensions rather than standardized.][rdar-92753335]
+Despite this, the default Macintosh implementation will still work with
+  ⛩📰 书社, with the caveat that the timestamp will only include a
+  fractional component when a Posix⹀compliant (e·g, Macintosh legacy or
+  G·N·U) implementation is used.
+
+### `file`
+
+This is a Posix utility, but it was considered optional in
+  `POSIX.1-2001` (altho it was made mandatory in `POSIX.1-2008`) and
+  ⛩📰 书社 currently depends on unspecified behaviour.
+It requires support for the following additional options :—
+
+- **`-C`**, when supplied with `-m`, must be useable to compile a
+    `.mgc` magicfile for use with future invocations of `file`.
+
+- **`--files-from`** must be useable to provide a file that `file`
+    should read file·names from, and `-` must be useable in this
+    context to specify the standard input.
+
+- **`--mime-type`** must cause `file` to print the internet media type
+    of the file with no charset parameter.
+
+- **`--separator`** must be useable to set the separator that `file`
+    uses to separate file names from types.
+
+These options are implemented by the
+  [Fine Free File Command](https://darwinsys.com/file/), which is used
+  by most operating systems.
+
+### `git`
+
+This is not a Posix utility.
+Usage of `git` is optional, but recommended (and activated by default).
+To disable it, set `GIT=`.
+
+### `make`
+
+This is a Posix utility, but it is considered an optional Software
+  Development utility and ⛩📰 书社 currently depends on unspecified
+  behaviour.
+⛩📰 书社 requires specifically the G·N·U version of `make`, and
+  depends on functionality present in version 3.81 or later.
+It is not expected to work in previous versions, or with other
+  implementations of Make.
+
+### `pax`
+
+This is a Posix utility, but it is not included in the Linux Standard
+  Base or installed by default in many distributions.
+⛩📰 书社 only requires support for the `ustar` format.
+
+### `uudecode` and `uuencode`
+
+These are Posix utilities, but they were considered optional in
+  `POSIX.1-2001` (altho they are made mandatory in `POSIX.1-2008`) and
+  they are not included in the Linux Standard Base or installed by
+  default in many distributions.
+The G·N·U [Sharutils](https://www.gnu.org/software/sharutils/) package
+  provides one implementation.
+
+### `xmlcatalog` and `xmllint`
+
+These are not a Posix utilities.
+They are a part of `libxml2`, but may need to be installed separately
+  on some platforms (e·g by the name `libxml2-utils`).
+
+### `xsltproc`
+
+This is not a Posix utility.
+It is a part of `libxslt`, but may need to be installed separately on
+  some platforms.
+
 ## Basic Usage
 
 Place source files in `sources/` and run `make install` to compile
   the result to `public/`.
 Compilation involves the following steps :—
 
-1. ⛩️📰 书社 compiles all of the magic files in `magic/` into a single
+1. ⛩📰 书社 compiles all of the magic files in `magic/` into a single
     file, `build/magic.mgc`.
 
-2. ⛩️📰 书社 processes all of the parsers in `parsers/` and determines
+2. ⛩📰 书社 processes all of the parsers in `parsers/` and determines
     the list of supported plaintext types.
 
-3. ⛩️📰 书社 identifies all of the source files and includes and uses
+3. ⛩📰 书社 identifies all of the source files and includes and uses
     `build/magic.mgc` to classify them by media type.
 
-4. ⛩️📰 书社 parses all plaintext and X·M·L source files and includes
+4. ⛩📰 书社 parses all plaintext and X·M·L source files and includes
     and then builds a dependency tree between them.
 
-5. ⛩️📰 书社 uses the dependency tree to establish prerequisites for
+5. ⛩📰 书社 uses the dependency tree to establish prerequisites for
     each output file.
 
-6. ⛩️📰 书社 compiles each output file to `build/public`.
+6. ⛩📰 书社 compiles each output file to `build/result`.
 
-7. ⛩️📰 书社 copies the output files to `public`.
+7. ⛩📰 书社 copies most output files from `build/result` to
+     `build/public`, but it does some additional processing instead on
+     those which indicate a non‐X·M·L desired final output form.
+
+8. ⛩📰 书社 copies the final resulting files to `public`.
 
 You can use `make list` to list each identified source file or include
   alongside its computed type and dependencies.
@@ -70,91 +175,241 @@ As this is a Make‐based program, steps will only be run if the
   corresponding buildfile or output file is older than its
   prerequisites.
 
-## Namespaces
+## Name·spaces
 
-The ⛩️📰 书社 namespace is `urn:fdc:ladys.computer:20231231:Shu1She4`.
+The ⛩📰 书社 name·space is `urn:fdc:ladys.computer:20231231:Shu1She4`.
 
-This document uses a few namespace prefixes, with the following
+This document uses a few name·space prefixes, with the following
   meanings :—
 
-|   Prefix | Expansion                                  |
-| -------: | :----------------------------------------- |
-|  `html:` | `http://www.w3.org/1999/xhtml`             |
-| `xlink:` | `http://www.w3.org/1999/xlink`             |
-|  `xslt:` | `http://www.w3.org/1999/XSL/Transform`     |
-|  `书社:` | `urn:fdc:ladys.computer:20231231:Shu1She4` |
+|     Prefix | Expansion                                     |
+| ---------: | :-------------------------------------------- |
+| `catalog:` | `urn:oasis:names:tc:entity:xmlns:xml:catalog` |
+|    `exsl:` | `http://exslt.org/common`                     |
+| `exslstr:` | `http://exslt.org/strings`                    |
+|    `html:` | `http://www.w3.org/1999/xhtml`                |
+|     `svg:` | `http://www.w3.org/2000/svg`                  |
+|   `xlink:` | `http://www.w3.org/1999/xlink`                |
+|    `xslt:` | `http://www.w3.org/1999/XSL/Transform`        |
+|    `书社:` | `urn:fdc:ladys.computer:20231231:Shu1She4`    |
 
 ## Setup and Configuration
 
-⛩️📰 书社 depends on the following programs to run.
+⛩📰 书社 depends on the following programs to run.
 In every case, you may supply your own implementation by overriding the
   corresponding (allcaps) variable (e·g, set `MKDIR` to supply your own
   `mkdir` implementation).
 
+- `awk`
 - `cat`
+- `cd`
+- `cksum`
 - `cp`
-- `echo`
+- `date`
+- `diff`
 - `file`
 - `find`
-- `mkdir` (requires support for `-p`)
+- `git` (optional; set `GIT=` to disable)
+- `grep`
+- `ln`
+- `mkdir`
 - `mv`
+- `od`
+- `pax` (only when generating archives)
 - `printf`
 - `rm`
 - `sed`
 - `sleep`
 - `test`
 - `touch`
-- `tr` (requires support for `-d`)
-- `uuencode` (requires support for `-m` and `-r`)
+- `tr`
+- `uuencode`
+- `uudecode`
+- `xargs`
 - `xmlcatalog` (provided by `libxml2`)
 - `xmllint` (provided by `libxml2`)
 - `xsltproc` (provided by `libxslt`)
 
 The following additional variables can be used to control the behaviour
-  of ⛩️📰 书社 :—
+  of ⛩📰 书社 :—
 
 - **`SRCDIR`:**
   The location of the source files (default: `sources`).
+  Multiple source directories can be provided, so long as the same
+    file subpath doesn’t exist in more than one of them.
 
 - **`INCLUDEDIR`:**
-  The location of the source files (default: `sources/includes`).
+  The location of source includes (default: `sources/includes`).
   This can be inside of `SRCDIR`, but needn’t be.
+  Multiple include directories can be provided, so long as the same
+    file subpath doesn’t exist in more than one of them.
+
+- **`DATADIR`:**
+  If set to the location of a directory, ⛩📰 书社 will run a two‐stage build.
+  In the first stage, only files in `SRCDIR` which match `FINDDATARULES` (see below) will be built, with files in `DATADIR` serving as includes.
+  In the second stage, the remaining files in `SRCDIR` will be built, with the files built during the first stage, in addition to any files in `INCLUDEDIR`, serving as includes.
+  Files built during the first stage are copied into `DESTDIR` alongside those from the second stage when installing.
+
+  This functionality is intended for sites where the bulk of the site can be built from a few data files which are expensive to create.
 
 - **`BUILDDIR`:**
   The location of the (temporary) build directory (default: `build`).
+  `make clean` will delete this, and it is recommended that it not be
+    used for programs aside from ⛩📰 书社.
 
 - **`DESTDIR`:**
   The location of directory to output files to (default: `public`).
+  `make install` will overwrite files in this directory which
+    correspond to those in `SRCDIR`.
+  It *will not* touch other files, including those generated from files
+    in `SRCDIR` which have since been deleted.
+
+  Files are first compiled to `$(BUILDDIR)/public` before they are
+    copied to `DESTDIR`, so this folder is relatively quick and
+    inexpensive to re·create.
+  It’s reasonable to simply delete it before every `make install` to
+    ensure stale content is removed.
 
 - **`THISDIR`:**
-  The location of the ⛩️📰 书社 `GNUmakefile`.
+  The location of the ⛩📰 书社 `GNUmakefile`.
   This should be set automatically when calling Make and shouldn’t ever
     need to be set manually.
-  This variable is used to find the ⛩️📰 书社 `lib/` folder, which is
+  This variable is used to find the ⛩📰 书社 `lib/` folder, which is
     expected to be in the same location.
 
-- **`MAGICDIR`:**
-  The location of the magic files to use (default: `$(THISDIR)/magic`).
+- **`MAGIC`:**
+  A white·space‐separated list of magic files to use (default:
+    `$(THISDIR)/magic/*`).
 
-- **`FINDOPTS`:**
-  Options to pass to `find` when searching for source files (default:
-    `-LE`).
+- **`EXTRAMAGIC`:**
+  The value of this variable is appended to `MAGIC` by default, to
+    enable additional magic files without overriding the existing ones.
 
 - **`FINDRULES`:**
-  Rules to use with `find` when searching for source files (default:
-    `-flags -nohidden -and -not -name '.*'`).
+  Rules to use with `find` when searching for source files.
+  The default ignores files that start with a period or hyphen‐minus,
+    those which end with a cloparen, and those which contain a hash,
+    buck, percent, asterisk, colon, semi, eroteme, bracket, backslash,
+    or pipe.
+  It is important that these rules not produce any output, as anything
+    printed to `stdout` will be considered a result of the find.
+
+- **`EXTRAFINDRULES`:**
+  The value of this variable is appended to `FINDRULES` by default, to
+    enable additional rules without overriding the existing ones.
+
+- **`FINDINCLUDERULES`:**
+  Rules to use with `find` when searching for includes (default:
+    `$(FINDRULES)`).
+
+- **`EXTRAFINDINCLUDERULES`:**
+  The value of this variable is appended to `FINDINCLUDERULES` by
+    default, to enable additional rules without overriding the existing
+    ones.
+
+- **`DATAOPTS`:**
+  Additional options to use when calling Make during the first stage of a two‐stage build using `DATADIR`.
+
+  This can be used to override variables which are only applicable during the second stage.
+  Note that when supplying this variable on the shell, it will need to be double‐quoted.
+
+- **`DATAEXT`:**
+  A list of file extensions which signify “data” files during a two‐stage build using `DATADIR`.
+
+- **`FINDDATARULES`:**
+  Rules to use with `find` when searching for data files.
+  By default, these rules are derived from `DATAEXT`.
+
+- **`EXTRAFINDDATARULES`:**
+  The value of this variable is appended to `FINDDATARULES` by
+    default, to enable additional rules without overriding the existing
+    ones.
+
+- **`FINDFILTERONLY`:**
+  A semicolon‐separated list of regular expressions, at least one of which the paths for sources and includes are required to match, unless empty (default: empty).
+
+- **`FINDFILTEROUT`:**
+  A semicolon‐separated list of regular expressions, each of which matches paths that should _not_ be considered sources or includes (default: empty).
+
+- **`FINDINCLUDEFILTERONLY`:**
+  A semicolon‐separated list of regular expressions, at least one of which the paths for includes are required to match, unless empty (default: empty).
+
+  Note that only paths which already match `FINDFILTERONLY` are considered.
+
+- **`FINDINCLUDEFILTEROUT`:**
+  A semicolon‐separated list of regular expressions, each of which matches paths that should _not_ be considered includes, but may still be considered sources (default: empty).
+
+- **`FINDFILTERONLYEXTENDED`:**
+  If non·empty, `FINDFILTERONLY` is an extended regular expression; otherwise, it is basic (default: empty).
+
+- **`FINDFILTEROUTEXTENDED`:**
+  If non·empty, `FINDFILTEROUT` is an extended regular expression; otherwise, it is basic (default: matches `FINDFILTERONLYEXTENDED`).
+
+- **`FINDINCLUDEFILTERONLYEXTENDED`:**
+  If non·empty, `FINDINCLUDEFILTERONLY` is an extended regular expression; otherwise, it is basic (default: matches `FINDFILTERONLYEXTENDED`).
+
+- **`FINDINCLUDEFILTEROUTEXTENDED`:**
+  If non·empty, `FINDINCLUDEFILTEROUT` is an extended regular expression; otherwise, it is basic (default: `1` if either `FINDFILTEROUTEXTENDED` or `FINDINCLUDEFILTERONLYEXTENDED` is non·empty).
 
 - **`PARSERS`:**
   A white·space‐separated list of parsers to use (default:
     `$(THISDIR)/parsers/*.xslt`).
 
+- **`EXTRAPARSERS`:**
+  The value of this variable is appended to `PARSERS` by default, to
+    enable additional parsers without overriding the existing ones.
+
+- **`PARSERLIBS`:**
+  A white·space‐separated list of parser dependencies (default:
+    `$(THISDIR)/lib/split.xslt`).
+
+- **`EXTRAPARSERLIBS`:**
+  The value of this variable is appended to `PARSERLIBS` by default, to
+    enable additional parser dependencies without overriding the
+    existing ones.
+
 - **`TRANSFORMS`:**
   A white·space‐separated list of transforms to use (default:
     `$(THISDIR)/transforms/*.xslt`).
 
+- **`EXTRATRANSFORMS`:**
+  The value of this variable is appended to `TRANSFORMS` by default, to
+    enable additional transforms without overriding the existing ones.
+
+- **`TRANSFORMLIBS`:**
+  A white·space‐separated list of transform dependencies (default:
+    `$(THISDIR)/lib/serialize.xslt`).
+
+- **`EXTRATRANSFORMLIBS`:**
+  The value of this variable is appended to `TRANSFORMLIBS` by default,
+    to enable additional transform dependencies without overriding the
+    existing ones.
+
 - **`XMLTYPES`:**
-  A white·space‐separated list of media types to consider X·M·L
-    (default: `application/xml text/xml`).
+  A white·space‐separated list of media types or media type suffixes to
+    consider X·M·L (default: `application/xml text/xml +xml`).
+
+- **`FINALIZE`:**
+  A program to run on (unspecial) X·M·L files after they are
+    transformed (default: `xmllint --nonet --nsclean`).
+  This variable can be used for postprocessing.
+
+- **`THISREV`:**
+  The current version of ⛩📰 书社 (default: derived from the current
+    git tag/branch/commit).
+
+- **`SRCREV`:**
+  The current version of the source files (default: derived from the
+    current git tag/branch/commit).
+
+- **`QUIET`:**
+  If this variable has a value, informative messages will not be
+    printed (default: empty).
+  Informative messages print to stderr, not stdout, so disabling them
+    usually shouldn’t be necessary.
+  This does not (cannot) disable messages from Make itself, for which
+    the `-s`, `--silent` ∕ `--quiet` Make option is more likely to be
+    useful.
 
 - **`VERBOSE`:**
   If this variable has a value, every recipe instruction will be
@@ -175,6 +430,8 @@ Supported magic numbers include :—
 - `#!js` for `text/javascript` files
 - `@charset "` for `text/css` files
 - `#!tsv` for `text/tab-separated-values` files
+- `%%` for `text/record-jar` files (unregistered; see
+    [[draft-phillips-record-jar-01][]])
 
 Text formats with associated X·S·L·T parsers are wrapped in a H·T·M·L
   `