X-Git-Url: https://git.ladys.computer/Shushe/blobdiff_plain/ca7be60f15de191e25f8cc1890e13dc499576ec4..refs/heads/current:/README.markdown?ds=sidebyside diff --git a/README.markdown b/README.markdown index 70a9730..658866e 100644 --- a/README.markdown +++ b/README.markdown @@ -1,5 +1,5 @@ # ⛩📰 书社 @@ -70,7 +70,7 @@ Macintosh systems somewhat interestingly implement this option extensions rather than standardized.][rdar-92753335] Despite this, the default Macintosh implementation will still work with ⛩📰 书社, with the caveat that the timestamp will only include a - fractional component when a Posix⹀compliant (e·g, Macintosh legacy or + fractional component when a Posix‐compliant (e·g, Macintosh legacy or G·N·U) implementation is used. ### `file` @@ -93,9 +93,8 @@ It requires support for the following additional options :⁠— - **`--separator`** must be useable to set the separator that `file` uses to separate file names from types. -These options are implemented by the - [Fine Free File Command](https://darwinsys.com/file/), which is used - by most operating systems. +These options are implemented by the [Fine Free File Command][F3C], + which is used by most operating systems. ### `git` @@ -125,8 +124,7 @@ These are Posix utilities, but they were considered optional in `POSIX.1-2001` (altho they are made mandatory in `POSIX.1-2008`) and they are not included in the Linux Standard Base or installed by default in many distributions. -The G·N·U [Sharutils](https://www.gnu.org/software/sharutils/) package - provides one implementation. +The G·N·U [Sharutils][] package provides one implementation. ### `xmlcatalog` and `xmllint` @@ -188,6 +186,7 @@ This document uses a few name·space prefixes, with the following | `exsl:` | `http://exslt.org/common` | | `exslstr:` | `http://exslt.org/strings` | | `html:` | `http://www.w3.org/1999/xhtml` | +| `rdf:` | `http://www.w3.org/1999/02/22-rdf-syntax-ns#` | | `svg:` | `http://www.w3.org/2000/svg` | | `xlink:` | `http://www.w3.org/1999/xlink` | | `xslt:` | `http://www.w3.org/1999/XSL/Transform` | @@ -245,12 +244,19 @@ The following additional variables can be used to control the behaviour file subpath doesn’t exist in more than one of them. - **`DATADIR`:** - If set to the location of a directory, ⛩📰 书社 will run a two‐stage build. - In the first stage, only files in `SRCDIR` which match `FINDDATARULES` (see below) will be built, with files in `DATADIR` serving as includes. - In the second stage, the remaining files in `SRCDIR` will be built, with the files built during the first stage, in addition to any files in `INCLUDEDIR`, serving as includes. - Files built during the first stage are copied into `DESTDIR` alongside those from the second stage when installing. - - This functionality is intended for sites where the bulk of the site can be built from a few data files which are expensive to create. + If set to the location of a directory, ⛩📰 书社 will run a two‐stage + build. + In the first stage, only files in `SRCDIR` which match + `FINDDATARULES` (see below) will be built, with files in `DATADIR` + serving as includes. + In the second stage, the remaining files in `SRCDIR` will be built, + with the files built during the first stage, in addition to any + files in `INCLUDEDIR`, serving as includes. + Files built during the first stage are copied into `DESTDIR` + alongside those from the second stage when installing. + + This functionality is intended for sites where the bulk of the site + can be built from a few data files which are expensive to create. - **`BUILDDIR`:** The location of the (temporary) build directory (default: `build`). @@ -261,18 +267,19 @@ The following additional variables can be used to control the behaviour The location of directory to output files to (default: `public`). `make install` will overwrite files in this directory which correspond to those in `SRCDIR`. - It *will not* touch other files, including those generated from files + It _will not_ touch other files, including those generated from files in `SRCDIR` which have since been deleted. Files are first compiled to `$(BUILDDIR)/public` before they are copied to `DESTDIR`, so this folder is relatively quick and inexpensive to re·create. It’s reasonable to simply delete it before every `make install` to - ensure stale content is removed. + ensure stale content is removed, assuming copies are quick on your + file·system. - **`THISDIR`:** The location of the ⛩📰 书社 `GNUmakefile`. - This should be set automatically when calling Make and shouldn’t ever + This should be set automatically when calling Make and shouldn¦t ever need to be set manually. This variable is used to find the ⛩📰 书社 `lib/` folder, which is expected to be in the same location. @@ -283,7 +290,8 @@ The following additional variables can be used to control the behaviour - **`EXTRAMAGIC`:** The value of this variable is appended to `MAGIC` by default, to - enable additional magic files without overriding the existing ones. + enable additional magic files without overriding the existing + ones (default: empty). - **`FINDRULES`:** Rules to use with `find` when searching for source files. @@ -296,7 +304,8 @@ The following additional variables can be used to control the behaviour - **`EXTRAFINDRULES`:** The value of this variable is appended to `FINDRULES` by default, to - enable additional rules without overriding the existing ones. + enable additional rules without overriding the existing ones + (default: empty). - **`FINDINCLUDERULES`:** Rules to use with `find` when searching for includes (default: @@ -305,16 +314,20 @@ The following additional variables can be used to control the behaviour - **`EXTRAFINDINCLUDERULES`:** The value of this variable is appended to `FINDINCLUDERULES` by default, to enable additional rules without overriding the existing - ones. + ones (default: empty). - **`DATAOPTS`:** - Additional options to use when calling Make during the first stage of a two‐stage build using `DATADIR`. + Additional options to use when calling Make during the first stage of + a two‐stage build using `DATADIR` (default: empty). - This can be used to override variables which are only applicable during the second stage. - Note that when supplying this variable on the shell, it will need to be double‐quoted. + This can be used to override variables which are only applicable + during the second stage. + Note that when supplying this variable on the shell, it will need to + be twice‐quoted. - **`DATAEXT`:** - A list of file extensions which signify “data” files during a two‐stage build using `DATADIR`. + A list of file extensions which signify “data” files during a + two‐stage build using `DATADIR` (default: `rdf`). - **`FINDDATARULES`:** Rules to use with `find` when searching for data files. @@ -323,33 +336,49 @@ The following additional variables can be used to control the behaviour - **`EXTRAFINDDATARULES`:** The value of this variable is appended to `FINDDATARULES` by default, to enable additional rules without overriding the existing - ones. + ones (default: empty). - **`FINDFILTERONLY`:** - A semicolon‐separated list of regular expressions, at least one of which the paths for sources and includes are required to match, unless empty (default: empty). + A semicolon‐separated list of regular expressions, at least one of + which the paths for sources and includes are required to match, + unless empty (default: empty). - **`FINDFILTEROUT`:** - A semicolon‐separated list of regular expressions, each of which matches paths that should _not_ be considered sources or includes (default: empty). + A semicolon‐separated list of regular expressions, each of which + matches paths that should _not_ be considered sources or includes + (default: empty). - **`FINDINCLUDEFILTERONLY`:** - A semicolon‐separated list of regular expressions, at least one of which the paths for includes are required to match, unless empty (default: empty). + A semicolon‐separated list of regular expressions, at least one of + which the paths for includes are required to match, unless empty + (default: empty). - Note that only paths which already match `FINDFILTERONLY` are considered. + Note that only paths which already match `FINDFILTERONLY` are + considered. - **`FINDINCLUDEFILTEROUT`:** - A semicolon‐separated list of regular expressions, each of which matches paths that should _not_ be considered includes, but may still be considered sources (default: empty). + A semicolon‐separated list of regular expressions, each of which + matches paths that should _not_ be considered includes, but may + still be considered sources (default: empty). - **`FINDFILTERONLYEXTENDED`:** - If non·empty, `FINDFILTERONLY` is an extended regular expression; otherwise, it is basic (default: empty). + If non·empty, `FINDFILTERONLY` is an extended regular expression; + otherwise, it is basic (default: empty). - **`FINDFILTEROUTEXTENDED`:** - If non·empty, `FINDFILTEROUT` is an extended regular expression; otherwise, it is basic (default: matches `FINDFILTERONLYEXTENDED`). + If non·empty, `FINDFILTEROUT` is an extended regular expression; + otherwise, it is basic (default: matches `FINDFILTERONLYEXTENDED`). - **`FINDINCLUDEFILTERONLYEXTENDED`:** - If non·empty, `FINDINCLUDEFILTERONLY` is an extended regular expression; otherwise, it is basic (default: matches `FINDFILTERONLYEXTENDED`). + If non·empty, `FINDINCLUDEFILTERONLY` is an extended regular + expression; otherwise, it is basic (default: matches + `FINDFILTERONLYEXTENDED`). - **`FINDINCLUDEFILTEROUTEXTENDED`:** - If non·empty, `FINDINCLUDEFILTEROUT` is an extended regular expression; otherwise, it is basic (default: `1` if either `FINDFILTEROUTEXTENDED` or `FINDINCLUDEFILTERONLYEXTENDED` is non·empty). + If non·empty, `FINDINCLUDEFILTEROUT` is an extended regular + expression; otherwise, it is basic (default: `1` if either + `FINDFILTEROUTEXTENDED` or `FINDINCLUDEFILTERONLYEXTENDED` is + non·empty). - **`PARSERS`:** A white·space‐separated list of parsers to use (default: @@ -357,7 +386,8 @@ The following additional variables can be used to control the behaviour - **`EXTRAPARSERS`:** The value of this variable is appended to `PARSERS` by default, to - enable additional parsers without overriding the existing ones. + enable additional parsers without overriding the existing ones + (default: empty). - **`PARSERLIBS`:** A white·space‐separated list of parser dependencies (default: @@ -366,7 +396,7 @@ The following additional variables can be used to control the behaviour - **`EXTRAPARSERLIBS`:** The value of this variable is appended to `PARSERLIBS` by default, to enable additional parser dependencies without overriding the - existing ones. + existing ones (default: empty). - **`TRANSFORMS`:** A white·space‐separated list of transforms to use (default: @@ -374,7 +404,8 @@ The following additional variables can be used to control the behaviour - **`EXTRATRANSFORMS`:** The value of this variable is appended to `TRANSFORMS` by default, to - enable additional transforms without overriding the existing ones. + enable additional transforms without overriding the existing ones + (default: empty). - **`TRANSFORMLIBS`:** A white·space‐separated list of transform dependencies (default: @@ -383,7 +414,7 @@ The following additional variables can be used to control the behaviour - **`EXTRATRANSFORMLIBS`:** The value of this variable is appended to `TRANSFORMLIBS` by default, to enable additional transform dependencies without overriding the - existing ones. + existing ones (default: empty). - **`XMLTYPES`:** A white·space‐separated list of media types or media type suffixes to @@ -421,7 +452,7 @@ The following additional variables can be used to control the behaviour Source files may be placed in `SRCDIR` in any manner; the file structure used there will match the output. -The type of source files is *not* determined by file extension, but +The type of source files is _not_ determined by file extension, but rather by magic number; this means that files **must** begin with something recognizable. Supported magic numbers include :⁠— @@ -448,7 +479,7 @@ Source files whose media type does not have an associated X·S·L·T The former characters have the potential to conflict with make syntax, a leading hyphen‐minus is confusable for a commandline argument, and a trailing cloparen [activates a bug in G·N·U Make - 3.81](https://stackoverflow.com/questions/17148468/capturing-filenames-including-parentheses-with-gnu-makes-wildcard-function#comment24825307_17148894). + 3.81][so-17148468-comment]. ## Parsers @@ -479,7 +510,7 @@ New ⛩📰 书社 parsers which target plaintext formats should have an - Follows this with a quoted string giving a media type supported by the parser. - Media type parameters are *not* supported. + Media type parameters are _not_ supported. - Follows this with the string `]`. @@ -519,16 +550,19 @@ The result tree of applying the transform to the `` The value of this attribute will be the value of the `<书社:id>` toplevel element in the parser. +Parsers **should** have an `<书社:id>` and, if present, it **must** be + unique. + It is possible for parsers to support zero plaintext types. This is useful when targeting specific dialects of X·M·L; parsers in this sense operate on the same basic principles as transforms (described below). The major distinction between X·M·L parsers and transforms is where in the process the transformation happens: -Parsers are applied *prior* to embedding (and can be used to generate - embeds); transforms are applied *after*. +Parsers are applied _prior_ to embedding (and can be used to generate + embeds); transforms are applied _after_. -It is **strongly recommended** that auxillary templates in parsers be +It is **strongly recommended** that auxiliary templates in parsers be name·spaced (by `@name` or `@mode`) whenever possible, to avoid conflicts between parsers. @@ -547,6 +581,18 @@ These include :⁠— - A `@书社:media-type` attribute, giving the identified media type of the plaintext node. +### Parsed metadata + +It is possible to extract metadata from a document at the same time as + it is being parsed. +This is done by creating result elements in the `书社:about` mode; + these should be R·D·F property elements which apply to the conceptual + entity that is the document being parsed. + +During transformation, metadata for the file with identifier `$FILE` + can be read from the children of + `$书社:about//*[@rdf:about=$FILE]/nie:interpretedAs/*`. + ## Output Redirection By default, ⛩📰 书社 installs files to the same location in `DESTDIR` @@ -556,10 +602,14 @@ This behaviour can be customized by setting the `@书社:destination` This attribute is read after parsing, but before transformation (where it is silently dropped). +Multiple destinations can be provided if the same file should be output to multiple places. +The file is retransformed each time, with the value of the `DESTINATION` global param set appropriately. + ## Embedding Documents can be embedded in other documents using a `<书社:link>` - element with `@xlink:show="embed"`. + element with `@xlink:show="embed"` and an `@xlink:actuate` which is + absent or `"none"`. The `@xlink:href`s of these elements should have the format `about:shushe?source=`, where `` provides the path to the file within `SRCDIR`. @@ -567,6 +617,8 @@ Includes, which do not generate outputs of their own but may still be freely embedded, instead use the format `about:shushe?include=`, where `` provides the path within `INCLUDEDIR`. +If `` indicates a directory and ends with a slash (`/`), + everything within that directory will be embedded. Embeds are replaced with the parsed contents of a file, unless the file is an asset, in which case an `` element is produced @@ -606,6 +658,40 @@ These include :⁠— These attributes are used to scope any nested `` elements with `@itemprop` attributes to their containing documents. +## Soft Dependencies + +When a file depends only on the metadata of another file, and not its + contents, it can be added as a soft dependency rather than an embed. +Soft dependencies are indicated using a `<书社:link>` element with an + `@xlink:show` of `"other"`, `"none"`, or absent, and an + `@xlink:actuate` which is absent or `"none"`. +A change to a soft dependency requires a file to be rebuilt, but no + embedding occurs automatically. +Because there is no automatic embedding, soft dependencies are allowed + to be recursive. + +The `@xlink:href`s of soft dependency `<书社:link>`s are processed in + exactly the same fashion as embeds, described above. + +If the value of `@xlink:show` is `"other"`, the soft dependency is + transitive. +Any dependencies of the indicated file which have a `@name` which + matches that of the referencing `<书社:link>` element will also be + treated as soft dependencies. +If no `@name` is given, it is treated as the empty string. + +When a document is embedded directly, all of its soft dependencies are + also treated as soft dependencies of the embedding object. +However, a document is embedded in a transitive soft dependency, the + embed is treated exactly as tho it were itself a transitive soft + dependency. +That means it must have a matching `@name` to be included, and + like·wise for any embeds or soft dependencies it contains. + +If the value of `@xlink:show` is `"none"` or absent, the soft + dependency is not transitive and its own dependencies are not + checked. + ## Transforms Transforms are used to convert X·M·L files into their final output, @@ -615,11 +701,14 @@ Transforms are used to convert X·M·L files into their final output, - **`transforms/asset.xslt`:** Converts `` elements which correspond to recognized media types into the appropriate H·T·M·L elements, and deletes - `` elements from the body of the document and moves - them to the head. + `` elements from the body of the document while + transferring them to the head. This conversion happens during the finalization phase, after the main transformation. +- **`transforms/expansion.xslt`:** + Performs embedding, as described above. + - **`transforms/metadata.xslt`:** Provides basic `` metadata. This metadata is generated from `` elements with one of @@ -629,7 +718,7 @@ Transforms are used to convert X·M·L files into their final output, Provides the title of the page. ⛩📰 书社 automatically encapsulates H·T·M·L embeds so that their - metadata does not propogate up to the embedding document. + metadata does not propagate up to the embedding document. To undo this behaviour, remove the `@itemscope` and `@itemtype` attributes from the embed during the transformation phase. @@ -690,6 +779,28 @@ The following params are made available globally in parsers and - **`THISREV`:** The value of the `THISREV` variable (if present). +In transforms, the following params are additionally available :⁠— + +- **`DESTINATION`:** + The destination being targeted by this transform. + +- **`书社:about`:** + R·D·F metadata about all of the documents ⛩📰 书社 knows about. + Use `$书社:about//*[@rdf:about=$IDENTIFIER]` to get the metadata for + the current document. + +- **`书社:source`:** + The parsed source document being transformed, prior to any expansion. + +- **`书社:expansion`:** + The document after the all embeds have been expanded. + Unavailable during the `书社:expand` stage. + +- **`书社:result`:** + The document after the main set of transformations have been applied. + Only available during the `书社:finalize` stage, where it is used to + apply output wrapping and other clean·up. + ## Output Wrapping Provided at least one toplevel result element belongs to the H·T·M·L @@ -810,6 +921,9 @@ This repository conforms to [REUSE][]. Most source files are licensed under the terms of the Mozilla Public License, version 2.0. +[F3C]: [REUSE]: +[Sharutils]: [draft-phillips-record-jar-01]: [rdar-92753335]: +[so-17148468-comment]: