Improve default parser/transform i·d’s

[Shushe] / README.markdown
diff --git a/README.markdown b/README.markdown

index cae7ac1a82b71159967ea3e76f2fdcc5c14d2449..4427c7fed710eb40a5051e147688b77e1f843478 100644 (file)
--- a/README.markdown
+++ b/README.markdown
@@ -91,6 +91,7 @@ In every case, you may supply your own implementation by overriding the
    corresponding (allcaps) variable (e·g, set `MKDIR` to supply your own
    `mkdir` implementation).
  
    corresponding (allcaps) variable (e·g, set `MKDIR` to supply your own
    `mkdir` implementation).
  
+- `awk`
  - `cat`
  - `cp`
  - `date`
  - `cat`
  - `cp`
  - `date`
@@ -99,6 +100,7 @@ In every case, you may supply your own implementation by overriding the
  - `find`
  - `mkdir` (requires support for `-p`)
  - `mv`
  - `find`
  - `mkdir` (requires support for `-p`)
  - `mv`
+- `od` (requires support for `-t x1`)
  - `printf`
  - `rm`
  - `sed`
  - `printf`
  - `rm`
  - `sed`
@@ -108,6 +110,7 @@ In every case, you may supply your own implementation by overriding the
  - `touch`
  - `tr` (requires support for `-d`)
  - `uuencode` (requires support for `-m` and `-r`)
  - `touch`
  - `tr` (requires support for `-d`)
  - `uuencode` (requires support for `-m` and `-r`)
+- `xargs` (requires support for `-0`)
  - `xmlcatalog` (provided by `libxml2`)
  - `xmllint` (provided by `libxml2`)
  - `xsltproc` (provided by `libxslt`)
  - `xmlcatalog` (provided by `libxml2`)
  - `xmllint` (provided by `libxml2`)
  - `xsltproc` (provided by `libxslt`)
@@ -117,10 +120,14 @@ The following additional variables can be used to control the behaviour
  
  - **`SRCDIR`:**
    The location of the source files (default: `sources`).
  
  - **`SRCDIR`:**
    The location of the source files (default: `sources`).
+  Multiple source directories can be provided, so long as the same
+    file subpath doesn’t exist in more than one of them.
  
  - **`INCLUDEDIR`:**
  
  - **`INCLUDEDIR`:**
-  The location of the source files (default: `sources/includes`).
+  The location of source includes (default: `sources/includes`).
    This can be inside of `SRCDIR`, but needn’t be.
    This can be inside of `SRCDIR`, but needn’t be.
+  Multiple include directories can be provided, so long as the same
+    file subpath doesn’t exist in more than one of them.
  
  - **`BUILDDIR`:**
    The location of the (temporary) build directory (default: `build`).
  
  - **`BUILDDIR`:**
    The location of the (temporary) build directory (default: `build`).
@@ -150,17 +157,11 @@ The following additional variables can be used to control the behaviour
  - **`MAGICDIR`:**
    The location of the magic files to use (default: `$(THISDIR)/magic`).
  
  - **`MAGICDIR`:**
    The location of the magic files to use (default: `$(THISDIR)/magic`).
  
-- **`FINDOPTS`:**
-  Options to pass to `find` when searching for source files (default:
-    `-PE`).
-
  - **`FINDRULES`:**
  - **`FINDRULES`:**
-  Rules to use with `find` when searching for source files (default:
-    `-flags -nohidden -and -not -name '.*'`).
-
-- **`FINDINCLUDEOPTS`:**
-  Options to pass to `find` when searching for includes (default:
-    `$(FINDOPTS)`).
+  Rules to use with `find` when searching for source files.
+  The default ignores hidden files, those that start with a period or
+    hyphen‐minus, and those which contain a pipe, buck, percent, or
+    colon.
  
  - **`FINDINCLUDERULES`:**
    Rules to use with `find` when searching for includes (default:
  
  - **`FINDINCLUDERULES`:**
    Rules to use with `find` when searching for includes (default:
@@ -206,11 +207,12 @@ Text formats with associated X·S·L·T parsers are wrapped in a H·T·M·L
  Source files whose media type does not have an associated X·S·L·T
    parser are considered “assets” and will not be transformed.
  
  Source files whose media type does not have an associated X·S·L·T
    parser are considered “assets” and will not be transformed.
  
-For compatibility with this program, source filenames should not
-  contain Ascii whitespace or any of the following Ascii characters:
-  ``!"#$%&()-:<>?\^`{|}``.
-These characters are either invalid in u·r·i’s or conflict with aspects
-  of the Make or commandline syntax.
+**☡ For compatibility with this program, source filenames must not
+  contain Ascii whitespace, colons (`:`), pipes (`|`), bucks (`$`),
+  percents (`%`) or control characters, and must not begin with a
+  hyphen‐minus (`-`).**
+The former characters have the potential to conflict with make syntax,
+  and a leading hyphen‐minus is confusable for a command‐line argument.
  
  ## Parsers
  
  
  ## Parsers
  
@@ -253,8 +255,10 @@ For example, the trivial `text/plain` parser is defined as follows :⁠—
  <transform
    xmlns="http://www.w3.org/1999/XSL/Transform"
    xmlns:html="http://www.w3.org/1999/xhtml"
  <transform
    xmlns="http://www.w3.org/1999/XSL/Transform"
    xmlns:html="http://www.w3.org/1999/xhtml"
+  xmlns:书社="urn:fdc:ladys.computer:20231231:Shu1She4"
    version="1.0"
  >
    version="1.0"
  >
+  <书社:id>example:text/plain</书社:id>
    <template match="html:script[@type='text/plain']">
      <html:pre><value-of select="."/></html:pre>
    </template>
    <template match="html:script[@type='text/plain']">
      <html:pre><value-of select="."/></html:pre>
    </template>
@@ -269,8 +273,21 @@ Alternatively, you can set the `@书社:supported-media-types` attribute
    on the root element of the parser to override media type support
    detection.
  
    on the root element of the parser to override media type support
    detection.
  
-Parsers can also target specific dialects of X·M·L, in which case they
-  operate on the same basic principles as transforms (described below).
+Even when `@书社:supported-media-types` is set, it is a requirement
+  that each parser transform any `<html:script>` elements with a
+  `@type` which matches their registered types into something else.
+Otherwise the parser will be stuck in an endless loop.
+The result tree of applying the transform to the `<html:script>`
+  element will be reparsed (in case any new `<html:script>` elements
+  were added in its subtree), and a `@书社:parsed-by` attribute will be
+  added to each toplevel element in the result.
+The value of this attribute will be the value of the `<书社:id>`
+  toplevel element in the parser.
+
+It is possible for parsers to support zero plaintext types.
+This is useful when targeting specific dialects of X·M·L; parsers in
+  this sense operate on the same basic principles as transforms
+  (described below).
  The major distinction between X·M·L parsers and transforms is where in
    the process the transformation happens:
  Parsers are applied *prior* to embedding (and can be used to generate
  The major distinction between X·M·L parsers and transforms is where in
    the process the transformation happens:
  Parsers are applied *prior* to embedding (and can be used to generate
@@ -303,6 +320,15 @@ Embedding takes place after parsing but before transformation, so
    and update them accordingly; it will signal an error if the
    dependencies are recursive.
  
    and update them accordingly; it will signal an error if the
    dependencies are recursive.
  
+## Output Redirection
+
+By default, ⛩️📰 书社 installs files to the same location in `DESTDIR`
+  as they were placed in their `SRCDIR`.
+This behaviour can be customized by setting the `@书社:destination`
+  attribute on the root element, whose value can give a different path.
+This attribute is read after parsing, but before transformation (where
+  it is silently dropped).
+
  ## Transforms
  
  Transforms are used to convert X·M·L files into their final output,
  ## Transforms
  
  Transforms are used to convert X·M·L files into their final output,
@@ -378,7 +404,7 @@ In addition to being called with the transform result, each of these
    modes will additionally be called with a `<xslt:include>` element
    corresponding to each transform.
  If a transform has a `<书社:id>` top‐level element whose value is an
    modes will additionally be called with a `<xslt:include>` element
    corresponding to each transform.
  If a transform has a `<书社:id>` top‐level element whose value is an
-  i·r·i, its `<xslt:import>` element will have a corresponding
+  i·r·i, its `<xslt:include>` element will have a corresponding
    `@书社:id` attribute.
  This mechanism can be used to allow transforms to insert content
    without matching any elements in the result; for example, the
    `@书社:id` attribute.
  This mechanism can be used to allow transforms to insert content
    without matching any elements in the result; for example, the