Lady’s Gitweb - Shushe/blob - README.markdown

   1 # ⛩️📰 书社
   2
   3 <b>An X·S·L·T‐based static site generator.</b>
   4
   5 <dfn>⛩️📰 书社</dfn> aims to make it easy to generate websites with
   6   X·S·L·T and G·N·U Make.
   7 It is consequently only a good choice for people who like X·S·L·T and
   8   G·N·U Make and wish it were easier to make websites with them.
   9
  10 It makes things easier by :⁠—
  11
  12 - Automatically identifying source files and characterizing them by
  13     type (X·M·L, text, or asset).
  14
  15 - Parsing supported text types into X·M·L trees.
  16
  17 - Enabling easy inclusion of source files within each other.
  18
  19 It aims to do this with zero dependencies beyond the programs already
  20   installed on your computer.
  21
  22 ## Nomenclature
  23
  24 <i lang="cmn-Hans">书社</i> is a Chinese word meaning “publishing
  25   house”.
  26
  27 The first character, <i lang="cmn-Hans">书</i>, is the simplified form
  28   of “document”.
  29
  30 The second character, <i lang="cmn-Hans">社</i>, contemporarily means
  31   “association”, but historically referred to the god of the soil and
  32   related altars or festivities.
  33 In Japanese, it is an alternate spelling for <i lang="ja">やしろ</i>,
  34   the word for “Shinto shrine”.
  35
  36 The name <i lang="cmn-Hans">书社</i> was chosen to play on this pun, as
  37   it is intended as a publishing program for webshrines.
  38
  39 In Ascii environments, ⛩️📰 书社 should be written `Shushe`, following
  40   the pinyin transliteration.
  41
  42 ## Basic Usage
  43
  44 Place source files in `sources/` and run `make install` to compile
  45   the result to `public/`.
  46 Compilation involves the following steps :⁠—
  47
  48 1. ⛩️📰 书社 compiles all of the magic files in `magic/` into a single
  49     file, `build/magic.mgc`.
  50
  51 2. ⛩️📰 书社 processes all of the parsers in `parsers/` and determines
  52     the list of supported plaintext types.
  53
  54 3. ⛩️📰 书社 identifies all of the source files and includes and uses
  55     `build/magic.mgc` to classify them by media type.
  56
  57 4. ⛩️📰 书社 parses all plaintext and X·M·L source files and includes
  58     and then builds a dependency tree between them.
  59
  60 5. ⛩️📰 书社 uses the dependency tree to establish prerequisites for
  61     each output file.
  62
  63 6. ⛩️📰 书社 compiles each output file to `build/public`.
  64
  65 7. ⛩️📰 书社 copies the output files to `public`.
  66
  67 You can use `make list` to list each identified source file or include
  68   alongside its computed type and dependencies.
  69 As this is a Make‐based program, steps will only be run if the
  70   corresponding buildfile or output file is older than its
  71   prerequisites.
  72
  73 ## Namespaces
  74
  75 The ⛩️📰 书社 namespace is `urn:fdc:ladys.computer:20231231:Shu1She4`.
  76
  77 This document uses a few namespace prefixes, with the following
  78   meanings :⁠—
  79
  80 |   Prefix | Expansion                                  |
  81 | -------: | :----------------------------------------- |
  82 |  `html:` | `http://www.w3.org/1999/xhtml`             |
  83 | `xlink:` | `http://www.w3.org/1999/xlink`             |
  84 |  `xslt:` | `http://www.w3.org/1999/XSL/Transform`     |
  85 |  `书社:` | `urn:fdc:ladys.computer:20231231:Shu1She4` |
  86
  87 ## Setup and Configuration
  88
  89 ⛩️📰 书社 depends on the following programs to run.
  90 In every case, you may supply your own implementation by overriding the
  91   corresponding (allcaps) variable (e·g, set `MKDIR` to supply your own
  92   `mkdir` implementation).
  93
  94 - `cat`
  95 - `cp`
  96 - `echo`
  97 - `file`
  98 - `find`
  99 - `mkdir` (requires support for `-p`)
 100 - `mv`
 101 - `printf`
 102 - `rm`
 103 - `sed`
 104 - `sleep`
 105 - `test`
 106 - `touch`
 107 - `tr` (requires support for `-d`)
 108 - `uuencode` (requires support for `-m` and `-r`)
 109 - `xmlcatalog` (provided by `libxml2`)
 110 - `xmllint` (provided by `libxml2`)
 111 - `xsltproc` (provided by `libxslt`)
 112
 113 The following additional variables can be used to control the behaviour
 114   of ⛩️📰 书社 :⁠—
 115
 116 - **`SRCDIR`:**
 117   The location of the source files (default: `sources`).
 118
 119 - **`INCLUDEDIR`:**
 120   The location of the source files (default: `sources/includes`).
 121   This can be inside of `SRCDIR`, but needn’t be.
 122
 123 - **`BUILDDIR`:**
 124   The location of the (temporary) build directory (default: `build`).
 125
 126 - **`DESTDIR`:**
 127   The location of directory to output files to (default: `public`).
 128
 129 - **`THISDIR`:**
 130   The location of the ⛩️📰 书社 `GNUmakefile`.
 131   This should be set automatically when calling Make and shouldn’t ever
 132     need to be set manually.
 133   This variable is used to find the ⛩️📰 书社 `lib/` folder, which is
 134     expected to be in the same location.
 135
 136 - **`MAGICDIR`:**
 137   The location of the magic files to use (default: `$(THISDIR)/magic`).
 138
 139 - **`FINDOPTS`:**
 140   Options to pass to `find` when searching for source files (default:
 141     `-LE`).
 142
 143 - **`FINDRULES`:**
 144   Rules to use with `find` when searching for source files (default:
 145     `-flags -nohidden -and -not -name '.*'`).
 146
 147 - **`PARSERS`:**
 148   A white·space‐separated list of parsers to use (default:
 149     `$(THISDIR)/parsers/*.xslt`).
 150
 151 - **`TRANSFORMS`:**
 152   A white·space‐separated list of transforms to use (default:
 153     `$(THISDIR)/transforms/*.xslt`).
 154
 155 - **`XMLTYPES`:**
 156   A white·space‐separated list of media types to consider X·M·L
 157     (default: `application/xml text/xml`).
 158
 159 - **`VERBOSE`:**
 160   If this variable has a value, every recipe instruction will be
 161     printed when it runs (default: empty).
 162   This is helpful for debugging, but typically too noisy for general
 163     usage.
 164
 165 ## Source Files
 166
 167 Source files may be placed in `SRCDIR` in any manner; the file
 168   structure used there will match the output.
 169 The type of source files is *not* determined by file extension, but
 170   rather by magic number; this means that files **must** begin with
 171   something recognizable.
 172 Supported magic numbers include :⁠—
 173
 174 - `<?xml` for `application/xml` files
 175 - `#!js` for `text/javascript` files
 176 - `@charset "` for `text/css` files
 177 - `#!tsv` for `text/tab-separated-values` files
 178 - `%%` for `text/record-jar` files (unregistered; see
 179     [[draft-phillips-record-jar-01][]])
 180
 181 Text formats with associated X·S·L·T parsers are wrapped in a H·T·M·L
 182   `<script>` element whose `@type` gives its media type, and then
 183   passed to the parser to process.
 184 Source files whose media type does not have an associated X·S·L·T
 185   parser are considered “assets” and will not be transformed.
 186
 187 For compatibility with this program, source filenames should conform to
 188   the following rules :⁠—
 189
 190 - They should not start with a hyphen‐minus.
 191   This is to prevent confusion between filenames and options on the
 192     commandline.
 193
 194 - They should not contain spaces, colons, percent signs, backticks,
 195     question marks, hashes, or backslashes.
 196
 197 In general, filenames should be such that they do not require
 198   percent‐encoding in the path component of an i·r·i.
 199
 200 ## Parsers
 201
 202 Parsers are used to convert plaintext files into X·M·L trees, as well
 203   as convert plaintext formats which are already included inline in
 204   existing source X·M·L documents.
 205 ⛩️📰 书社 comes with some parsers; namely :⁠—
 206
 207 - **`parsers/plain.xslt`:**
 208   Wraps `text/plain` contents in a `<html:pre>` element.
 209
 210 - **`parsers/tsv.xslt`:**
 211   Converts `text/tab-separated-values` contents into an `<html:table>`
 212     element.
 213
 214 New ⛩️📰 书社 parsers should have a `<xslt:template>` element with no
 215   `@name` or `@mode` and whose `@match` attribute…
 216
 217 - Starts with an appropriately‐namespaced qualified name for a
 218     `<html:script>` element.
 219
 220 - Follows this with the string `[@type=`.
 221
 222 - Follows this with a quoted string giving a media type supported by
 223     the parser.
 224   Media type parameters are *not* supported.
 225
 226 - Follows this with the string `]`.
 227
 228 For example, the trivial `text/plain` parser is defined as follows :⁠—
 229
 230 ```xml
 231 <?xml version="1.0"?>
 232 <transform
 233   xmlns="http://www.w3.org/1999/XSL/Transform"
 234   xmlns:html="http://www.w3.org/1999/xhtml"
 235   version="1.0"
 236 >
 237   <template match="html:script[@type='text/plain']">
 238     <html:pre><value-of select="."/></html:pre>
 239   </template>
 240 </transform>
 241 ```
 242
 243 ⛩️📰 书社 will scan the provided parsers for this pattern to determine
 244   the set of allowed plaintext file types.
 245 Multiple such `<xslt:template>` elements may be provided in a single
 246   parser, for example if the parser supports multiple media types.
 247
 248 It is **strongly recommended** that all templates in parsers other than
 249   those described above be namespaced (by `@name` or `@mode`), to avoid
 250   conflicts between templates in multiple parsers.
 251
 252 ## Embedding
 253
 254 Documents can be embedded in other documents using a `<书社:link>`
 255   element with `@xlink:show="embed"`.
 256 The `@xlink:href`s of these elements should have the format
 257   `about:shushe?source=<path>`, where `<path>` provides the path to the
 258   file within `SRCDIR`.
 259 Includes, which do not generate outputs of their own but may still be
 260   freely embedded, instead use the format
 261   `about:shushe?include=<path>`, where `<path>` provides the path
 262   within `INCLUDEDIR`.
 263
 264 Embeds are replaced with the parsed contents of a file, unless the file
 265   is an asset, in which case an `<html:object>` element is produced
 266   instead (with the contents of the asset file provided as a base64
 267   `data:` u·r·i).
 268
 269 Embedding takes place after parsing but before transformation, so
 270   parsers are able to generate their own embeds.
 271 ⛩️📰 书社 is able to detect the transitive embed dependencies of files
 272   and update them accordingly; it will signal an error if the
 273   dependencies are recursive.
 274
 275 ## Transforms
 276
 277 Transforms are used to convert X·M·L files into their final output,
 278   after all necessary parsing and embedding has taken place.
 279 ⛩️📰 书社 comes with some transforms; namely :⁠—
 280
 281 - **`transforms/asset.xslt`:**
 282   Converts `<html:object>` elements which correspond to recognized
 283     media types into the appropriate H·T·M·L elements, and deletes
 284     `<html:style>` elements from the body of the document and moves
 285     them to the head.
 286
 287 - **`transforms/metadata.xslt`:**
 288   Provides basic `<html:head>` metadata.
 289   This metadata is generated from `<html:meta>` elements with one o.
 290     the following `@itemprop` attributes :⁠—
 291
 292   - **`urn:fdc:ladys.computer:20231231:Shu1She4:title`:**
 293     Provides the title of the page.
 294
 295   ⛩️📰 书社 automatically encapsulates embeds so that their metadata
 296     does not propogate up to the embedding document.
 297   To undo this behaviour, remove the `@itemscope` and `@itemtype`
 298     attributes from the embed during the transformation phase.
 299
 300 The following are recommendations on effective creation of
 301   transforms :⁠—
 302
 303 - Make template matchers as specific as possible.
 304   It is likely an error if two transforms have templates which match
 305     the same element (unless the templates have different priority).
 306
 307 - Namespace templates (with `@name` or `@mode`) whenever possible.
 308
 309 - Set `@exclude-result-prefixes` on the root `xslt:transform` element
 310     to reduce the number of declared namespaces in the final result.
 311
 312 ## Output Wrapping
 313
 314 ⛩️📰 书社 will wrap the final output of the transforms in appropriate
 315   `<html:html>` and `<html:body>` elements, so it is not necessary for
 316   transforms to do this explicitly.
 317 After performing the initial transform, ⛩️📰 书社 will match the root
 318   node of the result in the following modes to fill in areas of the
 319   wrapper :⁠—
 320
 321 - **`书社:header`:**
 322   The result of matching in this mode is prepended into the
 323     `<html:body>` of the output (before the transformation result).
 324
 325 - **`书社:footer`:**
 326   The result of matching in this mode is appended into the
 327     `<html:body>` of the output (after the transformation result).
 328
 329 - **`书社:metadata`:**
 330   The result of matching in this mode is inserted into the
 331     `<html:head>` of the output.
 332
 333 In addition to being called with the transform result, each of these
 334   modes will additionally be called with a `<xslt:include>` element
 335   corresponding to each transform.
 336 If a transform has a `<书社:id>` top‐level element whose value is an
 337   i·r·i, its `<xslt:import>` element will have a corresponding
 338   `@书社:id` attribute.
 339 This mechanism can be used to allow transforms to insert content
 340   without matching any elements in the result; for example, the
 341   following transform adds a link to a stylesheet to the `<html:head>`
 342   of every page :⁠—
 343
 344 ```xml
 345 <?xml version="1.0"?>
 346 <transform
 347   xmlns="http://www.w3.org/1999/XSL/Transform"
 348   xmlns:html="http://www.w3.org/1999/xhtml"
 349   xmlns:xslt="http://www.w3.org/1999/XSL/Transform"
 350   xmlns:书社="urn:fdc:ladys.computer:20231231:Shu1She4"
 351   exclude-result-prefixes="书社"
 352   version="1.0"
 353 >
 354   <书社:id>example:add-stylesheet-links.xslt</书社:id>
 355   <template match="xslt:include[@书社:id='example:add-stylesheet-links.xslt']" mode="书社:metadata">
 356     <html:link rel="stylesheet" type="text/css" href="/style.css"/>
 357   </template>
 358 </transform>
 359 ```
 360
 361 Output wrapping can be entirely disabled by adding a
 362   `@书社:disable-output-wrapping` attribute to the top‐level element in
 363   the result tree.
 364
 365 ## License
 366
 367 Source files are licensed under the terms of the <cite>Mozilla Public
 368   License, version 2.0</cite>.
 369 For more information, see [LICENSE](./LICENSE).
 370
 371 [draft-phillips-record-jar-01]: <https://datatracker.ietf.org/doc/html/draft-phillips-record-jar-01>