Lady’s Gitweb - Shushe/blob - README.markdown

   1 # ⛩️📰 书社
   2
   3 <b>An X·S·L·T‐based static site generator.</b>
   4
   5 <dfn>⛩️📰 书社</dfn> aims to make it easy to generate websites with
   6   X·S·L·T and G·N·U Make.
   7 It is consequently only a good choice for people who like X·S·L·T and
   8   G·N·U Make and wish it were easier to make websites with them.
   9
  10 It makes things easier by :⁠—
  11
  12 - Automatically identifying source files and characterizing them by
  13     type (X·M·L, text, or asset).
  14
  15 - Parsing supported text types into X·M·L trees.
  16
  17 - Enabling easy inclusion of source files within each other.
  18
  19 It aims to do this with zero dependencies beyond the programs already
  20   installed on your computer.
  21
  22 ## Nomenclature
  23
  24 <i lang="cmn-Hans">书社</i> is a Chinese word meaning “publishing
  25   house”.
  26
  27 The first character, <i lang="cmn-Hans">书</i>, is the simplified form
  28   of “document”.
  29
  30 The second character, <i lang="cmn-Hans">社</i>, contemporarily means
  31   “association”, but historically referred to the god of the soil and
  32   related altars or festivities.
  33 In Japanese, it is an alternate spelling for <i lang="ja">やしろ</i>,
  34   the word for “Shinto shrine”.
  35
  36 The name <i lang="cmn-Hans">书社</i> was chosen to play on this pun, as
  37   it is intended as a publishing program for webshrines.
  38
  39 In Ascii environments, ⛩️📰 书社 should be written `Shushe`, following
  40   the pinyin transliteration.
  41
  42 ## Basic Usage
  43
  44 Place source files in `sources/` and run `make install` to compile
  45   the result to `public/`.
  46 Compilation involves the following steps :⁠—
  47
  48 1. ⛩️📰 书社 compiles all of the magic files in `magic/` into a single
  49     file, `build/magic.mgc`.
  50
  51 2. ⛩️📰 书社 processes all of the parsers in `parsers/` and determines
  52     the list of supported plaintext types.
  53
  54 3. ⛩️📰 书社 identifies all of the source files and includes and uses
  55     `build/magic.mgc` to classify them by media type.
  56
  57 4. ⛩️📰 书社 parses all plaintext and X·M·L source files and includes
  58     and then builds a dependency tree between them.
  59
  60 5. ⛩️📰 书社 uses the dependency tree to establish prerequisites for
  61     each output file.
  62
  63 6. ⛩️📰 书社 compiles each output file to `build/public`.
  64
  65 7. ⛩️📰 书社 copies the output files to `public`.
  66
  67 You can use `make list` to list each identified source file or include
  68   alongside its computed type and dependencies.
  69 As this is a Make‐based program, steps will only be run if the
  70   corresponding buildfile or output file is older than its
  71   prerequisites.
  72
  73 ## Namespaces
  74
  75 The ⛩️📰 书社 namespace is `urn:fdc:ladys.computer:20231231:Shu1She4`.
  76
  77 This document uses a few namespace prefixes, with the following
  78   meanings :⁠—
  79
  80 |   Prefix | Expansion                                  |
  81 | -------: | :----------------------------------------- |
  82 |  `html:` | `http://www.w3.org/1999/xhtml`             |
  83 | `xlink:` | `http://www.w3.org/1999/xlink`             |
  84 |  `xslt:` | `http://www.w3.org/1999/XSL/Transform`     |
  85 |  `书社:` | `urn:fdc:ladys.computer:20231231:Shu1She4` |
  86
  87 ## Setup and Configuration
  88
  89 ⛩️📰 书社 depends on the following programs to run.
  90 In every case, you may supply your own implementation by overriding the
  91   corresponding (allcaps) variable (e·g, set `MKDIR` to supply your own
  92   `mkdir` implementation).
  93
  94 - `cat`
  95 - `cp`
  96 - `echo`
  97 - `file`
  98 - `find`
  99 - `mkdir` (requires support for `-p`)
 100 - `mv`
 101 - `printf`
 102 - `rm`
 103 - `sed`
 104 - `sleep`
 105 - `test`
 106 - `touch`
 107 - `tr` (requires support for `-d`)
 108 - `uuencode` (requires support for `-m` and `-r`)
 109 - `xmlcatalog` (provided by `libxml2`)
 110 - `xmllint` (provided by `libxml2`)
 111 - `xsltproc` (provided by `libxslt`)
 112
 113 The following additional variables can be used to control the behaviour
 114   of ⛩️📰 书社 :⁠—
 115
 116 - **`SRCDIR`:**
 117   The location of the source files (default: `sources`).
 118
 119 - **`INCLUDEDIR`:**
 120   The location of the source files (default: `sources/includes`).
 121   This can be inside of `SRCDIR`, but needn’t be.
 122
 123 - **`BUILDDIR`:**
 124   The location of the (temporary) build directory (default: `build`).
 125
 126 - **`DESTDIR`:**
 127   The location of directory to output files to (default: `public`).
 128
 129 - **`THISDIR`:**
 130   The location of the ⛩️📰 书社 `GNUmakefile`.
 131   This should be set automatically when calling Make and shouldn’t ever
 132     need to be set manually.
 133   This variable is used to find the ⛩️📰 书社 `lib/` folder, which is
 134     expected to be in the same location.
 135
 136 - **`MAGICDIR`:**
 137   The location of the magic files to use (default: `$(THISDIR)/magic`).
 138
 139 - **`FINDOPTS`:**
 140   Options to pass to `find` when searching for source files (default:
 141     `-LE`).
 142
 143 - **`FINDRULES`:**
 144   Rules to use with `find` when searching for source files (default:
 145     `-flags -nohidden -and -not -name '.*'`).
 146
 147 - **`PARSERS`:**
 148   A white·space‐separated list of parsers to use (default:
 149     `$(THISDIR)/parsers/*.xslt`).
 150
 151 - **`TRANSFORMS`:**
 152   A white·space‐separated list of transforms to use (default:
 153     `$(THISDIR)/transforms/*.xslt`).
 154
 155 - **`XMLTYPES`:**
 156   A white·space‐separated list of media types to consider X·M·L
 157     (default: `application/xml text/xml`).
 158
 159 - **`VERBOSE`:**
 160   If this variable has a value, every recipe instruction will be
 161     printed when it runs (default: empty).
 162   This is helpful for debugging, but typically too noisy for general
 163     usage.
 164
 165 ## Source Files
 166
 167 Source files may be placed in `SRCDIR` in any manner; the file
 168   structure used there will match the output.
 169 The type of source files is *not* determined by file extension, but
 170   rather by magic number; this means that files **must** begin with
 171   something recognizable.
 172 Supported magic numbers include :⁠—
 173
 174 - `<?xml` for `application/xml` files
 175 - `#!js` for `text/javascript` files
 176 - `@charset "` for `text/css` files
 177 - `#!tsv` for `text/tab-separated-values` files
 178
 179 Text formats with associated X·S·L·T parsers are wrapped in a H·T·M·L
 180   `<script>` element whose `@type` gives its media type, and then
 181   passed to the parser to process.
 182 Source files whose media type does not have an associated X·S·L·T
 183   parser are considered “assets” and will not be transformed.
 184
 185 For compatibility with this program, source filenames should conform to
 186   the following rules :⁠—
 187
 188 - They should not start with a hyphen‐minus.
 189   This is to prevent confusion between filenames and options on the
 190     commandline.
 191
 192 - They should not contain spaces, colons, percent signs, backticks,
 193     question marks, hashes, or backslashes.
 194
 195 In general, filenames should be such that they do not require
 196   percent‐encoding in the path component of an i·r·i.
 197
 198 ## Parsers
 199
 200 Parsers are used to convert plaintext files into X·M·L trees, as well
 201   as convert plaintext formats which are already included inline in
 202   existing source X·M·L documents.
 203 ⛩️📰 书社 comes with some parsers; namely :⁠—
 204
 205 - **`parsers/plain.xslt`:**
 206   Wraps `text/plain` contents in a `<html:pre>` element.
 207
 208 - **`parsers/tsv.xslt`:**
 209   Converts `text/tab-separated-values` contents into an `<html:table>`
 210     element.
 211
 212 New ⛩️📰 书社 parsers should have a `<xslt:template>` element with no
 213   `@name` or `@mode` and whose `@match` attribute…
 214
 215 - Starts with an appropriately‐namespaced qualified name for a
 216     `<html:script>` element.
 217
 218 - Follows this with the string `[@type=`.
 219
 220 - Follows this with a quoted string giving a media type supported by
 221     the parser.
 222   Media type parameters are *not* supported.
 223
 224 - Follows this with the string `]`.
 225
 226 For example, the trivial `text/plain` parser is defined as follows :⁠—
 227
 228 ```xml
 229 <?xml version="1.0"?>
 230 <transform
 231         xmlns="http://www.w3.org/1999/XSL/Transform"
 232         xmlns:html="http://www.w3.org/1999/xhtml"
 233         version="1.0"
 234 >
 235         <template match="html:script[@type='text/plain']">
 236                 <html:pre><value-of select="."/></html:pre>
 237         </template>
 238 </transform>
 239 ```
 240
 241 ⛩️📰 书社 will scan the provided parsers for this pattern to determine
 242   the set of allowed plaintext file types.
 243 Multiple such `<xslt:template>` elements may be provided in a single
 244   parser, for example if the parser supports multiple media types.
 245
 246 It is **strongly recommended** that all templates in parsers other than
 247   those described above be namespaced (by `@name` or `@mode`), to avoid
 248   conflicts between templates in multiple parsers.
 249
 250 ## Embedding
 251
 252 Documents can be embedded in other documents using a `<书社:link>`
 253   element with `@xlink:show="embed"`.
 254 The `@xlink:href`s of these elements should have the format
 255   `about:shushe?source=<path>`, where `<path>` provides the path to the
 256   file within `SRCDIR`.
 257 Includes, which do not generate outputs of their own but may still be
 258   freely embedded, instead use the format
 259   `about:shushe?include=<path>`, where `<path>` provides the path
 260   within `INCLUDEDIR`.
 261
 262 Embeds are replaced with the parsed contents of a file, unless the file
 263   is an asset, in which case an `<html:object>` element is produced
 264   instead (with the contents of the asset file provided as a base64
 265   `data:` u·r·i).
 266
 267 Embedding takes place after parsing but before transformation, so
 268   parsers are able to generate their own embeds.
 269 ⛩️📰 书社 is able to detect the transitive embed dependencies of files
 270   and update them accordingly; it will signal an error if the
 271   dependencies are recursive.
 272
 273 ## Transforms
 274
 275 Transforms are used to convert X·M·L files into their final output,
 276   after all necessary parsing and embedding has taken place.
 277 ⛩️📰 书社 comes with some transforms; namely :⁠—
 278
 279 - **`transforms/asset.xslt`:**
 280   Converts `<html:object type="text/css">` elements into corresponding
 281     `<html:link rel="stylesheet">` elements and
 282     `<html:object type="text/javascript">` elements into corresponding
 283     `<html:script>` elements.
 284   This transform enables embedding of `text/css` and `text/javascript`
 285     files, which ordinarily are considered assets (as they lack
 286     associated parsers).
 287
 288 - **`transforms/metadata.xslt`:**
 289   Provides basic `<html:head>` metadata.
 290   This metadata is generated from `<html:meta>` descendants of the
 291     first element with an `@itemscope` attribute (recommended to just
 292     be the root element).
 293   Such elements can provide metadata using the following `@itemprop`
 294     attributes :⁠—
 295
 296   - **`urn:fdc:ladys.computer:20231231:Shu1She4:title`:**
 297     Provides the title of the page.
 298
 299 The following are recommendations on effective creation of
 300   transforms :⁠—
 301
 302 - Make template matchers as specific as possible.
 303   It is likely an error if two transforms have templates which match
 304     the same element (unless the templates have different priority).
 305
 306 - Namespace templates (with `@name` or `@mode`) whenever possible.
 307
 308 - Set `@exclude-result-prefixes` on the root `xslt:transform` element
 309     to reduce the number of declared namespaces in the final result.
 310
 311 ## Output Wrapping
 312
 313 ⛩️📰 书社 will wrap the final output of the transforms in appropriate
 314   `<html:html>` and `<html:body>` elements, so it is not necessary for
 315   transforms to do this explicitly.
 316 After performing the initial transform, ⛩️📰 书社 will match the root
 317   node of the result in the following modes to fill in areas of the
 318   wrapper :⁠—
 319
 320 - **`书社:metadata`:**
 321   The result of matching in this mode is inserted into the
 322     `<html:head>` of the output.
 323
 324 In addition to being called with the transform result, each of these
 325   modes will additionally be called with a `<xslt:include>` element
 326   corresponding to each transform.
 327 If a transform has a `<书社:id>` top‐level element whose value is an
 328   i·r·i, its `<xslt:import>` element will have a corresponding
 329   `@书社:id` attribute.
 330 This mechanism can be used to allow transforms to insert content
 331   without matching any elements in the result; for example, the
 332   following transform adds a link to a stylesheet to the `<html:head>`
 333   of every page :⁠—
 334
 335 ```xml
 336 <?xml version="1.0"?>
 337 <transform
 338         xmlns="http://www.w3.org/1999/XSL/Transform"
 339         xmlns:html="http://www.w3.org/1999/xhtml"
 340         xmlns:xslt="http://www.w3.org/1999/XSL/Transform"
 341         xmlns:书社="urn:fdc:ladys.computer:20231231:Shu1She4"
 342         exclude-result-prefixes="书社"
 343         version="1.0"
 344 >
 345         <书社:id>example:add-stylesheet-links.xslt</书社:id>
 346         <template match="xslt:include[@书社:id='example:add-stylesheet-links.xslt']" mode="书社:metadata">
 347           <html:link rel="stylesheet" type="text/css" href="/style.css"/>
 348         </template>
 349 </transform>
 350 ```
 351
 352 ## License
 353
 354 Source files are licensed under the terms of the <cite>Mozilla Public
 355   License, version 2.0</cite>.
 356 For more information, see [LICENSE](./LICENSE).