From: Lady Date: Sun, 4 Feb 2024 22:36:46 +0000 (-0500) Subject: Update parser documentation X-Git-Tag: 0.6.0~2 X-Git-Url: https://git.ladys.computer/Shushe/commitdiff_plain/662456c81fa21e10334c1e23a36394794f71cd99?ds=sidebyside Update parser documentation - Remove class names which are no longer provided from description. - Document attributes added during parsing. --- diff --git a/README.markdown b/README.markdown index e0a143e..885fe75 100644 --- a/README.markdown +++ b/README.markdown @@ -272,16 +272,15 @@ Parsers are used to convert plaintext files into X·M·L trees, as well ⛩️📰 书社 comes with some parsers; namely :⁠— - **`parsers/plain.xslt`:** - Wraps `text/plain` contents in a `` element. + Wraps `text/plain` contents in a `` element. - **`parsers/record-jar.xslt`:** - Converts `text/record-jar` contents into a - `` of `` elements (one for - each record). + Converts `text/record-jar` contents into a `` of + `` elements (one for each record). - **`parsers/tsv.xslt`:** - Converts `text/tab-separated-values` contents into an - `` element. + Converts `text/tab-separated-values` contents into an `` + element. New ⛩️📰 书社 parsers which target plaintext formats should have an `` element with no `@name` or `@mode` and whose @@ -347,6 +346,26 @@ It is **strongly recommended** that auxillary templates in parsers be namespaced (by `@name` or `@mode`) whenever possible, to avoid conflicts between parsers. +### Attributes added during parsing + +⛩️📰 书社 will add a few attributes to the output of the parsing step, + namely :⁠— + +- A `@书社:cksum` attribute on toplevel result elements, giving the + `cksum` checksum of the corresponding source file. + +- For the elements which result from parsing plaintext `` + elements :⁠— + + - A `@书社:parsed-by` attribute, giving a space‐separated list of + parsers which parsed the node. + (Generally, this will be a list of one, but it is possible for the + result of a parse to be another plaintext node, which may be + parsed by a different parser.) + + - A `@书社:media-type` attribute, giving the identified media type of + the plaintext node. + ## Embedding Documents can be embedded in other documents using a `<书社:link>`