From: Lady <redacted> Date: Sun, 14 Apr 2024 19:44:22 +0000 (-0400) Subject: Add support for manually serializing X·M·L X-Git-Tag: 0.7.2~1 X-Git-Url: https://git.ladys.computer/Shushe/commitdiff_plain/3dc36d1f23aff7cb60b498d88e503b619a7bc543?ds=sidebyside;hp=c26abcf07cd0f116b5810946744dd804cc41a208 Add support for manually serializing X·M·L This commit adds a transform for a new `<书社:serialize-xml>` element, which is useful in conjunction with `<书社:raw-text>` to produce a more finely‐controlled X·M·L output, or in other X·M·L‐y situations where an escaped X·M·L value is required. The algorithm used for serialization attempts to closely match the DOM Parsing and Serialization spec, including such behaviours as mandating an undeclared `xml:` prefix for the X·M·L name·space and dropping the prefix from elements whose name·space matches the default, but it probably isn’t exactly the same (due in part to the fact that the underlying data structure is an X·M·L infoset, not a potentially dynamically‐modified Dom). No special allowances are made for elements in the H·T·M·L name·space; this is not (yet) a suitable polyglot serializer (or intended to be one). --- diff --git a/README.markdown b/README.markdown index 55d6f8f..643a4d8 100644 --- a/README.markdown +++ b/README.markdown @@ -87,19 +87,23 @@ As this is a Make‐based program, steps will only be run if the corresponding buildfile or output file is older than its prerequisites. -## Namespaces +## Name·spaces -The ⛩️📰 书社 namespace is `urn:fdc:ladys.computer:20231231:Shu1She4`. +The ⛩️📰 书社 name·space is `urn:fdc:ladys.computer:20231231:Shu1She4`. -This document uses a few namespace prefixes, with the following +This document uses a few name·space prefixes, with the following meanings :— -| Prefix | Expansion | -| -------: | :----------------------------------------- | -| `html:` | `http://www.w3.org/1999/xhtml` | -| `xlink:` | `http://www.w3.org/1999/xlink` | -| `xslt:` | `http://www.w3.org/1999/XSL/Transform` | -| `书社:` | `urn:fdc:ladys.computer:20231231:Shu1She4` | +| Prefix | Expansion | +| ---------: | :-------------------------------------------- | +| `catalog:` | `urn:oasis:names:tc:entity:xmlns:xml:catalog` | +| `exsl:` | `http://exslt.org/common` | +| `exslstr:` | `http://exslt.org/strings` | +| `html:` | `http://www.w3.org/1999/xhtml` | +| `svg:` | `http://www.w3.org/2000/svg` | +| `xlink:` | `http://www.w3.org/1999/xlink` | +| `xslt:` | `http://www.w3.org/1999/XSL/Transform` | +| `书社:` | `urn:fdc:ladys.computer:20231231:Shu1She4` | ## Setup and Configuration @@ -293,7 +297,7 @@ New ⛩️📰 书社 parsers which target plaintext formats should have an `<xslt:template>` element with no `@name` or `@mode` and whose `@match` attribute… -- Starts with an appropriately‐namespaced qualified name for a +- Starts with an appropriately‐name·spaced qualified name for a `<html:script>` element. - Follows this with the string `[@type=`. @@ -350,7 +354,7 @@ Parsers are applied *prior* to embedding (and can be used to generate embeds); transforms are applied *after*. It is **strongly recommended** that auxillary templates in parsers be - namespaced (by `@name` or `@mode`) whenever possible, to avoid + name·spaced (by `@name` or `@mode`) whenever possible, to avoid conflicts between parsers. ### Attributes added during parsing @@ -432,6 +436,29 @@ Transforms are used to convert X·M·L files into their final output, To undo this behaviour, remove the `@itemscope` and `@itemtype` attributes from the embed during the transformation phase. +- **`transforms/serialization.xslt`:** + Replaces `<书社:serialize-xml>` elements with the (escaped) + serialized X·M·L of their contents. + This replacement happens during the application phase, after most + other transformations have taken place. + + If a `@with-namespaces` attribute is provided, any name·space nodes + on the toplevel serialized elements whose U·R·I’s correspond to the + definitions of the provided prefixes, as defined for the + `<书社:serialize-xml>` element, will be declared using name·space + attributes on the serialized elements. + Otherwise, only name·space nodes which _differ_ from the definitions + on the `<书社:serialize-xml>` element will be declared. + The string `#default` may be used to represent the default + name·space. + Multiple prefixes may be provided, separated by white·space. + + When it comes to name·spaces used internally by ⛩️📰 书社, the + prefix used by ⛩️📰 书社 may be declared _in addition to_ the + prefix(es) used in the source document(s). + It is not possible to selectively only declare one prefix for a + name·space to the exclusion of others. + The following are recommendations on effective creation of transforms :— @@ -439,10 +466,10 @@ The following are recommendations on effective creation of It is likely an error if two transforms have templates which match the same element (unless the templates have different priority). -- Namespace templates (with `@name` or `@mode`) whenever possible. +- Name·space templates (with `@name` or `@mode`) whenever possible. - Set `@exclude-result-prefixes` on the root `xslt:transform` element - to reduce the number of declared namespaces in the final result. + to reduce the number of declared name·spaces in the final result. ## Global Params @@ -547,7 +574,7 @@ In both cases, attributes from various sources are combined with Attribute application takes place after all ordinary transforms have completed. -Both elements ignore attributes in the `xml:` namespace, except for +Both elements ignore attributes in the `xml:` name·space, except for `@xml:lang`, which ignores all but the first definition (including any already present on the root element). On H·T·M·L and S·V·G elements, `@lang` has the same behaviour as @@ -555,7 +582,7 @@ On H·T·M·L and S·V·G elements, `@lang` has the same behaviour as ## Other Kinds of Output -There are a few special elements in the `书社:` namespace which, if +There are a few special elements in the `书社:` name·space which, if they appear as the toplevel element in a transformation result, cause ⛩️📰 书社 to produce something other than an X·M·L file. They are :— diff --git a/lib/catalog2transform.xslt b/lib/catalog2transform.xslt index 2315762..2b5c417 100644 --- a/lib/catalog2transform.xslt +++ b/lib/catalog2transform.xslt @@ -272,7 +272,7 @@ If a copy of the M·P·L was not distributed with this file, You can obtain one </xslt:copy> </xslt:template> <xslt:template match="@书社:destination|@书社:disable-output-wrapping|@书社:archived-as[../ancestor::*[not(self::书社:apply-attributes-to-root or self::书社:apply-attributes)]]" mode="书社:application" priority="1"/> - <xslt:template match="书社:archive" mode="书社:application"> + <xslt:template match="书社:archive" mode="书社:application" priority="1"> <xslt:copy> <xslt:for-each select="@*|node()"> <xslt:choose> @@ -301,7 +301,7 @@ If a copy of the M·P·L was not distributed with this file, You can obtain one </xslt:template> <xslt:template match="书社:apply-attributes" mode="书社:application" priority="1"> <xslt:variable name="children"> - <xslt:apply-templates select="node()" mode="书社:application"/> + <xslt:apply-templates mode="书社:application"/> </xslt:variable> <xslt:call-template name="书社:apply-attributes"> <xslt:with-param name="context-nodes" select="."/> @@ -309,7 +309,7 @@ If a copy of the M·P·L was not distributed with this file, You can obtain one </xslt:call-template> </xslt:template> <xslt:template match="书社:apply-attributes-to-root" mode="书社:application" priority="1"> - <xslt:apply-templates select="node()" mode="书社:application"/> + <xslt:apply-templates mode="书社:application"/> </xslt:template> <xslt:template match="/node()" mode="书社:expand" priority="0"> <xslt:copy> diff --git a/lib/serialize.xslt b/lib/serialize.xslt new file mode 100644 index 0000000..5f792ea --- /dev/null +++ b/lib/serialize.xslt @@ -0,0 +1,257 @@ +<?xml version="1.0"?> +<!-- +SPDX-FileCopyrightText: 2024 Lady <https://www.ladys.computer/about/#lady> +SPDX-License-Identifier: MPL-2.0 +--> +<!-- +⁌ ⛩️📰 书社 ∷ lib/serialize.xslt + +© 2024 Lady [@ Lady’s Computer]. + +This Source Code Form is subject to the terms of the Mozilla Public License, v 2.0. +If a copy of the M·P·L was not distributed with this file, You can obtain one at <https://mozilla.org/MPL/2.0/>. +--> +<transform + xmlns="http://www.w3.org/1999/XSL/Transform" + xmlns:exsl="http://exslt.org/common" + xmlns:exslstr="http://exslt.org/strings" + xmlns:html="http://www.w3.org/1999/xhtml" + xmlns:书社="urn:fdc:ladys.computer:20231231:Shu1She4" + extension-element-prefixes="exsl exslstr" + version="1.0" +> + <template name="书社:sanitize-_amp_lt_gt"> + <param name="source"/> + <choose> + <when test="contains($source, '&')"> + <call-template name="书社:sanitize-_lt_gt"> + <with-param name="source" select="substring-before($source, '&')"/> + </call-template> + <text>&amp;</text> + <call-template name="书社:sanitize-_amp_lt_gt"> + <with-param name="source" select="substring-after($source, '&')"/> + </call-template> + </when> + <otherwise> + <call-template name="书社:sanitize-_lt_gt"> + <with-param name="source" select="$source"/> + </call-template> + </otherwise> + </choose> + </template> + <template name="书社:sanitize-_amp_quot_lt_gt"> + <param name="source"/> + <choose> + <when test="contains($source, '&')"> + <call-template name="书社:sanitize-_quot_lt_gt"> + <with-param name="source" select="substring-before($source, '&')"/> + </call-template> + <text>&amp;</text> + <call-template name="书社:sanitize-_amp_quot_lt_gt"> + <with-param name="source" select="substring-after($source, '&')"/> + </call-template> + </when> + <otherwise> + <call-template name="书社:sanitize-_quot_lt_gt"> + <with-param name="source" select="$source"/> + </call-template> + </otherwise> + </choose> + </template> + <template name="书社:sanitize-_gt"> + <param name="source"/> + <choose> + <when test="contains($source, '>')"> + <value-of select="substring-before($source, '>')"/> + <text>&gt;</text> + <call-template name="书社:sanitize-_gt"> + <with-param name="source" select="substring-after($source, '>')"/> + </call-template> + </when> + <otherwise> + <value-of select="$source"/> + </otherwise> + </choose> + </template> + <template name="书社:sanitize-_lt_gt"> + <param name="source"/> + <choose> + <when test="contains($source, '<')"> + <call-template name="书社:sanitize-_gt"> + <with-param name="source" select="substring-before($source, '<')"/> + </call-template> + <text>&lt;</text> + <call-template name="书社:sanitize-_lt_gt"> + <with-param name="source" select="substring-after($source, '<')"/> + </call-template> + </when> + <otherwise> + <call-template name="书社:sanitize-_gt"> + <with-param name="source" select="$source"/> + </call-template> + </otherwise> + </choose> + </template> + <template name="书社:sanitize-_quot_lt_gt"> + <param name="source"/> + <choose> + <when test="contains($source, '"')"> + <call-template name="书社:sanitize-_lt_gt"> + <with-param name="source" select="substring-before($source, '"')"/> + </call-template> + <text>&quot;</text> + <call-template name="书社:sanitize-_quot_lt_gt"> + <with-param name="source" select="substring-after($source, '"')"/> + </call-template> + </when> + <otherwise> + <call-template name="书社:sanitize-_lt_gt"> + <with-param name="source" select="$source"/> + </call-template> + </otherwise> + </choose> + </template> + <template name="书社:serialize-attribute-value"> + <param name="source"/> + <call-template name="书社:sanitize-_amp_quot_lt_gt"> + <with-param name="source" select="string($source)"/> + </call-template> + </template> + <template name="书社:serialize-text-value"> + <param name="source"/> + <call-template name="书社:sanitize-_amp_lt_gt"> + <with-param name="source" select="string($source)"/> + </call-template> + </template> + <template match="node()" mode="书社:serialize"> + <param name="namespace" select="''"/> + <param name="declare-namespaces" select="''"/> + <param name="prefix-map"> + <html:dl> + <html:div> + <html:dt>xml</html:dt> + <html:dd>http://www.w3.org/XML/1998/namespace</html:dd> + </html:div> + </html:dl> + </param> + <variable name="namespaces-to-declare" select="exslstr:tokenize($declare-namespaces)"/> + <choose> + <when test="self::*"> + <variable name="local-namespace" select="string(namespace::*[local-name()=''])"/> + <variable name="local-prefixes" select="namespace::*[local-name()!='' and local-name()!='xml']"/> + <variable name="local-prefixes-map"> + <html:dl> + <html:div> + <html:dt>xml</html:dt> + <html:dd>http://www.w3.org/XML/1998/namespace</html:dd> + </html:div> + <for-each select="exsl:node-set($prefix-map)//html:div"> + <if test="string(html:dt)!='xml' and not($local-prefixes[local-name()=string(current()/html:dt)])"> + <copy-of select="."/> + </if> + </for-each> + <for-each select="$local-prefixes"> + <html:div> + <html:dt> + <value-of select="local-name()"/> + </html:dt> + <html:dd> + <value-of select="string(.)"/> + </html:dd> + </html:div> + </for-each> + </html:dl> + </variable> + <variable name="qualified-name"> + <choose> + <when test="$namespace=namespace-uri()"> + <if test="$namespace='http://www.w3.org/XML/1998/namespace'"> + <text>xml:</text> + </if> + <value-of select="local-name()"/> + </when> + <otherwise> + <value-of select="name()"/> + </otherwise> + </choose> + </variable> + <text><</text> + <value-of select="$qualified-name"/> + <if test="$local-namespace!=$namespace or $namespaces-to-declare[string(.)=$local-namespace]"> + <text> xmlns="</text> + <call-template name="书社:serialize-attribute-value"> + <with-param name="source" select="$local-namespace"/> + </call-template> + <text>"</text> + </if> + <for-each select="exsl:node-set($local-prefixes-map)//html:div"> + <if test="$namespaces-to-declare[string(.)=string(current()/html:dd)] or string(html:dt)!='xml' and not(exsl:node-set($prefix-map)//html:div[string(html:dt)=string(current()/html:dt) and string(html:dd)=string(current()/html:dd)])"> + <text> xmlns:</text> + <value-of select="html:dt"/> + <text>="</text> + <call-template name="书社:serialize-attribute-value"> + <with-param name="source" select="string(html:dd)"/> + </call-template> + <text>"</text> + </if> + </for-each> + <for-each select="@*"> + <text> </text> + <choose> + <when test="namespace-uri()='http://www.w3.org/XML/1998/namespace'"> + <text>xml:</text> + <value-of select="local-name()"/> + </when> + <otherwise> + <value-of select="name()"/> + </otherwise> + </choose> + <text>="</text> + <call-template name="书社:serialize-attribute-value"> + <with-param name="source" select="string(.)"/> + </call-template> + <text>"</text> + </for-each> + <choose> + <when test="node()"> + <text>></text> + <apply-templates mode="书社:serialize"> + <with-param name="namespace" select="$local-namespace"/> + <with-param name="prefix-map" select="$local-prefixes-map"/> + </apply-templates> + <text></</text> + <value-of select="$qualified-name"/> + <text>></text> + </when> + <otherwise> + <text>/></text> + </otherwise> + </choose> + </when> + <when test="node()"> + <apply-templates mode="书社:serialize"> + <with-param name="namespace" select="$namespace"/> + <with-param name="prefix-map" select="$prefix-map"/> + </apply-templates> + </when> + <when test="self::comment()"> + <text><!--</text> + <value-of select="."/> + <text>--></text> + </when> + <when test="self::text()"> + <call-template name="书社:serialize-text-value"> + <with-param name="source" select="string(.)"/> + </call-template> + </when> + <when test="self::processing-instruction()"> + <text><?</text> + <value-of select="local-name()"/> + <text> </text> + <value-of select="."/> + <text>?></text> + </when> + <otherwise/> + </choose> + </template> +</transform> diff --git a/transforms/serialization.xslt b/transforms/serialization.xslt new file mode 100644 index 0000000..09f595e --- /dev/null +++ b/transforms/serialization.xslt @@ -0,0 +1,67 @@ +<?xml version="1.0"?> +<!-- +SPDX-FileCopyrightText: 2024 Lady <https://www.ladys.computer/about/#lady> +SPDX-License-Identifier: MPL-2.0 +--> +<!-- +⁌ ⛩️📰 书社 ∷ transforms/serialization.xslt + +© 2024 Lady [@ Lady’s Computer]. + +This Source Code Form is subject to the terms of the Mozilla Public License, v 2.0. +If a copy of the M·P·L was not distributed with this file, You can obtain one at <https://mozilla.org/MPL/2.0/>. +--> +<transform + xmlns="http://www.w3.org/1999/XSL/Transform" + xmlns:exsl="http://exslt.org/common" + xmlns:exslstr="http://exslt.org/strings" + xmlns:html="http://www.w3.org/1999/xhtml" + xmlns:书社="urn:fdc:ladys.computer:20231231:Shu1She4" + extension-element-prefixes="exsl exslstr" + version="1.0" +> + <import href="../lib/serialize.xslt"/> + <书社:id>urn:fdc:ladys.computer:20231231:Shu1She4:serialization.xslt</书社:id> + <template match="书社:serialize-xml" mode="书社:application" priority="1"> + <variable name="namespaces" select="namespace::*"/> + <variable name="contents"> + <apply-templates mode="书社:application"/> + </variable> + <variable name="passthru-namespaces" select="exslstr:tokenize(string(@with-namespaces))"/> + <apply-templates select="exsl:node-set($contents)" mode="书社:serialize"> + <with-param name="namespace"> + <value-of select="namespace::*[local-name()='']"/> + </with-param> + <with-param name="declare-namespaces"> + <if test="$passthru-namespaces[string(.)='xml']"> + <text> http://www.w3.org/XML/1998/namespace</text> + </if> + <for-each select="namespace::*[local-name()!='xml']"> + <if test="local-name()='' and $passthru-namespaces[string(.)='#default'] or $passthru-namespaces[string(.)=local-name(current())]"> + <text> </text> + <value-of select="string(.)"/> + </if> + </for-each> + <text> </text> + </with-param> + <with-param name="prefix-map"> + <html:dl> + <html:div> + <html:dt>xml</html:dt> + <html:dd>http://www.w3.org/XML/1998/namespace</html:dd> + </html:div> + <for-each select="namespace::*[local-name()!='' and local-name()!='xml']"> + <html:div> + <html:dt> + <value-of select="local-name()"/> + </html:dt> + <html:dd> + <value-of select="string(.)"/> + </html:dd> + </html:div> + </for-each> + </html:dl> + </with-param> + </apply-templates> + </template> +</transform>