Lady [Fri, 2 Feb 2024 04:09:47 +0000 (23:09 -0500)]
Make percent‐decoding awk script portable
This script depended on `printf` having the same behaviour within `awk`
and on the commandline. This doesn’t appear to be true in G·N·U Awk.
Instead, pipe into the shell version from within the Awk script.
Lady [Fri, 2 Feb 2024 04:06:22 +0000 (23:06 -0500)]
Disallow filenames which end in a cloparen
There is a bug in G·N·U Make which causes the `wildcard` function to
ignore files and directories which end in a cloparen (`)`). To be safe,
disallow these files as sources, even though parens are generally 🆗.
Lady [Wed, 31 Jan 2024 05:48:08 +0000 (00:48 -0500)]
Specifiy magic files, not a directory
This is more flexible and matches how parsers and transforms work.
Compiling magic requires these to all be placed in the same directory
at some point, but symbolic linking works for this purpose.
Lady [Wed, 31 Jan 2024 05:40:55 +0000 (00:40 -0500)]
Add EXTRA* variables
It shouldn’t be necessary to know where existing parsers and transforms
are kept or what the default find rules are in order to supply
additional ones.
Lady [Wed, 31 Jan 2024 05:34:32 +0000 (00:34 -0500)]
Reduce the strength of magic matches
The strength of magic matches is based on the length of the first line.
Boost this by `100` by default, but add an additional `10` for each
byte in additional match lines. This should provide a more comfortable
set of default strengths to work with.
Lady [Mon, 22 Jan 2024 01:36:38 +0000 (20:36 -0500)]
Disallow backslash in filenames; filter out spaces
Space characters will break this make·file, so like other characters
which might break it should be filtered out by `find`. Backslashes feel
very fraught, especially under secondary expansion, and should be
dis·allowed for simplicity.
Lady [Mon, 22 Jan 2024 01:10:39 +0000 (20:10 -0500)]
Consider destinations akin to dependencies
There were a couple statements which predate the addition of the
destinations file which check for or remove the dependencies file.
These two files should be considered interchangable for the purpose of
this kind of detection, i·e their impact on make·file rebuilding.
Lady [Mon, 22 Jan 2024 00:49:17 +0000 (19:49 -0500)]
Make find more cross‐compatible
The `-flags -nohidden` check doesn’t appear to work in G·N·U Find.
Apparently `-a` and `-o` are the Posix‐correct versions of `-and` and
`-or`, even though both G·N·U and B·S·D find seem to prefer the latter.
The `-type f` check is moved to the end, as it is computationally more
expensive than `-name` and so would benefit from an early exit.
Lady [Mon, 22 Jan 2024 00:44:23 +0000 (19:44 -0500)]
Add uninstall command; fix other phonies
`make uninstall` can now be used to remove (only) installed files. Note
that it only removes files which correspond to presently‐extant source
files; if a source is removed, then `make uninstall` will not remove
any previously‐installed derivatives.
`make all` used to compile assets, but this is unnecessary.
`make clean` now guards against `BUILDDIR` being empty (which would
otherwise result in a dangerous `rm -rf /`).
Lady [Sat, 20 Jan 2024 17:02:33 +0000 (12:02 -0500)]
Pass single result tree to modes in wrapper
When matching a `<xslt:include>` inside of a `书社:metadata` transform,
it’s useful to be able to easily match elements in the result. Right
now, that requires a reference to the little‐documented variable
`$书社:result`, because `<xslt:include>` elements don’t belong to the
same document as result tree elements.
This commit creates a new tree which simply copies over all the result
nodes and all of the `<xslt:include>` elements, and then passes that to
the modal templates. Consequently, matches like `/html:div` or
`/xslt:include` should work regardless of what the current context node
is.
Lady [Sat, 20 Jan 2024 17:02:11 +0000 (12:02 -0500)]
Better wrapper support for plural result trees
It’s not unsensible for a result tree to consist of both a
`<html:body>` and an `<html:head>` without a wrapping `<html:html>`
element, assuming that the result is going to be wrapped. This commit
improves support for this pattern in the wrapper.
Lady [Fri, 19 Jan 2024 01:16:04 +0000 (20:16 -0500)]
Disallow a few more characters in file·names
- Make will try to expand the glob characters `*`, `?`, and `[` when
followed by `]`. Forbid all of these (including both brackets in all
cases for simplicity).
- `#` and `;` are dangerous in make prerequisites (at least under
secondary expansion).
Lady [Thu, 18 Jan 2024 02:31:11 +0000 (21:31 -0500)]
Improve global params in parsers/transforms
- Uppercase global params to make them distinct.
- Make global params available in parsers, not just transforms, where
possible.
- Add params for the current ⛩️📰 书社 version and the current rev
of the source files (this requires `git` and makes some assumptions
about the location of the git directory.)
Lady [Thu, 18 Jan 2024 02:13:06 +0000 (21:13 -0500)]
Improve default parser/transform i·d’s
Use a format of `about:shushe?parser=<name>` and
`about:shushe?transform=<name>` as default i·d’s for parsers and
transforms which do not have one explicitly specified.
Lady [Thu, 18 Jan 2024 01:55:19 +0000 (20:55 -0500)]
Add @书社:parsed-by to parse results
This switches the parser to use a two‐stage parse, in which each node
is by default first processed in the `书社:parse` mode, which then
applies templates to the node. This provides a hook for selecting
certain kinds of elements, for example `<html:script>` elements, and
doing something to the result.
The “something” in this case is “making note of the parser which is
registered to that type on each result element by setting the
`@书社:parsed-by` attribute to its `@id`.
This setup also allows a reparsing of the parse result (in case new
`<html:script>` elements were produced by it); it is worth noting that
this could result in an endless loop if the `<html:script>` element is
not actually transformed by any parser.
Lady [Wed, 17 Jan 2024 06:36:01 +0000 (01:36 -0500)]
Remember types in parser
Rather than derive the types from the parser via a separate transform,
collect them when building the parser and remember them in a
`<html:dl>` which can be queried with X·Path.
This is a prerequisite to accessing this information at parse time,
but also a useful optimization in its own right.
Lady [Tue, 16 Jan 2024 06:04:33 +0000 (01:04 -0500)]
Update FINDRULES documentation
This comment is out‐of‐date (since the removal of `FINDOPTS`), and the
readme prose misses that percents are also problematic and not matched
by default.
Lady [Tue, 16 Jan 2024 05:58:45 +0000 (00:58 -0500)]
Don’t double‐newline specials in perdec
It’s not necessary to insert newlines (as pipes) before and after the
`sed` substitutions for pipe and backslash, because these substitutions
take place before the substitution for percent‐encoded characters
(which adds its own newlines). (The extra newlines are harmless, they
just mean `awk` gets a few more empty records to process.)
Lady [Tue, 16 Jan 2024 05:07:00 +0000 (00:07 -0500)]
Enable output redirection with 书社:destination
Specifying this attribute on the root element (after parsing, but
before transformation) will override the default output location. All
of the processing for this can be done at the same time as dependency
detection, as it depends on media typing but not on the dependency
tree.
Lady [Tue, 16 Jan 2024 03:20:45 +0000 (22:20 -0500)]
Use pipe as internal delimiter instead of colon
Colons are valid characters in u·r·i’s, whereas pipes are not. Both
characters are forbidden in filenames because they have special meaning
in make·files, so there’s no reason to use the more confusing option.
Lady [Tue, 16 Jan 2024 02:41:30 +0000 (21:41 -0500)]
Update allowed characters; make find more selective
With percent‐encoding, it seems like the only problematic characters
are :—
- Whitespace (incompatible with make)
- Colons (incompatible with make)
- Pipes (incompatible with make as they introduce order‐only
prerequisites)
- Bucks (incomaptible with make secondary expansion)
- Percents (incompatible with secondary expansion inside static pattern
rules, and possibly other things)
- Leading hyphen‐minuses (confusable with a command‐line argument)
This commit updates the `find` rules to not select these files, in
addition to the existing behaviour of not selecting hidden files or
those which start with a period.
`FINDOPTS` is removed as the default is virtually always the correct
behaviour; users can override `FIND` if supplying options is absolutely
necessary.
Lady [Tue, 16 Jan 2024 01:46:10 +0000 (20:46 -0500)]
Percent‐encode filenames when generating u·r·i’s
It’s not known or expected whether tools like `xmlcatalog` can handle
full leiris. It’s better and safer to just only use u·r·i’s for
identifying resources.
Note that this _does_ have implications on includes (they must also be
percent‐encoded). Ideally, it would be possible to run this conversion
in the transforms, but this probably is not possible in X·S·L·T 1.0.
Lady [Tue, 16 Jan 2024 01:29:06 +0000 (20:29 -0500)]
Use colon as delimiter and hymin as recursive sigil
Colon is already forbidden in paths by the make syntax, and initial
hyphen is forbidden because it is confusable with command‐line options.
Re·use these for other semantics to hopefully reduce the number of
forbidden characters in filenames.
Lady [Thu, 11 Jan 2024 01:20:48 +0000 (20:20 -0500)]
Allow multiple source directories
⛩️📰 书社 might be called from another script or make·file, which
might have built files of its own. It would be a pain if each parent
script needed to copy all the source files into a new build directory
at every step, and it’s much easier to just allow ⛩️📰 书社 to support
multiple source directories (one for the original sources, and
additional ones for any files built by other scripts).
Naturally, ⛩️📰 书社 can’t support the same file subpath across
multiple source directories, as these would compile to the same place.
This commit tries to migitage this by just taking the first match, but
it hasn’t been tested and the behaviour should formally be considered
undefined.
Lady [Thu, 11 Jan 2024 01:20:38 +0000 (20:20 -0500)]
Make X·M·L types take priority over plaintext ones
If `XMLTYPES` defines something as X·M·L, it should be treated as
X·M·L, regardless of whether there is a parser which claims to support
it.
This avoids awkward footguns where a parser might transform and claim
support for (through `@书社:supported-media-types`) an X·M·L dialect,
resulting in ⛩️📰 书社 treating that type as plaintext and wrapping it
in an `<html:script>` element. X·M·L types listed in
`@书社:supported-media-types` should instead not have any effect
(⛩️📰 书社 should not require parsers to reparse X·M·L).
This commit also removes the unused `simpletypes` variable; it was
formerly used for categorization of types into plaintext or asset prior
to the implementation of automatic detection.
Lady [Thu, 11 Jan 2024 01:20:19 +0000 (20:20 -0500)]
Allow separate find rules for includes
This hypothetically enables the situation where `SRCDIR` and
`INCLUDEDIR` are the same, and files are grouped into one or the other
by some other factor.
Lady [Thu, 11 Jan 2024 01:19:32 +0000 (20:19 -0500)]
Do not follow symbolic links with `find`
⛩️📰 书社 expects that source files exist within `SRCDIR` and includes
exist within `INCLUDEDIR`. Following symlinks can break this
assumption. Other commands should follow symlinks by default, so there
shouldn’t be any need to resolve them this early in the process.
Lady [Thu, 11 Jan 2024 01:15:56 +0000 (20:15 -0500)]
Update readme documentation
- Provide more information regarding parsers, including X·M·L parsers.
- Update advice on allowed characters, to exclude all Ascii characters
not allowed in u·r·i’s as well as those known to cause potential
commandline problems.
- Improve the documentation regarding BUILDDIR and DESTDIR
Lady [Sat, 6 Jan 2024 03:59:15 +0000 (22:59 -0500)]
Add <书社:apply-attributes> transformation
The behaviour of `transforms/asset.xslt` is useful, but limited in that
the H·T·M·L elements it creates don’t have any attributes beyond
`@src`. `<书社:apply-attributes>` fixes this by allowing attributes to
be declared in a parent element which wraps the `<书社:link>`.
Lady [Sat, 6 Jan 2024 02:31:17 +0000 (21:31 -0500)]
Fix magic file generation in non‐default location
As it turns out, `file -C` always creates a file named `magic.mgc` in
the current working directory. Navigate to the build directory before
calling it instead of moving the file after.
Lady [Mon, 1 Jan 2024 22:18:49 +0000 (17:18 -0500)]
Provide a mechanism to override parser media types
If `@书社:supported-media-types` is present on the root element of a
parser, the normal media type detection is disabled and the value of
the attribute is used instead.
Lady [Mon, 1 Jan 2024 20:49:15 +0000 (15:49 -0500)]
Re·order remakes to (again) fix restarts
In cases where `$(BUILDDIR)/dependencies` exists but
`$(BUILDDIR)/.update-types` (initially) does not, it is important to
check for dependency updates *first*, prior to checking for parser
updates. This is because when parsers are updated, the dependency file
will be deleted, causing the dependency reload recipe to activate
immediately (prior to a restart) if it hasn’t already been checked.
Having correct behaviour depend on the ordering of these recipes isn’t
ideal, but the alternative is checking whether
`$(BUILDDIR)/.update-types` was created *in the course of processing
the make·file* and disabling dependency creation until the next restart
if it had been. This sounds unbearably complex and difficult to phrase
in a readable manner.
Lady [Mon, 1 Jan 2024 19:00:12 +0000 (14:00 -0500)]
Improve asset transforms
This commit converts `audio/*`, `image/*` and `video/*` embeds to their
appropriate H·T·M·L element, enables inline `<html:style>`s, and
improves the handling of `text/css` embeds.
Lady [Mon, 1 Jan 2024 18:40:20 +0000 (13:40 -0500)]
Allow inserting nodes before and after result
This commit adds two new modes akin to `书社:metadata`: `书社:header`,
which supplies nodes to insert at the beginning of the `<html:body>`,
and `书社:footer`, which supplies nodes to insert at the end. Like
`书社:metadata`, these modes do not run if output wrapping is disabled.
Lady [Mon, 1 Jan 2024 16:49:24 +0000 (11:49 -0500)]
Automatically encapsulate metadata and preserve it
During the embedding phase, give top‐level elements and embeds
`@itemscope` properties as well as a `@itemtype` which indicates which
they are. Don’t remove microdata from the output, and make use of these
properties when processing to ensure only document metadata is actually
used.
Lady [Mon, 1 Jan 2024 16:19:37 +0000 (11:19 -0500)]
Allow creation of metadata without matching result
Each node in the result can only be matched once in any given mode, and
transforms need a mechanism for inserting elements without requiring a
match. This commit gives them a means of doing so by also matching
every `<xslt:include>` in the main transform. If a transform has a
`书社:id` top‐level element which is an i·r·i, then its include will
have a corresponding attribute, and transforms can (by convention)
match this include without fear of conflicts.
This commit also makes the expansion and result available as top‐level
variables in the `书社:` namespace, so that transforms can easily match
within them.
Lady [Mon, 1 Jan 2024 06:32:28 +0000 (01:32 -0500)]
Fix/improve restarts by just waiting a sec
The previous method of attempting to retroactively reduce the timestamp
of the make·file when compiling dependencies hasn’t seemed reliable in
practice and probably isn’t portable either. However, a simple
`sleep 1` after touching the make·file but before the first restart
seems to reliably ensure the second restart happens.