Lady [Fri, 28 Mar 2025 17:44:56 +0000 (13:44 -0400)]
Drop `assert()´ check in subpath parsing
This is a minor refactor to use `for´ loops instead of `while´ ones
when parsing subpaths, and to condition exiting the second loop on
filling the array, rather than on reaching the end of the string. If
there is a bug in the code which causes the array to be too small, this
will simply clip the result rather than try to assign to out‐of‐bounds
memory. If there is a bug in the code which causes the array to be too
big, the program will loop endlessly rather than fail an `assert()´
check. The improvement in the first case is deemed to justify the
slight degradation of behaviour in the second.
Lady [Fri, 28 Mar 2025 17:44:52 +0000 (13:44 -0400)]
Improve documentation generation
• Update Les·M·L to 0.4.0.
• Switch to using a local transform, `xslt/documentation.xslt´, which
includes the Les·M·L parser instead of using the Les·M·L parser
directly. This offers a great deal more configurability and the
potential for customization. Styling information is moved into this
transform and expanded. `<html:i>´ inside of `<html:code>´ is replaced
with `<html:var>´, unless it has a language tag.
• Automatically provide syntax hiliting to code blocks generated from
header files. It¦s impossible to do this perfectly, but assuming files
follow a consistent style, this should handle most cases transparently.
It would be good to document what exactly “consistent style” means.
⋯ More generally, it might be worth extracting this documentation
generation code (and maybe some stuff in `cgirls.mak´ as well?) out
into a separate repository of C tools, but this work will probably be
deferred until there is a second C repository which actually needs to
make use of it.
• Update the markup, and some text, in those documentation comments in
`request.h´.
Lady [Thu, 20 Mar 2025 03:59:56 +0000 (23:59 -0400)]
Document and improve `request.h´
Mostly, this commit just adds documentation comments to `request.h´ to
fully explain its behaviour, including renaming some things for
stylistic reasons. How·ever, it does make one significant change:
It reverts the definition of `cgirls_mtype´ and `cgirls_vb´ back to
enums.
Previously, I thought that it might be possible to be clever and
define these as `constexpr´ strings. That would enable them to be
serialized by string value and compared by pointer value, perhaps
offering the best of both worlds. And this worked in initial tests!
Unfortunately, `constexpr´ declarations have internal linkage, which
means that a ⹐different⹑ object is created for each translation unit
(file). This renders the existence of these `constexpr´s essentially
use·less in a header file, since nobody outside of `request.c´ will
have the same pointers that `request.c´ has.
(This C “feature” is presumably to help guarantee constancy, since
anything with `extern´ linkage cannot be truly guaranteed to be
constant.)
The new approach goes back to enums but provides `static const*const´
arrays which map those enums to string values. This of course means the
enums need to be roughly sequential, and one needs to check that a
given index is actually in bounds and does not point to `nullptr´
before using its associated value. Defining the enums as
`unsigned char´ at least means they can never be negative.
Lady [Wed, 19 Mar 2025 01:06:31 +0000 (21:06 -0400)]
Improve handling of strings
• String constants are now defined with `constexpr´. Because these are
(associated at runtime with) `char const*const´ values, they can be
compared more‐or‐less like the old enum values used to be; because
those pointers point to actual strings, the code for processing them
and serializing them is simplified quite a bit. A few arrays give the
list of available strings; these are ⹐not⹑ (cannot be) `constexpr´s
because while the strings themselves are known at compile time, the
pointers which point to them cannot be. Instead, they are
`static const*const´ arrays; the `static´ keyword keeps their
visibility internal.
⋯ Exceptionally, `cgirls_mtype_any´ is defined as `nullptr´ rather
than a string of zero length; handling this should always be a
special case.
• Most of the verbs have been commented out to reduce the amount of
code needed for an initial working implementation.
• The path·info parsing code has been refactored a bit, making use of a
new function, `cgirls_gobblepath´ to encapsulate the task of reading
up thru the next slash. The serialization code has also been
refactored here and there for tidiness.
• Some comments in `request.c´ used spaces instead of tabs. Whoops!
Note that Clang only supports `constexpr´ in version 19 and later.
Lady [Tue, 18 Mar 2025 04:31:32 +0000 (00:31 -0400)]
Add request parsing and related tests
This commit adds a function for processing a “path info” string, for
example one received through C·G·I (as the `PATH_INFO´ environment
variable), into a structure which represents its semantics,
`cgirls_req´. It also adds a function for reserializing this structure
into a canonical form. The program `cgirls-test-pathinfo´ is used with
the existing test infrastructure to ensure that strings are processed
correctly.
There is a flaw in this design (which I realized after making the
original commit, but before writing this updated message), in that an
empty identifier string is represented as `..´, which in a URL already
has a different, and very normative, meaning of “parent directory”.
This flaw will need to be fixed in a later commit.
Probably some more tests could be added here; in particular only a few
verbs and extensions are being tested right now and ideally they all
would be.
Lady [Tue, 18 Mar 2025 04:01:20 +0000 (00:01 -0400)]
Build testing infrastructure
For the actual testing script, see `sh/test.sh´, but note the following
details :—
• The file `cgirls.h´ has been renamed `aa.h´ because it will be needed
in test binaries as well.
• Source files are now automatically found rather than needing to list
them explicitly in `make/cgirls.mak´.
• Support for a `make/config.mak´ configuration file was added, for
modifying the build. A sample is provided which offers the sort of
optimizations one might want in a production environment.
• Tests consist of comparing result of running a program with the input
in `test/´ against an expected output in `expect/´. Lines which start
with a `#´ are ignored in both. This is to enable REUSE‐conformance
(all of these files are `CC0-1.0´) and hopefully will not cause
problems down the road.