#!/usr/bin/sed -f # SPDX-FileCopyrightText: 2025, 2026 Lady # SPDX-License-Identifier: MPL-2.0 ## ⋯ 🧮🖍 Codemark ∷ sed ∷ SYNTAXES ∷ sh.sed ## ## ⁌ Shell script syntax ## ## ] Copyright © 2026 Lady [@ Ladys Computer]. ## ] ## ] This Source Code Form is subject to the terms of the Mozilla ## ] Public License, version 2.0. ## ] If a copy of the M·P·L was not distributed with this file, You ## ] can obtain one at {🔗}. ## § Implementation ## ## S·P·D·X copyright comments are made empty with·out replacement. ## Rather, any copyright information should be expressed in prose. /^# SPDX-.*/{ s/.*// b } ## Documentation comments begin with two hashes. ## The hashes are removed and the next cycle is begun. /^##/{ :indoc s/^## *// n /^##/b indoc /^$/b } ## The behaviour of blank lines (not handled by the above rules) varies ## depending on the line which follows. ## If it is not a documentation comment, a pipe is prepended. /^$/{ N /\n##/!s/^/|/ P D } ## We can now assume that the current line did not begin with two ## hashes; it should be treated as code. ## ## Lines which begin with (single) hashes are comments and not ## processed further. /^ *#/{ s/^\( *\)\(#.*\)/|\1⟦\2⟧/ b } ## An initial and final space is added to make processing easier. ## These will be removed at the end. s/^/ / s/$/ / ## Variable defaults must follow a colon `:´ command and be specified ## either as a string or as a reference to another variable. /^ *: "[$]{[^:}]*:=/{ s/"[$]{\([^:]*\):=\([^$"}]*\)}"/⟨"${⸤\1⸥:=\2}"⟩{@class="string"}/ s/"[$]{\([^:]*\):=\([$]{[^}]*}\)}"/⟨"${⸤\1⸥:=\2}"⟩{@class="string"}/ } ## Control flow keywords such as `if´ or `for´ must begin their line. ## They are handled upfront as they are easily recognized. ## ## For `then´, `else´, and `do´, following them with the no·op command ## `:´ is recommended as a matter of style, but not required. /^ *if [^ ]/s/if \([^ ]*\)/⦃︎if⦄︎ ⦃︎\1⦄︎/ /^ *then [^ ]/s/then \([^ ]*\)/⦃then⦄︎ ⦃︎\1⦄︎/ /^ *else [^ ]/s/else \([^ ]*\)/⦃else⦄︎ ⦃︎\1⦄︎/ /^ *fi /s/fi/⦃︎fi⦄︎/ /^ *for [^ ][^ ]* in /s/for \([^ ]*\) in/⦃︎for⦄︎ ⸤†1⸥ ⦃︎in⦄︎/ /^ *do [^ ]/s/do \([^ ]*\)/⦃do⦄︎ ⦃︎\1⦄︎/ /^ *done /s/done/⦃done⦄︎/ ## It is not permitted to pipe into a control flow keyword. ## So, after pipes, the script will jump here to begin processing. :postpipe ## When a line begins with the sequence `)"´, it is assumed to be ## closing a subshell. ## In this case, the script skips ahead to avoid recognizing this ## sequence as a command. /^ *)"/b postback ## The first word which is not a variable assignment is assumed to be ## the name of a command, unless this line already contains markup ## from control flow processing. /⦃︎/!s/ \([^ =][^ =]*\) / ⦃︎\1⦄︎ / ## Variables must be processed in a loop because there can be multiple ## on a line. ## However, they are only processed up to the first instance of `⦃︎´ ## markup (presumably inserted by the above rules). :variable /^\([^=]*\([^A-Za-z][^0-9A-Za-z_]*=\)*\)*⦃︎/!{ s/ \([A-Za-z][0-9A-Za-z_]*\)=/ ⸤\1⸥=/ / [A-Za-z][0-9A-Za-z_]*=/b variable } ## After backslashes (line continuations) or subshells, it is assumed ## that the line is in the middle of a command. ## By jumping here, the program skips variable and command name ## processing. :postback ## After `&&´, `||´, and `|´, the following word is a command. s/ \(&&\) \([^ ]*\) / \1 ⦃\2⦄︎ / s/ || \([^ ]*\) / || ⦃\1⦄︎ / s/ | \([^ ]*\) / | ⦃\1⦄︎ / ## Some rules are enforced for strings :⁠— ## ## • When a string wraps a variable reference, it must have the form ## `"${VARIABLE_NAME}"´ (the variable must be the only thing in the ## string). ## ## • When a string wraps a subshell, the text of the subshell ## invocation must be on its own line. ## The string must not contain anything aside from the subshell; it ## must begin `"$(´ and end `)"´. ## ## • Strings consisting of only a straight single quote must be ## double‐quoted. ## ## • Otherwise, strings must be single‐quoted. ## ## The shell syntax allows strings to be concatenated by placing them ## directly adjacent to each other, so a variable reference and a ## literal value may be joined together in this manner. s/"'"/⟨"'"⟩{@class="string"}/g s/"${\([^}]*\)}"/⟨"${\1}"⟩{@class="string"}/g /^ *)"/s/)"/⟨)"⟩{@class="string"}/g s/"$(/⟨"$(⟩{@class="string"}/g s/\('[^']*'\)/⟨\1⟩{@class="string"}/g ## The initial and final spaces added above can now be removed. s/^ // s/ $// ## Finally, a `|´ can be prepended to mark the line as preformatted. s/^/|/ ## If the current line of code ends in a backslash or a pipe, the ## following line is a continuation of it. ## The `n´ command manually advances to the next line with·out starting ## a new cycle, and the program then jumps to the correct place for ## what might follow. /[\]$/{ n s/^/ / s/$/ / b postback } /|$/{ n s/^/ / s/$/ / b postpipe }