From: Lady <redacted> Date: Wed, 10 May 2023 01:31:26 +0000 (-0700) Subject: [2023-05-09] fannish_metadata X-Git-Url: https://git.ladys.computer/Blog/commitdiff_plain/3687b980db4c6e9ece6b1bbb2284164046b57e99?ds=sidebyside;hp=b5ee3171c80dbe83c0d767074ba8672170704a98 [2023-05-09] fannish_metadata --- diff --git a/2023-05-09/fannish_metadata/#entry.rdf b/2023-05-09/fannish_metadata/#entry.rdf new file mode 100644 index 0000000..f7907cc --- /dev/null +++ b/2023-05-09/fannish_metadata/#entry.rdf @@ -0,0 +1,165 @@ +<awol:Entry + xml:lang="en" + xmlns:awol="http://bblfish.net/work/atom-owl/2006-06-06/" + xmlns:dc11="http://purl.org/dc/elements/1.1/" + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" + xmlns:sioc="http://rdfs.org/sioc/ns#" +> + <dc11:title>Requirements for fannish resource identifiers</dc11:title> + <dc11:date>2023-05-09T18:31:26-07:00</dc11:date> + <dc11:abstract rdf:parseType="Markdown">< discord about I·D +requirements for various types of fannish resources, and how these +things might federate out or be handled by other services. +]]></dc11:abstract> + <sioc:content rdf:parseType="Markdown">< discord about I·D +requirements for various types of fannish resources, and how these +things might federate out or be handled by other services. Our goal is +to create a decentralized network of fannish platforms, so figuring out +resource identification requirements is an important first step. + +Note that in the discussion which follows, a “resource” might be a +work, an author, a tag, a bookmark, or something else… anything which +might be a metadata subject. + +## ① Resources should have Tag U·R·I’s. + +Resource [Tag U·R·I](https://taguri.org)’s should be U·R·I’s of the +form :— + +``` +tag:<domain>,<date>:<path> +``` + +—: where `<domain>` is the domain name of a site, `<date>` is some +date (in `YYYY-MM-DD` format), and `<path>` is some path decided by the +person or people who owned `<domain>` at `<date>` to uniquely identify +the resource. Tag U·R·I’s are ideal for fannish resources for the +following reasons :— + +- In order for a fannish resource to be published on the internet, it + must be published at a domain on a date. So these requirements are + easily satisfied. + +- No external registration (beyond owning a domain name) is necessary + to mint U·R·I’s, and no maintenance is necessary. + +- The domain name in the Tag U·R·I indicates who should be the trusted + party when it comes to information about the resource: `<domain>`. + If you hear about the resource from somewhere else, you know to view + the information you receive with some level of suspicion. + +Some additional notes :— + +- The term “Tag U·R·I” has no relation to the normal fannish use of + “tag”; it’s just what the U·R·I scheme happens to be called. + +- `<date>` does *not* (and maybe *should* not) have to be the actual + date a resource was created. My recommendation would be to set + `<date>` to the date that a service was founded, so that e·g if a + service dies and a new one is started at the same domain, the two + generate clearly distinguishable U·R·I’s. + +- It’s not possible to distinguish between beneficial reasons for + content changes at `<domain>` (an author editing a work) and + malicious ones (hostile domain takeover). It’s also not possible to + verify that the people at `<domain>` actually controlled the domain + at `<date>`. But if people play by the rules, an *accidental* name + collision will never happen. + +## ② Resources should have canonical U·R·L’s containing their Tag. + +The canonical U·R·L for a resource should look like this :— + +``` +https://<domain>/<subpath>/tag:<domain>,<date>:<path> +``` + +There are a few important things of note here :— + +1. Both instances of `<domain>` **must be** the same, or else the U·R·L + is not canonical. + +2. The entire Tag U·R·I is present in the U·R·L, allowing it to be + identified even if the U·R·L ceases to be dereferencable. + +3. `<path>` may contain anything, including a query or fragment part. + +It is possible for resources to be *mirrored*. Mirrors **must** have +U·R·L’s like the following :— + +``` +https://<mirror-domain>/<mirror-subpath>/tag:<domain>,<date>:<path> +``` + +—: that is, the same easily‐recognizable Tag U·R·I, but at a different +domain and subpath. Mirrors **must** identify the canonical U·R·L of +the resource they are mirroring. `owl:sameAs` might be one mechanism of +doing this in R·D·F. + +## ③ Crossposted resources should link to each other. + +If a work is crossposted in two locations, one is not necessarily +“canonical” and the other a “mirror”. Likely, both will be canonical +and have their own Tag U·R·I’s (and this is a good thing). Crossposted +works should instead identify themselves by linking to each other in +some reciprocal fashion. We may need to come up with our own metadata +term for specifying this, but see e·g `dcterms:hasFormat` and +`dcterms:isFormatOf` which encode a similar (but not necessarily +reciprocal in the same way) relationship. + +## ④ Platforms should only trust mirrors as a last resort. + +And with copious warnings. If at all possible, platforms should direct +users to the canonical U·R·L associated with a resource. However, this +may not be possible (if an archive moves or goes down). In that case, +a platform *may* direct users to a mirror, with a warning that the +mirrored version is not the original published work and may differ in +significant ways. + +## Additional thoughts. + +These things were either only briefly touched on, or else are my own +ideas which came as I was writing this post. + +- Mirroring should be explictly opt·in, and ideally automated (to + reduce the likelihood of intentional or unintentional error). We will + need to develop protocols for this. + +- For added security, publishing platforms might implement Webfinger, + to guard against mirrors which correctly identify works they control + but misidentify their path (thus making them appear to be down). + Discovery platforms *may*, and probably *should*, attempt to make a + Webfinger request for the resource with its Tag U·R·I instead of + trusting the canonical path. However, supporting Webfinger *should + not* be required of all publishing platforms, and the attack vector + from mirrors in this sense is pretty small. + +- Webfinger or ordinary H·T·T·P redirects could be used to forward + services to new “canonical” U·R·L’s in the case that a service moves. + However, this trail would only be followable for as long as the + redirects or Webfinger endpoint remains up at the original domain. + +- Instead of mirroring tags, a service might indicate that its version + of a tag is intended to be synonymous with another service’s version + of a tag using `skos:closeMatch`. The stronger statement + `skos:exactMatch` requires agreement from both services. Tag mirrors + are useful in case the canonical service for a tag goes down, but + should not be relied upon otherwise. + +- Publishing platforms *may* serve a “tombstone” at the canonical U·R·L + for a resource, indicating that it was intentionally deleted. In this + case, a mirrored version **must not** be used. +]]></sioc:content> + <dc11:rights rdf:parseType="Markdown"><![CDATA[ +Copyright © 2023 Lady <small>[Fannish Metadata Nerd]</small>. +Some rights reserved. + +This blogpost is licensed under a <a rel="license" +href="http://creativecommons.org/licenses/by/4.0/"><cite>Creative +Commons Attribution 4.0 International License</cite></a>. +]]></dc11:rights> +</awol:Entry>