Lady’s Gitweb - Blog/blob - 2023-05-09/fannish_metadata/#entry.rdf

   1 <awol:Entry
   2         xml:lang="en"
   3         xmlns:awol="http://bblfish.net/work/atom-owl/2006-06-06/"
   4         xmlns:dc11="http://purl.org/dc/elements/1.1/"
   5         xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   6         xmlns:sioc="http://rdfs.org/sioc/ns#"
   7 >
   8         <dc11:title>Requirements for fannish resource identifiers</dc11:title>
   9         <dc11:date>2023-05-09T18:31:26-07:00</dc11:date>
  10         <dc11:abstract rdf:parseType="Markdown"><![CDATA[
  11 A summary of a discussion which was held in the
  12 [Fandom Coders](https://www.fancoders.com) discord about I·D
  13 requirements for various types of fannish resources, and how these
  14 things might federate out or be handled by other services.
  15 ]]></dc11:abstract>
  16         <sioc:content rdf:parseType="Markdown"><![CDATA[
  17 The following blogpost is a summary of a discussion which was held in
  18 the [Fandom Coders](https://www.fancoders.com) discord about I·D
  19 requirements for various types of fannish resources, and how these
  20 things might federate out or be handled by other services. Our goal is
  21 to create a decentralized network of fannish platforms, so figuring out
  22 resource identification requirements is an important first step.
  23
  24 Note that in the discussion which follows, a “resource” might be a
  25 work, an author, a tag, a bookmark, or something else… anything which
  26 might be a metadata subject.
  27
  28 ## ① Resources should have Tag U·R·I’s.
  29
  30 Resource [Tag U·R·I](https://taguri.org)’s should be U·R·I’s of the
  31 form :—
  32
  33 ```
  34 tag:<domain>,<date>:<path>
  35 ```
  36
  37 —: where `<domain>` is the domain name of a site, `<date>` is some
  38 date (in `YYYY-MM-DD` format), and `<path>` is some path decided by the
  39 person or people who owned `<domain>` at `<date>` to uniquely identify
  40 the resource. Tag U·R·I’s are ideal for fannish resources for the
  41 following reasons :—
  42
  43 - In order for a fannish resource to be published on the internet, it
  44   must be published at a domain on a date. So these requirements are
  45   easily satisfied.
  46
  47 - No external registration (beyond owning a domain name) is necessary
  48   to mint U·R·I’s, and no maintenance is necessary.
  49
  50 - The domain name in the Tag U·R·I indicates who should be the trusted
  51   party when it comes to information about the resource: `<domain>`.
  52   If you hear about the resource from somewhere else, you know to view
  53   the information you receive with some level of suspicion.
  54
  55 Some additional notes :—
  56
  57 - The term “Tag U·R·I” has no relation to the normal fannish use of
  58   “tag”; it’s just what the U·R·I scheme happens to be called.
  59
  60 - `<date>` does *not* (and maybe *should* not) have to be the actual
  61   date a resource was created. My recommendation would be to set
  62   `<date>` to the date that a service was founded, so that e·g if a
  63   service dies and a new one is started at the same domain, the two
  64   generate clearly distinguishable U·R·I’s.
  65
  66 - It’s not possible to distinguish between beneficial reasons for
  67   content changes at `<domain>` (an author editing a work) and
  68   malicious ones (hostile domain takeover). It’s also not possible to
  69   verify that the people at `<domain>` actually controlled the domain
  70   at `<date>`. But if people play by the rules, an *accidental* name
  71   collision will never happen.
  72
  73 ## ② Resources should have canonical U·R·L’s containing their Tag.
  74
  75 The canonical U·R·L for a resource should look like this :—
  76
  77 ```
  78 https://<domain>/<subpath>/tag:<domain>,<date>:<path>
  79 ```
  80
  81 There are a few important things of note here :—
  82
  83 1. Both instances of `<domain>` **must be** the same, or else the U·R·L
  84    is not canonical.
  85
  86 2. The entire Tag U·R·I is present in the U·R·L, allowing it to be
  87    identified even if the U·R·L ceases to be dereferencable.
  88
  89 3. `<path>` may contain anything, including a query or fragment part.
  90
  91 It is possible for resources to be *mirrored*. Mirrors **must** have
  92 U·R·L’s like the following :—
  93
  94 ```
  95 https://<mirror-domain>/<mirror-subpath>/tag:<domain>,<date>:<path>
  96 ```
  97
  98 —: that is, the same easily‐recognizable Tag U·R·I, but at a different
  99 domain and subpath. Mirrors **must** identify the canonical U·R·L of
 100 the resource they are mirroring. `owl:sameAs` might be one mechanism of
 101 doing this in R·D·F.
 102
 103 ## ③ Crossposted resources should link to each other.
 104
 105 If a work is crossposted in two locations, one is not necessarily
 106 “canonical” and the other a “mirror”. Likely, both will be canonical
 107 and have their own Tag U·R·I’s (and this is a good thing). Crossposted
 108 works should instead identify themselves by linking to each other in
 109 some reciprocal fashion. We may need to come up with our own metadata
 110 term for specifying this, but see e·g `dcterms:hasFormat` and
 111 `dcterms:isFormatOf` which encode a similar (but not necessarily
 112 reciprocal in the same way) relationship.
 113
 114 ## ④ Platforms should only trust mirrors as a last resort.
 115
 116 And with copious warnings. If at all possible, platforms should direct
 117 users to the canonical U·R·L associated with a resource. However, this
 118 may not be possible (if an archive moves or goes down). In that case,
 119 a platform *may* direct users to a mirror, with a warning that the
 120 mirrored version is not the original published work and may differ in
 121 significant ways.
 122
 123 ## Additional thoughts.
 124
 125 These things were either only briefly touched on, or else are my own
 126 ideas which came as I was writing this post.
 127
 128 - Mirroring should be explictly opt·in, and ideally automated (to
 129   reduce the likelihood of intentional or unintentional error). We will
 130   need to develop protocols for this.
 131
 132 - For added security, publishing platforms might implement Webfinger,
 133   to guard against mirrors which correctly identify works they control
 134   but misidentify their path (thus making them appear to be down).
 135   Discovery platforms *may*, and probably *should*, attempt to make a
 136   Webfinger request for the resource with its Tag U·R·I instead of
 137   trusting the canonical path. However, supporting Webfinger *should
 138   not* be required of all publishing platforms, and the attack vector
 139   from mirrors in this sense is pretty small.
 140
 141 - Webfinger or ordinary H·T·T·P redirects could be used to forward
 142   services to new “canonical” U·R·L’s in the case that a service moves.
 143   However, this trail would only be followable for as long as the
 144   redirects or Webfinger endpoint remains up at the original domain.
 145
 146 - Instead of mirroring tags, a service might indicate that its version
 147   of a tag is intended to be synonymous with another service’s version
 148   of a tag using `skos:closeMatch`. The stronger statement
 149   `skos:exactMatch` requires agreement from both services. Tag mirrors
 150   are useful in case the canonical service for a tag goes down, but
 151   should not be relied upon otherwise.
 152
 153 - Publishing platforms *may* serve a “tombstone” at the canonical U·R·L
 154   for a resource, indicating that it was intentionally deleted. In this
 155   case, a mirrored version **must not** be used.
 156 ]]></sioc:content>
 157         <dc11:rights rdf:parseType="Markdown"><![CDATA[
 158 Copyright © 2023 Lady <small>[Fannish Metadata Nerd]</small>.
 159 Some rights reserved.
 160
 161 This blogpost is licensed under a <a rel="license"
 162 href="http://creativecommons.org/licenses/by/4.0/"><cite>Creative
 163 Commons Attribution 4.0 International License</cite></a>.
 164 ]]></dc11:rights>
 165 </awol:Entry>