RDTF metadata that is exposed using the Linked Data approach must be made available under an open licence as RDF datasets.
The RDF datasets must be made available using simple HTTP GET requests as one of more RDF dumps (e.g. files containing RDF/XML, N-Triples, N-Quads or RDF/JSON).
The location of all RDF dumps must be disclosed in accordance with the Semantic Web Crawling Sitemap Extension.
Metadata about multiple collections may be made available. If so, the dataset corresponding to each collection should be made available using separate RDF dumps.
The dataset in each RDF dump must include links to other (external) RDF datasets, for example those describing people, organisations, topics or places.
The dataset in each RDF dump must be described using the Vocabulary of Interlinked Datasets (VoID). VoID files must be made available over HTTP and must be listed in the sitemap(s) above.
All significant resources associated with the collection of interest must be assigned a unique ‘http’ URI.
All URIs must dereference to a human-readable HTML description and an RDF description (e.g. using RDF/XML, N-Triples, RDF/JSON or RDFa [40]) of the resource when the URI, either by using one of the patterns described in Cool URIs for the Semantic Web [41] or by combining the HTML and RDF descriptions using embedded RDFa.
For libraries, the use of RDF data modeled according to FRBR and including links to other RDF sources (such as people, organisations, topics and places) is acceptable. The JISC OpenBib project [42] provides an example of this.
For museums, the use of RDF data modeled according to the CIDOC CRM and including links to other RDF sources (such as people, organisations, topics and places) is acceptable. The CLAROS project [43] provides an example of this.
For archives, the use of RDF data modeled according to EAD and including links to other RDF sources (such as people, organisations, topics and places) is acceptable. The JISC LOCAH project [44] provides an example of this.
- it will be possible to cite and bookmark resources of interest using their URIs, knowing that dereferencing the URI will offer a description or a representation of the resource;
- browsing the resulting ‘web of data’ will encourage the discovery of new resources in other collections.
- combining metadata from different providers should become easier (depending on the consistency of FRBR, CIDOC CRM and EAD adoption in RDF), building on shared URIs for external resources (people, organisations, topics, places, etc.);
- As a provider:
- metadata will offer more value because it is fully integrated into a web of data;
- greater (and more fine-grained) control over the metadata associated with resources means that access can be optimised (load balancing, caching, etc.).
- modelling resources (e.g. using FRBR, CIDOC CRM or EAD) and constructing RDF metadata may be time consuming;
- assigning URIs to resources may be non-trivial;
- discovering appropriate external collections and creating links to the resources they contain may be time consuming;
- investment will be required to configure web servers to serve descriptions of resources at their URIs.
Comment posted on behalf of Markus Enders, “I think it is difficult for anybody who is not involved in the LinkedData development to get a good overview over the various datasets and vocabularies. If the guidelines could provide a list, this would be helpful. But I wouldn’t consider this list to be a recommendation.”
FRBRised records for libraries and museums may be setting the bar rather high, at this time at least. And the FRBR apporach doesn’t represent a wholly coherent and consistent approach to structuring metadata records (i.e., it is possible to represent some relationships both horizontally and vertically).
In terms of: a) testing; b) providing useful usage examples, would it also make sense to *require* that publishers give an example showing:
1) how to call their API;
2) show how their data can be enriched with data from a third party via the LD approach;
3) show how third party data can be enriched with data from the first party/publisher via the LD approach;
Assuming we all use the same URI, or that appropriate relationships are established between URIs representing the same object.
I think it would be worth mentioning cross-domain links here, and elsewhere. “Other RDF sources” include (for museums) books and other library resources, and some archive resources; (for libraries) museum objects and archive resources; (for archives) library resources.
This doesn’t use the must/should terminology. The implication is that RDF data modeled on anything other than FRBR is unacceptable, which is unacceptable. ISBD needs to be included. And there will eventually be MODS/MADS in RDF …
+1. much more flexibility should be allowed, even though this part is just about examples. In fact I really don’t see why the paragraph here should be different from the corresponding paragraph in the “RDF data” section!
Is there a reason why this is stated as “must”? What if the data itself does not require those links? Or the institution isn’t interested in using them (e.g. when using simple Dublin Core)?
I think this requirement as a ‘must’ is justifiable in the guidelines as they currently stand. All it means is that if you don’t do this you are adopting the ‘RDF data approach’ not the ‘Linked Data’ approach – which would seem fair if you don’t incorporate links to external resources? (although I can see the terminology creating objections – afterall with the RDF approach you are still making your data ‘linkable’ even if you aren’t linking out)
I suggest to add the EDM as an example spanning over the three communities here.
Question from the authors: Do we want to recommend particular RDF datasets/vocabularies as the target for external links?
I think examples rather than anything stronger? Dbpedia obvious leading contender, and id.loc.gov for library data – but all this bound to evolve and change – not least if institutions start following these guidelines!
Obvious, but might need stating: Must include links to RDF datasets that are intrinsic to the model/ontology/format being used; e.g. RDA and ISBD vocabularies for content and carrier types.
It’s still work-in-progress, but if you’re searching for examples in the library domain (may work as well in a wider LAM context!) there’s
http://www.w3.org/2005/Incubator/lld/wiki/Vocabularies#Reference_value_vocabularies
I can certainly see a role for indicating preferred vocabularies for example. Preferred datasets in some cases, although cautiously perhaps….
Question from the authors: Are we happy allowing both RDF/XML and RDFa to be served from resource URIs?
Yes, there’s really nothing against RDF served by both channels I think. Of course they should be aligned, ideally.
Question from the authors: Do we want to be more prescriptive here about how URIs should be dereferenced?
Probably not. There is wide variation in what is returned from dereferencing; is there an agreed standard or recommendation for good practice?