<?xml version="1.0" encoding="us-ascii"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article article-type="research-article" xml:lang="EN" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">LIBER</journal-id>
<journal-title-group>
<journal-title>LIBER QUARTERLY</journal-title>
</journal-title-group>
<issn pub-type="epub">2213-056X</issn>
<publisher>
<publisher-name>Uopen Journals</publisher-name>
<publisher-loc>Utrecht, The Netherlands</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">lq.10247</article-id>
<article-id pub-id-type="doi">10.18352/lq.10247</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Supporting FAIR Data Principles with Fedora</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">http://orcid.org/0000-0001-5411-9208</contrib-id>
<name>
<surname>Wilcox</surname>
<given-names>David</given-names>
</name>
<email>dwilcox@duraspace.org</email>
<xref ref-type="aff" rid="aff1"/>
</contrib>
<aff id="aff1">DuraSpace, Beaverton OR, Canada</aff>
</contrib-group>
<pub-date pub-type="epub">
<month>8</month>
<year>2018</year>
</pub-date>
<volume>28</volume>
<fpage>xx</fpage>
<lpage>xx</lpage>
<permissions>
<copyright-statement>Copyright 2018, The copyright of this article remains with the author</copyright-statement>
<copyright-year>2018</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See <uri xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</uri>.</license-p>
</license>
</permissions>
<self-uri xlink:href="https://www.liberquarterly.eu/article/10.18352/lq.10247"/>
<abstract>
<p>Making data findable, accessible, interoperable, and re-usable is an important but challenging goal. From an infrastructure perspective, repository technologies play a key role in supporting FAIR data principles. Fedora is a flexible, extensible, open source repository platform for managing, preserving, and providing access to digital content. Fedora is used in a wide variety of institutions including libraries, museums, archives, and government organizations. Fedora provides native linked data capabilities and a modular architecture based on well-documented APIs and ease of integration with existing applications. As both a project and a community, Fedora has been increasingly focused on research data management, making it well-suited to supporting FAIR data principles as a repository platform.</p>
<p>Fedora provides strong support for persistent identifiers, both by minting HTTP URIs for each resource and by allowing any number of additional identifiers to be associated with resources as RDF properties. Fedora also supports rich metadata in any schema that can be indexed and disseminated using a variety of protocols and services. As a linked data server, Fedora allows resources to be semantically linked both within the repository and on the broader web. Along with these and other features supporting research data management, the Fedora community has been actively participating in related initiatives, most notably the Research Data Alliance. Fedora representatives participate in a number of interest and working groups focused on requirements and interoperability for research data repository platforms. This participation allows the Fedora project to both influence and be influenced by an international group of Research Data Alliance stakeholders.</p>
<p>This paper will describe how Fedora supports FAIR data principles, both in terms of relevant features and community participation in related initiatives.</p>
</abstract>
<kwd-group>
<kwd>fedora</kwd>
<kwd>repository</kwd>
<kwd>fair data</kwd>
<kwd>open source</kwd>
<kwd>linked data</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1. Introduction</title>
<p>Fedora, the Flexible, Extensible, Digital Object Repository Architecture, is open source repository software that stores, preserves, and provides access to digital content.<sup>1</sup> Fedora is built around the notion of flexibility; content can be modeled in a variety of different ways to support both simple and complex use cases. This flexibility is primarily based on linked data. Fedora uses Resource Description Framework (RDF) triples to create semantic links between resources, thereby allowing for data models unrestricted by traditional file and folder hierarchies. Along with this flexibility, Fedora supports millions of resources, both large and small, with configurable storage capabilities. But perhaps most importantly, Fedora is interoperable; it has been designed around a robust REST-API and an event-based messaging service that establish well-documented patterns for integrating Fedora with other applications and services to build a larger system (<xref ref-type="bibr" rid="r9">Technical Specifications, 2018</xref>).</p>
<p>Institutions adopt Fedora for a variety of reasons; as a flexible system Fedora can satisfy a number of use cases and requirements. However, most institutions turn to Fedora for its flexibility &#x2014; while local use cases may start out relatively simple, they will inevitably grow more complex over time. Fedora supports this natural growth by accommodating more complex needs as they arrive. Just as importantly, Fedora is designed with durability in mind. Digital preservation is a complex topic, and Fedora does not seek to be an all-in-one digital preservation system, but it provides a number of features and integration patterns that support an overall digital preservation strategy (<xref ref-type="bibr" rid="r3">Duraspace, 2018</xref>). Fedora has also been successful &#x2014; it is not enough for a project to be open source, it must also be sustainable and well-adopted. Fedora has been deployed in over 400 institutions around the world, which demonstrates its stability and success (<xref ref-type="bibr" rid="r11">Wilcox, 2018</xref>). Fedora also focuses on standards; as an API-driven application, Fedora implements a set of modern, well-adopted web standards to provide its services. These standards help ensure that data don&#x2019;t become trapped in a Fedora repository with application-specific customizations, while also making it easier to integrate with other applications and services to share data. Finally, Fedora is backed by a thriving, global community that provides distributed support and control.</p>
<p>The FAIR Data principles (<xref ref-type="bibr" rid="r5">Force11, n.d.</xref>), introduced by Force11 in 2014 and first published in 2016, provide a set of guidelines for making data Findable, Accessible, Interoperable, and Reusable. Each principle has an associated list of criteria which can be aligned with the relevant Fedora features in order to demonstrate how Fedora can effectively support the FAIR Data principles.</p>
</sec>
<sec id="s2">
<title>2. FAIR Data Principles</title>
<p>This section will look at each FAIR Data principle and its criteria in turn and describe how Fedora meets these criteria to support each principle.</p>
<sec id="s2a">
<title>2.1. Findable</title>
<p>Findability is defined by the following criteria:</p>
<list list-type="order">
<list-item><p>(meta)data are assigned a globally unique and eternally persistent identifier.</p></list-item>
<list-item><p>data are described with rich metadata.</p></list-item>
<list-item><p>(meta)data are registered or indexed in a searchable resource.</p></list-item>
<list-item><p>metadata specify the data identifier.</p></list-item>
</list>
<p>As a resource-centric repository, Fedora assigns each resource, whether it be a metadata record or a file, a Uniform Resource Identifier (URI) that serves as a persistent identifier for that resource. Additional persistent identifiers may also be used; for example, a public-facing DOI could be registered and mapped to a Fedora resource, and that DOI could be stored as a metadata property on the resource in Fedora. This also means that metadata resources can be linked to the data they describe by storing and linking to the data identifier. Strong support for metadata is another key Fedora feature; any type of metadata based on any schema (including custom fields) may be used. This flexibility allows Fedora to be used across research domains. Fedora also provides strong support for indexing &#x2014; metadata and data (in the case of text-based resources) can be indexed in any number of external indices; common use cases include Solr, Elasticsearch, and triples stores.</p>
</sec>
<sec id="s2b">
<title>2.2. Accessible</title>
<p>Accessibility is defined by the following criteria:</p>
<list list-type="order">
<list-item><p>(meta)data are retrievable by their identifier using a standard protocol.</p>
<p> a. the protocol is open, free, and universally implementable.</p>
<p> b. the protocol allows for authentication and authorization.</p>
</list-item>
<list-item><p>metadata are accessible, even when the data are no longer available</p></list-item>
</list>
<p>Fedora provides a well-documented REST-API (<xref ref-type="bibr" rid="r12">Woods, 2018</xref>). This API serves as an open protocol for access resources in the repository using their identifier. Any REST-based client can easily request repository resources, and standard authentication can be applied to such requests to prevent access by users or machines without the proper credentials. Once a user has been authenticated, Fedora uses the World Wide Web Consortium (W3C) Web Access Control standard to enforce authorization (<xref ref-type="bibr" rid="r10">WebAccessControl, 2018</xref>). This will ensure that authenticated users only receive the appropriate level of access based on their credentials. This same authorization scheme can be used to control access to data and metadata separately &#x2014; a repository administrator could choose to lock down access to data while still providing access to the related metadata.</p>
</sec>
<sec id="s2c">
<title>2.3. Interoperable</title>
<p>Interoperability is defined by the following criteria:</p>
<list list-type="order">
<list-item><p>(meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.</p></list-item>
<list-item><p>(meta)data use vocabularies that follow FAIR principles.</p></list-item>
<list-item><p>(meta)data include qualified references to other (meta)data.</p></list-item>
</list>
<p>In terms of metadata, Fedora has strong multilingual support. Any language may be used, and metadata can be stored in whatever format is most relevant (e.g. RDF, XML). Additionally Fedora&#x2019;s native linked data support allows vocabularies to be used to enhance knowledge representation by referencing well-known terms rather than using custom values. For example, a subject field could reference a term in the Library of Conference Subject Headings vocabulary (<xref ref-type="bibr" rid="r6">Library of Congress, 2011</xref>) and store the URI for that term. That way, a user can follow the URI to gather more information on the subject. This concept of following links to other resources is a fundamental principle of linked data and also supports the FAIR data notion of interoperability.</p>
</sec>
<sec id="s2d">
<title>2.4. Reusable</title>
<p>Reusability is defined by the following criteria:</p>
<list list-type="order">
<list-item><p>meta(data) have a plurality of accurate and relevant attributes.</p>
<p> a. (meta)data are released with a clear and accessible usage license.</p>
<p> b. (meta)data are associated with their provenance.</p>
<p> c. (meta)data meet domain-relevant community standards.</p>
</list-item>
</list>
<p>Fedora&#x2019;s rich metadata support allows for a wide variety of attributes, and RDF can be used to link resources to their licenses. These licenses could be stored as resources in the repository, or they could be external licenses such as those provided by Creative Commons.<sup>2</sup> Fedora also has an optional Audit module that can be enabled to track the provenance of resources in the repository (including metadata). Once enabled, the module will create PREMIS metadata (<xref ref-type="bibr" rid="r7">PREMIS, 2018</xref>) associated with events in the repository (e.g. when something is added, changed, or deleted) and this PREMIS metadata can be stored and queried for provenance reporting. Finally, another optional application can be used with Fedora to provide additional functionality on top of the standard set of Fedora services. This application, the API Extension Framework, can be used to build and share modules to do things such as metadata validation to ensure compliance with relevant community standards (<xref ref-type="bibr" rid="r1">API Extension, 2018</xref>).</p>
</sec>
</sec>
<sec id="s3">
<title>3. Related Community Initiatives</title>
<p>Fedora is more than software; it is also a community. The Fedora community participates in a variety of international efforts aimed at making progress on key project priorities. One such effort is the Next Generation Repositories report that was published by the Confederation of Open Access Repositories in 2017 (<xref ref-type="bibr" rid="r2">Confederation of Open Access Repositories, 2017</xref>). This report is based on the efforts of an international working group which included participation from the Fedora Product Manager. The report recommends a number of behaviours and supporting technologies that the next generation of repositories should implement, and these recommendations are very much in line with the FAIR data principles.</p>
<p>The Fedora community is also involved in the Research Data Alliance,<sup>3</sup> an international group focused on enabling research data sharing across borders and around the world. This group is obviously well-aligned with the FAIR data principles, and there are a number of interest and working groups within the RDA that are making progress toward these shared goals. One such group, the Research Data Repository Interoperability working group, recently published recommendations on a data packaging standard for increased interoperability between repository platforms on a machine-to-machine level (<xref ref-type="bibr" rid="r8">RDA Research Data, 2018</xref>). This group was co-chaired by the Fedora Product Manager, and the recommendations are consistent with both the FAIR data principles and ongoing work in the Fedora community to support standardized data import and export.</p>
</sec>
<sec id="s4">
<title>4. Supporting and Sustaining Fedora</title>
<p>Fedora is stewarded by DuraSpace,<sup>4</sup> a not-for-profit organization funded primarily through membership. Institutions join DuraSpace and direct annual funding to support the project(s) of their choice. In 2017, 74 DuraSpace member institutions supported Fedora with $562,300 in funding (<xref ref-type="bibr" rid="r4">Fedora Community, 2018</xref>). This funding pays for 2 full time equivalent (FTE) staff members, as well as travel for conferences, workshops, and user groups, marketing and communication, and other priorities as determined by the project governance group. Fedora is designed, built, and maintained by the community; DuraSpace provides support but the majority of the development is done by members of the community.</p>
</sec>
<sec id="s5">
<title>5. Conclusion</title>
<p>The FAIR data principles represent an important community goal of making data Findable, Accessible, Interoperable, and Reusable. However, in order to put these principles into practice they must be broken down into criteria that can be supported by infrastructure. For each principle, Fedora has a set of features that satisfy these criteria and support the overall implementation of the FAIR Data principles. This can be demonstrated not only at the level of the software, but also in the Fedora community&#x2019;s participation in related community efforts that further the goals of the FAIR data principles at a more strategic level. As community-supported, open source software, Fedora will continue to evolve to meet the needs of the research data management community as data is made more Findable, Accessible, Interoperable, and Reusable.</p>
</sec>
</body>
<back>
<ref-list>
<title>References</title>
<ref id="r1"><mixed-citation>API Extension. (2018). <italic>Fedora API extension framework</italic>. Retrieved from GitHub. <ext-link ext-link-type="uri" xlink:href="https://github.com/fcrepo4-labs/fcrepo-api-x">https://github.com/fcrepo4-labs/fcrepo-api-x</ext-link>.</mixed-citation></ref>
<ref id="r2"><mixed-citation>Confederation of Open Access Repositories. (2017). <italic>Next generation repositories.</italic> Retrieved from <ext-link ext-link-type="uri" xlink:href="https://www.coar-repositories.org/files/NGR-Final-Formatted-Report-cc.pdf">https://www.coar-repositories.org/files/NGR-Final-Formatted-Report-cc.pdf</ext-link>.</mixed-citation></ref>
<ref id="r3"><mixed-citation>Duraspace. (2018). <italic>Fedora and digital preservation.</italic> Retrieved from <ext-link ext-link-type="uri" xlink:href="https://duraspace.org/fedora/resources/publications/fedora-digital-preservation/">https://duraspace.org/fedora/resources/publications/fedora-digital-preservation/</ext-link>.</mixed-citation></ref>
<ref id="r4"><mixed-citation>Fedora Users. (2018). Retrieved from <ext-link ext-link-type="uri" xlink:href="https://duraspace.org/fedora/community/fedora-users/">https://duraspace.org/fedora/community/fedora-users/</ext-link>.</mixed-citation></ref>
<ref id="r5"><mixed-citation>Force11. (n.d.). <italic>The FAIR data principles</italic>. Retrieved from <ext-link ext-link-type="uri" xlink:href="https://www.force11.org/group/fairgroup/fairprinciples">https://www.force11.org/group/fairgroup/fairprinciples</ext-link>.</mixed-citation></ref>
<ref id="r6"><mixed-citation>Library of Congress. (2011). <italic>Library of Congress subject headings</italic>. Retrieved from <ext-link ext-link-type="uri" xlink:href="http://id.loc.gov/authorities/subjects.html">http://id.loc.gov/authorities/subjects.html</ext-link>.</mixed-citation></ref>
<ref id="r7"><mixed-citation>PREMIS. (2018). <italic>Preservation metadata maintenance activity</italic>. Retrieved from Library of Congress website: <ext-link ext-link-type="uri" xlink:href="https://www.loc.gov/standards/premis/">https://www.loc.gov/standards/premis/</ext-link>.</mixed-citation></ref>
<ref id="r8"><mixed-citation>RDA Research Data Repository Interoperability Working Group. (2018). <italic>Research Data Repository Interoperability WG final recommendations</italic>. <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.15497/RDA00025">http://dx.doi.org/10.15497/RDA00025</ext-link>.</mixed-citation></ref>
<ref id="r9"><mixed-citation>Technical Specifications. (2018). Retrieved from Fedora website <ext-link ext-link-type="uri" xlink:href="https://duraspace.org/fedora/resources/technical-specifications/">https://duraspace.org/fedora/resources/technical-specifications/</ext-link>.</mixed-citation></ref>
<ref id="r10"><mixed-citation>WebAccessControl. (2018). Retrieved from W3C wiki. <ext-link ext-link-type="uri" xlink:href="https://www.w3.org/wiki/WebAccessControl">https://www.w3.org/wiki/WebAccessControl</ext-link>.</mixed-citation></ref>
<ref id="r11"><mixed-citation>Wilcox, D. (2018). <italic>2017 Fedora annual report.</italic> Retrieved from <ext-link ext-link-type="uri" xlink:href="https://wiki.duraspace.org/display/FF/Annual&#x002B;Reports">https://wiki.duraspace.org/display/FF/Annual&#x002B;Reports</ext-link>.</mixed-citation></ref>
<ref id="r12"><mixed-citation>Woods, A. (2018). <italic>RESTful HTTP API.</italic> Retrieved from Fedora 4.7.5 documentation: <ext-link ext-link-type="uri" xlink:href="https://wiki.duraspace.org/display/FEDORA475/RESTful&#x002B;HTTP&#x002B;API">https://wiki.duraspace.org/display/FEDORA475/RESTful&#x002B;HTTP&#x002B;API</ext-link>.</mixed-citation></ref>
</ref-list>
<fn-group>
<fn id="fn1"><p><ext-link ext-link-type="uri" xlink:href="https://duraspace.org/fedora/">https://duraspace.org/fedora/</ext-link>.</p></fn>
<fn id="fn2"><p><ext-link ext-link-type="uri" xlink:href="https://creativecommons.org">https://creativecommons.org</ext-link>.</p></fn>
<fn id="fn3"><p><ext-link ext-link-type="uri" xlink:href="https://www.rd-alliance.org">https://www.rd-alliance.org</ext-link>.</p></fn>
<fn id="fn4"><p><ext-link ext-link-type="uri" xlink:href="https://duraspace.org">https://duraspace.org</ext-link>.</p></fn>
</fn-group>
</back>
</article>