<?xml version="1.0" encoding="us-ascii"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article article-type="research-article" xml:lang="EN" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">LIBER</journal-id>
<journal-title-group>
<journal-title>LIBER QUARTERLY</journal-title>
</journal-title-group>
<issn pub-type="epub">2213-056X</issn>
<publisher>
<publisher-name>Uopen Journals</publisher-name>
<publisher-loc>Utrecht, The Netherlands</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">lq.10138</article-id>
<article-id pub-id-type="doi">10.18352/lq.10138</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Pontika</surname>
<given-names>Nancy</given-names>
</name>
<email>nancy.pontika@open.ac.uk</email>
<xref ref-type="aff" rid="aff1"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Knoth</surname>
<given-names>Petr</given-names>
</name>
<email>petr.knoth@open.ac.uk</email>
<xref ref-type="aff" rid="aff1"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Cacellieri</surname>
<given-names>Matteo</given-names>
</name>
<email>matteo.cancellieri@open.ac.uk</email>
<xref ref-type="aff" rid="aff1"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Pearce</surname>
<given-names>Samuel</given-names>
</name>
<email>samuel.pearce@open.ac.uk</email>
<xref ref-type="aff" rid="aff1"/>
</contrib>
<aff id="aff1">The Open University, UK</aff>
</contrib-group>
<pub-date pub-type="epub">
<month>2</month>
<year>2016</year>
</pub-date>
<volume>25</volume>
<issue>4</issue>
<fpage>172</fpage>
<lpage>188</lpage>
<permissions>
<copyright-statement>Copyright 2016, The copyright of this article remains with the author</copyright-statement>
<copyright-year>2016</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See <uri xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</uri>.</license-p>
</license>
</permissions>
<self-uri xlink:href="https://www.liberquarterly.eu/article/10.18352/lq.10138"/>
<abstract>
<p>The amount of open access content stored in repositories has increased dramatically, which has created new technical and organisational challenges for bringing this content together. The COnnecting REpositories (CORE) project has been dealing with these challenges by aggregating and enriching content from hundreds of open access repositories, increasing the discoverability and reusability of millions of open access manuscripts. As repository managers and library directors often wish to know the details of the content harvested from their repositories and keep a certain level of control over it, CORE is now facing the challenge of how to enable content providers to manage their content in the aggregation and control the harvesting process. In order to improve the quality and transparency of the aggregation process and create a two-way collaboration between the CORE project and the content providers, we propose the CORE Dashboard.</p>
</abstract>
<kwd-group>
<kwd>open access</kwd>
<kwd>repositories</kwd>
<kwd>harvesting</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1. Introduction</title>
<p>Over the past five years the amount of open access content has increased dramatically (<xref ref-type="bibr" rid="r2">Gargouri, Larivi&#x00E8;re, Gingras, Carr, &#x0026; Harnad, 2012</xref>; <xref ref-type="bibr" rid="r7">Laakso &#x0026; Bj&#x00F6;rk, 2012</xref>; <xref ref-type="bibr" rid="r8">Morrison, 2015</xref>). According to the Registry of Open Access Repository Mandates and Policies<xref ref-type="fn" rid="fn1">1</xref> (ROARMAP), currently there are 79 funder, 54 organisational, 520 institutional and 72 departmental open access mandates. These mandates require the open accessibility of the research manuscripts and call for a shift in the scholars&#x2019; publishing behaviour towards open access content. As a result, there is a high volume of scientific publications being self-archived in institutional and subject repositories. Even though there is an increasing amount of manuscripts that can be accessed on the web for free, there are still technical challenges in automatically bringing together full-text open access content from different systems and reusing it (<xref ref-type="bibr" rid="r6">Knoth, Anastasiou, &#x0026; Pearce, 2014</xref>).</p>
<p>For the past five years the CORE project has been harvesting research manuscripts from open institutional and subject repositories, and open access journals. CORE&#x2019;s mission is not only to increase the visibility of the open access research manuscripts, but also to enable all research stakeholders to discover, access and reuse this open access content by providing three levels of access:</p>
<list list-type="order">
<list-item><p>Programmable data access</p></list-item>
<list-item><p>Transaction information access</p></list-item>
<list-item><p>Analytical information access (<xref ref-type="bibr" rid="r5">Knoth &#x0026; Zdrahal, 2012</xref>).</p></list-item>
</list>
<p>The programmable data access focuses on providing access to raw data and this is made possible with the use of the CORE API.<xref ref-type="fn" rid="fn2">2</xref> So far we have 135 API registered users, who are in position to gain access to CORE&#x2019;s content and the Data Dumps<xref ref-type="fn" rid="fn3">3</xref> that permit the text-mining of these manuscripts. This level of access is intended primarily for researchers, developers and companies. The second level is implemented with a set of services; the CORE portal,<xref ref-type="fn" rid="fn4">4</xref> where users can search and retrieve manuscripts from CORE; the Mobile application, which enhances the user flexibility of searching the CORE content; and the Plug-in,<xref ref-type="fn" rid="fn5">5</xref> a tool that, when integrated with repositories, provides research paper recommendations hosted in the CORE collection. The CORE portal receives a high traffic every year; in 2015 we had 70,465 new visitors, while 8,513 returned to our page. In addition, throughout this time, 14,704,530 full-text documents were downloaded from CORE. The transactional access applies mainly to researchers, students and life-long learners. For the third level of access, the analytical information access, CORE has newly implemented the Repositories Dashboard,<xref ref-type="fn" rid="fn6">6</xref> which is presented in this article. The target group of this application is primarily the repositories that act as CORE&#x2019;s data providers and their repository managers.</p>
<p>Currently there are other products that offer services similar to CORE, like Google Scholar<xref ref-type="fn" rid="fn7">7</xref> or CiteSeerX,<xref ref-type="fn" rid="fn8">8</xref> but there are some major differences between them. First, none of these were designed to aggregate repository systems. These products crawl and index research papers located anywhere on the web, providing access to them either directly through their own system, like CiteSeerX, or by linking to the original source, like Google Scholar. Once the content is aggregated by these systems, the originator has no control over this content and the aggregation system is not accountable to the original repository. CORE aims to strike a balance between the need for aggregating, promoting and exploiting the repository content and the need of the repository owners to have control over their content. Another popular project that relates to repositories&#x2019; harvesting is the European-funded project Open Access Infrastructure for Research in Europe (OpenAIRE).<xref ref-type="fn" rid="fn9">9</xref> While OpenAIRE works with the full-text content of the articles, it does not store the full-text content, while CORE caches the full-text file. In this perspective, CORE is mostly similar to PubMed,<xref ref-type="fn" rid="fn10">10</xref> a free of cost search engine on medical literature, since it collects and disseminates papers from many content providers, both publishers and repositories, but serves the needs of the providers of the open access content instead.</p>
</sec>
<sec id="s2">
<title>2. The Harvesting Process</title>
<p>In order to collect the world&#x2019;s resources, CORE implements a harvesting technique with which it aggregates the open access content via the Open Archives Initiative Metadata Harvesting Protocol (OAI-PMH).<xref ref-type="fn" rid="fn11">11</xref> The OAI-PMH is one of the most widely used standards (<xref ref-type="bibr" rid="r4">Horwood, Sullivan, Young, &#x0026; Garner, 2004</xref>) in content collection and the vast majority of repositories are supporting it.<xref ref-type="fn" rid="fn12">12</xref> At CORE, the harvesting process is divided into eight different but interdependent tasks.</p>
<sec id="s2a">
<title>2.1. Metadata Download, Extraction and Cleaning</title>
<p>As this first step, a repository&#x2019;s metadata are being downloaded into the CORE database. Since CORE uses the OAI-PMH protocol, it is essential for the harvesting process that the metadata are formatted in the Dublin Core schema, a collection of conditions that are used to describe objects in an online environment.<xref ref-type="fn" rid="fn13">13</xref> Afterwards, the metadata are being extracted in our database for local storage and they are cleaned; for example the order of the authors is corrected and normalised when necessary, or the digital object identifiers (DOIs) are extracted in case they appear in the wrong field.</p>
</sec>
<sec id="s2b">
<title>2.2. Full-text Harvesting</title>
<p>Apart from downloading a record&#x2019;s metadata, CORE also downloads the article full-text and stores it in the CORE database. Users are in position to retrieve the cached content either via the CORE or any other search engine.</p>
</sec>
<sec id="s2c">
<title>2.3. Text Extraction</title>
<p>After the full-text harvesting task, CORE extracts the full-text of an output into a text file, which is indexed to facilitate full-text searching.</p>
</sec>
<sec id="s2d">
<title>2.4. Language Detection</title>
<p>Based on the fact that repositories hold large collections of manuscripts that are written in many languages, CORE has a dedicated task that recognizes the language that an output uses. Thanks to the language detection task, CORE&#x2019;s users are in position to filter manuscripts in specific languages.</p>
</sec>
<sec id="s2e">
<title>2.5. Citation Extraction</title>
<p>This task extracts the citations from an output&#x2019;s references. During this task the titles of all references are extracted and then CORE searches for the referenced output in the CORE collection. If the item is available in our collection then the two items are linked together; if not, CORE submits the titles of the referenced manuscripts to a DOI resolution service, CrossRef,<xref ref-type="fn" rid="fn14">14</xref> which sends back to CORE the output&#x2019;s DOI, if available.</p>
</sec>
<sec id="s2f">
<title>2.6. Related Content Identification</title>
<p>This step relates with the discoverability and matching of semantically related manuscripts using information retrieval techniques.</p>
</sec>
<sec id="s2g">
<title>2.7. Detection of Duplicates</title>
<p>In this step, the CORE system detects the duplicate records and groups these duplicates together in the database.</p>
</sec>
<sec id="s2h">
<title>2.8. Indexing</title>
<p>Once the whole harvesting process is completed and a large volume of data is stored in the CORE database, the data is indexed. This task enables the searching of the CORE content and is also necessary for the functionality of the CORE API as well as to enable the creation of the Data Dumps.</p>
</sec>
</sec>
<sec id="s3">
<title>3. The Need for a Repositories Dashboard</title>
<p>Existing research studies (<xref ref-type="bibr" rid="r1">Allard, Mack, &#x0026; Feltner-Reicher, 2005</xref>; <xref ref-type="bibr" rid="r12">Walters, 2007</xref>; <xref ref-type="bibr" rid="r14">Wickham, 2010</xref>) that describe the roles of repository managers indicate that they &#x201C;<italic>manage the repository service by identifying goals and future strategies for improvement in the repository service based on new developments, usage statistics and feedback from users</italic>&#x201D; (<xref ref-type="bibr" rid="r14">Wickham, 2010</xref>, p. 5). Furthermore, <xref ref-type="bibr" rid="r1">Allard et al. (2005)</xref> discovered that repository managers do not always have specialized technical skills, which indicates that perhaps they cannot take a direct advantage of the available information that CORE offers through the use of the API and the data dumps.</p>
<p>At the time when these aforementioned studies were conducted, five to ten years ago, the numbers of open access mandates were not as high as currently. According to ROARMAP, by the end of 2007 there were 22 funder and 137 institutional open access mandates, in 2010 there were 34 funder and 258 institutional, while the third quarter of 2015 ROARMAP has recorded 67 funder and 430 institutional open access mandates. In addition, SPARC Europe, an organization focusing on scholarly communications, in 2013 conducted an analysis of the global funder open access policies and discovered that from the 48 mandatory policies, 33 were green open access policies, which means that compliance is met via self-archiving in a repository (<xref ref-type="bibr" rid="r11">SPARC Europe, 2013</xref>). Therefore, the repositories&#x2019; landscape has been significantly shifted by these open access mandates. In 2010 SHERPA Services surveyed the United Kingdom Council of Research Repositories members (<xref ref-type="bibr" rid="r14">Wickham, 2010</xref>) and discovered that it is the repository manager&#x2019;s responsibility to &#x201C;<italic>develop workflows to manage the capture, description and preservation etc. of research outputs</italic>&#x201D;. In this new environment, repository managers have to further develop more skills and deal with timely deposits and publishers&#x2019; embargo periods, count compliance percentages and assist authors with licensing their manuscripts (<xref ref-type="bibr" rid="r9">Pontika &#x0026; Rozenberga, 2015</xref>).</p>
<p>CORE has been dealing with the aggregation challenges over the past four years by harvesting and enriching content from open access repositories, allowing the discoverability and reusability of millions of open access manuscripts via its own search engine and the API. While CORE has been able to provide the aggregated content from a single harmonised endpoint, it is now facing a challenge of how to enable the content providers to manage their content through the aggregation and control the harvesting process. Throughout the past four years of its existence, CORE has harvested 687 repositories from all over the world. All this time, CORE has received dozens of opt-in requests and only a few repositories have opted-out from the service. The primary reason for the opt-out requests was fear of institutions losing control of their content through the aggregation process. On the other hand, those repository managers and library directors that have opted-in, often email us requesting access to details regarding their aggregated content and wishing to gain control over it.</p>
<p>In order to improve the quality and transparency of the aggregation process of the open access content and create a two-way collaboration between the CORE project and the providers of this content, CORE has created the Repositories Dashboard. The purpose of the Dashboard is to provide an online interface for CORE&#x2019;s data providers, the vast majority of which are repositories&#x2019; managers (<xref ref-type="fig" rid="fg001">Figure 1</xref>). This online interface enables data providers to acquire more control of their content that appears in CORE by them gaining access to information that they did not have in the past. This allows the repository managers to efficiently manage the aggregation process, by, for example, requesting metadata updates or managing takedown requests directly in the CORE aggregation. The tool also provides information with regards to the frequency the content is being aggregated, including all detected technical issues, suggestions for improving the efficiency both of the harvesting process and the quality of metadata, and compliance with existing metadata guidelines. Furthermore, the CORE dashboard provides a range of statistics about the aggregated content.</p>
<fig id="fg001">
<label>Fig. 1:</label>
<caption><p>Repositories Dashboard Purpose.</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="figures/Pontika_fig1.jpg"/>
</fig>
</sec>
<sec id="s4">
<title>4. Repositories Dashboard Overview</title>
<sec id="s4a">
<title>4.1. Institution Main Page</title>
<p>The vast majority of the repositories hosted in CORE are institutional, hosted and maintained by academic or research institutions. In the dashboard each institution and their affiliated repositories&#x2014;there are cases where one institution may host more than one repository&#x2014;have a dedicated page that includes the name and logo of the institution, the repository name and corresponding email. This page is intentionally left blank and it is the responsibility of the repository manager to fill in all this information (<xref ref-type="fig" rid="fg002">Figure 2</xref>).</p>
<fig id="fg002">
<label>Fig. 2:</label>
<caption><p>Institution Main Page in the Dashboard.</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="figures/Pontika_fig2.jpg"/>
</fig>
</sec>
<sec id="s4b">
<title>4.2. Invite users to the Dashboard</title>
<p>There are two requirements for repositories to gain access to the dashboard. First they need to be CORE&#x2019;s data providers and second the repository manager needs to allocate and manage the dashboard invitations to the members of their own institution. In order to register one repository manager in the Dashboard, CORE applies the following process: initially, CORE uses either the personal email address of the repository manager or the generic repository email address and sends an invitation to the new user, granting this account with administrative privileges. Afterwards, the person who handles this account has the right to create as many accounts for her/his own institution members as s/he wishes. These accounts have two levels of access: advanced and standard. The standard account allows users to view only the content hosted in CORE and the information related to their repository (<xref ref-type="fig" rid="fg003">Figure 3</xref>). Users with advanced accounts are able to perform actions, for example take down material, request the re-harvesting of the repository, or download Comma Separated Values (CSV) files, which contain the same fields as the information in the Content tab explored in section 4.3.</p>
<fig id="fg003">
<label>Fig. 3:</label>
<caption><p>Invite Users to the Dashboard.</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="figures/Pontika_fig3.jpg"/>
</fig>
</sec>
<sec id="s4c">
<title>4.3. Content Tab</title>
<p>The content tab of one repository lists all the manuscripts that are harvested from this content provider. This page contains the title of the harvested document, the output&#x2019;s unique identifier (OAI ID), the author name, the date the output was harvested and whether there is an openly accessible version through CORE (<xref ref-type="fig" rid="fg004">Figure 4</xref>). On this page repository managers can perform four tasks: take down and take up content, update metadata records and request a full re-harvesting of their repository.</p>
<fig id="fg004">
<label>Fig. 4:</label>
<caption><p>Content Tab in the Dashboard.</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="figures/Pontika_fig4.jpg"/>
</fig>
<p>The take down and take up buttons are considered to be critical for repository managers, who often receive take down requests of thesis and dissertations from authors or publishers and need to act promptly on them. CORE&#x2019;s intention was not only to make the process of taking down a document as simple and as fast as possible, but we also wanted to hand over the control of the harvested content to its data providers. Our future goal is to integrate this functionality with the repository software. For example, every manuscript that will be taken-down from an EPrints repository, will then be automatically removed from the CORE collection as well or vice versa.</p>
</sec>
<sec id="s4d">
<title>4.4. Issues Related to Harvesting</title>
<p>During the harvesting process, explained above, various issues may occur. These issues can be critical to the ability of CORE completing the harvesting task and can lead to the whole corpus of a repository being not accessible in CORE, or to poor harvesting, where only some items are retrieved. As it has already been mentioned, repository managers may not have the technical skills to deal with these issues and in most cases they receive support from technical staff in their institution. In an effort to improve the communication between the repository managers and Information Technology staff, which would also result into the improvement of the harvesting process and the quality of the harvested content, CORE has created the &#x201C;Issues&#x201D; tab, where all possible issues are explained in further detail in a way that should be understood by both technical and non-technical staff.</p>
<p>First, the issues are divided into three types, a) error, b) warning and c) info, and a related description is provided for each one of them:</p>
<list list-type="alpha-lower">
<list-item><p>Error: When harvesting your repository/document we encountered an error that we couldn&#x2019;t resolve. These errors need to be fixed in order to harvest your repository/document.</p></list-item>
<list-item><p>Warning: We encountered an error but we are still able to harvest the repository document. We strongly recommend that these issues are resolved as they may lead to incompatibility problems in the future.</p></list-item>
<list-item><p>Info: This may not be a problem but it may be a clue for misconfiguration or future incompatibilities.</p></list-item>
</list>
<p>Apart from these generic instructions, the Dashboard software provides also issues that relate to a specific repository, as they were recorded during the harvesting process by CORE&#x2019;s systems. These issues are divided into two sections, repository and document issues (<xref ref-type="fig" rid="fg005">Figure 5</xref>). The first category relates to issues accessing the repository, mainly because the CORE crawlers are blocked from accessing items in a repository<xref ref-type="fn" rid="fn15">15</xref> or the OAI endpoint has changed. The second category relates to issues that the CORE harvester has encountered with regards to the aggregated full-text. For example in this page repositories&#x2019; managers can find out information as to whether there are links where the resource locator (URL) in the &#x003C;dc:indentifier&#x003E; tag was not properly formulated, or if there are documents that require the use of a username and password to permit access to their content.<xref ref-type="fn" rid="fn16">16</xref></p>
<fig id="fg005">
<label>Fig. 5:</label>
<caption><p>Repositories Issues During the Harvesting Process.</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="figures/Pontika_fig5.jpg"/>
</fig>
</sec>
<sec id="s4e">
<title>4.5. IRUS-UK Statistics</title>
<p>In the past, CORE often received emails from repository managers requesting download and usage statistics from CORE. Generally speaking, the importance of a repository manager being aware of their repository&#x2019;s statistics can be summarised into four main reasons. The statistics indicate the level of exposure of the research that is being conducted in an institution; it can serve as information regarding the return on investment both for the conducted research and the maintenance of the repository (<xref ref-type="bibr" rid="r10">Sch&#x00F6;pfel &#x0026; Boukacem-Zeghmouri, 2011</xref>); in some subject fields they can verify an increase in the citation rate of the papers (<xref ref-type="bibr" rid="r3">Gentil-Beccot, Mele, &#x0026; Brooks, 2010</xref>); and it is a signal for prospective citations (<xref ref-type="bibr" rid="r13">Watson, 2009</xref>). Apart from downloading the metadata files of the repositories collections, CORE, during the harvesting process, downloads also the full-text of the manuscripts and caches this PDF version in its own database. As an effort to provide to repository managers information regarding to the manuscripts&#x2019; downloads from CORE, we have integrated in the Dashboard the Institutional Repository Usage Statistics (IRUS-UK, see <xref ref-type="fig" rid="fg006">Figure 6</xref>). IRUS-UK<xref ref-type="fn" rid="fn17">17</xref> is a Jisc-funded project that serves as a national repository usage statistics aggregation service. IRUS-UK aims to provide article download statistics for content from UK repositories. Repositories who participate in the IRUS-UK project, which are currently close to 90, have access to these statistics from the CORE Dashboard as well.</p>
<fig id="fg006">
<label>Fig. 6:</label>
<caption><p>IRUS Statistics in the Dashboard.</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="figures/Pontika_fig6.jpg"/>
</fig>
</sec>
<sec id="s4f">
<title>4.6. RIOXX Metadata</title>
<p>The RIOXX Metadata application profile<xref ref-type="fn" rid="fn18">18</xref> aims to assist repository managers in tracking compliance with the Research Councils UK Policy on Open Access and Guidance.<xref ref-type="fn" rid="fn19">19</xref> Via the UK Metadata Guidelines for Open Access Repositories,<xref ref-type="fn" rid="fn20">20</xref> RIOXX provides mainly directions on the discoverability of the research manuscripts across different systems with the use of a set of metadata elements, resulting in the automated detection of the RCUK compliant manuscripts in a repository. During the harvesting process, CORE is in position to detect those UK repositories that support the RIOXX metadata and run a compliance check. The purpose of this task is to provide repository managers with the ability to validate the metadata inserted in their repositories. The dashboard supports two validation types, the &#x201C;Basic&#x201D; and the &#x201C;Full&#x201D;, similar to the RIOXX application (<xref ref-type="fig" rid="fg007">Figure 7</xref>). The difference between these two types is that the first one has less strict constraints with less fields, while the latter has more fields and requires the input of more data with rigid metadata rules; the latter could be proved more important from a funder perspective.</p>
<fig id="fg007">
<label>Fig. 7:</label>
<caption><p>RIOXX Compliance in the Dashboard.</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="figures/Pontika_fig7.jpg"/>
</fig>
</sec>
<sec id="s4g">
<title>4.7. Benefits of the Repositories Dashboard</title>
<p>CORE foresees some distinctive benefits with the implementation of the CORE Dashboard. First, it is expected that it will bring an increased and simplified collaboration between the aggregator, that is the CORE service, and the content providers, which are the repositories and their administrators. The Dashboard will also be the tool to improve the control of the content providers over the harvested content, something that we hope will reduce the scepticism and fear of sharing open access content with third party systems, which provide this content openly as well. We also hope that the technical issues notification system, not only will improve the harvesting process, but it will also provide a mutual understanding and closer collaboration between the repositories&#x2019; managers and the technical staff, who support the repository. Finally, CORE&#x2019;s foremost goal is to broaden the discoverability of the open access content and its reuse when permitted.</p>
</sec>
</sec>
<sec id="s5">
<title>5. Conclusion</title>
<p>The idea of facilitating the collaboration between CORE and repositories using the CORE Dashboard can be generalised to the collaboration of any aggregator with content providers, such as national libraries and archives. The overall aim of this approach is to strike a balance between the ability of aggregators to more effectively disseminate content, while allowing content providers to keep full control over it at all times.</p>
</sec>
</body>
<back>
<ack>
<title>Acknowledgement</title>
<p>This paper was presented at the 44<sup>th</sup> LIBER International Conference in London, on 24&#x2013;26 June, 2015. The CORE team would like to thank the Dashboard volunteer testers, Nick Sheppard and Chris Biggs for their comments and feedback.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="r1"><mixed-citation>Allard, S.L., Mack, T.R., &#x0026; Feltner-Reichert, M. (2005). The librarian&#x2019;s role in institutional repositories: a content analysis of the literature. <italic>Reference Services Review</italic>, <italic>33</italic>(3), 325&#x2013;336. doi:<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1108/00907320510611357">10.1108/00907320510611357</ext-link>.</mixed-citation></ref>
<ref id="r2"><mixed-citation>Gargouri, Y., Larivi&#x00E8;re, V., Gingras, Y., Carr, L., &#x0026; Harnad, S. (2012). Green and gold open access percentages and growth by discipline. In <italic>17<sup>th</sup> International Conference on Science and Technology Indicators (STI)</italic>, Montreal, CA, 05&#x2013;08 September 2012. Retrieved February 16, 2016, from <ext-link ext-link-type="uri" xlink:href="http://eprints.soton.ac.uk/340294/">http://eprints.soton.ac.uk/340294/</ext-link></mixed-citation></ref>
<ref id="r3"><mixed-citation>Gentil-Beccot, A., Mele, S., &#x0026; Brooks, T.C. (2010). Citing and reading behaviours in high-energy physics. How a community stopped worrying about journals and learned to love repositories. <italic>Scientometrics</italic>, <italic>84</italic>(2), 345&#x2013;355. Retrieved February 16, 2016, from <ext-link ext-link-type="uri" xlink:href="http://arxiv.org/abs/0906.5418">http://arxiv.org/abs/0906.5418</ext-link></mixed-citation></ref>
<ref id="r4"><mixed-citation>Horwood, L., Sullivan, S., Young, E., &#x0026; Garner, J. (2004). OAI compliant institutional repositories and the role of library staff. <italic>Library Management</italic>, <italic>25</italic>(4/5), 170&#x2013;176. doi:<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1108/01435120410533756">10.1108/01435120410533756</ext-link>.</mixed-citation></ref>
<ref id="r5"><mixed-citation>Knoth, P., &#x0026; Zdrahal, Z. (2012). CORE: Three access levels to underpin open access. <italic>D-Lib Magazine</italic>, <italic>18</italic>(11/12). Retrieved February 16, 2016, from <ext-link ext-link-type="uri" xlink:href="http://www.dlib.org/dlib/november12/knoth/11knoth.html">http://www.dlib.org/dlib/november12/knoth/11knoth.html</ext-link></mixed-citation></ref>
<ref id="r6"><mixed-citation>Knoth, P., Anastasiou, L., &#x0026; Pearce, S. (2014). My repository is being aggregated: a blessing or a curse? In <italic>9<sup>th</sup> International Conference on Open Repositories</italic>, Helsinki, Finland, 9&#x2013;13 June 2014. Retrieved February 16, 2016, from <ext-link ext-link-type="uri" xlink:href="http://blog.core.ac.uk/files/OpenRepositories2014_v2.pdf">http://blog.core.ac.uk/files/OpenRepositories2014_v2.pdf</ext-link></mixed-citation></ref>
<ref id="r7"><mixed-citation>Laakso, M., &#x0026; Bj&#x00F6;rk, B.-C. (2012). Anatomy of open access publishing: a study of longitudinal development and internal structure. <italic>BMC Medicine</italic>, <italic>10</italic>(124), 9. Retrieved February 16, 2016, from <ext-link ext-link-type="uri" xlink:href="http://www.biomedcentral.com/1741-7015/10/124">http://www.biomedcentral.com/1741-7015/10/124</ext-link></mixed-citation></ref>
<ref id="r8"><mixed-citation>Morrison, H. (2015). The dramatic growth of open access June 30, 2015. <italic>The Imaginary Journal of Poetic Economics</italic>. [Weblog] Retrieved February 16, 2016, from: <ext-link ext-link-type="uri" xlink:href="http://poeticeconomics.blogspot.co.uk/2015/06/dramatic-growth-of-open-access-june-30.html">http://poeticeconomics.blogspot.co.uk/2015/06/dramatic-growth-of-open-access-june-30.html</ext-link></mixed-citation></ref>
<ref id="r9"><mixed-citation>Pontika, N., &#x0026; Rozenberga, D. (2015). Developing strategies to ensure compliance with funders&#x2019; open access policies. <italic>Insights</italic>, <italic>28</italic>(1), 32&#x2013;36. doi:<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1629/uksg.168">10.1629/uksg.168</ext-link>. Retrieved February 16, 2016, from <ext-link ext-link-type="uri" xlink:href="http://insights.uksg.org/articles/10.1629/uksg.168/">http://insights.uksg.org/articles/10.1629/uksg.168/</ext-link></mixed-citation></ref>
<ref id="r10"><mixed-citation>Sch&#x00F6;pfel, J., &#x0026; Boukacem-Zeghmouri, C. (2011). Assessing the return on investments in GL institutional repositories. In <italic>Gray Literature in Library and Information Studies,</italic> De Gruyter Saur (pp. 1&#x2013;20). Retrieved February 16, 2016, from <ext-link ext-link-type="uri" xlink:href="https://halshs.archives-ouvertes.fr/sic_00601568/document">https://halshs.archives-ouvertes.fr/sic_00601568/document</ext-link></mixed-citation></ref>
<ref id="r11"><mixed-citation>SPARC Europe. (2013). <italic>Analysis of funder open access Policies around the world</italic>. Retrieved February 16, 2016, from <ext-link ext-link-type="uri" xlink:href="http://sparceurope.org/analysis-of-funder-open-access-policies-around-the-world/">http://sparceurope.org/analysis-of-funder-open-access-policies-around-the-world/</ext-link></mixed-citation></ref>
<ref id="r12"><mixed-citation>Walters, T.O. (2007). Reinventing the library: How repositories are causing librarians to rethink their professional roles. <italic>portal: Libraries and the Academy</italic>, <italic>7</italic>(2), 213&#x2013;225. doi:<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1353/pla.2007.0023">10.1353/pla.2007.0023</ext-link>. Retrieved February 16, 2016, from <ext-link ext-link-type="uri" xlink:href="https://courses.washington.edu/mlis550/au10/pdf/Module_4_Walters_Reinventing_the_Library.pdf">https://courses.washington.edu/mlis550/au10/pdf/Module_4_Walters_Reinventing_the_Library.pdf</ext-link></mixed-citation></ref>
<ref id="r13"><mixed-citation>Watson, A.B. (2009). Comparing citations and downloads for individual articles at the Journal of Vision. <italic>Journal of Vision</italic>, <italic>9</italic>(4):i, 1&#x2013;4. doi:<ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.1167/9.4.i">10.1167/9.4.i</ext-link>. Retrieved February 16, 2016, from <ext-link ext-link-type="uri" xlink:href="http://jov.arvojournals.org/article.aspx?articleid&#x003D;2193506">http://jov.arvojournals.org/article.aspx?articleid&#x003D;2193506</ext-link></mixed-citation></ref>
<ref id="r14"><mixed-citation>Wickham, J. (2010). Repository management: an emerging profession in the information sector. In Online Information 2010. London, UK, 30 November&#x2013;2 December. Retrieved February 16, 2016, from <ext-link ext-link-type="uri" xlink:href="http://eprints.nottingham.ac.uk/1511/">http://eprints.nottingham.ac.uk/1511/</ext-link></mixed-citation></ref>
</ref-list>
<fn-group>
<fn id="fn1"><p><ext-link ext-link-type="uri" xlink:href="http://roarmap.eprints.org/">http://roarmap.eprints.org/</ext-link>.</p></fn>
<fn id="fn2"><p><ext-link ext-link-type="uri" xlink:href="http://core.ac.uk/intro/api">http://core.ac.uk/intro/api</ext-link>.</p></fn>
<fn id="fn3"><p><ext-link ext-link-type="uri" xlink:href="http://core.ac.uk/intro/data_dumps">http://core.ac.uk/intro/data_dumps</ext-link>.</p></fn>
<fn id="fn4"><p><ext-link ext-link-type="uri" xlink:href="http://core.ac.uk/">http://core.ac.uk/</ext-link>.</p></fn>
<fn id="fn5"><p><ext-link ext-link-type="uri" xlink:href="http://core.ac.uk/intro/plugin">http://core.ac.uk/intro/plugin</ext-link>.</p></fn>
<fn id="fn6"><p><ext-link ext-link-type="uri" xlink:href="http://core.ac.uk/intro/dashboard">http://core.ac.uk/intro/dashboard</ext-link>.</p></fn>
<fn id="fn7"><p><ext-link ext-link-type="uri" xlink:href="https://scholar.google.co.uk/">https://scholar.google.co.uk/</ext-link>.</p></fn>
<fn id="fn8"><p><ext-link ext-link-type="uri" xlink:href="http://citeseerx.ist.psu.edu">http://citeseerx.ist.psu.edu</ext-link>.</p></fn>
<fn id="fn9"><p><ext-link ext-link-type="uri" xlink:href="https://www.openaire.eu/">https://www.openaire.eu/</ext-link>.</p></fn>
<fn id="fn10"><p><ext-link ext-link-type="uri" xlink:href="https://www.ncbi.nlm.nih.gov/pubmed">https://www.ncbi.nlm.nih.gov/pubmed</ext-link>.</p></fn>
<fn id="fn11"><p><ext-link ext-link-type="uri" xlink:href="https://www.openarchives.org/OAI/openarchivesprotocol.html">https://www.openarchives.org/OAI/openarchivesprotocol.html</ext-link>.</p></fn>
<fn id="fn12"><p><ext-link ext-link-type="uri" xlink:href="https://www.openarchives.org/pmh/tools/tools.php">https://www.openarchives.org/pmh/tools/tools.php</ext-link>.</p></fn>
<fn id="fn13"><p><ext-link ext-link-type="uri" xlink:href="http://dublincore.org/">http://dublincore.org/</ext-link>.</p></fn>
<fn id="fn14"><p><ext-link ext-link-type="uri" xlink:href="http://www.crossref.org/">http://www.crossref.org/</ext-link>.</p></fn>
<fn id="fn15"><p>Read more at point 3 <ext-link ext-link-type="uri" xlink:href="http://blog.core.ac.uk/2015/10/19/7-tips-for-successful-harvesting/">http://blog.core.ac.uk/2015/10/19/7-tips-for-successful-harvesting/</ext-link>.</p></fn>
<fn id="fn16"><p>Read more at point 5 <ext-link ext-link-type="uri" xlink:href="http://blog.core.ac.uk/2015/10/19/7-tips-for-successful-harvesting/">http://blog.core.ac.uk/2015/10/19/7-tips-for-successful-harvesting/</ext-link>.</p></fn>
<fn id="fn17"><p><ext-link ext-link-type="uri" xlink:href="http://www.irus.mimas.ac.uk/">http://www.irus.mimas.ac.uk/</ext-link>.</p></fn>
<fn id="fn18"><p><ext-link ext-link-type="uri" xlink:href="http://rioxx.net/">http://rioxx.net/</ext-link>.</p></fn>
<fn id="fn19"><p><ext-link ext-link-type="uri" xlink:href="http://www.rcuk.ac.uk/research/openaccess/policy/">http://www.rcuk.ac.uk/research/openaccess/policy/</ext-link>.</p></fn>
<fn id="fn20"><p><ext-link ext-link-type="uri" xlink:href="http://rioxx.net/guidelines/RIOXX_Metadata_Guidelines_v_3.0.pdf">http://rioxx.net/guidelines/RIOXX_Metadata_Guidelines_v_3.0.pdf</ext-link>.</p></fn>
</fn-group>
</back>
</article>