An Overview of Research Infrastructures in Europe
— and Recommendations to LIBER[1]

Norbert Lossau

Göttingen, HU Berlin[2]

lossau@sub.uni-goettingen.de

Abstract

Research infrastructures (RI) include major scientific equipment, scientific collections, archives, structured information and ICT-based infrastructures and services[3]. They support top-level research and can be organized at the national and regional level, at EU Member State, European and global level.

RIs have become a topic of interest and priority for funders, political bodies, and (increasingly) institutional decision makers. In Europe the European Commission is a funder of RIs, complementing funding done by EU Member States at the national level. Over the last ten years hundreds of RI-projects have been planned and some received funding for design, extension and improvement of operations and services to scientific communities. The ESFRI[4] roadmap for research infrastructures represents a financial volume of approx. 20 billion EUR for ten years to construct 44 RIs. From the perspective of realizing the objectives set for RI, 2012 is an essential milestone, as the discussion of the HORIZON 2020 programmes at the European level will take place as well as consultations with member states. The following overview is by no means complete. It focuses on some RIs majorly influenced by the production and management of scientific information and which have relevance for the European political and funding agenda.

RI projects include a variety of typologies, ranging from hard, single-site facilities to distributed, soft facilities relying on networks. Typically they have emerged from discipline-specific or cross-disciplinary requirements. RIs produce, process or manage big and small but heterogeneous volumes of information. They are the so-called ‘scientific data factories’ of the 21st century. They comprise various types of information resources such as publications, digitized collections, learning objects and research data. Key issues on today’s agenda for RIs are their uptake by researchers, and their viability, sustainability and interoperability.

Research libraries’ engagement with RIs has been low. While this could be understandable in 2005 when the first priorities for RI investments were defined, it now represents a big gap in the European strategy. Key initiatives such as the ESFRI Research Infrastructures involve no participation by research libraries, except for DARIAH. Participation in EC-funded projects (through LIBER or directly through institutions) focused (with a few exceptions) on the areas of digitization, cultural heritage and publications. Research libraries need to become visible actors in strategic discussions on RIs and should actively explore their engagement in research data infrastructures. Open Access, open science (data), research data infrastructures and management are the catalysts to get research libraries back into the awareness of researchers beyond the humanities and social sciences.

‘Open Access is global — but implementation is local’. This is a popular slogan of the OpenAIRE project and gives local research libraries an important role in the European context. Research data are discipline-specific, but policies, workflows and standards also need to be implemented at the local level. Creating participatory infrastructures by involving institutional, national and disciplinary actors has been identified by the EC as a key task for the current decade. The term ‘participatory’ is also considered to be of fundamental relevance for European policy strategy, as it matches well with national and European coordination for cost efficiency and is instrumental in avoiding duplication of work.

The primary challenges to building a coherent, fundable and sustainable ecosystem do not lie in ICT technology, but rather in governance, law, organization, socio- cultural aspects, trust, and, of course, costs.

Key Words: research infrastructures; LIBER; research libraries

An Overview of the RI Landscape

Strategies and roadmaps for RIs have become a priority for many funding organizations and policy makers, e.g., in Australia, Germany, Japan, the UK and the US. An example at a national level is the ‘Concept for Information Infrastructure in Germany’[5], published in May 2011 (in German) and produced by a large group of key stakeholders including major research libraries. The German Scientific Council (‘Wissenschaftsrat’) has released four Recommendations on various aspects of RIs and Information Infrastructures (January 2011, mainly in German). A 5th Recommendation, summing up multiple aspects, is expected in the Summer of 2012. Knowledge Exchange (KE)[6], a forum for four countries (Denmark, Germany, The Netherlands, The UK) is addressing RI issues. The European Strategy Forum on Research Infrastructures (ESFRI)[7], established in 2002, which comprises delegates nominated through the Research Ministers of the Member and Associate Countries, and includes a representative of the Commission, is working together to develop a joint vision and a common RI strategy. The G8+O5 countries have commissioned an international working group on ‘Data Infrastructure’, requesting a roadmap for ‘Global Research Infrastructures’.

In the 7th Framework Programme (FP 7: 2007–2013), the European Commission has earmarked 1715 million euros for RIs, a 42% share of the ‘Capacities’ Programme Area (4097 million euros). One of the largest (if not the largest) distributed RI is GEANT[8], the data network serving research and education in Europe, with links to research environments in the rest of the world (e.g., Israel, South America, Asia).

We have two principal categories of RIs:

Disciplinary RIs

ESFRI

In addition to numerous EU projects addressing Research Infrastructures, the most relevant and systematic activity coordinated by the EU member states, together with the EC, is ESFRI. The ESFRI projects are under the auspices of DG Research & Innovation (RTD), Directorate B.3 ‘Research Infrastructures’. Their focus is on integrating and providing seamless access to and (re-)use of research data (Figure 1).


Figure 1 - ESFRI projects, examples. CLARIN = The Common Language Resources and Technology Infrastructure; LIFEWATCH = Science and technology infrastructure for biodiversity data and observatories; EURO-ARGO = RI for Ocean Science and observations.

The ‘ESFRI Roadmap’ is the structure to provide overview and guidance at a time of proliferating RIs. For the year 2010 it includes a number of projects in six thematic areas, as shown in Figure 2.

Figure 2 - The ESFRI Roadmap 2010.

The latest update of the ESFRI Roadmap (May 2011) lists 48 RIs, including six new activities:

  1. Infrastructure for Analysis and Experimentation on Ecosystems (ANAEE)

  2. Infrastructure for System Biology — Europe (ISBE)

  3. EU Microbial Resource Centre Research Infrastructure (MIRRI)

  4. The European Solar Research Infrastructure for Concentrating Solar Power (EU-Solaris)

  5. Multipurpose Hybrid Research Reactor for High-Technology Application (MYRRHA)

  6. The European Wind Scanner Facility (Windscanner).

A number of ESFRI RIs are in the process of establishing the legal framework for an ERIC (European Research Infrastructure Consortium[9]), e.g., CLARIN and DARIAH. For DARIAH the Research Ministries in France, Germany and The Netherlands have taken the lead to guide the necessary processes with France being the ERIC host. Further countries involved are, e.g., Austria, Denmark, Greece, Ireland, Switzerland and the UK. Funding is provided through national DARIAH projects, e.g., DARIAH-DE[10].

Associated research, education, and infrastructure projects for DARIAH include:

With the exception of DARIAH there is no research library involved in ESFRI projects. Consortia are usually composed of scientific communities, (applied) computer scientists and — sometimes — large-scale computing centres. Looking at the objectives of many of these ESFRi RIs, one could easily imagine research libraries as relevant partners. ELIXIR is an example from the Biology Sciences[11]. In their brochure the Consortium explains why they are building the infrastructure now and what has changed over the last years:

‘Data is an essential commodity for life science research. Ten years ago, finding a connection between a gene and a characteristic such as drought tolerance or disease susceptibility could take years. Now it takes minutes. The growth in genomic data has outstripped dramatically the growth in storage and processing capacity. Next-generation sequencing machines produce billions of bases of nucleotide data per experiment quickly and at relatively low cost. It is a disruptive technology that is so much better than what we had before that the uptake is skyrocketing and both the users and the informatics infrastructure have trouble adapting to it’[12].

These are the four main objectives of ELIXIR in a nutshell[13]:

Cross-disciplinary Research Infrastructures

At the EC, the DG Information & Society funds ICT-based e-Infrastructures[14] reaching across disciplines. A key advisory group is the e-Infrastructure Reflection Group, e-IRG, established in 2003[15].

OpenAIRE

OpenAIRE[16] (December 2009–November 2012) is a research e-infrastructure initiative for publications. It was set up in support of the FP7 Open Access pilot of the EC, but rapidly evolved to become a flagship initiative and a clear success story of the EC[17]. ‘The success of OpenAIRE is the success of the EC Open Access Initiative’, has been an introductory statement by EC officers already in the negotiations preparing for the Grant Agreement. OpenAIRE has been invited by the EC to transfer the European ‘model’ to countries outside of Europe, e.g., Australia, India, Latin America or the US. COAR[18], the Confederation of Open Access Repositories is used as an instrument to facilitate outreach outside of Europe.

OpenAIRE’s basic objectives are to provide a support infrastructure for researchers, helping them to comply with Special Clause 39 of their grant agreements in the selected seven thematic areas of the Open Access pilot. It also delivers statistics to the EC about, e.g., the number of publications produced in funded projects and the percentage of Open Access publications. Beyond the technical infrastructure, which builds on the DRIVER project, OpenAIRE has established a Network of National Open Access Desks (NOADs) with partners in all European member states (excluding Luxemburg, but including Norway as a partner) (see Figure 3).

Figure 3 - OpenAIRE.

It is very encouraging to read the result of a recent review by independent experts expressing that:

‘OpenAIRE is poised to change attitudes in a lasting way over the value of freely sharing research papers and, with OpenAIRE+, the accompanying data. Already, one of the great results of OpenAIRE has been the effective building of communities across Europe, including countries where scientific publishing needs to mature. This will do much to advance the quality of science, enabling the checking of results, as well as the re-analysis and re-purposing of the data. It will also benefit communities such as SMEs, teachers and learners who are outside the mainstream research community but who can find value in access to publicly funded research.’

Through its network OpenAIRE engages multiple stakeholders, in particular research project coordinators and researchers, repository managers, research administrators at universities, and policy makers (e.g., at the request of NOADs). OpenAIRE has started to train thematic programme officers at the EC, teaching them how Open Access can be implemented and what the arguments are to convince researchers to start depositing in repositories or publishing in Open Access. Key challenges addressed by the Consortium are the weak mandate of the EC, accordingly the reluctance of researchers to comply, a different level of Open Access maturity in EU member states, publishers’ counter-activities, and repository managers’ slow uptake of the guidelines.

OpenAIREplus

OpenAIREplus[19] (December 2011–May 2014) expands the remit of OpenAIRE by linking publications to research data (and other associated information, such as funding programmes). This initiative could be defined as an RI initiative that crosses from the publication into the data world. There is a clear gap that needs to be filled in terms of linking up scholarly communications, and research libraries are well placed here to fulfill this RI role, due to their long-lasting involvement with researcher communities. The structure of the project and the consortium has changed slightly[20]. Running in parallel for twelve months, OpenAIREplus will continue and expand to implement the OpenAIRE objectives (see Figure 4).


Figure 4 - OpenAIREplus. (Figure provided by Yannis Ionnadis, Univ. of Athens, OpenAIRE/∼plus).

Both OpenAIRE and OpenAIREplus have many research libraries as consortium partners, with an estimated substantial percentage being LIBER members.

Discussions have started to make OpenAIRE/∼plus a permanent pan- European publication infrastructure, similar perhaps to GEANT, the data bandwidth provider.

EUDAT

EUDAT (October 2011–September 2014) stands for ‘European Data Infrastructure’ and aims ‘to contribute to the production of a Collaborative Data Infrastructure’[21]. The cross-disciplinary dimension in handling research data makes EUDAT special. Partners include key representatives from research communities in linguistics ( CLARIN), earth sciences ( EPOS), climate sciences ( ENES), environmental sciences ( LIFEWATCH), and biological and medical sciences ( VPH), all of which have been allocated project resources to help specify their requirements and co-design related services. The EC has been working with EUDAT and OpenAIREplus as complimentary RIs combining research data and publications, to implement a seamless knowledge infrastructure in Europe. There are no research libraries involved in EUDAT.

EGI, the European Grid Infrastructure

EGI.eu[22] (2010–2014), a foundation under Dutch law, was created in 2010 and has its history in the EGEE project (Enabling Grid for e-Science, 2004–2009)[23] and the DataGrid Project (2001–2004)[24]. One of the key objectives is to operate a secure integrated production grid infrastructure that seamlessly federates resources from providers around Europe. Operations of EGI. eu are currently funded through the EC project EGI-InSPIRE[25], which comprises fifty institutions from forty countries. There are no research libraries involved. The ambition is to transform EGI into a permanent infrastructure, sustained by European countries.

Europeana (2007 -)/ The European Library

Europeana (2007-)/The European Library focuses on cultural heritage material accessible through digitization. Europeana stands under the auspices of the DG Information & Society branch in Luxemburg, Directorates E.3 ‘Cultural Heritage and Technology Enhanced Learning’ and E.4 ‘Access to Information’. Historically, the focus of the Directorates has been on e-content and digital libraries. Therefore, Europeana is not branded as an RI, although material accessible through Europeana is clearly relevant for research. Europeana has a wide definition of ‘users’ and the new strategic plan somewhat broadens the outreach, e.g., to the tourism sector. Researchers are not the focus of Europeana’s strategic plan 2011–2015, although they are mentioned as a target group.

DART-Europe

DART-Europe is a partnership of 21 research libraries and library consortia aimed to improve global access to European research theses[26]. The group maintains the DART-Europe E-Theses Portal. It is resourced through partner contributions and administered by UCL library. Dart-Europe is not an EC co-funded initiative or branded as an RI as such, but it does cover relevant types of publication for libraries and universities. It is listed as an infrastructure supported by LIBER. From the structural point of view it could be considered as part of the European publication RI as built by OpenAIRE/∼plus.

Issues

Maintaining an Overview

Maintaining an overview of the RI landscape is one of the key challenges. RIs in Europe build a complex, somewhat fragmented system at the national, European and disciplinary level. Different parts of the European programmes fund different and similar types of RIs, both as single, separate projects and in a more systematic way, as outlined above (e.g., ESFRI, OpenAIRE, EUDAT). In addition there are potential infrastructures which are not considered to be RIs, as they fall outside the scope of the established schema of RIs (e.g., Europeana).

Sustainability, Cost and Legal Models, Finance and Governance

Designing and creating a new RI prototype is only the first step. Transforming a project into a production-quality infrastructure and maintaining this infrastructure for the long term is much more of a challenge. The scale of RIs in terms of numbers and heterogeneity has become a real challenge, in particular for funders at the European level (the EC) and the member states (ministries). Usually there are two principal layers to be sustained: 1. the actual data and system infrastructure and 2. the community or network behind this infrastructure. These layers mirror the service and networking layer in an FP7 ‘I3’-project.

With the introduction of ERICs, the EC has created a legal framework for cross-country RIs. The process to implement an ERIC in practice, however, is rather laborious. In some countries issues have arisen with regard to, e.g., taxation.

An established funding model for RIs in the long term within the EU means involving member states, directly or through associated legal entities such as governmental agencies (e.g., for ERICs). A possible alternative is funding RIs through institutional membership contributions. However, to my knowledge there are presently no experiences with large-scale RIs based on such a model.

There is no financial plan at the national or European level to finance RIs in the long term. The financial volume needed to maintain just the ESFRI RIs has been estimated at 42 million euros for the social sciences and humanities (compared to 250 million euros construction costs) and 1,327 million euros for the sciences (compared to 13,446 million euros construction costs; May 2010). To illustrate this, here are some examples of estimates of the operational costs for ESFRI infrastructures: ELIXIR: estimated operational costs: 100 million euros/year; LIFeWAtcH, Science and Technology Infrastructure for Research on Biodiversity and Ecosystems, 35.5 million euros/year; ELI, Extreme Light Infrastructure — Laser, ca. 70 million euros/year; DARIAH: 2.4 million euros/year. Beyond ESFRI we have at least four more potential permanent RIs: OpenAIRE, EUDAT, EGI and Europeana.

Interoperability, Re-Use

Creating seamless access to publications, research data and cultural heritage material or indeed supporting interdisciplinary research will never become a reality if interoperability at various levels is not addressed properly. How do the DRIVER-OpenAIRE guidelines relate to the data models of the ESFRI infrastructures, or to those of the Europeana Data Model (EDM)? How can researchers not only search and access data but also automatically export large quantities of data from RIs to their own domain or theme-specific virtual research environment? What ICT-services are required to support this usage scenario and what should the terms and conditions before those services? COAR is an existing organization which has started addressing interoperability at an international scale through the COAR Interoperability Initiative[27] and by engaging in international discussions like the recently started ‘Repository Indexing/Google Scholar’-discussion[28]. For Europe, COAR is closely liaising with OpenAIRE/∼plus.

How Can/Will These Developments Impact Research Libraries?

Recommendations

The following recommendations reflect my personal views and have not been discussed with the organizations/initiatives described (order reflects priority).

To LIBER

  1. Develop the ‘Scholary Communication’ Steering Committee into a ‘Scholarly Communication and Research Infrastructure’ Steering Committee, addressing Open Access policy and implementation, RIs in the wider sense, including publications, data and cultural heritage infrastructures;

  2. Become a key actor in a sustained European publication RI and become more aware of what ICT technologies can offer in terms of advanced services for researchers, educators and students (OpenAIRE/∼plus) (including governance, outreach to members etc.);

  3. Continue and expand LIBER’s role in Europeana, supporting the interests of scholars who want to re-use Europeana materials, outside of the portal’s own environment — in research-driven virtual research environments organized by humanities RIs such as DARIAH and CLARIN;

  4. Support developments in global cooperation by becoming a/the European node in COAR e.V. (International), contribute to the development of COAR into a global RI forum; explore joint membership fees;

  5. Join OpenAIRE and COAR in drafting the European RI vision and roadmap;

  6. Design and publish a vision and (practical) roadmap for research libraries on how to ‘redefine the library’ in the digital world (establish research data management as an activity portfolio, engage with researchers by becoming a partner in research projects, etc. );

  7. Commission and publish practical ‘kits’ on topics described in the vision and roadmap;

  8. Intensify discussions with university organizations ranging from LERU to COIMBRA and EUA in selected areas (e.g., Open Access, RI and their sustainability) and start discussions with European organizations representing scientists (in particular EuroScience), positioning LIBER as a relevant strategic body in this context as well;

  9. Education & training: build on the excellent activities of the ‘Organization and Human Resources’ Steering Committee and expand the portfolio of the ‘leadership seminar’ to broadscale ‘research librarian’ education & training, in collaboration with TICER and iSchools[29]/iCaucus[30]; in addition, organizational changes for research libraries should be paid more attention to;

  10. Provide room (in space and time) for ‘intimate’ discussions between library leaders (exclusively) to discuss challenges and solutions for selected topics, e.g., organizational changes, motivation of staff towards new areas of activity etc., e.g., at the Annual Conference.

To Local Research Libraries as Dynamic Nodes of a European and Global Web of Knowledge
  1. Get engaged in the implementation of the recommendations for LIBER at the European level, e.g., participate in the Steering Committees and Working Groups;

  2. Get involved in discussions about the university strategy and position the library as a relevant player in the digital research infrastructure;

  3. Be creative and active at the local level (e.g., at your university) by, e.g., implementing Open Access, developing research data management staff, becoming a partner in research projects);

  4. Expand your local library strategy with a European/international view (e.g., implement European standards for digitization, repositories, etc. at the local level);

  5. As library leader take the opportunity offered by the LIBER Annual Conferences to get into discussions with your colleagues (which helps you to reduce local learning curves).

To National (Research) Library Associations

Support LIBER in the above-mentioned activities at the national level, in particular through dissemination of these recommendations to research libraries your the country and by acting as a feedback channel.

Websites Referred to in the Text

COAR http://www.coar-repositories.org/

COAR Interoperability project http://www.coar-repositories.org/working-groups/repository-interoperability/coar-interoperability-project/

DART-Europe E-theses Portal www.dart-europe.eu.

DataGrid Project http://eu-datagrid.web.cern.ch/eu-datagrid/.

EGI-InSPIRE Project http://www.egi.eu/projects/egi-inspire/index.html

Elixir www.elixir-europe.org.

Enabling Grids for E-Science www.elixir-europe.org.

EUDAT http://www.eudat.eu/European Grid Infrastructure http://www.egi.eu/.

European Strategy Forum on Research Infrastructures http://ec.europa.eu/research/infrastructures/index_en.cfm?pg=esfri e-Infrastructure http://cordis.europa.eu/fp7/ict/e-infrastructure/home_en.html e-Infrastructure Reflection Group http://www.e-irg.eu/

GÉANT www.geant.net.

iCaucus http://www.ischools.org/site/about_icaucus/, members include I-Schools at Univ. of California, Carnegie Mellon, Humboldt, Copenhagen etc.

Kommission Zukunft der Informationsinfrastruktur (2011) Gesamtkonzept für die Informationsinfrastruktur in Deutschland Available at: http://www.gwk-bonn.de/fileadmin/Papers/KII_Gesamtkonzept.pdf (accessed 29.02.12).

Knowledge Exchange http://www.knowledge-exchange.info/.

OpenAIRE http://www.openaire.eu/


Notes

The paper was prepared for the Strategy Meeting of the LIBER Board on 23/24 February 2012. The published version has been slightly revised and includes comments from Natalia Manola, Heike Neuroth and Najla Rettberg. I would also like to thank Carlos Morais-Pires for fruitful discussions on the future of the European scientific research space. The author is particularly grateful to Najla Rettberg and Inge Angevaare for final proofreading and editorial support.

Prof. Dr. Norbert Lossau is a member of the LIBER Executive Board. He is the Director of Göttingen State and University Library, Germany, and holds an Honorary Professorship in Library and Information Science at Humboldt University, Berlin. His research areas are research-/information infrastructures, scholarly communication incl. Open Access, e(nhanced)Research and the implications for libraries, and library strategy.

European Strategy Forum for RI.

Kommission Zukunft der Informationsinfrastruktur (2011) Gesamtkonzept für die Informationsinfrastruktur in Deutschland Available at: http://www.gwk-bonn.de/fileadmin/Papers/KII_Gesamtkonzept.pdf (accessed 29.02.12).

Göttingen State and University Library is the Coordinator of the Consortium which comprises 17 partners, http://de.dariah.eu/.

ELIXIR will be a secure, rapidly evolving platform for collection, storage, annotation, validation, dissemination and utilization of biological data. It will comprise a distributed and interlinked collection of core and specialized biological data resources. The core resources will include a substantial upgrade to the existing molecular data resources at the European Bioinformatics Institute (EBI), as well as new resources as appropriate.
The specialized resources will be distributed across Europe. ELIXIR will also include the necessary major upgrade to the computer infrastructure to store and organize this data in a way suitable for rapid search and access, and will provide a sophisticated but user-friendly portal for users. Additionally, it will provide the infrastructure necessary to utilize data in a manner that is most appropriate for users of other research infrastructures in biological and medical sciences and environmental sciences.

The e-Infrastructures activity, as a part of the Research Infrastructures programme, focuses on ICT-based infrastructures and services that cut across a broad range of user disciplines. It aims at empowering researchers with an easy and controlled online access to facilities, resources and collaboration tools, bringing to them the power of ICT for computing, connectivity, storage and instrumentation. This allows for instant access to data and remote instruments, ‘in silico’ experimentation, as well as the setup of virtual research communities (i.e., research collaborations formed across geographical, disciplinary and organizational boundaries). e-Infrastructures foster the emergence of e-Science, i.e., new working methods based on the shared use of ICT tools and resources across different disciplines and technology domains. Furthermore, e-infrastructures enable the circulation of knowledge in Europe online and therefore constitute an essential building block for the European Research Area (ERA). http://cordis.europa.eu/fp7/ict/e-infrastructure/home_en.html.

SURF left the consortium, five countries have joined: Croatia, Iceland, Luxemburg, Switzerland and Turkey.

E-mail from Eloy Rodrigues to SPARC and JISC repositories discussion lists, 17 February 2012: ‘(...)
While increased access to and visibility of scholarship are central to Open Access and both beneficial to researchers and their institutions, we, the international community of open access repositories brought together under the auspices of the Confederation of Open Access Repositories (COAR), believe that the real value of repositories lies in the potential to interconnect them and other related components of the e-research infrastructure in order to create a network of research outputs, a network that will allow for research outputs to be used and re-used by both machines and researchers. This vision for Open Access relies on interoperability between various components of the e-research infrastructure, including, but not limited to, Institutional Repositories. Furthermore, we at COAR, believe that the importance of interoperability is not only as a means of facilitating discovery and access to content, but as a mechanism for developing and implementing new services on top of this e-research infrastructure. Google Scholar is one such service, but there are many other emerging services and not-yet-implemented ideas which will have implications for repositories.’

http://www.ischools.org/site/about_icaucus/, members include I-Schools at Univ. of California, Carnegie Mellon, Humboldt, Copenhagen etc.