Defining the technical framework required to enable and sustain an evolving EOSC federation of systems
The Architecture activity proposes the technical framework required to enable and sustain an evolving EOSC federation of systems. Such a technical framework may include standards, APIs and protocols that will facilitate interoperable services delivered by diverse providers.
In order to fulfill the EOSC vision, an interoperability layer that allows to build the EOSC federation of systems
- has to be defined
- has to be agreed upon by relevant stakeholders
- has to be developed through an open and transparent process.
It will be built on the results delivered by the EOSC-related Horizon 2020 projects.
The EOSC Architecture activity will provide an in-depth independent review of the current offering and the required evolution of the EOSC technical architecture, its standards and best practices. It will propose a way forward by describing and/or defining:
- the EOSC core services and their interfaces
- the EOSC open source APIs for reuse by thematic services
- the EOSC portal components and federated catalogues of service offerings
- the EOSC data description standards
- any other standards and best practices necessary to ensure the evolution of EOSC and the widening of its user base to the industry and the public sectors.
Based on the consultation of stakeholders, the EOSC should be a federation of existing and planned research data infrastructures, adding a soft overlay to connect them and making them operate as one seamless European research data infrastructure. In terms of architecture, the EOSC would essentially comprise a federating core and a variety of federated research data infrastructures committed to providing services as part of the EOSC. The groundwork for such a federated EOSC architecture was laid by several projects funded under Horizon Work Programme 2016-2017, which aim at federating data infrastructures at the European level and offering shared services (e.g. EGI, EUDAT, ELIXIR, EPOS etc.). In addition, resources were committed to examine the EOSC architecture through the EOSCPilot project.
The EOSC federating core is understood to be constituted by EOSC shared resources and by a compliance framework including notably the Rules of Participation. The Work Programme foresees developing the initial shared resources around the EOSC-hub project, the EOSC Portal and a catalogue of data infrastructures and services.
Therefore, the process of federation entails two inter-related activities:
- To develop shared resources as part of the federating core. In the initial phase, Horizon 2020 projects, notably EOSC-hub, will provide an access channel complementing the access mechanisms in use by different data infrastructures. A portfolio of projects (see Annex 1) will provide horizontal services such as a portal, authentication and authorisation and security services, allowing users to access the computing, data and services of pan-European and disciplinary research data infrastructures, which already federate data infrastructures at the European level. A catalogue of EOSC services, including both thematic and generic services - for data storage, management and analytics, simulation and visualisation, distributed computing, etc. will help researchers discover, select and use the services they need.
- To connect to the core a large number of research data infrastructures (henceforth data infrastructures).44 The hub would relay the resources and the services of data infrastructures funded at European, national and regional levels. Service and resources might be both generic and thematic-specific. The progressive federation over time of existing service providers in the EOSC would provide a single, coherent access channel to EOSC services at European level that meets researchers’ needs for data sharing, management and computing.
The timeline below shows how resources of Horizon 2020 would serve this particular effort.
In addition to directly supporting the federation of ESFRI projects in the EOSC (INFRAEOSC- 04-2018), WP 2018-2020 of Horizon 2020 funds specific actions in scientific areas with a tradition of research data sharing and services like transport, food, marine, health and earth- observation; this ensures that the EOSC is fully inclusive.
The WP 2018-2020 on food security, sustainable agriculture and forestry, marine, maritime and inland water research and the bioeconomy includes two topics, one each for developing and building cloud services on food data and ocean data, in such as a way that they can be eventually federated into the EOSC. In health, a significant development included in WP 2018-2020 is the Health Research and Innovation Cloud (HRIC), which aims to structure first and later establish a thematic cloud for health-related research, in strict relation with the EOSC.
The Commission also invests heavily in data regarding the planet and the environment in the Copernicus programme, the flagship space programme. Copernicus’s Data and Information Access Services (DIAS) provide access, tools and processing capabilities for scientists and innovators to exploit this data. DIAS are operated by the industry and will offer additional services in the EOSC under commercial conditions. Federating Copernicus data and DIAS added-value services into the EOSC will leverage the existing Commission investments for the benefit of multiple science and innovation communities. In line with the intervention logic of the Communication, this will reduce the burden for scientific institutes to engage in complex procurement processes, support cross-analysis of data from heterogeneous sources, create market opportunities for research data services and represent a demand-side stimulus for the commercial DIAS.
The consultation and the available evidence show that EOSC might offer five main types of services for European researchers. While such services are currently being provided to specific scientific communities, they are limited by the contexts of disciplines, by national boundaries or by both. The EOSC would make them all available irrespective of discipline or national boundaries.
These services are:
- A unique identification and authentication service and an access point and routing system towards the resources of the EOSC.
- A protected and personalised work environment/space (e.g. logbook, settings, compliance record and pending issues).
- Access to relevant service information (status of the EOSC, list of federated data infrastructures, policy-related information, description of the compliance framework) and to specific guidelines (how to make data FAIR, to certify a repository or service, to procure joint services).
- Services to find, access, re-use and analyse research data generated by others, accessible through appropriate catalogues of datasets and data services (e.g. analytics, fusion, mining, processing).
- Services to make their own data FAIR, to store them and ensure long-term preservation.
The consultation process recommended providing free of charge the services under 1, 2 and 3, as well as under 4 except when the re-use and analysis of data involves big data or large computation power, in particular via a commercial service provider. This would entail co-financing from other sources (e.g. a national or European grant). The cost model of the services described under 5 would be determined when deciding on the long-term business model for EOSC.
Services as proposed above, that could effectively be provided under the EOSC reflect existing offers by service providers across Europe such as EGI, EUDAT and GEANT, and by existing research data repositories. Work to integrate and federate such services has already began in Horizon 2020 Work Programme 2016-2017, with the EOSC-hub project and other related projects expected to deliver services under the EOSC. The projects will deliver the initial catalogue of services and data to be provided by EOSC and will define the delivery model(s) for the services. Those catalogues would be enriched periodically based on the process of federation.
Access & Interface
The consultation and evidence gathered indicate the benefits of giving users a choice between different entry points for accessing EOSC services for practical reasons and to ensure a smooth transition from legacy research data systems in contrast to implement a single access point. Work on the EOSC access and interface has already begun under Horizon 2020 Work Programme 2016-2017.
The entry points to the EOSC would be similar but not equivalent, and typically would consist of a web-based user interface, or front-end, which can be tailored to the specific needs and context of particular user communities. In addition, it would comprise a common platform building on the EOSC-hub project and further developed in the INFRAEOSC-06-2020 call a) and b), that would be accessible to users via machine-to-machine interfaces and which offers access to shared EOSC resources and to the full range of EOSC services.
Services provided under the EOSC would be made accessible via a EOSC portal, based on the work developed by the EOSC-hub and eInfra Central projects and further support planned in Horizon 2020. Acting as an entry point for all potential users, the portal would have a full-fledged user interface supported by the common platform. Such an entry point usually guarantees that all users have access to the full range of services, irrespective of geographical location or scientific affiliation.
- AAI - Authentication and Authorization Interface
- PID - Persistent Identifier
- SIRS - The task force Scholarly Infrastructure for Research Software (SIRS) does an inventory of the current operational infrastructures across Europe and compares their scope and approach. Following this inventory, it will be possible to draw lessons learned, and to establish the rationale for new initiatives to allow EOSC to include software in the realm of its research artefacts, next to publications and data.
Want to be the first to know about the outputs of the Working Group?
Then join the EOSC Secretariat network now!
 The Commission funded the integration, interoperability and federation of data infrastructures in various fields and the development of horizontal data services. These actions delivered generic and thematic data services, workflows, interoperable standards and ontologies, which pave the way toward the establishment of a European integrated environment for research data. In particular, Horizon 2020 supported the development and interoperability of pan- European thematic data infrastructures, through targeted support to the implementation and operation of the ESFRI roadmap projects identified through the ESFRI prioritisation exercise. Among these, two priority ESFRI projects,
EPOS (European Plate Observing System) and ELIXIR (The European Life-Science Infrastructure for Biological Information), received significant funding in the 2014-2015 WP.