Architecture Working Group

Defining the technical framework required to enable and sustain an evolving EOSC federation of systems

Chair

Jean-Francois Abramatic

 

The Architecture activity proposes the technical framework required to enable and sustain an evolving EOSC federation of systems. Such a technical framework may include standards, APIs and protocols that will facilitate interoperable services delivered by diverse providers.

In order to fulfill the EOSC vision, an interoperability layer that allows to build the EOSC federation of systems

  1. has to be defined
  2. has to be agreed upon by relevant stakeholders
  3. has to be developed through an open and transparent process.

It will be built on the results delivered by the EOSC-related Horizon 2020 projects.

 

The EOSC Architecture activity will provide an in-depth independent review of the current offering and the required evolution of the EOSC technical architecture, its standards and best practices. It will propose a way forward by describing and/or defining:

  1. the EOSC core services and their interfaces
  2. the EOSC open source APIs for reuse by thematic services
  3. the EOSC portal components and federated catalogues of service offerings
  4. the EOSC data description standards
  5. any other standards and best practices necessary to ensure the evolution of EOSC and the widening of its user base to the industry and the public sectors.

Based on the consultation of stakeholders, the EOSC should be a federation of existing and planned research data infrastructures, adding a soft overlay to connect them and making them operate as one seamless European research data infrastructure. In terms of architecture, the EOSC would essentially comprise a federating core and a variety of federated research data infrastructures committed to providing services as part of the EOSC. The groundwork for such a federated EOSC architecture was laid by several projects funded under Horizon Work Programme 2016-2017, which aim at federating data infrastructures at the European level and offering shared services (e.g. EGI, EUDAT, ELIXIR, EPOS etc.).[1] In addition, resources were committed to examine the EOSC architecture through the EOSCPilot project.

The EOSC federating core is understood to be constituted by EOSC shared resources and by a compliance framework including notably the Rules of Participation. The Work Programme foresees developing the initial shared resources around the EOSC-hub project, the EOSC Portal and a catalogue of data infrastructures and services.

Therefore, the process of federation entails two inter-related activities:

  1. To develop shared resources as part of the federating core. In the initial phase, Horizon 2020 projects, notably EOSC-hub, will provide an access channel complementing the access mechanisms in use by different data infrastructures. A portfolio of projects (see Annex 1) will provide horizontal services such as a portal, authentication and authorisation and security services, allowing users to access the computing, data and services of pan-European and disciplinary research data infrastructures, which already federate data infrastructures at the European level. A catalogue of EOSC services, including both thematic and generic services - for data storage, management and analytics, simulation and visualisation, distributed computing, etc. will help researchers discover, select and use the services they need.
  2. To connect to the core a large number of research data infrastructures (henceforth data infrastructures).44 The hub would relay the resources and the services of data infrastructures funded at European, national and regional levels. Service and resources might be both generic and thematic-specific. The progressive federation over time of existing service providers in the EOSC would provide a single, coherent access channel to EOSC services at European level that meets researchers’ needs for data sharing, management and computing.

The timeline below shows how resources of Horizon 2020 would serve this particular effort.

In addition to directly supporting the federation of ESFRI projects in the EOSC (INFRAEOSC- 04-2018), WP 2018-2020 of Horizon 2020 funds specific actions in scientific areas with a tradition of research data sharing and services like transport, food, marine, health and earth- observation; this ensures that the EOSC is fully inclusive.

The WP 2018-2020 on food security, sustainable agriculture and forestry, marine, maritime and inland water research and the bioeconomy includes two topics, one each for developing and building cloud services on food data and ocean data, in such as a way that they can be eventually federated into the EOSC.[2] In health, a significant development included in WP 2018-2020 is the Health Research and Innovation Cloud (HRIC), which aims to structure first and later establish a thematic cloud for health-related research, in strict relation with the EOSC.

The Commission also invests heavily in data regarding the planet and the environment in the Copernicus programme, the flagship space programme. Copernicus’s Data and Information Access Services (DIAS) provide access, tools and processing capabilities for scientists and innovators to exploit this data. DIAS are operated by the industry and will offer additional services in the EOSC under commercial conditions. Federating Copernicus data and DIAS added-value services into the EOSC will leverage the existing Commission investments for the benefit of multiple science and innovation communities. In line with the intervention logic of the Communication, this will reduce the burden for scientific institutes to engage in complex procurement processes, support cross-analysis of data from heterogeneous sources, create market opportunities for research data services and represent a demand-side stimulus for the commercial DIAS.

Generic services

The consultation and the available evidence show that EOSC might offer five main types of services for European researchers. While such services are currently being provided to specific scientific communities, they are limited by the contexts of disciplines, by national boundaries or by both. The EOSC would make them all available irrespective of discipline or national boundaries.

These services are:

  1. A unique identification and authentication service and an access point and routing system towards the resources of the EOSC.
  2. A protected and personalised work environment/space (e.g. logbook, settings, compliance record and pending issues).
  3. Access to relevant service information (status of the EOSC, list of federated data infrastructures, policy-related information, description of the compliance framework) and to specific guidelines (how to make data FAIR, to certify a repository or service, to procure joint services).
  4. Services to find, access, re-use and analyse research data generated by others, accessible through appropriate catalogues of datasets and data services (e.g. analytics, fusion, mining, processing).
  5. Services to make their own data FAIR, to store them and ensure long-term preservation.

The consultation process recommended providing free of charge the services under 1, 2 and 3, as well as under 4 except when the re-use and analysis of data involves big data or large computation power, in particular via a commercial service provider. This would entail co­-financing from other sources (e.g. a national or European grant). The cost model of the services described under 5 would be determined when deciding on the long-term business model for EOSC.

Services as proposed above, that could effectively be provided under the EOSC reflect existing offers by service providers across Europe such as EGI, EUDAT and GEANT, and by existing research data repositories. Work to integrate and federate such services has already began in Horizon 2020 Work Programme 2016-2017, with the EOSC-hub project and other related projects expected to deliver services under the EOSC. The projects will deliver the initial catalogue of services and data to be provided by EOSC and will define the delivery model(s) for the services. Those catalogues would be enriched periodically based on the process of federation.

Access & Interface

The consultation and evidence gathered indicate the benefits of giving users a choice between different entry points for accessing EOSC services for practical reasons and to ensure a smooth transition from legacy research data systems in contrast to implement a single access point. Work on the EOSC access and interface has already begun under Horizon 2020 Work Programme 2016-2017.

The entry points to the EOSC would be similar but not equivalent, and typically would consist of a web-based user interface, or front-end, which can be tailored to the specific needs and context of particular user communities. In addition, it would comprise a common platform building on the EOSC-hub project and further developed in the INFRAEOSC-06-2020 call a) and b), that would be accessible to users via machine-to-machine interfaces and which offers access to shared EOSC resources and to the full range of EOSC services.

Services provided under the EOSC would be made accessible via a EOSC portal, based on the work developed by the EOSC-hub and eInfra Central projects and further support planned in Horizon 2020. Acting as an entry point for all potential users, the portal would have a full-fledged user interface supported by the common platform. Such an entry point usually guarantees that all users have access to the full range of services, irrespective of geographical location or scientific affiliation.

 

Want to be the first to know about the outputs of the Working Group?
Then join the EOSC Secretariat network now!

 

 

[1]   The Commission funded the integration, interoperability and federation of data infrastructures in various fields and the development of horizontal data services. These actions delivered generic and thematic data services, workflows, interoperable standards and ontologies, which pave the way toward the establishment of a European integrated environment for research data. In particular, Horizon 2020 supported the development and interoperability of pan- European thematic data infrastructures, through targeted support to the implementation and operation of the ESFRI roadmap projects identified through the ESFRI prioritisation exercise. Among these, two priority ESFRI projects,

EPOS (European Plate Observing System) and ELIXIR (The European Life-Science Infrastructure for Biological Information), received significant funding in the 2014-2015 WP.

[2]   The two calls are: DT-SFS-26-2019: Food Cloud demonstrators and BG-07-2019-2020: The Future of Seas and Oceans Flagship Initiative, http://ec.europa.eu/programmes/horizon2020/en/h2020-section/food-security-sustainable- agriculture-and-forestry-marine-maritime-and-inland-water



Working Group Architecture members

Architecture WG role: Architecture WG chair

Organisation: Inria - Emeritus Senior Scientist

Jean-François Abramatic
Architecture WG role: Architecture WG EOSCsecretariat Contact Point

Organisation: Jülich -

Daniel Mallmann
Architecture WG role: Architecture WG EOSCsecretariat Contact Point

Organisation: Jülich -

Meredith Peyser
Architecture WG role: Architecture WG EOSCsecretariat Contact Point

Organisation: Jülich -

Eleonora Epp
Architecture WG role: EC Contact Point

Organisation: DG Research & Innovation (DG RTD) - European Commission - Policy Officer - European Open Science Cloud and Open Science policies

Corina Pascu
Architecture WG role: EC Contact Point

Organisation: Directorate-General Research and Innovation (DG RTD) of the European Commission - Policy Officer at the Open Science Unit

Thomas Neidenmark
Architecture WG role: EC Contact Point

Organisation: European Commission -

Georgia Tzenou
Architecture WG role: EC Contact Point

Organisation: European Commission -

Enrique Gomez
Architecture WG role: EC Contact Point

Organisation: European Commission -

Christian Cuciniello
Architecture WG role: EC Contact Point

Organisation: European Commission -

Carlos Casorran

Organisation: ICOS-ERIC (representing ENVRI-FAIR cluster project) - Director

Alex Vermeulen

Organisation: Academic and Research network of Slovenia -

Avgust Jauk

Organisation: STFC - Head of the Data Science and Technology Group, in the Scientific Computing Department of the Science and Technology Facilities Council

Brian Matthews

Organisation: Institute of Physics Belgrade -

Dusan Vudgragovic

Organisation: Wageningen University & Research - Infrastructure Coordinator of the Wageningen Data Competence Center

Erik van den Bergh

Organisation: UNINETT Sigma2 AS -

Francesca Iozzi

Organisation: ENEA -

Francesco Iannone

Organisation: VIB - Programme Manager/Project Manager, Head of Node ELIXIR Belgium

Frederik Coppens

Organisation: ATHENA Research center & University of Athens - Research associate

George Papastefanatos

Organisation: SURFSara -

Hylke Koers

Organisation: Universidad Complutense de Madrid - Head of the Data-intensive Cloud Lab

Ignacio Martín Llorente

Organisation: ETAIS / University of Tartu - IT expert specialising in cloud solutions

Ilja Livenson

Organisation: PaNOSC -

Jean Francois Perrin

Organisation: CNRS - Head of the SCIGNE facility at the Hubert Curien Pluridisciplinary Institute

Jérôme Pansanel

Organisation: LIP -

Jorge Gomes

Organisation: JNP - founding partner

Jorge Sanchez

Organisation: Aalborg University -

Josva Kleist

Organisation: ESCAPE Cluster - General manager and senior staff researcher

Kay Graf

Organisation: Malta Information Technology Agency - Consultant

Keith Aquilina

Organisation: GÉANT - Chief Community Support Officer

Klaas Wierenga

Organisation: Swedish Research Council, SUNET -

Leif Johansson

Organisation: Vilnius University - Head of IT Development Department at Vilnius University IT Service centre

Levaldas Zigmantas

Organisation: Poznan Supercomputing and Networking Center (PSNC) - Head of IoT Systems department

Marcin Płóciennik

Organisation: SURFsara - System architect, data preservation services team

Mark van de Sanden

Organisation: Datacite -

Martin Fenner

Organisation: CNR - ISTI - (PhD) Researcher in computer science

Paolo Manghi

Organisation: MPG - Max Planck Computing and Data Facility (MPCDF) - Head of Data Division

Raphael Ritz

Organisation: Indiana University -

Robert Quick

Organisation: Department of Computer Architecture at UCM and OpenNebula -

Ruben Santiago

Organisation: Swiss National Supercomputing Centre (CSCS) - ETH Zurich - Chief Technology Officer (CTO)

Sadaf Alam

Organisation: CERN - Senior staff member

Simone Campana

Organisation: Athena Research Center (ARC) - Project Manager and Research Associate

Spiros Athanasiou

Organisation: EMBL-EBI -

Steven Newhouse

Organisation: OPERAS/TGIR Huma-Num (CNRS) - Research engineer

Suzanne Dumouchel

Organisation: Deutsches Klimarechenzentrum (DKRZ) -

Tobias Weigel

Organisation: INFN (Italian National Institute for Nuclear Physics) - Senior Researcher

Tommaso Boccali

Organisation: Scientific Computing Department - STFC -

Vasily Bunakov

Organisation: University of Helsinki - Team leader of the IT for Science group, coordinator of the Data Support service, data manager, technology architect, project manager and Lean coach

Ville Tenhunen

Organisation: ExPaNDS -

Volker Gulzow

Organisation: Instituut voor Informatica UvA/ ENVRI-FAIR -

Zhiming Zhao