Research data management (RDM) refers to those practices aimed at organising, storing, documenting and preserving data that is generated and collected during the research data lifecycle. The research data life cycle identifies several stages which research data go through before, during and after a research project. While the specific stages may vary, they generally include data collection, data analysis and interpretation, publication and sharing, and preservation and archiving. Proper research data management is crucial in all the different stages as it helps ensure that data is efficiently and responsibly handled throughout a research project.
Within this context, the term “research data” covers the diverse set of information, knowledge and results generated by, and at the same time supporting, research projects in different scientific fields. In recent years, research data has taken on increasing importance because it eases the transition to open science. The implementation of good research data management practices is therefore key to
ensuring that research results can be shared and reused by the greater research community.
Research data is also acquiring more importance in the wider discussion around digitalisation. Equipping future graduates, researchers and society at large with the skills needed to support the digital transition is becoming a priority on European, national and institutional agendas. Research data management and FAIR data are part of this skillset, and research data processing and management careers are increasingly in demand in both the academic and private sectors.
A recent EUA survey exploring universities’ innovation capacity and their role in supporting the digital transition shows that research data management and staff uptake of digital skills are two important challenges for universities pursuing the digital transition through their innovation activities. The survey also highlights how digitalisation is being largely implemented through universities’ research activities. The right infrastructure and skills will therefore be key enablers that help universities in their digital transformation journey.
The availability of data infrastructure at the institutional level is crucial to enable and support the digital transformation of universities. Data infrastructure allows for the storage, management, processing and sharing of the high number of data that universities are producing and working with daily. It facilitates learning and teaching activities (e.g., infrastructure connected to online learning platforms and management systems), research activities (e.g., data storage and data infrastructure), and services to students, researchers and academic staff.
In 2022, EUA published a follow-up report to the main EUA 2020-2021 Open Science Survey report looking at the research data practices of universities in Europe. The follow-up report investigated how the research data landscape is changing in the higher education sector, as FAIR research data and research data management are increasingly ranking higher in the strategic agenda of universities.
Regarding the infrastructure needed to conduct the research activities of universities, most of the surveyed institutions confirmed having some type of research data infrastructure, including infrastructures for data storage and data repositories. At the same time, the landscape presented in the report appeared to be quite diverse, as universities seemed to rely less on purely internal infrastructure in favour of external and shared infrastructure, or a combination of all three.
Developing new and/or enhancing current infrastructure is also ranking highly among institutional actions to support the transition to open science. In some cases, support is also coming from the national level, with the creation of ad-hoc initiatives and funding opportunities to enhance the availability of data infrastructure. This is the case of Germany, where the National Research Data Infrastructure (NFDI) was established to systematically manage scientific and research data, providing long-term data storage, backup and accessibility, and network the data both nationally and internationally. Furthermore, the Netherlands proposes a publicly accessible data repository platform, open to researchers from research-performing institutions, to deposit and share research data openly with anyone.
At the European and national levels, the landscape of data (e-)infrastructure is quite diverse. (E)-infrastructure can provide discipline-agnostic digital services to universities and research-performing organisations or discipline-specific support in areas where this may be required, such as research data management. Examples of data (e-)infrastructure include: OpenAIRE, GÉANT, DARIAH, CLARIN and CESSDA.
Data infrastructure is also thriving at the international level, offering services for domain-specific research such as crystallography, particle physics, astronomy, bioinformatics and genetics. Furthermore, the Research Data Alliance (RDA) offers international research data experts an important forum for discussion and exchange on several topics related to data sharing and reuse.
As universities continue to invest in the development of data infrastructure, questions about the ownership of the data are also entering into the policy discourse related to the digital transformation of universities and open science. Universities are increasingly recognising the need to reclaim and keep control of the data generated by their educational and research activities in their relations with commercial providers and scholarly publishers. This need has led to new initiatives aimed at developing different and mixed solutions to streamline the hosting and keep control of the data produced by universities. An example of this approach comes from France, where regional data centres are being developed and managed by universities established in the same region.
See also digital sovereignty.
At the same time, new European digital regulations such as the Digital Services Act will set obligations and requirements for online platforms. A recent study showed that, as the legislation was drafted with economic services in mind and failed to clearly state that only for-profit services are subjected to its provisions, universities’ research and educational data repositories might also have to comply with the requirements set for intermediary services and online platforms. This will lead to much legal uncertainty for universities, as well as new costs and administrative burdens to manage internal or shared data infrastructure.
With the growing relevance of data infrastructure, universities are paying renewed attention to ensuring the availability of the institutional staff needed to manage their correct use. The last decade has seen the emergence of new data career profiles, such as data stewards, research data officers/managers, and data administrators.
However, the lack of a common framework and clear career development path is hindering the recognition and professionalisation of data science and data stewardship job profiles. Responsibilities related to research data management are still largely falling to existing members of the staff, usually working in university libraries and IT teams, who, in the absence of adequate upskilling and reskilling opportunities, may not be able to offer researchers the right support and guidance. Universities should rather create dedicated research data support services and hire for specific data support roles, which are crucial to performing good research data management and complying with requirements to access national and European funding opportunities.
More actions should also be taken to promote a FAIR research data management culture at different institutional levels. This includes promoting the uptake of data-related skills among students and researchers and achieving the reform of research assessment systems. Currently, research data practices still rank low among the indicators used to assess research careers and quantitative indicators, such as the Journal Impact Factor, continue to be the main evaluation practice in academic assessment, as highlighted in a EUA report on academic assessment. This approach fails to consider the diversity of outputs currently resulting from the research process, including data, protocols, algorithms and software, and the important work needed to ensure these are ready to be shared and reused. As a result, researchers still tend to perceive data management recommendations and requirements as an extra burden, rather than as a practice that helps promote the integrity and visibility of their work.
See also Skills and support.
In the wider European context, the uptake of research data management skills and practices is a key step in the roll-out of the European Open Science Cloud (EOSC). EOSC aims to provide a shared ecosystem through the means of federating existing research data (e-)infrastructure. Its ambition is to equip European researchers, innovators, companies and citizens with a federated and open, cross-border and multi-disciplinary data space (or data commons) where they can publish, find and reuse data, tools and services for research, innovation and educational purposes. In this sense, EOSC should be considered as an overarching transverse European Data Space for research, implemented as orthogonal and supplementary to the thematic common European Data Spaces.
The establishment of EOSC will require a joint effort between different stakeholders and scientific communities, including universities, repositories and research infrastructure. EOSC will offer them a common platform for seamless data discovery, sharing, re-use and analysis. However, the implementation of EOSC is closely linked with the broader adoption of research data management practices at institutional level, which can only be supported by the availability of data infrastructure, technical skills and a clear policy framework for the management and sharing of data.
As showed in the 2020-2021 EUA Open Science Survey report, while universities are aware of the possible benefits brought by EOSC, they are still deciding whether to link their infrastructure to the EOSC ecosystem. This is due to several challenges, such as the issues of interoperability of services, limited institutional capacity and low awareness of what the process will entail. Due in large measure to its complex administration and unclear technical implementation, researchers still struggle to understand how EOSC will contribute to their work and how to engage with its development.
As knowledge-provider institutions and research-performing organisations, universities are both key enablers and major beneficiaries of the EOSC roll-out. Regional, national and European support will therefore be needed to address university needs and concerns, ensuring the engagement of the whole university community with EOSC.