Author: Henri Ala-Lahti (FSD) team member of Data and Metadata Interoperability Hub

SSHOC aims to provide common services, integrate existing ones, and increase cooperation between the federated research infrastructures within the project. One challenge in creating common services is the diversity and varying practices of different fields of science. This can be seen in the work for SSHOC’s Data and Metadata Interoperability Hub, where we map interoperability issues and solutions across SSHOC member organisations. This article will highlight the main findings from the recently published Report on SSHOC (meta)data interoperability problems

Mapping SSHOC (meta)data interoperability problems

The goal of the report was to find out what kind of interoperability problems there are for research data and metadata in SSHOC member organisations. We also tried to find metadata and data standards and formats that can be recommended for all organisations.

We interviewed 16 people from six research infrastructures and four fields: social sciences, language sciences, arts and humanities, and heritage sciences. The interview findings were supplemented with desk research on the number of records and data formats on the websites of repositories.

This was not the first time interoperability was assessed. The research infrastructures participating in SSHOC have each developed common practices and standards, however, this time there were more organisations from different research infrastructures involved.

 

Different data but similar problems

When interviewing the informants, it quickly became clear that the data formats used in the fields vary a great deal. In social sciences, the data are often in the form of a data matrix or text, while language sciences use a lot of text and voice data. Images are common in the humanities, and in heritage sciences, a dataset can consist of objects or 3D models, among others. Despite the differences in data types, the interoperability problems were similar in all participating organisations. 

  • The most common of these included the conversion needs related to the use of proprietary file formats, loss of information caused by conversions and problems with format versions.
  • There is also a great deal of variety between and within the fields in terms of metadata standards. For instance, the DDI standard often used in social sciences is usually not extensive enough to describe data in the humanities and hardly suitable at all to record the metadata of heritage objects. The metadata needs of individual organisations also vary significantly.
  • The most common metadata interoperability problems had to do with differing interpretations of metadata concepts, incompatibility of older standards with the newer ones, and loss of information when converting from rich metadata format to a less rich one.

 

All in all, fewer interoperability problems were reported than we expected. One reason for this might be that organisations have reacted to the problems by developing their practices around them. On the other hand, we also observed differences between the organisations in their maturity levels in terms of how well they take interoperability into consideration.

 

One size does not fit all

The report confirmed what was noticed in previous interoperability projects; there is no single metadata standard or data format suitable for all fields and situations. A common metadata standard, for example, has to be very bare-boneds to be suitable for all fields.

  • To ensure the F and I, findability and interoperability of the FAIR principles, we ended up recommending Dublin Core and a slightly modified DataCite as common metadata standards for all members. It is recommended that other standards used by the organisations can at least be converted into one of these standards.
  • Because the fields and organisations have different needs, we also made separate recommendations for metadata standards for each community and data formats by data type. The recommendations are based on the most used standards and formats in the communities.
  • Whichever standard and format organisations use, we recommend documenting it transparently. Many interoperability problems are avoided with thorough documentation.

 

Common solutions through cooperation in SSHOC

The report on SSHOC (meta)data interoperability problems shows that developing common services is not always straightforward. Different communities not only have their own needs but also their established practices. However, the purpose of the project is not to fit everyone into the same mould but to create tools and services beneficial for everyone. Cooperating and gaining perspective into the processes and challenges of other communities and organisations is useful also because the problems encountered are often similar, and someone might have already found a solution that works for everyone. The next objective of the Data and Metadata Interoperability Hub is to chart solutions to metadata and data interoperability problems.

 

Want to know more?

Read the full report here:  SSHOC D3.1 Report on SSHOC (meta)data interoperability problems

Read the original article on FSD here: https://tietoarkistoblogi.blogspot.com/2019/08/sshoc-project-charted-metadata.html

Tell us about your SSH (meta)data interoperability needs. Join us at the "EOSC Services, Collaborations and the RDA" discusion. register here.