The NFDI4Ing Metadata Profile Service

The NFDI4Ing Metadata Profile Service is a platform that facilitates creation, curation and sharing of metadata profiles. The profiles are based on the W3C recommendation SHACL and can be created in a graphical user interface by selecting suitable terms from existing ontologies.

Screenshot of the Metada Profile Service GUI

The description of research data by means of structured metadata is essential for the subsequent use and efficient application of the data. At the same time, the description must be specific, flexible, and interoperable. Metadata schemas or ontologies alone can only help to a limited extent, since the required reusability limits their specificity.

Flexible & RDF-compliant profiles
Within the context of NFDI4Ing, we have developed an approach to create subject-specific RDF-compliant metadata profiles (in the sense of SHACL shapes) that enable precise and flexible documentation of research processes and data. By using a hierarchical inheritance approach, which we combine with a strategy that uses the combination of relatively simple modular profiles to model complex setups, the individual profiles are highly reusable and can be used in different contexts, increasing the interoperability of both the profiles and the resulting data. In addition, the approach gets by with low- and medium-specificity ontology terms that can be assumed to be available for many disciplines, and still achieves sufficient precision for typical scientific applications via their context-specific use and combination.

Easy modelling with GUI
To facilitate the modelling process and make it accessible to users with only limited knowledge about ontologies, we have developed a web service that provides a graphical user interface for creating metadata profiles. The platform is available at https://profiles.nfdi4ing.de. Its frontend allows searching for suitable terms from existing terminologies and adding them to a profile along with restrictions on the permitted value nodes, allowing setting expected datatypes, classes, or nodetypes as well as the cardinality of attributes. The profiles are first created as a draft that can be edited collaboratively by sharing a link, but that are otherwise not publicly visible. Once a profile is published, it can no longer be edited, but is visible to everyone and receives a persistent ID. Visible profiles can be searched and are indexed according to a metadata schema providing information like, e.g., title, description, author, creation date, as well as the scientific domain. Existing profiles can be used to derive new profiles, typically by creating more specific child profiles.

Share, curate & re-use
The platform also supports curation of metadata profiles. Profiles can be submitted for approval by a scientific community that is realised by triggering a merge request to a GitLab group containing appointed reviewers of the community. If a profile is approved, it receives a corresponding metadata entry, making it possible for users to find peer-reviewed profiles recommended by their community.

Users can also upload metadata corresponding to a metadata profile to the platform, either via an API or by manually entering the data in a form. The metadata will be validated and published on the platform. The stored metadata can be searched by selecting a metadata profile that the platform translates to a frontend that provides a form in which users can search for specific attribute values. A SPARQL endpoint exists for integration to other infrastructure.

Validation of metadata is also possible via the API without subsequent publication, allowing users to validate their local data according to a metadata profile.

Marc Fuhrmans
Benedikt Heinrichs

Tags

NFDI4ING services may be relevant to different users according to varying requirements. To support filtering or sorting, we added a tag system outlining which archetype, phase of the data lifecycle, or degree of maturity a service corresponds to. By clicking on one of the tags below, you can get an overview of all services aligned with each tag.

This service has the following tags:

The tags correspond to:
The Archetypes: Services relevant to Alex – Bespoke Experiments, Betty – Research Software Engineering, Caden – Provenance Tracking, Doris – High Performance Computing, Ellen – Complex Systems, Fiona – Data Re-Use and Enrichment

The data lifecycle: Services related to Informing & Planning, Organising & Processing, Describing & Documenting, Storing & Computing,
Finding & Re-Using, Learning & Teaching

The maturity of the service: Services sorted according to their maturity and status of their integration into the larger NFDI service landscape. For this we use the Integration Readiness Level (IRL), ranging from IRL0 (no specifications, strictly internal use) up to IRL4 (fully integrated in the German research data landscape and the EOSC). Click here for a diagram outlining all Integration Readiness Levels.