DataDesc - NFDI4ING

Private: DataDesc

Link to the service

https://github.com/FZJ-IEK3-VSA/DataDesc

Logo

Detailed description of the service

Service’s Capabilities:

In addition to capturing general information relevant to the scientific context, DataDesc aims in particular at the detailed documentation of software interfaces. The programming language agnostic schema offers the possibility to treat all functions of an interface and its input and output parameters individually. The structured capture of information enables their automated processing and increases their findability and comparability. Mapping them as machine-actionable metadata allows both humans and computers to discover and understand the capabilities of a software, interact with it, and integrate it with other programs and data without having to refer to the source code or further documentation.

Added Value for the User:

Software metadata schemas often concentrate on the general provision of information and neglect the description of interfaces, which results in problems for downstream users in the subsequent application and integration of the software. DataDesc is designed to capture precisely this information.

To be able to use different software publication platforms in parallel and utilise their different strengths to increase the impact and transparency of software, metadata often has to be collected redundantly and adapted to heterogeneous formats and processes. Here, DataDesc offers a machine-processable and programming language-independent exchange format and automated publication pipelines that allow metadata to be collected only once, thus reducing the documentation effort.

Service’s Suitability:

DataDesc is ideal for researchers, academics, and students who create and use research software for their computational analyses and want to improve the interoperability, reusability, and findability of their work.

Typical Use Cases:

Common uses include the annotation of the data models implemented in research software interfaces directly within the code. Along with the collection of general metadata about the software, the information is converted to sustainabe and reusable DataDesc documents. Once the information is compiled, it can be automatically uploaded to various software publication platforms for registration, documentation and dissemination.

Strengths of the Service:

The DataDesc framework focuses on research software interfaces and their data models
A metadata schema maps input and output content, formats, value ranges and structures
Tools enable the collection, exchange and publication of machine-actionable metadata
DataDesc reduces annotation efforts and promotes software reuse and integration

Weaknesses of the Service:

To date, only a parser for Python-based research software and interfaces to five software platforms are available.
DataDesc is offered as a GitHub download but not yet as an online service.

Terms of use & restrictions

DataDesc is an open-source software project available to anyone on GitHub. There are no costs associated with using DataDesc and no registration is required.

Contact

Patrick Kuckertz, p.kuckertz@fz-juelich.de

References

publications that reference (or report on using) the service

Kuckertz, P., Göpfert, J., Karras, O., Neuroth, D., Schönau, J., Pueblas, R., Ferenz, S., Engel, F., Pflugradt, N., Weinand, J. W., Nieße, A., Auer, S. & Stolten, D. (2023). A Metadata-Based Ecosystem to Improve the FAIRness of Research Software. arXiv preprint arXiv:2306.10620.