Analysing large collections of documents using text-and-data-mining methods can yield entirely new scientific insights. In the NFDI4Ing task area Automated data and knowledge discovery in engineering literature (S-7), services are developed to enable researchers to more easily apply these innovative methods in practice.
Current Obstacles for TDM
Services currently under development in S-7
To address these obstacles, one aim of our measure is to provide researchers access to engineering literature in a machine readable, structured XML format that is particularly well suited for TDM applications to the best extent legally and technically possible. To achieve this goal, we are currently harvesting engineering publications, preparing their conversion into the uniform format, and looking into their provision in compliance with the applicable copyright regulations. In parallel, guidelines for researchers on the legal aspects of text and data mining are being developed. To this end, the precise boundaries of statutory exceptions regarding text and data mining as well as Open-Access-Licences and license contracts of publishers of the most important resources in the field are currently being analysed.
Looking for feedback
We are always very interested in literature demands for text-and-data-mining projects, both to check against literature we are already able to provide and to add to our basis for harvesting. If you are working on a TDM project, we would be very pleased if you could send us a short message with the literature you need for your project.
1. Kononova, O., Huo, H., He, T. et al. Text-mined dataset of inorganic materials synthesis recipes. Sci Data 6, 203 (2019). https://doi.org/10.1038/s41597-019-0224-1↩