What Do We Want? Better Code Quality! When do we want it? NOW!

For most researchers, working with code is common practice. However, many researchers lack knowledge and experience in how to write good code. In the Task Area Alex, we addressed this problem and developed a set of guidelines currently being tested at the Chair of Fluid Systems, TU Darmstadt.

The quest for good code continues.

For the vast majority of researchers, working with code is common practice. However, many researchers lack knowledge and experience in how to write good code. As a result, their code is often chaotic, non-reusable, non-interoperable and it is hard (or impossible) to include new people such as students or colleagues in developing the code further. On the positive side, researchers and students are usually highly motivated to improve their programming skills, at least as long as the hurdle isn’t too high or it takes too much time.

Learning from past experiences of getting vertigo when looking at code a student just handed in, of dealing with months-long refactoring, and of the common frustration when debugging a code which was working “just a minute ago”, the RDM team at the Chair of Fluid Systems of the Technical University Darmstadt has developed a set of guidelines for assuring basic quality of code developed there. It constitutes a set of steps that are very easy and fast to implement, require little to no experience and lead to a significant improvement of the written code – such as linting, autoformatting, and type checking. We have also created a lightweight Python project template for GitLab which presents a great starting point for well managed and well documented Python projects.

You can access and fork the project template under TA_Alex / Python Project Template

The guidelines are focused on the programming language Python but are applicable to other programming languages as well. They rely on existing standards such as PEP8 and other thorough guidelines such as that of Google. They consist of the following two groups:

  • Language Rules: these include linting, spellchecking, code structuring
  • Style Rules: these include writing good documentation, using naming conventions for variables, classes, modules, etc., managing whitespaces, automatic formatting.
  • Unit Testing

While the guidelines are concerned with how the code is written (with regard to language and style), they do not assure quality with regard to whether the code works. Especially when multiple people develop code cooperatively (either within a research team simultaneously or when using code someone else has written), it is essential to gain trust in the code without having to manually check every single function written by someone else. Moreover, it is necessary to ensure that the functionality of functions remains as expected even when new functions are written, new features are added or bugs are fixed. These points can be addressed by so-called unit testing. The idea behind unit testing is checking (in engineering language: validating) each unit (small piece of code, most commonly a function or method) to ensure that it works as expected.

In TA Alex, we developed an introductory workshop explaining the basics of unit testing in Python using simple examples. We held this workshop at the NFDI4Ing Conference (26.-27.10.2022) as well as at the Chair of Fluid Systems for our colleagues. We also created a GitLab project for unit testing in which simple unit tests can be found.

We offer to hold a workshop on better code quality with focus on Python as well as on unit testing at your institution as well. If you are interested, please contactM. Leštáková.

M. Leštáková

Tags

NFDI4ING services may be relevant to different users according to varying requirements. To support filtering or sorting, we added a tag system outlining which archetype, phase of the data lifecycle, or degree of maturity a service corresponds to. By clicking on one of the tags below, you can get an overview of all services aligned with each tag.

This service has the following tags:

The tags correspond to:
The Archetypes: Services relevant to Alex – Bespoke Experiments, Betty – Research Software Engineering, Caden – Provenance Tracking, Doris – High Performance Computing, Ellen – Complex Systems, Fiona – Data Re-Use and Enrichment

The data lifecycle: Services related to Informing & Planning, Organising & Processing, Describing & Documenting, Storing & Computing,
Finding & Re-Using, Learning & Teaching

The maturity of the service: Services sorted according to their maturity and status of their integration into the larger NFDI service landscape. For this we use the Integration Readiness Level (IRL), ranging from IRL0 (no specifications, strictly internal use) up to IRL4 (fully integrated in the German research data landscape and the EOSC). Click here for a diagram outlining all Integration Readiness Levels.