For the vast majority of researchers, working with code is common practice. However, many researchers lack knowledge and experience in how to write good code. As a result, their code is often chaotic, non-reusable, non-interoperable and it is hard (or impossible) to include new people such as students or colleagues in developing the code further. On the positive side, researchers and students are usually highly motivated to improve their programming skills, at least as long as the hurdle isn’t too high or it takes too much time.
Learning from past experiences of getting vertigo when looking at code a student just handed in, of dealing with months-long refactoring, and of the common frustration when debugging a code which was working “just a minute ago”, the RDM team at the Chair of Fluid Systems of the Technical University Darmstadt has developed a set of guidelines for assuring basic quality of code developed there. It constitutes a set of steps that are very easy and fast to implement, require little to no experience and lead to a significant improvement of the written code – such as linting, autoformatting, and type checking. We have also created a lightweight Python project template for GitLab which presents a great starting point for well managed and well documented Python projects.
The guidelines are focused on the programming language Python but are applicable to other programming languages as well. They rely on existing standards such as PEP8 and other thorough guidelines such as that of Google. They consist of the following two groups:
- Language Rules: these include linting, spellchecking, code structuring
- Style Rules: these include writing good documentation, using naming conventions for variables, classes, modules, etc., managing whitespaces, automatic formatting.
- Unit Testing
While the guidelines are concerned with how the code is written (with regard to language and style), they do not assure quality with regard to whether the code works. Especially when multiple people develop code cooperatively (either within a research team simultaneously or when using code someone else has written), it is essential to gain trust in the code without having to manually check every single function written by someone else. Moreover, it is necessary to ensure that the functionality of functions remains as expected even when new functions are written, new features are added or bugs are fixed. These points can be addressed by so-called unit testing. The idea behind unit testing is checking (in engineering language: validating) each unit (small piece of code, most commonly a function or method) to ensure that it works as expected.
In TA Alex, we developed an introductory workshop explaining the basics of unit testing in Python using simple examples. We held this workshop at the NFDI4Ing Conference (26.-27.10.2022) as well as at the Chair of Fluid Systems for our colleagues. We also created a GitLab project for unit testing in which simple unit tests can be found.
We offer to hold a workshop on better code quality with focus on Python as well as on unit testing at your institution as well. If you are interested, please contactM. Leštáková.
M. Leštáková