CE error handling and code repo issues & merge requests

andrew.manning · May 7, 2025, 3:24pm

Recently one of the module developers and I discovered a bug in the module’s error handling system. Both the bug itself and the programming practices involved in its resolution are universal enough that I wanted to share them here so that the other @devs can benefit.

Bug

The bug specifically related to the Calculation Engine was a failure of the module to catch errors so that they could be passed to the CE in order for the user to read the error message. The module did not guarantee that a status.yaml file was in place before it encountered an error, which means that the error conveyed to the user via the CE was simply “status.yaml” not found. The correct pattern to follow is:

The module should initially create a status.yaml before any errors can occur, with code 500.
During module execution, try/except blocks should be used to handle errors, allowing an informative error message and code to be injected into status.yaml before aborting.
At the end of the module execution, assuming no errors occurred, this file should be modified to emit code 200.

Programming practices

When implementing and publishing a fix for a bug like this, the easiest (i.e. laziest) thing to do as an individual programmer is to make the changes, commit and push them, and claim victory. However, we are professionals, and as such we need to do a small amount of additional work for the sake of making better scientific software:

When you identify an issue that requires changing code, like we did with the missing status.yaml file, create an issue that describes the problem and how you plan to fix it. Then make your updates in a separate, temporary branch that you name with the issue number as the prefix (or equivalent linking mechanism on GitHub).
It is up to you whether you create a merge request for your own issue resolution, but this is often a clear way to show what changed and why. In other words, when people visit your repo, they will see a problem (issue) and then the solution (merge request), where importantly these provide a forum for someone to comment on the process or changes. Often in open source projects I will see people commenting on a merge request or issue months later when their use of the code causes them to seek some solution that search engines direct to the repo issue. These are also useful for historical reference, when a new grad student or postdoc wants to see how the code evolved to understand its current form.