The migration from our old cluster to the new one is complete. There were some bumps along the way, as usual, but in the end we have a more modern and robust computing platform. A few notes:
-
The new cluster is maximally utilizing all of the available 166 CPU and 664GB RAM in our Radiant allocation across ten 16-cpu worker nodes (with the remainder consumed by a trio of small control-plane nodes).
-
The larger number of worker nodes means that we should have plenty of horsepower for MUSES workflows. To that end I have increase the number of Calculation Engine task workers to 10, with resource limits set to allow concurrent processing across 140 cores. This should be a good starting point, but I look forward to optimizing the system according to feedback from real-world use. We have the freedom to tune many parameters to improve performance for the needs of our researchers.
-
At the request of some module developers (@mrpelicer), there is a parallel deployment of the CE supporting custom modules that is available to trusted collaborators on an as-needed basis.