Hi. I’ve been running UM vn13.7 global suites on Monsoon2 and they were running fine until recently, but some of my latest suites stop at the end of every cycle for some reason.
u-dq018 and u-dq112 are currently running fine while u-dq115 and u-dq162 are the ones with this problem. These suites are very similar to each other and have only small scientific differences, I believe.
cylc says ‘submission failed’ for postproc of that cycle and atmos_main of the next cycle. Other than that I don’t see any error message. Neither job.err or job.out seems to have been created.
If I manually trigger run on these failed jobs, they run fine for a cycle and then stop for the same reason. That means they require my constant attention and are stopping for hours over night.
Please could you help me?
Thanks,
Masaru