Hi guys,
I have an intermittent problematic behaviour with my cylc8 UKESM1.2 (ARIA PROMOTE) suites on archer2. I think the current state of u-ds260/19560701T0000Z/coupled illustrates the symptoms.
The first attempt at running “coupled” seems to succeed, and restarts to start the next cycle are written. For some reason however cylc seems to think it has failed, and automatically retriggers. This doesn’t work though, since the UM xhist file is already pointing to the restart for the next cycle, and the drivers object to the existence of NEMO restarts for the future cycle too, so deletes them. The retriggered coupled exec then just sits there until it times out.
I think that means I’m stuck - I can neither rerun this cycle, nor manually trigger the next one now that the NEMO dumps have been deleted. I’ve had several suites get into this state now, although sometimes the first attempt at coupled is explicitly marked as having timed out too.
Can you see what’s causing this? Have I missed something?
cheers,
robin