Restarting a failed coupled suite

Hi,

A coupled vn11.1 UKESM suite (u-du284) was hit by the Archer2 issue over the weekend has failed on the 19090701T0000Z coupled task with:

[FAIL] No restart data available in NEMO restart directory:
/work/n02/n02/jweber/cylc-run/u-du284/share/data/History_Data/NEMOhist

I can see that the NEMOhist for has far fewer files in it than NEMOhist files for other coupled runs.

I have the initialisation files for the previous resubmission:

du284o_19090101_restart.nc, du284o_19090101_restart_trc.nc, du284o_icebergs_19090101_restart.nc, du284a.da19090101_00 anc du284i.restart.1909-01-01-00000.nc

I could use these to restart the run at at 19090101T0000Z. However when doing this in the past I have found the model failed to bit compare to runs where this stop-start procedure was not done ( Continuing a run after a successful stop ), even when the other switches (bitcomp_nrun=TRUE, lnrun_as_crun=true, task_recon=false) were set.

Is there any way we can get the run going from 19090701T0000Z again? As it has run for nearly 60 years, we are keen to avoid a complete restart.

Cheers,

James

Hello,

Any advice on this would be greatly appreciated.

Best,

James

James

I don’t see any alternative other than starting from 19090101T0000Z

Maybe switch off automatic resubmission - the 15 resubmissions paint a confused picture of how this suite got into this state.

Grenville

Thanks, Grenville. If it put

du284o_19090101_restart.nc,

du284o_19090101_restart_trc.nc,

du284o_icebergs_19090101_restart.nc,

du284a.da19090101_00,

du284i.restart.1909-01-01-00000.nc

in the ~/cylc-run/u-du284/share/data/Historydata directory, is there a “rose suite-run” type command I can use to start it at 19090101T0000Z?

James

James

It’s been a while since I’ve done this - the best thing to do will be to follow the guidelines here https://code.metoffice.gov.uk/trac/moci/wiki/tips_CRgeneral#RestartingFailingSuites

You are right about the start files, but the history file needs to be correct (you need to get the one from /work/n02/n02/jweber/cylc-run/u-du284/work/19080701T0000Z/coupled/history_archive.

You will probably need to insert the 190901T0000Z coupled task - and maybe some post proc tasks (not sure)

Grenville

Thanks, Grenville, I will look into it

James

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.