Replanca: pp headers on ancillary file do not match

I am getting this error message for this suite u-cm245 on ARCHER2 after running from 1949 to 2014.
It crashes during the December month of 2014. I had previously run a predecessor suite (u-cl527) with an 1994-2005 Ozone ancillary file from 1949-2015. In the suite u-cm245, I replaced the Ozone ancillary file with an 1850-2014 ancillary file. Maybe there is something wrong with my settings so that it can’t finish running 2014 through to 20150101 with the new Ozone ancillary file? The error message says: “STASH code in dataset 31 STASH code requested 60”. Is this the Ozone STASH code?
I have looked at already.
Any further suggestions?

Error reported fully in this file:


OPEN: File /work/n02/n02/pmcguire/cylc-run/u-cm245/share/data/History_Data/cm245a.pj20141201 to be Opened on Unit 22 Exists
OPEN:  Claimed 4194304 Bytes for Buffering
OPEN: Buffer Address is 0x28bc0390
IO: Open: /work/n02/n02/pmcguire/cylc-run/u-cm245/share/data/History_Data/cm245a.pj20141201 on unit  22
loadHeader: Model Version: 11.5
REPLANCA - time interpolation for ancillary field, stashcode  60
targ_time,time1,time2  1425624.,  1425600.,  1426320.
hours,int,period  1425624,  720,  -1
Information used in checking ancillary data set: position of lookup table in dataset: 168301
Position of first lookup table referring to data type  1
Interval between lookup tables referring to data type  85  Number of steps 1980
STASH code in dataset  31   STASH code requested  60
'start' position of lookup tables for dataset  1 in overall lookup array  1
31,  60,  1

???!!!???!!!???!!!???!!!???!!!       ERROR        ???!!!???!!!???!!!???!!!???!!!
?  Error code: 201
?  Error from routine: UP_ANCIL
?  Error from processor: 0
?  Error number: 66

That (totally useless) error message indicates you’ve run out of data in the ozone ancil file. The thing is that it’s appears to have completed Dec 2014 successfully, the output data is in /work/n02/n02/pmcguire/cylc-run/u-cm245/share/data/History_Data, so it wrote all of the Dec 2014 data before the crash. The crash actually occurred on the 1st of Jan during the ancil processing phase.


Yay! Thanks for looking at this. I am so happy.
I see the 20150101 enddump file now. That is what I was worried wasn’t there.

I am manually trying to trigger the postproc app after the partially-successful atmos_main app, for 20131201 for u-cm245, which ran to 20150101 instead of all the way through the 18-month cycle to 20150601. I think I have done this type of thing before. But for this case, I am getting a submit-failed for the postproc app. There are no log files for postproc for this cycle on archer2, but there are job-activity logs for postproc for this cycle on puma test, which aren’t that informative.

I get the proper ssh response from pumatest to archer2, and I have tried restarting the ssh agent, but that didn’t help.

Any suggestions? I can put this in a different ticket if it is appropriate or better to do so.

OK. This seemed to work just fine for postproc on the 201301201 cycle of u-cm244, a different suite. The postproc app is running there now, for that suite for this cycle.

So I did a rose suite-run --restart on u-cm245, and retriggered the postproc app, and it has been submitted properly, I think. It is still waiting to run, but I think it will be fine.
Thanks again.