Postprocessing fails to remove files from History_Data

Hi CMS,

I’m running UKESM1 on Archer2 (suite u-dc325), and have realised that despite postprocessing task succeeds on each cycle, lots of files which i would have expected to be cleared out after contributing to my archive are building up in /share/data/History_Data. This is clogging up my work directory on Archer, so at the moment i am just manually deleting files corresponding to earlier cycles as i go.

My understanding is that with the various delete options below switched on, the postprocessing task would automatically delete files from History_Data once they are no longer needed. Do you have any suggestions for making sure this happens?

Thanks!

Alistair

Alistair

Click on Atmosphere (under postproc) - you’ll see that the suite is only processing streams pa, p4 and p5. You can choose what to process.

Also - please wait for an update to fcm:moci.xm_br/pkg/rosalynhatcher/postproc_2.3_archer2_jasmin_pkg that addresses a possible data copy problem.

Grenville

Alistair

You have bad data in /work/n02/n02/aduffeyum/archive/u-dc325/20430101T0000Z - take a look at
dc325a.pa2043jun.pp and dc325a.pa2043jul.pp – the june file appears to be corrupted.

Grenville

Hi Alistair,

I have just committed changes to the fcm:moci.xm_br/pkg/rosalynhatcher/postproc_2.3_archer2_jasmin_pkg branch to workaround the data copy issue.

Please make sure you are using the latest revision/head of this branch (rev 5094) in your suite(s).

If you are continuing the run you will need to insert the fcm_make_pp & fcm_make2_pp tasks into the running suite to rebuild the postproc scripts. E.g. on PUMA2 run:

cylc insert --no-check u-dc325 fcm_make_pp.<current-cycle>
cylc insert --no-check u-dc325 fcm_make2_pp.<current-cycle>

Where <current-cycle> is of the form 20430101T0000Z
Make sure that fcm_make_pp runs before fcm_make2_pp.
In ..../cylc-run/u-dc325/share/fcm_make_pp/build/bin check that the scripts archer.py and utils.py have been updated.

Regards,
Ros.

Hi Ros, thanks for this!

If i restart the run (via suite-run --restart), will this update be picked up automatically? My suites are stopped so the cylc insert command won’t run.

Hi Alistair,

Once you’ve restarted the suite you will then need to run the cylc insert commands to insert the postproc build tasks as above.

Regards,
Ros.

Hi Grenville, thanks very much for your help on this.

I had naively turned off processing for lots of streams because i only need a relatively small set of outputs for these runs. Is there any example you could me towards for suites/guidance on archiving only a relatively small set of outputs, while still processing and cleaning out the full set?

Also thanks for pointing out the corrupt file issue, this seems to have affected a few of my suites, for certain cycles and streams. Do you think this is related to the issue which Ros’ update fixed?

Okay thanks, will do