Submit-retrying

Hi Luciana,

  • For the install_cold app you’ll need to change the path of
    /work/n02/n02/annette/HRCM/ancil/data/ancil_versions/n1280e_orca12/GA7.1_AMIP/v6/ancils which can now be found under /work/y07/shared/umshared/HRCM/....

  • The cylc command for triggering a task from the command line is:
    cylc trigger REG TASK
    See cylc trigger --help for full details.

  • The job script that is in the log/job directory is the script that is submitted to Slurm and is available as soon as the task is submitted.

    The easiest way, however, to see how many nodes a task is running on once its been submitted is to query the Slurm queue on ARCHER2 using squeue -u <username>.

    atmos_main is the task that runs the model and is the only task affected by the change in atmos processor decomposition.

Regards,
Ros.

Dear Ros.

Still, in the suite u-ch427-n1280-template, I’m having this error message now.

? Error message: Failed to open file /work/n02/n02/grenvill/cylc-run/u-ce930/share/data/History_Data/ce930a.da19790301_00

About the number of nodes, I was exchanging some messages with Archer2 Support and they told me about:

sacct -j --format=“JobID,NNodes,elapsed”

I’m still testing the options because squeue always starts with 1 because the first process requests only one node.

Thank you for the other hints.

Kind regards, Luciana.

Hi Luciana,

Grenville is just copying /work/n02/n02/grenvill/cylc-run/u-ce930/share/data/History_Data/ce930a.da19790301_00 over from the 4-cab but it will be while before it’s there as the transfer is being very slow. We’ll let you know.

Each task in the suite is submitted as a separate batch job with a separate Slurm JobID so you need to do squeue on the atmos_main job id - you won’t be able to get any information out of squeue or sacct regarding number of nodes for the atmos_main task until it is submitted. If you want this before the task is submitted you’ll have to look yourself at the suite.rc.processed file to see what number of nodes it has calculated.

Cheers,
Ros.

That’s great. Thank you very much! :slight_smile:

Luciana

The n1280 version of u-ch427 doesn’t need to run the reconfiguration - please switch it off. I copied the start file /work/n02/n02/grenvill/cylc-run/u-bo026-ens-inc1280/share/data/History_Data/u-b026a.da19790301_00, which the suite is configured to use.

There one or two more file paths that need changing for n1280
change /work/n02/n02/annette/HRCM/cmip6spectralmonthly to /work/y07/shared/umshared/HRCM/cmip6spectralmonthly
and change /work/n02/n02/annette/HRCM/easy_aerosol/final/1949-2015/n1280e to /work/y07/shared/umshared/HRCM/easy_aerosol/final/1949-2015/n1280e

Hopefully that’s all, but see my copy of the suite if I’ve forgotten any

Grenville