Permission denied (publickey) failure

James

I’m confused - your suite u-cl703 is running with:

grenvill@ln03:/home/n02/n02/jweber/cylc-run/u-cl073/share/data> ls -lrt
total 6430396
lrwxrwxrwx 1 jweber n02         52 Mar  8 18:07 cl073a.ainitial -> /work/n02/n02/jweber/dump_files/cc298a.da20100101_00
drwxr-sr-x 4 jweber n02       4096 Mar  8 18:12 etc
-rw-r--r-- 1 jweber n02 6585507840 Mar  9 12:25 cl073a.astart
drwxr-sr-x 3 jweber n02       4096 Mar  9 12:30 History_Data
grenvill@ln03:/home/n02/n02/jweber/cylc-run/u-cl073/share/data> ls -lrt /work/n02/n02/jweber/dump_files/cc298a.da20100101_00
-rwxrwxrwx 1 jweber n02 19944304640 Mar  2 16:07 /work/n02/n02/jweber/dump_files/cc298a.da20100101_00

It looks to me that you have reconfigured the March 2 cc298a.da20100101_00, which, I think, is the one I overwrote.
ainitial should point to cc298a.da20100101_00_cp - it doesn’t currently:

site/archer2.rc:{% set AINITIAL = AINITIAL_DIR + ‘cc298a.da20100101_00’ %}

Try
site/archer2.rc:{% set AINITIAL = AINITIAL_DIR + ‘cc298a.da20100101_00_cp’ %}

Grenville

Hi Grenville,

Sorry yes I was editing the dump file in rose-suite.conf file but realise that I need to be editing it in the site/archer2.rc instead. I have changed it to the _cp version in u-cl073 now and I am currently rerunning.

Thanks,

James

Hi Grenville,

After I edited archer2.rc to point to the cc298a.da20100101_00_cp start file, the model ran for much longer than before (40 mins v. 2 mins previously) but failed again with the GLUE_CONV_6A error.

I’ve had a look at a few fields in the cc298a.da20100101_00_cp file and it doesn’t appear to be corrupted. However, I’m not certain whether the model actually started to run as I can’t see the timestamps printed in the job.out file. There is also no atmos_main/pe_output subdirectory in the /work/date/ directory which I would expect to find.

Would you have any recommendations?

Thanks,

James

James

The model ran for 1732 time steps (see /home/n02/n02/jweber/cylc-run/u-cl073/work/20100101T0000Z/atmos_main/pe_output/cl073.fort6.pe000)

then met with a problem - you could try the usual things to work around it.

Grenville

Hi Grenville,

Thank you, I hadn’t realised previously that the cylc-run on Archer2 has more information that the cylc-run on PUMA.

Following the suggestion from http://cms.ncas.ac.uk/ticket/1765, I reduced the chemistry timestep from 1 hour to 40 mins and u-cl073 run 1 month fine and the output data looks reasonable.

Thanks again for your help,

James

Just an update on this. Matt Shin and I have found that changing the processor decomposition of 128 and MAIN_ATM_PROCX=32 and MAIN_ATM_PROCY=18 allowed the model to run a full month with 1 hour chemistry timesteps.

I hadn’t changed the processor decomposition over from the Monsoon values previously.

James