MPICH ERROR - HadGEM3 GC5e

Hello,

I currently met a problem with MPICH in one of my HadGEM3 GC5e experiments.

I thought this might be a consequence of restart files and tried to start a new run with previous year, but I got the same error message.

I have checked the ocean.output, but there is no specific error information.

the details can be found here: /home/n02/n02/xuang/cylc-run/u-do729/, therein run7 is the new run restarted from a previous month of run6.

Do you have any clue about this error and know how to solve this? if so, that would be very appreciable.

Xu

Hi,

This appears to be an OASIS coupler failure, see work/22041201T0000Z/coupled/debug.root.02



(oasis_init_comp) OASIS RUNNING
(oasis_init_comp) OPEN debug file for pe, unit :       0    9999
(oasis_advance_run) ERROR: coupling skipped at earlier time, potential deadlock
(oasis_advance_run) ERROR: my coupler =        1 variable = model01_O_SSTSST
(oasis_advance_run) ERROR: current time =        28800 mseclag =        28800
(oasis_advance_run) ERROR: my coupler last time and dt =            0        3600
(oasis_advance_run) ERROR: skipped coupler =        1 variable = model01_O_SSTSST
(oasis_advance_run) ERROR: skipped coupler last time and dt =            0        3600
(oasis_abort) ABORT: file      = /work/n02/n02/jwc/modulefiles_build/oasis3-mct_4.0/10f61316/extract/oasis3-mct/lib/psmile/src/mod_oasis_advance.F90
(oasis_abort) ABORT: line      =  714
(oasis_abort) ABORT: on model  = toyoce
(oasis_abort) ABORT: on global rank =  576
(oasis_abort) ABORT: on local  rank =  0
(oasis_abort) ABORT: CALLING ABORT FROM OASIS LAYER NOW

However, I do not know what this means, maybe someone familiar with OASIS can give an indication.

Mohit

Hi Mohit

thanks for targeting where the error comes from.

it seems the OASIS may not read in the SST variable or maybe there are undefined values in SST field?

waiting for anyone familiar with OASIS giving a hand on this~

thank you

Xu