Hello, could you help with this one:
My submission (u-cw692) fails at coupled
The only changes from a successful suite is to change the restart files
and change model basis time on gui to reflect October start.
(does the msg indicate a problem with those files?)
Part of error file reads:
/work/n02/n02/jgrist02/cylc-run/u-cw692/log/job/19501001T0000Z/coupled/01/job.err
???
??? WARNING ???
? Warning code: -1
? Warning from routine: eg_SISL_setcon
? Warning message: Constant gravity enforced
? Warning from processor: 0
? Warning number: 34
???
MPICH ERROR [Rank 1024] [job id 3617338.0] [Mon May 8 08:33:10 2023] [nid004973] - Abort(32765) (rank 1024 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 32765) - process 1024
srun: error: nid004973: task 0: Aborted
srun: launch/slurm: _step_signal: Terminating StepId=3617338.0+0
srun: launch/slurm: _step_signal: Terminating StepId=3617338.0+1
slurmstepd: error: *** STEP 3617338.0+1 ON nid004973 CANCELLED AT 2023-05-08T09:33:12 ***
slurmstepd: error: *** STEP 3617338.0+0 ON nid001443 CANCELLED AT 2023-05-08T09:33:12 ***
slurmstepd: error: *** STEP 3617338.0+2 ON nid005759 CANCELLED AT 2023-05-08T09:33:12 ***
srun: launch/slurm: _step_signal: Terminating StepId=3617338.0+2
srun: error: nid005781: tasks 8-11: Terminated
srun: error: nid005778: tasks 4-7: Terminated
srun: error: nid005861: tasks 16-19: Terminated
srun: error: nid005759: tasks 0-3: Terminated
srun: error: nid005859: tasks 12-15: Terminated
srun: Force Terminated StepId=3617338.0+2
srun: error: nid001443: tasks 0-63: Terminated
srun: error: nid001764: tasks 512-575: Terminated
srun: error: nid001511: tasks 384-447: Terminated
srun: error: nid001484: tasks 320-383: Terminated
srun: error: nid001463: tasks 256-319: Terminated
srun: error: nid001710: tasks 448-511: Terminated
srun: error: nid001462: tasks 192-255: Terminated
srun: error: nid001461: tasks 128-191: Terminated
srun: error: nid003954: tasks 576-639: Terminated
srun: error: nid003998: tasks 704-767: Terminated
srun: error: nid004011: tasks 832-895: Terminated
srun: error: nid004012: tasks 896-959: Terminated
srun: error: nid003984: tasks 640-703: Terminated
srun: error: nid001459: tasks 64-127: Terminated
srun: error: nid004010: tasks 768-831: Terminated
srun: error: nid004070: tasks 960-1023: Terminated
srun: Force Terminated StepId=3617338.0+0
srun: error: nid005755: tasks 896-1023: Terminated
srun: error: nid005751: tasks 768-895: Terminated
srun: error: nid004975: tasks 256-383: Terminated
srun: error: nid005009: tasks 512-639: Terminated
srun: error: nid004992: tasks 384-511: Terminated
srun: error: nid005756: tasks 1024-1151: Terminated
srun: error: nid005010: tasks 640-767: Terminated
srun: error: nid004973: tasks 1-127: Terminated
srun: error: nid004974: tasks 128-255: Terminated
srun: Force Terminated StepId=3617338.0+1
[FAIL] run_model <<‘STDIN’
[FAIL]
[FAIL] ‘STDIN’ # return-code=143
2023-05-08T08:33:13Z CRITICAL - failed/EXIT