Stopped with 'submitted'?

MYoshioka · 24 November 2021 10:55

My suite u-cj666 had run for some time before one of the fcst processes stopped and its status was shown as ‘retrying’. When I retriggered that process to run, sgc window whited out like this.

It says stopped with ‘submitted’… What happened?

I don’t seem to be able to restart the run. I thought maybe this is just an issue with sgc and tried to open another sgc window, but it failed saying the suite is not running…

How do I restart the run?
Thanks.
Masaru

grenville · 24 November 2021 13:06

Masaru

I don’t know what happened. In the suite directory, try

rose suite-restart

Grenville

MYoshioka · 24 November 2021 13:19

HI Grenville,
When I tried it I got this;

myosh@xcslc0:u-cj666 $ rose suite-restart
[FAIL] cylc restart u-cj666 # return-code=1, stderr=
[FAIL] /home/d03/myosh/.bash_profile: line 14: /home/d03/myosh/.ssh/ssh-setup: No such file or directory
[FAIL] Traceback (most recent call last):
[FAIL] File “/common/fcm/cylc-7.8.7/bin/cylc-restart”, line 25, in
[FAIL] main(is_restart=True)
[FAIL] File “/common/fcm/cylc-7.8.7/lib/cylc/scheduler_cli.py”, line 134, in main
[FAIL] scheduler.start()
[FAIL] File “/common/fcm/cylc-7.8.7/lib/cylc/scheduler.py”, line 242, in start
[FAIL] self.suite_db_mgr.restart_upgrade()
[FAIL] File “/common/fcm/cylc-7.8.7/lib/cylc/suite_db_mgr.py”, line 526, in restart_upgrade
[FAIL] pri_dao.vacuum()
[FAIL] File “/common/fcm/cylc-7.8.7/lib/cylc/rundb.py”, line 1031, in vacuum
[FAIL] return self.connect().execute(“VACUUM”)
[FAIL] sqlite3.OperationalError: disk I/O error
[FAIL] ERROR: command terminated by signal 1: ssh -oBatchMode=yes -oConnectTimeout=8 -oStrictHostKeyChecking=no -n xcslc1 env CYLC_VERSION=7.8.7 bash --login -c ‘’“'”‘exec “$0” “$@”’“'”‘’ cylc restart u-cj666 --host=localhost

But my other runs were also stopped after that. So I guess this is an issue with Monsoon or my account…?

It may be related to “MASS Planned Outage Wednesday 24th November 10:00-12:00 BST”, although it’s already 1pm and archiving should not be critical for model run?

Masaru

MYoshioka · 25 November 2021 00:45

Now it looks like disk quota in /scratch . So this problem appears to be resolved. Thanks.
Masaru

Topic		Replies	Views
JASMIN suite stopped with 'submit failed' Rose/Cylc and FCM	2	302	7 October 2021
Suite fails to restart Rose/Cylc and FCM ARCHER2	4	352	11 January 2023
Restarts failing Unified Model ARCHER2	2	195	4 January 2023
Submit-failed for all tasks today? Unified Model PUMA , ARCHER2	11	503	13 May 2022
Suite restart fails Unified Model ARCHER2 , PUMATest	1	250	15 June 2022

Stopped with 'submitted'?

Related topics