Upgrading from cylc7 to 8 on Monsoon3

Hi,

Many apologies, but I am having trouble with the upgrade process (I knew it wouldn’t be straightforward!). I am following the instructions at Summary Of Major Changes — Cylc 8.6.2 documentation , in order to upgrade my u-dg710. I thought I had done the first 4 steps correctly. So I validated my suite (sorry, workflow), then renamed my suite.rc to flow.cylc, then validated again with no obvious error messages, apart from:

[charles.williams.ext@cazccylc1 u-dg710]$ cylc validate u-dg710
WARNING - Backward compatibility mode ON - support for suite.rc files will be removed at 8.7.0
IllegalItemError: [scheduler]event hooks

but then, when I try running the workflow, I get:

[charles.williams.ext@cazccylc1 u-dg710]$ cylc vip
$ cylc validate /lustre/ehz2col/collaboration/home/users/charles.williams.ext/roses/u-dg710
WARNING - ‘rose-suite.conf[jinja2:suite.rc]’ is deprecated. Use [template variables] instead.
WARNING - Deprecated config items were automatically upgraded. Please alter your workflow to use the new syntax.
WARNING - * (8.0.0) [cylc] → [scheduler] - value unchanged
WARNING - graph items were automatically upgraded in “workflow definition”:
* (8.0.0) [scheduling][dependencies]graph → [scheduling][graph]X - for X in:
P1Y, R1
WARNING - deprecated settings found (please replace with [runtime][LINUX_UM]platform):
[runtime][LINUX_UM][remote]host = $(rose host-select cazccylc1.collab.sc.metoffice.gov.uk)
WARNING - deprecated settings found (please replace with [runtime][HPC_UM]platform):
[runtime][HPC_UM][remote]host = $(rose host-select xcs-c)
IllegalItemError: [scheduler]event hooks

Are you able to help?

Thank you,

Charlie

Hi Charlie,

Can you please give us read access to you home directory so we can see your workflows.

As a starter for ten…

  1. The illegalItemError:
    [scheduler][[event hooks]] needs to be changed to [scheduler][[events]]

  2. The way hosts are selected has changed. You now specify a platform rather than a host.

    So

    [runtime][HPC_UM][remote]host = $(rose host-select xcs-c)
    

    needs to be replaced with e.g:

    [runtime][[HPC_UM]]platform = ex

    Use platform = cazccylc1 for the other host warning.

Hopefully, the first ones are self explanatory. For example;
WARNING - * (8.0.0) [cylc] → [scheduler] - value unchanged
means change [cylc] → to [scheduler] in the flow.cylc file

Cheers,
Ros

Hi Ros,

Thanks ever so much.

So firstly, I have set the permissions of my entire directory on Monsoon to 755. Is this enough, or should I go higher? So you should be able to see all of my suites (sorry, workflows). The good news is that I think I only need to upgrade 2 of them, u-dg710 (the one I emailed you about) and another, u-dg751. But maybe let’s stick with 710 to begin with.

Secondly, I have followed most of your instructions, but wasn’t entirely sure about some of the other warnings. Specifically, and going in the order of your instructions rather than the warning messages, I have:

  • Changed event hooks: The only mention of [[event hooks]] is under [cylc] and [runtime][[root]]. I have changed both of these occurrences to [events]. But there is no [scheduler], there is a [scheduling] but this doesn’t have any mention of event hooks
  • I have inserted “platform = ex” (without the quotes, obviously) under [runtime][[HPC_UM]], but wasn’t sure if this was supposed to go under [[[remote]]], where the old one was? And I have removed the old line under [[[remote]]]
  • For the other host warning, under [runtime][LINUX_UM][remote], I have replaced this with platform = cazccylc1
  • Ahh, now I see the issue with my first bullet point, having got down to the end of your instructions. Okay, I have now replaced [cylc] with [scheduler]. But I still changed the [[[event hooks]]] under [runtime][[root]] to [[[events]]]
  • In my rose-suite.conf, I have changed [jinja2:suite.rc] to [template variables]

So firstly have I had made the right changes above, and secondly can you please help with the other warnings?

Thanks very much,

Charlie

Sorry Ros, I think I have done something silly with my permissions when I was opening everything up for you. As I said in my last message, I started by changing everything to 755, but when I then tried to log into another machine I got:

Permissions 0755 for ‘/home/users/charles.williams.ext/.ssh/id_ed25519’ are too open.
It is required that your private key files are NOT accessible by others.

So I tried changing just my .ssh directory (and its contents) to 600, which I thought allowed me writable access but nobody else. Despite telling me I don’t have permission to do this:

[charles.williams.ext@cazccylc1 ~]$ chmod -R 600 .ssh
chmod: cannot access ‘.ssh/id_ed25519.pub’: Permission denied
chmod: cannot access ‘.ssh/known_hosts’: Permission denied
chmod: cannot access ‘.ssh/known_hosts.old’: Permission denied
chmod: cannot access ‘.ssh/id_ed25519’: Permission denied
chmod: cannot access ‘.ssh/authorized_keys’: Permission denied

It clearly did do something, because now I have:

drw-------. 2 charles.williams.ext charles.williams.ext 4096 Apr 21 13:31 .ssh

but now, when I try to do anything with it, I get permission denied myself!

[charles.williams.ext@cazccylc1 ~]$ cd .ssh/
-bash: cd: .ssh/: Permission denied

What have I done this time?

Charlie

Hi Charlie,

You need to give the .ssh directory 700 permissions (chmod 700 ~/.ssh) otherwise as you’ve just found you don’t have execute permission on the directory which is needed in order to cd .ssh.

Cheers,
Ros.

P.S. I’ll take a look at your other questions in the morning.

Very sorry Ros, I will correct that first thing tomorrow. Apologies.

Charlie

Hi Ros,

Sorry for the radio silence with this, I was away on annual leave last week. Anyway, sorry to bother you, but I was just wondering if you had had a chance to look at this? It is possible that I may not need this workflow going forward, but it would be nice to get it working anyway, just in case (especially if it is relatively easy to upgrade it).

Thank you,

Charlie

Hi Charlie,

Hope you had a good week off.

  • Change [scheduler][events]timeout handler[scheduler][events]stall timeout handlers

    and in the same section timeoutstall timeout

  • Change [runtime][root]initial scripting[runtime][root]init-script

Cheers
Ros

Hi Charlie,

Sent too soon, there are lots more issues. Unfortunately the IllegalItemErrors you are seeing are a result of having ignored all the upgrade warnings you were previously getting when you were running this suite under cylc7. These IllegalItemErrors are due to still having Cylc 6 stuff in the suite which Cylc 8 knows nothing about and thus can’t tell you how to fix.

I’ve gone back and validated your original u-dg710 under Cylc 7 so you can see all the original warnings which should then give you enough information to use to fix each error as it appears.

[INFO] WARNING -  * (6.4.0) [runtime][root][initial scripting] -> [runtime][root][init-script] - value unchanged
[INFO] WARNING -  * (6.4.0) [runtime][nemo_cice][post-command scripting] -> [runtime][nemo_cice][post-script] - value unchanged
[INFO] WARNING -  * (6.4.0) [runtime][HPC_UM][pre-command scripting] -> [runtime][HPC_UM][pre-script] - value unchanged
[INFO] WARNING -  * (6.4.0) [runtime][postproc][pre-command scripting] -> [runtime][postproc][pre-script] - value unchanged
[INFO] WARNING -  * (6.4.0) [runtime][root][command scripting] -> [runtime][root][script] - value unchanged
[INFO] WARNING -  * (6.11.0) [cylc][event hooks] -> [cylc][events] - value unchanged
[INFO] WARNING -  * (6.11.0) [runtime][root][event hooks] -> [runtime][root][events] - value unchanged
[INFO] WARNING -  * (6.11.0) [runtime][LINUX_UM][job submission] -> [runtime][LINUX_UM][job] - value unchanged
[INFO] WARNING -  * (6.11.0) [runtime][HPC_UM][job submission] -> [runtime][HPC_UM][job] - value unchanged
[INFO] WARNING -  * (6.11.0) [runtime][LINUX_UM][job][method] -> [runtime][LINUX_UM][job][batch system] - value unchanged
[INFO] WARNING -  * (6.11.0) [runtime][HPC_UM][job][method] -> [runtime][HPC_UM][job][batch system] - value unchanged
[INFO] WARNING -  * (6.11.0) [runtime][postproc][retry delays] -> [runtime][postproc][job][execution retry delays] - value unchanged

Cheers,
Ros

Ok thank you very much. So just to be clear: for each of those lines, I find the relevant part in my suite.rc and change it to whatever is after the arrow?

Charlie

Yes.

Cheers,
Ros

Okay, thank you very much. So I have now corrected all of those lines, which generated a whole load of new warnings:

[charles.williams.ext@cazccylc1 u-dg710]$ cylc validate ~/roses/u-dg710
WARNING - Deprecated config items were automatically upgraded. Please alter your workflow to use the new
syntax.
WARNING -  * (8.0.0) [runtime][postproc][job]execution retry delays → [runtime][postproc]execution retry
delays - value unchanged
WARNING -  * (8.0.0) [runtime][root][events]succeeded handler → [runtime][root][events]succeeded handlers -
value unchanged
WARNING -  * (8.0.0) [runtime][root][events]failed handler → [runtime][root][events]failed handlers - value
unchanged
WARNING -  * (8.0.0) [runtime][root][events]submission failed handler → [runtime][root][events]submission
failed handlers - value unchanged
WARNING -  * (8.0.0) [runtime][root][events]retry handler → [runtime][root][events]retry handlers - value
unchanged
WARNING -  * (8.0.0) [runtime][root][events]execution timeout handler → [runtime][root][events]execution
timeout handlers - value unchanged
WARNING -  * (8.0.0) [runtime][root][events]submission timeout handler → [runtime][root][events]submission
timeout handlers - value unchanged
WARNING - graph items were automatically upgraded in “workflow definition”:
* (8.0.0) [scheduling][dependencies]
graph → [scheduling][graph]X - for X in:
P1Y, R1
WARNING - deprecated settings found (please replace with [runtime][LINUX_UM]platform):
[runtime][LINUX_UM][job]batch system = at
PlatformLookupError: Task ‘HPC_UM’ has the following deprecated ‘[runtime]’ setting(s) which cannot be used with ‘platform = ex’:

[job]batch system

So I corrected all of those, except the last 2 which I didn’t entirely understand what it was asking. Validating this again did at least remove all the other errors, so I now get:

[charle's s.williams.ext@cazccylc1 u-dg710]$ cylc validate ~/roses/u-dg710
WARNING - deprecated settings found (please replace with [runtime][LINUX_UM]platform):
[runtime][LINUX_UM][job]batch system = at
PlatformLookupError: Task ‘HPC_UM’ has the following deprecated ‘[runtime]’ setting(s) which cannot be used with ‘platform = ex’:

[job]batch system

I’m not entirely sure how to fix these?

Charlie

Hi Charlie,

You can remove the [job]batch system= lines as the batch system is defined as part of the platform configuration at cylc 8.

Cheers,
Ros

Okay, I have now done that and got rid of those 2 errors. Now all I have is:

[charles.williams.ext@cazccylc1 u-dg710]$ cylc validate ~/roses/u-dg710
IllegalItemError: [scheduling]R1

which I thought I had changed according to one of the earlier lines, but maybe I didn’t change it correctly?

Charlie

The graph configuration needs to change from

[scheduling]
    ...
    [[ R1 ]]
         graph = """
         ...
"""
    [[[ {{FMT}} ]]]
         graph = """
         ...
"""

To

[scheduling]
    ...
    [[ graph ]]
         R1 = """
         ...
"""

         {{FMT}} = """
         ...
"""

Enormous apologies Ros, somehow I missed the email notification that you had replied to my last question, but I see now that it was 8 days ago. Sorry.

So I have now done that, but get a new error (I definitely didn’t get this before):

[charles.williams.ext@cazccylc1 u-dg710]$ cylc validate ~/roses/u-dg710
IllegalItemError: [scheduling][graph]P1Y

I don’t understand this, because P1Y appears nowhere in my flow.cylc.

Charlie

Hi Charlie,

You have fixed the [[ R1 ]] but not done the same for [[[ {{FMT}} ]]]graph= below it. See previous message.

{{FMT}} is a variable that gets replaced. It’s the cycling frequency ie. P1Y in your setup.

Cheers,
Ros

Thank you. But this is odd, because again I didn’t receive an email notification. I only noticed your reply today because I had left the browser open. Have you recently turned off email notifications?

Anyway, no problem, I have done that. But now yet another error:

[charles.williams.ext@cazccylc1 u-dg710]$ cylc validate ~/roses/u-dg710
IllegalItemError: [runtime][LINUX_UM][remote]platform

Why can’t it show all of the errors at once, like it was before?!!

Charlie

Hi Charlie,

Cylc will show all WARNINGS at once, but like most scripts, it can’t show all ERRORS as it has met a statement it can’t parse and therefore can’t carry on.

All [[remote]] section should be removed. platform= now goes directly after the family. You have it correct in [[HPC_UM]] but wrong in [[LINUX_UM]].

Change:

    [[LINUX_UM]]
      [[[remote]]]
          platform = cazccylc1

to


    [[LINUX_UM]]
        platform = cazccylc1

With regard to email alerts being sent. You can configure when it does and doesn’t send you emails in your user profile preferences under the email tab.

Regards,
Ros

Okay, I think we are in business!

[charles.williams.ext@cazccylc1 u-dg710]$ cylc validate ~/roses/u-dg710
Valid for cylc-8.6.3

So, looking at the various official documentation, am I right in thinking that I can now do cylc vip and it should run? Is there anything else I need to change?

About the emails: that’s weird, as I absolutely haven’t changed anything in my settings. This latest message did arrive by email, your other 2 (this morning and 8 days ago) did not.

Charlie