JASMIN $SBATCH settings mismatch between roses/rose-suite.conf and job file (in rundir)

I get the following error for my test run:

ERROR: sbatch: error: Batch job submission failed: Invalid account or account/partition combination specified

I think I have found out what is causing this error but can’t work out why it is happening.

I am running a copy of suite u-ds632 (belonging to eemvpg) suite on JASMIN (my suite is u-dx182).

When I compare the two suites we have the same rose-suite.conf settings within the roses directory. These are:

JASMIN_MPI_NUM_TASKS=30
JASMIN_RUN_QUEUE=‘par-multi’
JASMIN_WALLTIME_RUN=‘15:00:00’

The settings are available to compare in /home/users/eempvg/roses/u-ds632/rose-suite.conf and /home/users/earagr/roses/u-dx182/rose-suite.conf.

However, in the test run for my run (u-dx182) the job file within the run directory inherits different settings for these variables. (head -100 /home/users/earagr/cylc-run/u-dx182/run1/log/job/20010101T0000Z/spinup_01/NN/job). in this file they are set to:

#SBATCH --qos=high
#SBATCH --partition=standard
#SBATCH --time=15:00:00
#SBATCH --ntasks=30
**#SBATCH --mem=40000
#SBATCH --nodes=1
**
This does not happen for the same test run in the u-ds632 run directory (head -100 /home/users/eempvg/cylc-run/u-ds632/run1/log/job/20010101T0000Z/spinup_01/03/job):
JASMIN_MPI_NUM_TASKS=30
JASMIN_RUN_QUEUE=‘par-multi’
JASMIN_WALLTIME_RUN=‘15:00:00’

My run seems to inheriting these settings from the /home/users/earagr/roses/u-dx182/location/JASMIN/suite.rc:

\[\[jules\]\]

    inherit = None, JASMIN_LOTUS

    \[\[\[job\]\]\]

        batch system = slurm

    \[\[\[directives\]\]\]

        **--account = jules**

        **--qos = high**

        **--partition = standard**

        **#--partition = {{ JASMIN_RUN_QUEUE }}**

        **--time = {{ JASMIN_WALLTIME_RUN }}**

        **--ntasks = {{ JASMIN_MPI_NUM_TASKS }}**

        **--mem = 40000**

        **--nodes = 1**

    \[\[\[remote\]\]\]

        retrieve job logs max size = 10M

    \[\[\[environment\]\]\]

        ROSE_LAUNCHER = mpirun

        MPI_NUM_TASKS = {{ JASMIN_MPI_NUM_TASKS }}

        ANCIL_BASE_PWD = {{ JASMIN_ANCIL_PATH }}/{{ MIPID|lower }}/jules_ancils/

        DRIVE_BASE_PWD = {{ JASMIN_DRIVE_PATH }}/{{ MIPID|lower }}

        OUTPUT_BASE = {{ JASMIN_OUTPUT_BASE }}/$ROSE_SUITE_NAME

{% if L_SPINUP_GENERIC %}

{% if INITIALSE_FROM_NON_DUMP_FILE_SPINUP_GENERIC %}

        INITIAL_NON_DUMP_FILE = {{ JASMIN_INITIAL_NON_DUMP_FILE }}

{% else %}

        INITIAL_DUMP_FOR_SPINUP_GENERIC = {{ JASMIN_INITIAL_DUMP_FOR_SPINUP_GENERIC }}

{% endif %}

{% endif %}

{% if L_SPINUP_2NDSPIN and not L_SPINUP_GENERIC %}

    {% if START_WITH_2NDSPIN %}

        INITIAL_DUMP_FOR_SPINUP_2NDSPIN = {{ JASMIN_INITIAL_DUMP_FOR_SPINUP_2NDSPIN }}

    {% endif %}

{% endif %}

I am not sure why this is happening for me (u-dx182) and not for the original suite (u-ds632). Or how I might fix this issue. Any help or suggestions you can provide would be really helpful.

Thanks,
Ailish

Hi Ailish

Sorry for the delay in replying.

The reason for the differences between /home/users/eempvg/cylc-run/u-ds632/run1/log/job/20010101T0000Z/spinup_01/03/job and /home/users/earagr/cylc-run/u-dx182/run1/log/job/20010101T0000Z/spinup_01/NN/job is that the former was run on 13/10/2025 and the suite u-ds632 has been updated since then. Your suite u-dx182 appears to be based on the latest version of u-ds632. The older version of u-ds632 would no longer work because the partition par-multi no longer exists.

The reason for your error message is that your job script specifies --account=jules when you are not a member of the jules Slurm account. You need to either change jules to a Slurm account of which you are a member (see https://help.jasmin.ac.uk/docs/batch-computing/slurm-queues/#new-slurm-job-accounting-hierarchy for how to get a list) or join the jules group workspace via the JASMIN accounts portal.

Good luck
David

Hi David,

Thanks for your help with this. I requested access to the JULES group and the job now runs the spin_up, which is great. However I am getting a different error that seems to be related to some oddities in the owner of my scratch dir folders.

I get the following error message in my job.err file (tail -100 /home/users/earagr/cylc-run/u-dx182/run3/log/job/20010101T0000Z/spinup_01/NN/job.err):
mkdir: cannot create directory ‘/work/scratch-pw5/earagr//u-dx182/run3’: Permission denied’

When I looked into this, eempvg (who is the owner of the suite I copied) is the owner of the u-dx182 directory on my scratch space.

#check who owns dirs

[earagr@cylc2 earagr]$ find /work/scratch-pw5/earagr -type d -user earagr
/work/scratch-pw5/earagr

[earagr@cylc2 earagr]$ find /work/scratch-pw5/earagr -type d -user eempvg

/work/scratch-pw5/earagr/u-dx182
/work/scratch-pw5/earagr/u-dx182/20CRv3-ERA5

#Test creating files
#can create files in earagr but not in u-dx182

[earagr@cylc2 earagr]$ pwd
/work/scratch-pw5/earagr

[earagr@cylc2 earagr]$ mkdir test

[earagr@cylc2 earagr]$ ls

isimip3a_fire_20CRv3-ERA5_obsclim.dump.20010101.0.nc isimip3a_fire_20CRv3-W5E5_obsclim.dump.20010101.0.nc model_scenario_info_isimip3a.dat u-dx182
isimip3a_fire_20CRv3_obsclim.dump.20010101.0.nc isimip3a_fire_GSWP3-W5E5_obsclim.dump.20010101.0.nc test

[earagr@cylc2 earagr]$ cd u-dx182/

[earagr@cylc2 u-dx182]$ mkdir test
mkdir: cannot create directory ‘test’: Permission denied

I am not sure how to fix this issue. When I search for eempvg in my roses dir using:

grep -ir eempvg .

There are no files defining eempvg as the owner of the cylc suite in cylc-run dir.

Any help or advice you can give would be great :slight_smile:

Thanks again for your help.

Ailish

Hi Ailish

Something strange is going on with file ownership and permissions under your scratch directory (/work/scratch-pw5/earagr). I was going to type more on this subject, but I see that you started a run7 earlier today. Did that work?

David