Cylc and job limits

Dear CMS,
I have a workflow generation script that generates N suites and would like to run them in parallel. I think the limit of 32 serial jobs is the limit I need to worry about – each suite has, I think, 3 parallel serial jobs when it starts working and only one standard q job. Is cylc smart enough to realise that it can’t submit a job because there are already too many submitted, then it should retry some time later?

In the limit, I could imagine submitting 50-60 parallel suites. My script can have a limiter – only submit M suites and when they are done submit another M until all are done. What do you think is a reasonable value of M? Larger is better!

Simon

Hi Simon,

Within a single suite/workflow, yes cylc can be set up to have internal queues that limit the number of jobs submitted at once. However, cylc doesn’t know about what is going on in another suite. The easiest thing is to limit the number of suites submitted at once in your generation script.

You could also implement retries on the serial tasks, in combination, so if a task fails to submit it will retry after a specific period for a specified number of tries.

If cylc 8 see here: Retries — Cylc 8.4.3 documentation

If cylc 7 see here: The Cylc Suite Engine 7.9.3 documentation

As for what combination of values to use - I can only suggest trial and error.

If you just go with limiting the number of suites to submit in parallel it sounds like from what you say above you should be able to run 11 at one time. Personally I would probably just submit in batches of 10.

Regards,
Ros.