Hi Patrick, I’m working from home and it seems to only work when I connect to Uni of Bristol VPN (I’m currently at Uni of Exeter but still have my username from my old job at Bristol). Do you know why that’s the case?
I can run u-al752 now - thank you so much for all your help!
A bit of a cheeky question, I tried running u-au253 this morning and my fcm_make failed with this error:
/apps/slurm/spool/slurmd/job7044469/slurm_script: line 64: /usr/local/bin/prg_ifort-12.0: No such file or directory
Hi Ayesha:
I am glad you have a VPN from Bristol to use, so that ssh works with Xwindows. Are you using ssh -AY with login2? That might work in more places than ssh -AX with login1. You might talk to the University of Exeter people and/or the JASMIN people if you continue to have ssh problems.
If you want to use login1 (it is better to use login1 if possible), did you do the step about reverse_dns_check in Login problems? - JASMIN help docs ?
About your error with u-au253, you can do:
cd ~/roses/u-au253
grep -r ifort *
This will tell you which file is trying to open the ifort file. You’ll see that the path to the ifort file doesn’t exist on JASMIN. Did you make the necessary changes to the suite to make it work on JASMIN? (i.e., have you ported the suite to JASMIN?)
When I type ifort at the cylc1 command line, then ifort is there. So maybe you need to change those ifort lines in the suite?
Please create a new ticket on the CMS Helpdesk if you need or want further help with this other suite.
Patrick
Hi Ayesha:
I am glad you can use login1 with the Bristol VPN. I am surprised a little that it works even though your Reverse DNS lookup failed.
If you need to get your Reverse DNS working, then you might speak with the Bristol IT and/or inquire with the JASMIN support email.
Patrick
Sorry, me again! I tried fixing this by myself but not having too much luck.
When I run u-al752 it fails at the make_plot stage (it says submit-failed).
On job.err it says:
ERROR: file not found: /home/users/ash221/cylc-run/u-al752/log/job/1/make_plots/01/job.err
This is what is says on the job-activity.log on /home/users/ash221/cylc-run/u-al752/log/job/1/make_plots/01:
[jobs-submit cmd] cylc jobs-submit – /home/users/ash221/cylc-run/u-al752/log/job 1/make_plots/01
[jobs-submit ret_code] 1
[jobs-submit out] 2022-07-19T13:09:08+01:00|1/make_plots/01|1|None
2022-07-19T13:09:08+01:00 [STDERR] sbatch: error: Batch job submission failed: Requested time limit is invalid (missing or exceeds some limit)
[((‘event-mail’, ‘submission failed’), 1) ret_code] 0
~
Can you please help? I’m not getting any output files.
Hi Ayesha:
If you give us read permission to your home directory and subdirectories, I could take a look at your setup and log files. You can do this with:
chmod -R g+rX /home/users/ash221/
If you have anything private or confidential, you might want to change back the read access on those items.
Patrick
Hi Ayesha:
The job.err file is not there because the make_plots app hasn’t run yet.
I looked from the command line with vi /home/users/ash221/cylc-run/u-al752/log/job/1/make_plots/01/job-activity.log
And I do see the error that you report: sbatch: error: Batch job submission failed: Requested time limit is invalid (missing or exceeds some limit)
(Often it is easier to look at the log files from the command line instead of in the cylc GUI.)
So the problem is that your job file:
/home/users/ash221/cylc-run/u-al752/log/job/1/make_plots/01/job
says: #SBATCH --partition=test #SBATCH --time=08:00:00
The test queue only allows up to 4 hour runs. See:
You can change this with: vi ~/roses/u-al752/site/suite.rc.CEDA_JASMIN
(Or with some other editor).
to: [[PLOTTING_CEDA_JASMIN] [[[directives]]] --partition = test --time = 04:00:00
It is quite possible that the plotting can’t get done in 4 hours. If you have permission to use the short-serial queue/partition, then you can run for 8 hours in there instead of in the test queue/partition. You can even run up to 48 hours in the short-serial partition.
After you change it, then:
cd ~/roses/u-al752
do a rose suite-run --reload
if a GUI doesn’t pop up, then do a rose sgc
and then in the GUI, use a right mouse-click to retrigger the make_plots app.
the 2nd-iteration version of the job file /home/users/ash221/cylc-run/u-al752/log/job/1/make_plots/02/job should then have the new wallclock time limit in it, and it should be able to start running when the queue lets it run.
These are missing the initial slash / before work.
Furthermore, you don’t have any output data existing from your JULES runs in the OUTPUT_FOLDER path ‘/work/scratch-nopw/ayeshahussain/fluxnet/u-al752/jules_output’.
If you want to you data from your prior runs, you might want to change it to: OUTPUT_FOLDER='/work/scratch-nopw/ayeshahussain/fluxnet/run11a/jules_output'
and maybe you want to change the PLOT_FOLDER to: PLOT_FOLDER='/work/scratch-nopw/ayeshahussain/fluxnet/run11a/plots'
After you make the changes, then:
cd ~/roses/u-al752
rose suite-run --reload
retrigger the make_plots app from the menu found with a right mouse click in the GUI.
when it starts running, this job script file should have the correct paths for OUTPUT_FOLDER and PLOT_FOLDER: /home/users/ash221/cylc-run/u-al752/log/job/1/make_plots/08/job
Hi Ayesha
I noticed an issue with make_plots for the u-al752 suite on JASMIN. Since your ticket has been closed, I just reopened the ticket, to respond properly.
The make_plots app fails for some weird reason unless this line is deleted in the [[PLOTTING_CEDA_JASMIN]] section of site/suite.rc.CEDA_JASMIN : env | grep LD_LIBRARY_PATH
Does that help you?
Patrick