Problem with pptransfer from ARCHER2 -> JASMIN

Hi,
I am running a UKESM1.1 job (u-dc258) on ARCHER2 (PUMA2) and am trying to transfer data to JASMIN.
I was having issues with the pptransfer.
I’ve have been trying with the old rsync way which I found to work intermittently.
So I have tried changing to using GridFTP following this guidance:

I think my credentials are all OK on the JASMIN (xfer1&2) and on archer2.
But the transfer fails with output


[WARN] [SUBPROCESS]: Command: globus-url-copy -vb -cd -cc 4 -sync file:///work/n02/n02/schn02/archive/u-dc258/17730101T0000Z/ sshftp://gridftp1.jasmin.ac.uk/gws/nopw/j04/glosat/production/UKESM/raw/u-dc258/17730101T0000Z/
[SUBPROCESS]: Error = 1:

error: Unable to check destination url for sync: sshftp://gridftp1.jasmin.ac.uk/gws/nopw/j04/glosat/production/UKESM/raw/u-dc258/17730101T0000Z/
an end-of-file was reached
globus_xio: An end of file occurred


I did a module load:
module load gct/v6.2.20201212
and tried the command from a archer2 node and the same problem occurred.

Is this still the recommended method for pptransfer? And if so any advice as the what the problem may be will be very gratefully received.

Many thank,
Andrew

Andrew

Have you got access to hpxfer1.jasmin.ac.uk (high-performance transfer)?

Grenville

Hi Grenville,
Yes I have and can ssh from ARCHER2 to hpxfer1.
And as far as I can understand have the correct credential set up at both ends (but do not totally understand the process so may have not done something quite right).

e.g.

schn02@ln02:~> openssl x509 -in cred.jasmin -noout -startdate -enddate
notBefore=Jan 19 15:24:34 2024 GMT
notAfter=Feb 18 15:24:34 2024 GMT

schn02@ln02:~> ssh aschurer@hpxfer1.jasmin.ac.uk

[aschurer@hpxfer1 ~]$ openssl x509 -in cred.jasmin -noout -startdate -enddate
notBefore=Jan 19 15:26:15 2024 GMT
notAfter=Feb 18 15:26:15 2024 GMT

cred.jasmin needs to be in the ARCHER2 /work space. (It’s not needed on JASMIN)

I now have a cred.jasmin file on ARCHER2 in /work/n02/n02/schn02/
I originally had it in my home folder.
So removed everything and started again in my work folder.

schn02@ln02:/work/n02/n02/schn02> openssl x509 -in cred.jasmin -noout -startdate -enddate
notBefore=Jan 22 11:44:54 2024 GMT
notAfter=Feb 21 11:44:54 2024 GMT

But the transfer is still returning me an error.

schn02@ln02:/work/n02/n02/schn02> globus-url-copy -vb -cd -cc 4 -sync file:///work/n02/n02/schn02/archive/u-dc258/17730101T0000Z/ sshftp://gridftp1.jasmin.ac.uk/gws/nopw/j04/glosat/production/UKESM/raw/u-dc258/17730101T0000Z

error: Unable to check destination url for sync: sshftp://gridftp1.jasmin.ac.uk/gws/nopw/j04/glosat/production/UKESM/raw/u-dc258/17730101T0000Z
an end-of-file was reached
globus_xio: An end of file occurred

Should this command be working?
Is there any other way of testing my setup?
Thanks again
Andrew

Andrew

Try deleting the .globus directory on JASMIN (or move it elsewhere)

Grenville

I’ve now moved the directory as suggested but unfortunately still get the same problem.

Andrew, is this the full command you’re running?

schn02@ln02:/work/n02/n02/schn02> globus-url-copy -vb -cd -cc 4 -sync file:///work/n02/n02/schn02/archive/u-dc258/17730101T0000Z/ sshftp://gridftp1.jasmin.ac.uk/gws/nopw/j04/glosat/production/UKESM/raw/u-dc258/17730101T0000Z

if so, it’s missing -cred /work/n02/n02/<YOU>/cred.jasmin

I was indeed missing the credential part.
I’ve added this and my command is now:

schn02@ln04:/work/n02/n02/schn02> globus-url-copy -cred cred.jasmin -vb -cd -cc 4 -sync file:///work/n02/n02/schn02/archive/u-dc258/17730101T0000Z/ sshftp://gridftp1.jasmin.ac.uk/gws/nopw/j04/glosat/production/UKESM/raw/u-dc258/17730101T0000Z/

It still produces the same error though

error: Unable to check destination url for sync: sshftp://gridftp1.jasmin.ac.uk/gws/nopw/j04/glosat/production/UKESM/raw/u-dc258/17730101T0000Z/
an end-of-file was reached
globus_xio: An end of file occurred

Looking through the documentation, I found that gsiftp was often used so tried the following which worked. I don’t understand enough about this transfer to know the significance of this though…

schn02@ln04:/work/n02/n02/schn02> globus-url-copy -cred cred.jasmin -vb -cd -cc 4 -sync file:///work/n02/n02/schn02/archive/u-dc258/17730101T0000Z/ gsiftp://gridftp1.jasmin.ac.uk/gws/nopw/j04/glosat/production/UKESM/raw/u-dc258/17730101T0000Z/
Source: file:///work/n02/n02/schn02/archive/u-dc258/17730101T0000Z/
Dest: gsiftp://gridftp1.jasmin.ac.uk/gws/nopw/j04/glosat/production/UKESM/raw/u-dc258/17730101T0000Z/

Thanks again,
Andrew

Hi Andrew,

You need to update version of the postproc branch in your suite. The one you are using doesn’t support gridftp with certificates, hence the incorrect gridftp command.

In fcm_make_pp → configuration update the versions of the pp_sources to:

postproc_2.3_pptransfer_gridftp_nopw@4557
postproc_2.3_archer2@4988

Regards,
Ros.

Hi Ros,
Thanks for this.
I’ve updated the branch as suggested.
My model simulation is currently running using the old pptransfer (not gridftp) but the next time it stops I will change to gridftp, reload and try this.
Thanks and kind regards
Andrew

Hi Andrew,

If you’re switching mid-run you will need to re-insert the fcm_make_pp and fcm_make2_pp tasks into the suite to re-build the postproc scripts. A reload alone won’t cause a rebuild of the scripts.

After you have restarted/reloaded try:

cylc insert --no-check u-dc258 fcm_make_pp.<cycle point>

That should insert the task, you may need to manually trigger it to run. Once that has completed do the same for the fcm_make2_pp task.

cylc insert --no-check u-dc258 fcm_make2_pp.<cycle point>

Regards,
Ros.