Setting up Globus for PPTransfer in non-standard postproc branch

I’m trying to set up globus for data transfer from archer2 to JASMIN in my suite u-dk694, which is NEMO-BISICLES only and I’m following the instructions here: IMPORTANT: Retirement of JASMIN Gridftp server

I’m using this postproc branch in fcm_make_pp → configuration:

fcm:moci.xm-br/dev/robinsmith/postproc_2.3_nemo-bisicles_support

I wondered if you could help with updating this to use the new file transfer system or whether the postproc_2.3_archer2_jasmin_rewrite will work in my suite? Thank you so much for your help!
Best wishes,
Ruth

Just to add that I’ve found Jing’s message about her suite u-dh564 and I’m currently editing my suite to use the same postproc branches hoping that might work (my suite has NEMO v4.2 rather than v3.6 but I don’t know if that matters?)

Hi Ruth,

Same version of postproc should work irrespective of the NEMO version.

You will probably get some errors if you simply switch to the postproc_2.3_archer2_jasmin_rewrite branch as it will need extra namelist variables for supporting putting data to JASMIN Elastic Tape.

If it doesn’t work you can revert to your original branches and simply change the revision number of the postproc_2.3_pptransfer_gridftp_nopw from @4557 to
@5411.

Then add in the 3 globus_* variables to the pptransfer namelist as per the instructions.

Regards,
Ros.

Hi Ros,
Thank you so much, the suite is working and running. However, I’ve found not all the files are being transferred successfully and the new globus transfer seems much less reliable than gridftp. I wondered if you are aware of this and if there’s anything I can do to add a check in pptransfer and perhaps retry if the file is not copied successfully? I’m currently checking manually and transferring the files that were missed.
Best wishes,
Ruth

Hi Ruth,

Thanks for letting us know. We did see a few occurences of this before Christmas as a result of a “Globus endpoint connection timeout” which ARCHER2 believe has since been fixed. Could you please give me some more information, as below, so I can determine if this is the same issue still occurring or a different one.

  1. In the Globus web app can you take a look at the copies/tasks that didn’t complete properly when it said they did and see if there is any “endpoint connection” error listed in the log?

  2. Again in the Globus web app can you see if Globus had to retry the task multiple times before it completed.

  3. Could you please send us the Globus task ids so ARCHER2/JASMIN can take a look in the Globus logs.

  4. How frequently is this problem occurring?

Regards,
Ros.

Hi Ruth,

Could you also please confirm when you last saw this problem?

Cheers,
Ros.

Hi Ros,
The transfers that failed were on 1/1/25, the files since then seem to have transferred ok.

  1. I’ve checked the log and indeed, there’s an endpoint connection error for the affected files:

Error (transfer)
Endpoint: Archer2 file systems (3e90d018-0d05-461a-bbaf-aab605283d21)
Server: 193.62.216.42:443
File: /work/n02/n02/rutrns/archive/u-dk694/19860101T0000Z/nemo_dk694o_1m_19860101-19860201_grid-T.nc
Command: RETR /work/n02/n02/rutrns/archive/u-dk694/19860101T0000Z/nemo_dk694o_1m_19860101-19860201_grid-T.nc
Message: Data channel authentication failed

Details: 500-Command failed. : globus_xio: The GSI XIO driver failed to establish a secure connection. The failure occured during a handshake read.\r\n500-globus_xio: Operation was canceled\r\n500-globus_xio: Operation timed out\r\n500 End.\r\n

  1. I can’t see if it tried multiple times - there were quite a few files to transfer for each year and there are a few entries in the events log.

  2. This is the task id for one of the years affected:
    c3dc95ae-c87a-11ef-9bae-1508cc3e4972

  3. There were 4 years affected where a few files from each year were empty.

Thank you for your help.
Best wishes,
Ruth

Thanks for the information Ruth, I have passed it on to both ARCHER2 & JASMIN

If you get the problem again can you please let me know along with:

  • date & time it happened
  • the globus task id
  • the filenames of those files that didn’t transfer properly

JASMIN have turned on verbose logging so we can hopefully get some more information to send to Globus.

Thanks
Regards,
Ros.