Hi,
I am trying to run the suite u-cm655_da191_20081229-W_naew-11 and I am continuously bumping into the following error, could you advise me how to fix it?
Error (transfer)
Endpoint: Archer2 file systems (3e90d018-0d05-461a-bbaf-aab605283d21)
Server: 193.62.216.44:443
File: /work/n02/n02/egentile/cylc-run/u-dq099_cm655_da191_20081229-W_naew-02/share/cycle/20081227T0000Z/dq099_cm655_da191_20081229-W_naew-02a_upe_t_30m_m01s00i408_20081227-20081227.nc
Command: RETR /work/n02/n02/egentile/cylc-run/u-dq099_cm655_da191_20081229-W_naew-02/share/cycle/20081227T0000Z/dq099_cm655_da191_20081229-W_naew-02a_upe_t_30m_m01s00i408_20081227-20081227.nc
Message: Data channel authentication failed
Hi Emanuele,
That error shows that the data channel authentication is failing during transfer. Please check if your credentials have expired or re-authenticate your Globus session (globus login) and check the Archer2 endpoint is active and try the transfer again.
Best regards,
Juan
Hi Emanuele,
Assuming you credentials are fine, please do check as advised by Juan.
However, there is a known issue which generates this error message, I suspect the problem is likely due to performance issues with the JASMIN Quobyte storage as documented in this helpdesk ticket: Globus transfer failures
Whilst we work to find a temporary workaround until the storage issues are fixed, the best course of action is :
- go to the Globus web app and kill the retrying transfer
- In the cylc GUI retrigger the pptransfer task
This should hopefully cause Globus to use a different JASMIN transfer node. Nodes that have problems are rebooted but I don’t know if that is happening automatically over the bank holiday weekend or not.
Regards,
Ros.
Hi Ros,
Thanks so much for your reply, I’ve already done 1-2 multiple times with no avail.
What I have done now I have deleted the directory with transfer issues and retriggered it but again I can already see the error log swelling on globus…
What do you advise?
Thanks so much for your help again
Emanuele
Hi Emanuele,
At the moment, I’m afraid there is nothing we can do, other than keep resubmitting the globus task when it timesout. This is a JASMIN issue with the Globus nodes and the GWS (Quobyte) filesystem - I suspect some of the Globus nodes need rebooting. I have flagged it with them, but the RAL site has a closure day today so nothing will be done until tomorrow at the earliest.
Have a chat with Ben, I don’t know if data is going to tape - if it is you will be better off transferring data to the JASMIN transfer cache (XFC) temporarily as this is a different filesystem and doesn’t exhibit the same “stuck” issues.
Regards,
Ros.
Hi Rosalyn,
Many thanks for this,
I am bumping in a slightly different error which is some of the pp transfer tasks are stuck on submitted.
Should I just retrigger them again?
Bests,
Emanuele