It’s been a little while since I ran it (since before the upgrade), but the 1 km domain of my suite u-cu484 has started to time out on all the forecasts.
For reference, the forecasts used to run in <60 M requested wall time. The 12 km domain (204x160, running on 8x6 CPUs) and 3 km domain (608x424, running on 20x16 CPUs) both still complete in ~18 and ~37 mins on average.
My 1 km domain is 800x600 and was previously running fine with 32x28 CPUs. I’ve since bumped it up to 36x32 CPUs and increased the walltime to 80M but it’s still failing around a third of the way through the forecast.
I have been seeing this warning in the .err file though, which isn’t present in the 12/3km output:
WARNING: Requested total thread count and/or thread affinity may result in
oversubscription of available CPU resources! Performance may be degraded.
Explicitly set OMP_WAIT_POLICY=PASSIVE or ACTIVE to suppress this message.
Set CRAY_OMP_CHECK_AFFINITY=TRUE to print detailed thread-affinity messages.
Has something changed that means my previous config is no longer efficient? And any advice on how to fix this?