My suite ‘u-cm283’ is a copy of a working suite from archer, which I’m setting up for use on ARCHER2. I have reinstalled miniconda and included the anaconda environment. One or more of the tasks in my suite make use of ‘module load anaconda’.
Archer admin support have advised me that to load the base anaconda environment for use in my suite, I will need to enter the command:
eval “$(/work/n02/n02/lre/miniconda3/anaconda_install/bin/conda shell.bash hook)”
But, I’m unsure where to enter this command. My suite currently has ‘module load anaconda’ as a pre-script in my suite.rc file and also in some python code within /app/perturb/rose-app.conf
Could you please suggest a sensible place to include this command?
So just to check if I’m understanding correctly; the
'module load anaconda' lines you refer to are just what you used on old ARCHER? If so all that did was load the system python environment so can you not just use the system python3 on ARCHER2 by using
'module load cray-python' instead?
I’m not sure. I inherrited these aspects of the suite and the associated code. I’ll try running with the alternative you suggested.
Just looking in your suite… in the
archer.rc file you had specifications for
[[FLIGHT_TRACK_RESOURCE]] which just loaded the system python and added in our UM python libs.
So if you put the same in your
archer2.rc file and replace
module load anaconda
module load cray-python.
The equivalent of
I also note that your
[[HPC_SERIAL]] is still set to use the compute nodes, I would suggest switching these back to using the serial nodes. (We had to use the compute nodes for some intensive serial tasks on the 4-cab as we didn’t have the serial nodes there.)
Thanks for the advice. I’m glad to know this will work in advance, so thanks for taking the time to look.
With [[HPC_SERIAL]] , it’s not clear to me which part of my site/archer2.rc file indicates use of the compute nodes. Could you please share an example of how to change back to using the serial nodes?
[[HPC_SERIAL]] is setting directives like
--cpus-per-task and is inheriting the compute node queues (short or standard) from
[[HPC_SERIAL]] to be:
inherit = HPC
ROSE_TASK_N_JOBS = 32
That makes sense. I appreciate the explanation.
Does this change also affect the ‘Max number of processes/node’? I had it set to 128, so assume this should be reduced to 32. Is that correct?
“Max number of processes/node” specified in the
rose edit GUI should be set to 128. The compute nodes have 128 cores per node.
The ‘perturb’ task in my suite now fails on an attempt to load the ‘pandas’ python library. I guess this is why there was a specific module load command? Is there a straight-forward way of making this library generally available?
‘module load anaconda’ was just the ARCHER equivalent of ARCHER2’s ‘module load cray-python’, there was nothing special about it.
pandas is in the
cray-python library so looks like the command hasn’t run properly.
ARCHER2-23cab> module load cray-python
Python 3.8.5 (default, Aug 24 2020, 19:11:09)
[GCC 9.3.0 20200312 (Cray Inc.)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
I’ll take a look tomorrow.
app/perturb/bin/perturb_ppe.py script is hard-wired to python2.7 so that’s the reason why it can’t find
Change the first line to be:
I also noticed that your
[[FLIGHT_TRACK_RESOURCE]] are doing a
module load um this should be
module load cray-python to load the python3 environment.
perturb app then runs successfully for me.
Thanks for taking the extra step of checking the resource setups. My suite is running now, thanks to your advice.
This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.