Changing temperature initial conditions in Nemo but seeing no response

Hi,

Sorry to bother you, but I have a problem with one of my recent experiments. I fear I have done something silly in the setup, because I see absolutely no difference between 2 suites which, I thought at least, should be different.

The suites in question are dw696 and dw704, both of which are fully coupled simulations. They are absolutely identical, other than the initial conditions for climatological temperature and salinity…

dw696 uses the following ancillary versions file:

/work/n02/n02/cjrw09/gc31/pliod/plioda.d/ancil_versions/GC3.1_eORCA1v2.2x_nemo_ancils_CMIP6_pliodatest1

which points to this T & S initial conditions:

export NEMO_CLIM_START_TEMP=/work/y07/shared/umshared/hadgem3/initial/ocean/eORCA1v2.2x/EN4_v1.1.1995_2014.monthlymean_eORCA1T_NEMO_L75.nc

whereas dw704 uses the following ancillary versions file:

/work/n02/n02/cjrw09/gc31/pliod/plioda.d/ancil_versions/GC3.1_eORCA1v2.2x_nemo_ancils_CMIP6_pliodatest2

which points to this T & S initial conditions:

export NEMO_CLIM_START_TEMP=/work/n02/n02/cjrw09/gc31/pliod/plioda.d/ancils/ocean/EN4_v1.1.1995_2014.monthlymean_eORCA1T_NEMO_L75_new_surface_okatt.nc

They are clearly different, because in the latter file I have added an anomaly.

However, when I run the model for 20 years with both of these, the results - whether SST or surface temperature, looking at the first month of the first year (which should definitely be covered by the initial conditions change) - are absolutely identical. Not just similar, but numerically identical.

I wasn’t expecting a great deal of difference between the 2, but I was expecting at least some difference. Have I done something stupid?

Thanks a lot,

Charlie

Hi Charline.

I’ll check out those workflows and will take a look. :eyes:

All the best.

Jonny

Hi again.

I’ve had a look at the workflow definitions and your output on ARCHER2 and it looks to me like they are actually both using the values of T and S from the $NEMO_START environment variable defined in the rose-suite.conf file. This would certainly explain what you’re seeing assuming that this is indeed what is being used by the model at run time.

The job.out log files on ARCHER2 bear this out too…

My recommendation would be to edit this variable/path and run for a month. As you rightly say, this difference should be obvious very quickly.

Good luck!

Jonny

Thanks very much Jonny. This agrees with what I suspected. Can you confirm that the climatological T & S file, as specified in my ancillary versions file, is ONLY used if the restart in NEMO_START is left blank? But if a restart file is specified here, the climatological T & S is ignored, because the restart overrides anything else? With the benefit of hindsight, that actually make sense. So if I have changed my climatological T & S, I must leave the restart file blank?

Charlie

Hi Charlie.

I think we can confidently say that the $NEMO_START definition is ‘winning’, i.e. overriding the climatological files in your ancil versions file. This would make sense since it is defined in the rose-suite.conf file which tends to contain top level simulation control parameters.

As to whether leaving $NEMO_START blank will trigger the inclusion of your files I am not 100% sure off the top of my head but that certainly looks likely. I’m out of the office until Thursday so can’t easily check.

What I do know is that if nothing is specified at all for the initial T and S, then NEMO will choose impose an climatology for you; although it’s likely that this will not be suitable for a paleo run!

For now you should be able to tell very quickly what is going on by leaving $NEMO_START blank and seeing whether different output results when you vary the input in your ancil versions file.

Hope that helps! :slight_smile:

Jonny

Thank you very much Jonny, I will try that right now and let you know.

Charlie

Hi again,

For reasons I don’t understand, the second of my suites crashed almost straight away yesterday. As a reminder, they are dw696 and dw704. The first appears to be fine (as of last night at least), the second crashed. Both of these leave $NEMO_START blank, so the only difference between them is the T&S file to which they point in the ocean ancil file.

I am teaching all day today so can’t check the logs, but do you have any idea?

Charlie

Hey Charlie.

The crash is coming from the message parsing interface.

It did have 3 goes at this and crashed in the same way each time.

I’m doing a quick test now to see if I get the same error.

Cheers.

Jonny

Hi Charlie.

Quick update. I’m running into an error in the reconfiguration task which is curious since you are not. I’m running it at Cylc 8 (which everyone will have to in time…) but will try it at Cylc 7 to make sure.

All the best.

Jonny

HI,

If it is of any help - NEMO debug logs are generally found in work/date-time/coupled/ocean.output.

From the messages above it looks like the (NEMO) model itself called MPI_abort so most likely a check in the code has failed.

Mohit

Hi again.

I’ve done a couple of tests and the error is coming from the NEMO code, as can be seen here in the work/18500101T0000Z/coupled/ocean.output file (note the spaces in between the letters in E R R O R!)

I’ve checked and this is the same error you get in your u-dw704 run.

Essentially somehow the NetCDF dimensions have gotten confused, presumably in the ncatted commands which can be seen in the history attribute of /work/n02/n02/cjrw09/gc31/pliod/plioda.d/ancils/ocean/EN4_v1.1.1995_2014.monthlymean_eORCA1T_NEMO_L75_new_surface_okatt.nc.

All the best.

Jonny

PS - Thanks for your suggestion @mdalvi. You were right that it is the NEMO code ultimately causing the failure.

Thank you Jonny. Looking at the image, the error is with the salinity field, which I never actually changed. I only changed the temperature. I simply read in the salinity, then wrote it back out again, unchanged. Then I used ncatted to put all the attributes back in, making sure they were identical to the original. So what has been confused, and how can I avoid this? Presumably it needs all of the attributes that are listed in the original?

Charlie

Ps. At least we know it is reading in the right file!

Hey Charlie.

Ah, I neglected to say that this error results from use of either T or S, which further indicated that there was a problem with the file itself; see here for the temperature variable…

In this instance, the help information in the error message was very helpful and indeed following the advice therein I can get u-dw704 to run fine. :smiley:

I created a new test file as follows…

ncks --mk_rec_dmn time \
EN4_v1.1.1995_2014.monthlymean_eORCA1T_NEMO_L75_new_surface_okatt.nc \
EN4_v1.1.1995_2014.monthlymean_eORCA1T_NEMO_L75_new_surface_okatt-jonny.nc

… and then pointed the ancil versions file to it.

Cheers.

Jonny

… and here’s the global mean SST difference for days 30 to 60 FYI.

Cheers

Jony

Perfect, that’s brilliant Jonny. I think I understand what you did to change the file using ncks, but can you advise on what I could have done differently to avoid the problem in the first place? My work flow was: using IDL, read in original T&S file, modify for temperature, write out the new temperature and existing salinity to a new file without any attributes, the use ncatted to put back on the attributes, identical to the original ones.

Charlie

Hi Jonny

Very sorry about this, but my suite (only dw704 this time, which has the modified T & S file - my other suite, dw696, which has the original T & S file, is fine) has again crashed, this time in the 4th year i.e. 1853. Rather oddly, the job.status says that the coupled task for this year only ran for 3 minutes, but looking at the output (in ~/cylc-run/u-dw704/share/data/History_Data), it contains almost all of the output I would expect for the entire year i.e. there are 12 months, and a whole bunch of restart files. I have just checked the last month (i.e. December) and all the data are present and correct

Looking at the ocean.output (at ~/cylc-run/u-dw704/work/18530101T0000Z/coupled), it says it got to timestep 2700 - which, if my maths is correct and given the ocean timestep = 32 (or 45 minutes), is about 84 days e.g. just under 3 months. There is an error message at the end of ocean.output, but nothing very helpful:

===>>> : E R R O R

STOP
Critical errors in NEMO initialisation

Do you have any idea what’s going on here i.e. why it would crash at this point? And, if the ocean.output is correct in terms of where we got to i.e. just under 3 months, how come I have valid data for the entire year, all the way up to and including December.

Charlie

Hi Charlie.

I’m afraid I’m not an IDL user so can’t help you with that. That said, the only thing you’d need to change in your workflow would be to add the ncks command as I did…

ncks --mk_rec_dmn time \
EN4_v1.1.1995_2014.monthlymean_eORCA1T_NEMO_L75_new_surface_okatt.nc \
EN4_v1.1.1995_2014.monthlymean_eORCA1T_NEMO_L75_new_surface_okatt-jonny.nc

I think what had happened somehow is that the time dimension was not being recognised as a ‘record dimension’ (in the language of NCO operators such as ncatted) and so it was being interpreted as a spatial dimension.

Cheers.

Jonny

Very sorry about this …

Firstly, no need to apologise. :smiling_face:

I’ve had a look at the output files (job.out, job.err) for the 18530101T0000Z/coupled task and it actually seems to have completed successfully on the first attempt, which explains why you are seeing all the files you expected.

> cat ~cjrw09/cylc-run/u-dw704/log/job/1853*/coupled/01/job.out |tail -1
2026-03-15T16:45:53Z INFO - succeeded

Why it decided to run again is not clear. What I can say is that the 1/1/1854 NEMO restarts weren’t created as far as I can tell (the CICE one was though) and none of the postproc tasks ran for the 1853 cycle point (there are no logs present for them). The lack of these NEMO restarts will certainly have caused the simulation to stall although I can’t guarantee that is exactly what happened to your run.

I’ve started a run of u-dw704 to see if it gets past 1/1/1854.

All the best.

Jonny

Ok, that’s kind of good news I guess. But why would it fail to create the restarts, when it did so for the previous 3 years?

Charlie

It certainly should have done! I’ll let you know if I encounter the same issue…

Cheers.

Jonny