"negative tracer values"

Hi,

I’m trying to run suite u-ck880 but atmos_main fails after about 10 minutes

“ERROR in UKCA_TRACERS_COPY_TO_UM: negative tracer values”

The run has got going and reached ~ 60 timesteps before failing.

This suite is a modification of u-ck729 and also has future emissions, SST/SI, ocean biogeochemistry and land cover ancils.

u-ck729 had problems with too many negatives occurring in the solver when the future emissions were used (the model ran fine with 2010 emission) and this was dealt with by increasing maxneg in the branch following Luke’s advice ("Too many negatives" failure in atmos_main).

In the case of u-ck729, the “hotter” chemistry associated with the 2095 emissions was a possible reason why the solver was crashing. In the case of u-ck880 the error is different but when I swap the 2050 emissions for 2010 emissions, the model runs fine, identifying the emissions again as the problem. I wonder if the error is related to negative values appearing in the solver caused by the emissions.

Are there any other checks I could do to test this and/or solver settings I should modify?

Thanks for your help,

James

Hi James,

Somewhat worryingly we’re starting to see this error crop-up more frequently. I’m not sure what’s causing it. It won’t be directly connected to the emissions I think, more that it seems to push the chemistry into regimes where the model return negative values.

A solution that I have seen work is to perturb the dump, e.g. following the advice here:

However, I’ve also seen it be “fixed” when it happens during a model run if, when restarting, you set BITCOMP_NRUN=False.

As this is at the start of a run I’d try the perturb_theta method, although note that this will give a slightly different model evolution. If this doesn’t work let me know. It may also be work checking with the UKESM core group at the Met Office to see if they’ve seen this and if so what they do about it.

Best wishes,
Luke

Hi Luke,

Thank you for your help and sorry for the delay in responding, I’ve been investigating the dump perturbing solution (BITCOMP_NRUN was already set to false in my suite). I ran the perturb script and the printed output suggests the script only changed one stash (which I note is the default stash).

Perturbing Field - Sec 0, Item 388: ThetaVD After Timestep

I did a mule-cumf comparison of the original and perturbed dump files and this also shows the only difference between the dump files is this field.

When I tried to run with the perturbed dump, the model failed again after 63 timesteps but I’m not sure if this is because dump file needed to be perturbed more or the perturbed method isn’t working fullstop in this case.

Matt Shin suggested I try changing the initialisation of the CRI species from 1e-18 to 1e-15 to see if this was the issue. In a short (development queue) run with this updated initialisation, the model progressed further before failing on the 20 min wallclock limit. It did appear to be running quite slowly but I have set off test run on the main queue to see how it goes.

Best wishes,

James

Hi James,

Thanks for your feedback. The fact that the perturb_theta method didn’t work to overcome the issue is perhaps not surprising. It’s interesting that increasing the tracer concentrations (even at these low levels) managed to get the model through the initial stages is encouraging. Essentially what you want to do is make the model technically different enough that the model evolution allows it to proceed while still being scientifically similar to what your experiment requires. I’m sorry that I can’t be more specific in terms of a fix!

Best wishes,
Luke

Hi Luke,

The perturb_theta approach might work but I think I would need to change a large number of the input fields, probably by some sort of loop over stash codes or specifying a list of stash codes in the command line. I remember now that I solved a negative tracer value which cropped up immediately at the start of the run (rather than after ~60 timesteps) but changing to another dump file so the problems may be related.

I’m not certain the model will run a full month even in the maximum wallclock time but will wait to find out.

Best wishes,

James

Hi Luke,

Just an update on this. u-ck880 ran for 427 timesteps (an improvement on the previous attempt) before failing with the “North/South halos too small for advection” error.

Matt kindly doubled checked my emissions and found an error with the oXYLENE burning emissions timeslice emissions. I think I missed this originally as I had checked my timeseries emissions but foolishly not the resulting timeslice emissions (which are what this run was actually using). The error must have occurred in the conversion of my timeseries emissions to timeslice emissions. I have run a test and the model completes over 700 timesteps in 20 mins (suggesting a wallclock time of about 1 hour) so I think things are working ok but I will check the output.

One thing which does appear odd is the print statements (in print status Diag mode) for BVOC oxidation production UCARB12 which appear to suggest this species is much higher (many orders of magnitude) in concentration that other species (even with emissions correction).

UKCA_TRACERS_COPY_FROM_UM: Copying tracers In.
tracer: 214 108 UCARB12 =
0.1933732442615302E+09 0.0000000000000000E+00 0.7697903047282982E+08
Compared to say PHAN
tracer: 203 88 PHAN =
0.5001502882252428E-13 0.0000000000000000E+00 0.1913978863250520E-14

I guess the acid test will be the monthly output which I will look at when it is available but have you encountered anything odd about the output from this routine before?

Best wishes,

James

Hi James,

If those are kg/kg values then they are very large! You could do a 1-day run with timestep output of that tracer to see how it evolves over time. If the numbers are that big it may be that the model grinds to a halt before the end of the month. Are the large numbers everywhere or localised, and do they grow over time?

You mean odd output from UKCA_TRACERS_COPY_FROM_UM? This should copy the tracers from the UM into UKCA - it shouldn’t be too controversial, but that doesn’t mean that something odd isn’t happening.

Best wishes,
Luke

Hi Luke,

With my corrected emissions, the model now runs and completes a month in ~ 1 hour.

However as you say I am still a bit confused by those print statement values. When looking at the monthly mean output for UCARB12, the spatial distribution and concentrations look reasonable. The highest concentrations are ~0.3 ppb over the major biogenic regions which makes sense.

I think the print statement comes from the ukca_all_tracers_copy_mod.F90 module, specifically

WRITE(umMessage,’(3E24.16)’) &
MAXVAL(tracer_ukca_um(:,:,:,tr_lookup(i))), &
MINVAL(tracer_ukca_um(:,:,:,tr_lookup(i))), &
SUM(tracer_ukca_um(:,:,:,tr_lookup(i))) / &
REAL(SIZE(tracer_ukca_um(:,:,:,tr_lookup(i))))

So it would appear that somewhere a very high value (the 0.1772E+09) of UCARB12 is being read into UKCA. This high value is found for all timesteps and other tracers don’t seem to have such a high value.

tracer: 214 108 UCARB12 =
0.1772351990191835E+09 0.0000000000000000E+00 0.8205902285571828E+08

From the monthly output, there isn’t a single grid cell with an unusually high value so it could be this values moves spatially.

Do you mean doing a run where I output the value every 20 mins - if so is there a particular time domain for that? The highest frequency I am aware of is T1HR?

Thanks for your help,

James

Hi James,

Yes, you could do a run where you use TALLTS to output the species on all timesteps.

Thanks,
Luke

Hi Luke,

Thank you. I ran a single day with UCARB12’s MMR output on TALLTS and across whole 72 timsteps the global maximum MMR was 1.5e-9 (corresponding to a VMR of ~0.5 ppb) so all very normal. The spatial distribution of surface MMR also looked fine. I summed the MMR over the 3 spatial dimensions to get a timeseries and this too look quite normal with jumps every hour for the chemistry timestep and a small decline after each jump. The maximum value was also obtained around 20:00 UTC (i.e. the model wasn’t still spinning at the end of the day for this species).

Do you think this apparently sensible output is a better indicator that the model is functioning correctly than the UM print statements?

Best wishes,

James