I have a ticket with the same title and that one was resolved, but this one doesn’t seem to be solved. Here is the error message I’m getting from glm_um_fcst_000 in u-ci559.
? Error from routine: WGDOS Packing (f_shum_wgdos_pack)
? Error message: Problem packing field...
? STASH: 431
? Accuracy: -28
? Minimum: -0.1157114003E+39
? Maximum: 0.6946223300E+38
? Message: Unable to WGDOS pack to this accuracy
I don’t think I’m requesting STASH 431.
I thought this error might be related to unit 15 = pp5 by looking at the pe output. So I increased the reserved header and set packing option unpacked, but the process fails with the same error. Maybe the problem is caused somewhere else.
The minimum and maximum values above appear to be wrong, but I’m not sure what values these are meant to be. (Maybe they are in /home/d03/myosh/cylc-run/u-ci559/share/cycle/20190711T0000Z/glm/um/umglaa_cb000 ? but then not sure why that data is wrong or corrupted…)
This suite finished running the first 24 hours (20190710) and this is the run for the second day. I’m using the dump file created by the first day of run as the UKCA initial file and the operational forecast dump (20190711T0000Z_glm_t+0) for dm_ic_file. Maybe I’m missing something else. Or is it necessary to use the same dump as dm_ic_file ?
stash 431 is probably the section 0 lbc item (look in /home/d03/myosh/roses/u-ci559/app/glm_um/opt/rose-app-lbc.conf) which has operational packing.
The error usually indicates that the model has produced a number that the packing can not handle, which usually indicates an error. You could try not packing the lbcs to see what value the model has created. Try setting packing=0 in /home/d03/myosh/roses/u-ci559/app/glm_um/opt/rose-app-lbc.conf
? Error from routine: EG_BICGSTAB
? Error message: Convergence failure in BiCGstab after 1 iterations: omg is too small
This is a problem I have encountered and consulted once (http://cms.ncas.ac.uk/ticket/3561), but has not really been resolved. I just found a grid setting that does not cause this problem. And now it came back.
I got this just before the error message in /home/d03/myosh/cylc-run/u-ci559/log/job/20190711T0000Z/glm_um_fcst_000/NN/job.out;
Grid orientation in degrees - -0.1074E+10
Year Day Hour Minute Second
Atmosphere time = -0.1074E+10 -0.1074E+10 -0.1074E+10 -0.1074E+10 -0.1074E+10
Mass, energy, energy drift = -0.1074E+10 -0.1074E+10 -0.1074E+10
Is this the problem? Any idea how to fix it?
Masaru
I tried that and all previous results were deleted, including regional dump files that were to be used in this continuation run… Anyway that’s something I can take care of next time I run the suite.
But when you ran the suite didn’t you get this error in the regional model?
I have had so many problems running these suites and many of them have not been resolved but only somehow avoided. So I have been haunted by them repeatedly…
I may have misunderstood your message of Oct 15th - I thought you were getting the EG_BICGSTAB error in the global run. I’d not paid attention to the regional model. But having run the suite longer, I did get Regn1_Brit2Port_RA2M_um_fcst_000 failing with the error Error message: Boundary data starts after start of current boundary data interval
and the Regn1_wPorto_RA2M_um_recon task fails (I don’t understand why.)
No, Grenville. I don’t think you misunderstood the problem.
I just had another problem, and it is also a very familiar one…
I’ve been haunted by these problems and others…
Where is python_env file? I found ones in cylc-run folder for a couple of suites (like ~/cylc-run/u-ci463/bin/python_env) but not for this one. And how should I change it?
u-ci559 is configured a little differently to the suite I was testing ANTS on. Please look at /home/d03/gmslis/roses/u-ci559, where I have made changes so that it uses ANTS to calculate Regn1_Brit2Port_ancil_ants_vegfrac.
Thank you for this.
ants_general_aero is the only other directory I can see in app/ants*. Should I change this line in app/ants_general_aero/rose-app.conf
It looks like ANCIL and ANTS processes went fine, except that qrclim.sulpdms has value zero everywhere. This can be replaced with the file I created from the global data using python and xancil. So this is a progress.
But Regn1_Brit2Port_RA2M_um_fcst_000 fails with the same error as above… INBOUNDA: Boundary data starts after start of current boundary data interval
I forgot to set I_override_date_time back to 0 for the regional model. I think this was causing the INBOUNDA problem.
Now I have a different problem. I’ll ask you about this after I make some checks and tests.
I was having the similar problem to the top of this page;
? Error from routine: WGDOS Packing (f_shum_wgdos_pack)
? Error message: Problem packing field...
? STASH: 2380
? Accuracy: -24
? Minimum: 0.5540517937E-20
? Maximum: 0.4970327279E+04
? Message: Unable to WGDOS pack to this accuracy
The difference is that I do request STASH item 2-380 and want to have this output.
But ‘Accuracy: -24’ sounds strange.
I set packing=0 in /home/d03/myosh/roses/u-ci960/app/glm_um/opt/rose-app-lbc.conf and packing=0 for pp2 used by upc through which I request item 2-380. I don’t know why but it seems to be running fine now.
u-ci960 ran fine for two 24 hour simulations. so I think it is mostly fine (except some minor issues).
u-ci985 is a copy of u-ci960 but I made both nests a little larger. (I initially tried including the third nest near SW of Britain but it wasn’t created correctly.)
In this suite Regn1_Brit2PortL_RA2M_um_fcst_000 fails with an error message “Convergence failure in BiCGstab after 1 iterations: omg is too small”. This is the same error message before (above) but that time it occurred for the global model.
I compared u-ci960 and u-ci985 and now I can’t see a difference that might cause the problem…
This problem does not go even if I run the suite as new.