Add L2SM model into JULES

Request to add the L2SM model into JULES as an optional replacement for the current Sellers scheme. Ideally the new branch should be submitted to the trunk.

The master version of L2SM is here:

but this is a very generalised form of the model, written in Python, and we (probably Tristan) would need to boil it down into a single Fortran function.

Note - a key task will be deciding how we input vertically varying canopy properties to JULES. Probably, in the first instance, this should be done on a per-PFT basis in the appropriate NML file.

Thanks, Tristan:
I will get working on this.
Patrick

Hi Tristan:
Just a quick check-in.
I was able to fork your layeredCanopyTwoStream repository, then run both layeredCanopy_wrapper_example.py and leafGeometry.py with python2 on JASMIN. The first program produced numbers and the second program produced plots, both without error.
The checked-out code is now in:
/gws/nopw/j04/odanceo/pmcguire/layeredCanopyTwoStream_pcm/layeredCanopyTwoStream

I guess you want me to translate the function get_layered_canopy_fluxes( ) into FORTRAN. Let me know if this is not the case. Otherwise, I will try to do that next week.
Patrick

Hi Tristan:
I have studied a bit of the JULES radiative-transfer (RT) FORTRAN source code, which uses the Sellers scheme. I am working on JASMIN, currently with the layered-output-canopy branch of JULES 5.2 trunk, which I call vn5.2_sif1. The RT code I looked at is here: ~pmcguire/jules/vn5.2_sif1/src/science/radiation/albpft_jls_mod.F90

I had the idea to try to compare the (Python) code changes between your L2SM model (using the Meador&Weavor two-stream scheme) and the Sellers scheme. This led to me finding another repository of yours:

The albpft( ) FORTRAN function in albpft_jls_mod.F90 isn’t a standalone function for vegetation. It also has code in it for urban surfaces, for snow-covered canopies, and for matching observed albedos with l_albedo_obs=TRUE. But I guess, we mainly just want an L2SM version of albpft( )?

Patrick

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Hi Tristan & Meg:
I have been able to add the L2SM model to JULES. It’s a 1st draft version of the code, and I am sure there are mistakes of various sorts. It is a version of the Python code at the GitHub repository tquaife/layeredCanopyTwoStream, which I manually translated from Python to FORTRAN.

The new code is checked into a MOSRS branch at https://code.metoffice.gov.uk/svn/jules/main/branches/dev/patrickmcguire/r25125_vn5.2_layeredCanopyTwoStream

I modified a GL7 (global land) N96 suite so that it uses this branch of JULES 5.2. The new suite is u-cw591, and it is currently running. It has compiled the code, and it has passed the RECON step, and it is more than 1 year through a 20 year spinup cycle at the moment. Probably the output values will show mistakes of some sorts, but it is running! I added a can_rad_mod=7 option which is used to switch from can_rad_mod=6 (Sellers) to can_rad_mod=7 (L2SM), using the can_rad_mod=6 options otherwise.
Patrick

1 Like

Hi Tristan & Meg:
The L2SM GL7 (global land) N96 suite did run on JASMIN for a few years (from 1860-1863), before crashing on 1864 sometime. It turned out that it was running low on memory. I increased the memory available in SLURM on the JASMIN LOTUS batch-processing nodes up to near the normal maximum of 64GB per core, and it has now run for 10 years (1860-1869).

It still has many years to go to reach the 21st century. The LS2M version currently takes about 90 minutes per year, whereas the original Sellers version took about 4 minutes per year.

Also, the LS2M version gives annual-average GPP values in many places that are negative, whereas the original Sellers version gives GPP values that are non-negative everywhere. There certainly is something that needs to be fixed.
Patrick

Hi Tristan & Meg:
I started trying to figure out why the GPP carbon-flux values were negative in places. They should always be non-negative.

It looks like some of my code changes for putting the L2SM into JULES were never being reached, when I thought they should be reached. This turns out to be because l_getprofile is hard wired as .false. in the code, so that the can_rad_mod == 7 code changes aren’t being reached in the file ~pmcguire/jules/r25125_vn5.2_layeredCanopyTwoStream/src/science/radiation/albpft_jls_mod.F90 .

I am trying out hardwiring it with l_getprofile=.true. in ~pmcguire/jules/r25125_vn5.2_layeredCanopyTwoStream/src/science/radiation/tile_albedo_jls_mod.F90.

There is a comment that says:

LOGICAL, INTENT(IN) :: l_getprofile
                                   ! TRUE means profiles through the
!                                    canopy are calculated
!                                  ! FALSE means no profiles calculated

This l_getprofile LOGICAL variable is there both in the JULES 5.2 trunk and in the canopy-layering branch
https://code.metoffice.gov.uk/svn/jules/main/branches/dev/patrickmcguire/vn5.2_sif1

The new suite that uses the latest revised version of the L2SM JULES branch is u-cx323.
It is running right now. We should have some results about if it runs better with l_getprofile=.true. maybe by tomorrow.
Patrick

Thanks Patrick.

Hi Tristan:
It turns out that setting l_getprofile=.true. was the wrong thing to do. This gets set to true when the albpft_jls subroutine is called in an alternative way.

It turns out that I had a mistake in the can_rad_mod == 7 logic, so this is why the can_rad_mod == 7 section wasn’t getting reached in the albpft_jls subroutine. I fixed this mistake, and I am rerunning now, so hopefully it works better this time.
Patrick

1 Like

Hi Tristan:
I have been through several more rounds of debugging now. With the can_rad_mod == 7 logic fixed, and a couple of other bugs fixed, the next bug is that the code fails when cosz_gb = mu = 0.0, due to division by zero. I have put in an if statement to skip these cases.

I had some print statements embedded, and prior to adding this if (mu > 0.01) statement, the L2SM code in JULES was producing a fair number of NAN’s in the recon & spinup phases of the suite. But when I added this statement, there were no NAN’s in the recon phase; the L2SM code in JULES was producing various float values that didn’t seem unreasonable. However, I ran out of disk space because I was printing things out to the log files, so the recon phase failed. I have now commented out the print statements in the L2SM code in JULES and now, the recon phase succeeds.

Maybe now, the spinup phase will succeed, and the main-run phase of the suite will start producing output that doesn’t have the GPP carbon flux as being < 0.0.
Patrick

Thanks Patrick.

I assume the Seller’s code must also need to guard against mu=0.0 as it also has to divide by mu. I’ll check the code later, but from memory, it sets a variables K=G/mu.

Hi Tristan
Yes. Both the Sellers code and the L2SM code need to guard against division by mu=0. I think mu=0 when the sun doesn’t rise, due to the eternal night of winter in the arctic.
Patrick

Hi Tristan:
The code now seems to run the L2SM sub-code (unlike before) and get to the main_run of the JULES suite u-cx323, after spinning up.

For some reason, it previously crashed during spin up for some mysterious reason. I think it was running out of memory; the normal maximum memory for a JASMIN node is 64GB, and I had it set to use the maximum. But for the global JULES suite, it does domain decomposition, normally with 10 processors. And with the par-multi queue setup that we use for this suite, it assigned 3 processors to the same node of 1/10th the single-processor global size, and so I think because there were 3 such JULES processes running on the same node, it ran out of memory. My initial solution is to double the number of processors from 10 to 20, which maybe reduces the memory usage per processor by up to a factor of 2. I tried this, and this time it is running without crashing mysteriously.

But the GPP is still often < 0, so I still need to work on that. I will try to do some print statements for after spinup, within the L2SM code.
Patrick

Hi Tristan:
I suspect one reason that GPP<0 with the L2SM code in JULES, is that I don’t currently return all the output variables from the albpft routine:

Currently, there are values returned in can_rad_mod == 5 normal non-L2SM usage:

            rnet_dir(i) = rdird - rdiru
            rnet_dif(i) = rdifd - rdifu

            fapar_dir(l,n,i) = (rnet_dir(i-1) - rnet_dir(i)) / dlai
            fapar_dif(l,n,i) = (rnet_dif(i-1) - rnet_dif(i)) / dlai

            fapar_dir2dir(l,n,i) = (1.0 - om) * dabeer_dla(i)
            fapar_dir2dif(l,n,i) =  om * dabeer_dla(i) + fapar_dir(l,n,i)
            fapar_dif2dif(l,n,i) = fapar_dif(l,n,i)

            fsun(l,n,i) = exp_minus_kla * (EXP(k * dlai) - 1.0) / (k * dlai)

            fapar_dir(l,n,i)     = fapar_dir(l,n,i) * can_struct_a(n)
            fapar_dif(l,n,i)     = fapar_dif(l,n,i) * can_struct_a(n)
            fapar_dir2dir(l,n,i) = fapar_dir2dir(l,n,i) * can_struct_a(n)
            fapar_dir2dif(l,n,i) = fapar_dir2dif(l,n,i) * can_struct_a(n)
            fapar_dif2dif(l,n,i) = fapar_dif2dif(l,n,i) * can_struct_a(n)

But in the L2SM version with can_rad_mod == 7, I currently only return a couple of the variables (fapar_dir & fapar_dif). I need to go through the math and return the other variables besides those 2).

            !Original L2SM Python code from Tristan Quaife:
            !Idn=self.layers[layer].Idn_dir
            !Iup=self.layers[layer].Iup_dir
            !self.layers[layer].Iab_dir=np.abs(Iup_below-Iup+Idn_above-Idn)
            !self.layers[layer].Iab_dir=np.abs(Iup_below-Iup+Idn_above-Idn)
            !Idn=self.layers[layer].Idn_dif
            !Iup=self.layers[layer].Iup_dif
            !self.layers[layer].Iab_dif=np.abs(Iup_below-Iup+Idn_above-Idn)

            !initial L2SM FORTRAN code using JULES variable names, by Patrick McGuire
            rdn = rdirdn(i)
            rup = rdirup(i)
            fapar_dir(l,n,i) = ABS(rdirup_below-rup+rdirdn_above-rdn)
            rdn = rdifdn(i)
            rup = rdifup(i)
            fapar_dif(l,n,i) = ABS(rdifup_below-rup+rdifdn_above-rdn)

Patrick

Hi Patrick,

It might be quicker for me to take you through what is required here. You should be able to pull those calculations from other parts of the code but when we have vertical variability in the code, they will need to be slightly different.

The basis of the equations JULES uses is given in this paper:

Thanks,
Tristan.

Hi Tristan:
Thanks.
For the can_rad_mod == 7 version (L2SM), I did lift the equations for fsun, fapar_dir2dir, fapar_dir2dif, etc. verbatim from the can_rad_mod == 5 version (as quoted in my previous post here). I assumed that can_struct_a(NPFT) == 1.0, as it is in the GL7 suite, which can be revisited if necessary. The new suite (suite number u-cx605) compiles, and is starting to run now.
Patrick

Hi Tristan:
The suite u-cx605 did start running, but it soon failed, in the RECON step. I had some trouble tracing this down. Using jules_print() statements wasn’t enough, so I also switched from JULES_BUILD=normal to JULES_BUILD=debug. This allowed the logs to show that in the albpft( ) code, I was addressing one of the L2SM arrays with a 0 index when the lowest index was actually 1.

I also found a division-by-zero-snowdepth error in an unrelated file, relayersnow_mod.F90, which I have temporarily disabled.

And I found that my setting of a lower-limit on mu = cos_solar_zenith >= 0.01 was allowing overflows, in the albpft( ) code, with tau/mu >= 38 and EXP(tau/mu) overflowing in some cases. So I changed the lower limit of mu to 0.05. This allowed the code to get through the RECON step.

Now, it gets into the SPINUP step, but there are multiple different errors on the different processors after running for up to 6 months in the year 1860. I am currently using 40 processors (for domain decomposition), increasing first from 10 to 20 to avoid running out of memory and then from 20 to 40 to try to get rid of the aforementioned overflow. But making it to the SPINUP step is progress, as I can now look at the end-dump from the RECON step, to make sure the numbers appear legitimate.

I do note that with JULES_BUILD=debug, when it makes it through the RECON step, it fails with overflow in the compactsnow( ) routine, and possibly other places. So I am now running with the maximum number of snow layers changed from 3 to 0 in the rose cylc suite nsmax variable, in order to avoid running the compactsnow( ) routine’s overflow.

With these settings, in the SPINUP step, there is another overflow in the albpft( ) code at the same spot. So I changed the minimum mu=cos_solar_zenith further, from 0.05 to 0.10. Now it can run through at least 2 spinup years for each of the 40 processors running on the subdomains. Previously, it was limited to no more than 6 to 10 months of the 1st spinup year (1860), prior to overflowing. It should start producing output after 10 spinup years. Maybe the output now has the GPP carbon flux >=0.0, like it should.

Patrick

1 Like

Thanks Patrick.

Long term, I am not sure that having such a big threshold on mu is appropriate. It is probably better to work out what is going on. The L2SM model itself should be fine with any value >0.0 and hence the numbers it returns to the rest of JULES should be OK.

Tristan.