JULES NEON suite

Hi,

I’d like to have a a suite to run JULES at NEON terrestrial sites in the US. Observed meteorology should be available via the NEON API, but could use reanalysis as an alternative.

Thanks,
Tristan.

Thanks, Tristan:
I have started working on this, using the u-al752 JULES FLUXNET suite as a basis. We have contacted the primary owners of u-al752 (Karina Williams and Anna Harper), and they have pointed me to some code that might be able to be adapted for adding the JULES NEON sites to the u-al752 suite.

I also tried to download some of the NEON data from their website. It is a bit slow to download. Maybe it’s faster (maybe there isn’t any data throttling) if I get a NEON account? I will try that.
Patrick

I don’t know if having an account will speed up the download - I suspect not, but worth a try.

Hi Tristan:
I thought I saw somewhere on the NEON site that there might be throttling or something, for users without a NEON account. But I don’t immediately see that right now. Anyways, I will try downloading after creating an account soon.
Patrick

Hi Karina and Anna
Tristan suggests making a suite that can run JULES for the NEON terrestrial sites (approx. 30 sites).

When I heard this idea, I immediately thought of the u-al752 JULES FLUXNET suite that I worked with you to get working on JASMIN. I suggested to Tristan that maybe I could adapt that suite to work with the NEON sites.

Tristan doesn’t think that it is absolutely necessary to have the capability within the suite to overplot the time series of the observations from the towers on the same plot as the time series of the JULES model outputs, but I think that is interesting and useful to work towards porting that capability from your suite to the NEON-capable version of your suite.

Would it be OK to adapt your u-al752 suite so that it can run JULES for the NEON sites? Do you know of anybody else who has already done something like this? You do have some tools that are not part of the u-al752 JULES FLUXNET suite for preprocessing the atmospheric-forcing data and populating the soil/vegetation/etc. ancillary data fields. Do you think those tools would be useful for the NEON sites?
Patrick

Hi Patrick,

Sure, you can use u-al752 - you could add the NEON sites directly to the u-al752 trunk if you like?

I’ll forward the info summary of how to put new FLUXNET sites into the suite, as lots of that can be adapted I think.

You do have some tools that are not part of the u-al752 JULES FLUXNET suite for preprocessing
the atmospheric-forcing data and populating the soil/vegetation/etc. ancillary data fields. Do you
think those tools would be useful for the NEON sites?

Yes, I think so. The code is stored at

https://code.metoffice.gov.uk/trac/utils/wiki/smstress_jpeg

https://code.metoffice.gov.uk/trac/roses-u/browser/a/l/7/5/2/u-al752-processing/bin

All of this code works on python 2.7 (if you need a python2.7 environment to run them, you can use the MONSOON postprocessor). Some of the scripts work in both python 2.7 and python 3 (using future imports, e.g. https://code.metoffice.gov.uk/trac/utils/browser/smstress_jpeg/trunk/jules.py#L18). If you fancy adapting some of the other scripts to also run on both python 2.7 and 3, that would give us a lot more flexibility for running the suite, especially

https://code.metoffice.gov.uk/trac/utils/browser/smstress_jpeg/trunk/fluxnet_evaluation.py

and

https://code.metoffice.gov.uk/trac/utils/browser/smstress_jpeg/trunk/test_fluxnet_evaluation.py

(I’ve been using this script for lots of non-FLUXNET sites, so it should hopefully be useful for NEON sites too)

Cheers,

Karina

Hi Patrick

That sounds great, I don’t know of other work to add NEON sites to the suite. Thanks Karina for sending along the scripts!

Cheers,
Anna Harper

Hi Fred [and Patrick]:
[originally sent by email to Fred Otu-Larbi, CC: Patrick McGuire et al., in July 2020]

I’ve added site BR-Sa3 to u-al752 and [in the next post in this 2022 ticket, I attach] a description of how I did it (notes_on_adding_a_site.txt), which hopefully you can follow for the sites you need: US-Blo and US-Sa1 [or the NEON sites]. But if any steps don’t make sense, give me an email and we can work through them together over skype. I’m sure there’ll be things I forgot to write down, so there’s anything not immediately clear from the description, email me.

These are the changes made in u-al752:
https://code.metoffice.gov.uk/trac/roses-u/changeset?reponame=&new=164463%40a%2Fl%2F7%2F5%2F2%2Ftrunk&old=164432%40a%2Fl%2F7%2F5%2F2%2Ftrunk
and these are the changes made in u-al752-processing:
https://code.metoffice.gov.uk/trac/roses-u/changeset?reponame=&new=164464%40a%2Fl%2F7%2F5%2F2%2Fu-al752-processing&old=164428%40a%2Fl%2F7%2F5%2F2%2Fu-al752-processing

[I’ve sent plots by email] showing LBA-K83 and BR-Sa3.

It doesn’t include any processing of soil moisture or LAI at the moment - that’ll be the next stage.

Next, I’ll plot the soil moisture observations from FLUXNET2015, Ameriflux and LBA-Ecology Km 83 Data, Publications, and Presentations for this site, and see how they compare to the observations for this site from LBA-MIP.

cheers,

Karina

notes_on_adding_a_site.txt
Karina Williams
6.7.2020

Adding a new FLUXNET2015 site BR-Sa3 (without prescribing soil moisture or LAI)

Changes

These are the changes made in u-al752:https://code.metoffice.gov.uk/trac/roses-u/changeset?reponame=&new=164463%40a%2Fl%2F7%2F5%2F2%2Ftrunk&old=164432%40a%2Fl%2F7%2F5%2F2%2Ftrunk
and these are the changes made in u-al752-processing:https://code.metoffice.gov.uk/trac/roses-u/changeset?reponame=&new=164464%40a%2Fl%2F7%2F5%2F2%2Fu-al752-processing&old=164428%40a%2Fl%2F7%2F5%2F2%2Fu-al752-processing

Steps

  1. Download Tier 1+2 FLUXNET2015 data for Br-Sa3. Unzip. Check out trunk and processing branches if you don’t have them already:
    cd ~/roses
    svn co https://code.metoffice.gov.uk/svn/roses-u/a/l/7/5/2/trunk u-al752
    svn co https://code.metoffice.gov.uk/svn/roses-u/a/l/7/5/2/u-al752-processing u-al752-processing
    If you have them already, update them (“svn update”) to check you have the latest version.

  2. Create a file called filelist_br-sa3.txt out of an edited version of the ‘download manifest’ text that appears on the screen that gives the links to download. Because only one more site is being added, this file just contains the line:
    BR-Sa3,2020-06-30/FLX_BR-Sa3_FLUXNET2015_FULLSET_HH_2000-2004_1-4.csv,2000,2004,1-4,FLUX-MET,202001230043
    Put the name of this file in u-al752-processing/bin/convert_for_jules_edited_for_suite_u-al752.py (see the code comments in the main function for more information) and run this script. This produces files in u-al752-processing/processed_files/drive and u-al752-processing/processed_files/fluxes/subdaily_obs

  3. Change the site list in u-al752-processing/bin/create_daily_GMT_files.py and run. This produces files in u-al752-processing/processed_files/fluxes

  4. Create a new folder vn1.2 within your suite_data folder. Copy across the files from vn1.1. Copy across the new file u-al752-processing/processed_files/drive/BR_Sa3-met.dat and the new file u-al752-processing/processed_files/fluxes/BR_Sa3-energyandcarbon-dailyUTC.dat

  5. Add sites to the site_info_fluxnet dictionary in the get_site_info function in u-al752-processing/bin/generate_conf_files.py.
    Specify the type if you don’t want to use the category given in u-al752-processing/bin/fluxnet2015_site_info.csv. The file with this type in the filename must exist in u-al752/ancil/tilefracs (if it’s not there already, then create it). Years are the same as in the original ‘download manifest’. HH is timestep 1800 seconds and HR is timestep 3600 seconds.
    i.e. add
    “BR_Sa3” : {“drive_startyear” : 2000, “drive_endyear” : 2004,
    “type” : “BET-Tr”,
    “timestep” : 1800,
    },

  6. If have site fractions of sand/silt/clay, add to u-al752-processing/bin/soil_texture.dat
    If have other site obs for soil e.g. wilting point and field capacity, add this in explicitly to the make_soil_props_files function in u-al752-processing/bin/create_presc_soil_moisture.py (following the methods used for AT_Neu, BE_Vie, FI_Hyy).
    If have used site sand/silt/clay or other data, add this information in to the function print_soil_properties in u-al752-processing/bin/generate_soil_props.py (e.g. see notes_dict[‘AT-Neu’])
    Make links to files qrdata.soil.texture and HadGEM2ES_Soil_Ancil.nc, and the folder suite_data in the folder u-al752-processing/links

  7. Add new site to site_info in u-al752/var/info.inc. Change DATA_VERSION in u-al752/suite.rc to 1.2.

  8. Make sure python has access to the scripts listed in u-al752-processing/bin/README. Run u-al752-processing/bin/generate_soil_props.py (will generate the files of soil properties in u-al752/ancil/soil and the ancil conf file u-al752/app/jules/opt/rose-app-ancil-BR_Sa3.conf)
    Compile u-al752-processing/bin/display_soil_props_table.tex (i.e. “pdflatex display_soil_props_table.tex”) and check that the resulting file u-al752-processing/bin/display_soil_props_table.pdf looks ok.

  9. Create a link within suite_data called link-to-current-version that points to vn1.2.
    Run u-al752-processing/bin/generate_conf_files.py. Will create the drive conf file u-al752/app/jules/opt/rose-app-drive-BR_Sa3.conf. Check there’s no messages saying file is unexpected length (i.e. no warning to say the driving data does not have the expected amount of data).

  10. “svn add” all the new files in u-al752. Run the suite to check it works ok with the new sites. Compare new plots to the plots made by the last version. Commit changes to u-al75 and u-al752-processing folders. Copy the vn1.2 data to MONSooN. Make tarball of vn1.2 data (change version number, SUITE_DIR and TMP_DIR in u-al752-processing/bin/archive_data.bat and run) and put on dropbox. N.b. the tarball on dropbox shouldn’t include the soil moisture and LAI files until we have sorted out what can be used beyond the original study (i.e. Anna’s paper) and provided guidelines on how the data should be credited. Update table on twiki page (https://code.metoffice.gov.uk/trac/jules/wiki/FluxnetandLbaSites) and the “processed obs files” section. Attach plots and display_soil_props_table.pdf. Add any particular information about the site to the “Site specific information” section. Email rest of u-al752 team.

N.b. some of the sites are also in the LBA and/or Ameriflux networks (BR-Sa3 is in both) - so these are other places to download data from. There could also be extra infomation on site websites e.g. BR_Sa3 has data at LBA-Ecology Km 83 Data, Publications, and Presentations.

Hi Tristan
I have looked at the temperature data for the NEON SCBI site. I am wondering what heights the 5 vertical levels are at. Do you know where I can find this information? I did all sorts of Google searches for it, and I can’t see the vertical level heights in the data files. More generally, a table of the vertical levels for the various sites would be useful to me, if it’s not in each site’s dataset somewhere already.

I have one month of data here in the landsurf_rdg GWS on JASMIN, if you'd like to look at it. You can apply for access to that GWS first. This is the path for the 1 month of NEON data:

/gws/nopw/j04/landsurf_rdg/pmcguire/TristanQ/NEON.D02.SCBI.DP4.00200.001.2017-01.basic.20220120T173946Z.RELEASE-2022

Patrick

Hi Patrick,

Which data product is it? Taking DP1.00002.001 (Single aspirated air temperature) as an example, then you can check the data endpoint of the API:

https://data.neonscience.org/api/v0/data/DP1.00002.001/SCBI/2022-02

(Note - my choice of month here was arbitrary). That endpoint just returns the URL of a bunch of files, and one of them is:

NEON.D02.SCBI.DP1.00002.001.sensor_positions.20220313T193816Z.csv

And that contains the sensor heights. Full URL is:

https://storage.googleapis.com/neon-publication/NEON.DOM.SITE.DP1.00002.001/SCBI/20220201T000000--20220301T000000/expanded/NEON.D02.SCBI.DP1.00002.001.sensor_positions.20220313T193816Z.csv

I assume there will be similar files for other data types too. What I don’t fully understand is why the heights are described as an “offset” … offset from what? Presumably the ground, but I am not sure.

Cheers,
Tristan.

Hi Tristan

That’s impressive that you found that info with little trouble.

I had downloaded with the browser one month of data from “Bundled data products - eddy covariance” in the landsurf_rdg GWS on JASMIN.

/gws/nopw/j04/landsurf_rdg/pmcguire/TristanQ/NEON.D02.SCBI.DP4.00200.001.2017-01.basic.20220120T173946Z.RELEASE-2022

The weblink that has this data is, for example:

https://data.neonscience.org/data-products/DP4.00200.001

But I can’t figure out which buttons to push to also download endpoint link for the sensor_positions file that you found in the API directory

https://data.neonscience.org/api/v0/data/DP1.00002.001/SCBI/2022-02

Can you tell me how you found that API endpoint link?

I have been starting out with

https://data.neonscience.org/data-products/explore

Patrick

Hi,

I normally start with the site endpoint, e.g. for SCBI:

https://data.neonscience.org/api/v0/sites/SCBI

and that lists all the products for that site and the dates they are available. Then I use the data endpoint to examine the contents what data is available for that, e.g.:

https://data.neonscience.org/api/v0/data/DP1.00002.001/SCBI/2022-02

There’s normally a bunch of meta-data included at that level, but it can be a pain to spot. Is it possible this is why downloads were slow for you? If you use the data endpoint it gives you the address of individual csv files so you can fine tune what you want.

Cheers,
Tristan.

Thanks, Tristan:
I understand the site endpoint better now (https://data.neonscience.org/api/v0/sites/SCBI).
I see the various data products for the SCBI Smithsonian NEON site, including DP4.00200.001 (Bundled data products - eddy covariance). And I now know how to find out sensor positions for unbundled products.

I am now downloading almost 6 years of DP4.00200.001 bundled data for SCBI from the https://data.neonscience.org/data-products/explore website. This amounts to 5.5GB, and it is being downloaded at almost 4 MB/s, which is much faster than before. Previously, I attempted to download a dataset from there, and it was going at only 0.6-0.7 MB/s, so I gave up. For the 5.5GB, it is more than half done now, and it’s only been of order 10 minutes or so.
Patrick

Hi Tristan:

Anecdotal trials of the downloads from the NEON site suggest that it can maybe be 2x faster downloading when downloading with a NEON account, versus downloading without a NEON account. I am not sure if this statistic is valid all the time.

Furthermore, I have written some Python code that runs on JASMIN that can read & manipulate/plot/etc. the Bundled Eddy Covariance temperature data (in HDF format) from the NEON repository. There are also the equivalents of sw_down, lw_down, presBaro, and wind data in this bundled file. There is also the precip data and the VERTLEVEL=000 relative humidity data that is available in separate CSV files.

So I am continuing to revise the code so that it can read in all the data. This is what I have so far, for Python lists of strings, for iterating on in Python. Maybe some of these variables need adjustment? For example, should sw_down and lw_down be at the vertlevel=000 or vertlevel=tower top? Another example, is should pstar be at the soil surface or at vertlevel=35? The star generally means canopy top in JULES, I think.

Patrick

#tTop is an extra one for temperature at the top of the tower, t is at VERTLEVEL=000, to match the RH measurement location. 
#We need to compute q = specific humidity, which is not relative humidity as in the NEON files, so I think it is important to use 'tempRHMean' for this calculation of q

#JULES variable names
var=['t',        , 'tTop',       'sw_down',    'q',      'lw_down',    'pstar',      'precip',        'wind'            ]
#data file source
va0=['RH',       , 'BundleEddy', 'BundleEddy', 'RH',     'BundleEddy', 'BundleEddy', 'PRIPRE',        'BundleEddy'      ]
#variable names in data file
va1=['tempRHMean', 'tempAirTop', 'radiNet',    'RHMean', 'radiNet',    'presBaro',   'priPrecipBulk', 'soni'            ]
#vertlevel and time for variable name
va2=[None,         '000_040_30m','000_040_30m',None,     '000_040_30m','000_035_30m',None,            '000_040_30m'     ]
#sub-variable name
va3=[None,         'temp',       'radiSwIn',   None,     'radiLwIn',   'presAtm',    None,            'veloXaxsYaxsErth']
#statistic of sub-variable (i.e., mean, max, min)
va4=[None,         'mean',       'mean',       None,     'mean',       'mean',       None,            'mean'            ]

Tristan suggests using all driving-data variables above the canopy.
But we’re not sure right now how to define z1_tq. The default for z1_tq is 10 meters. But if we’re using driving-data variables all above the canopy, then maybe some sort of variable height for z1_tq for the different sites is needed?
Patrick

Hi Tristan:
Further down in the Google Scholar search for “zero-plane jules” was this relevant article.

Catherine Van den Hoof, Pier Luigi Vidale, Anne Verhoef, and Caroline Vincke
“Improved evaporative flux partitioning and carbon flux in the land surface model JULES: Impact on the simulation of land surface processes in temperate Europe”

[Agricultural and Forest Meteorology] Vol. 181 (2013) Pages 108-124

https://www.sciencedirect.com/science/article/pii/S0168192313001913
where they say:

“The distance between canopy height and the zero-plane displacement height H-
d, with d the displacement height, is a required parameter in JULES.
The values for d were taken from site specific information for Brass-
chaat and Vielsalm: 19 m (Carrara et al., 2003) and 28.5 m (Ligne
et al., 2010), respectively, or set to 2/3 h (Brutsaert, 1982) for the
other sites.” [h is the vegetation height.]

Hi Tristan:
Here are my updated variables (now at the top of the tower: VERT=060 for SCBI) that I will try to get extracted from the HDF5 files. Note that the barometric pressure is at VERT=015, and the PRECIP is at VERT=000.

#BundleEddy
filename=NEON_eddy-flux-3/NEON.D02.SCBI.DP4.00200.001.2017-11.basic.20220120T173946Z.RELEASE-2022/NEON.D02.SCBI.DP4.00200.001.nsae.2017-11.basic.20211220T234551Z.h5
#PRIPRE
filename=NEON_precipitation/NEON.D02.SCBI.DP1.00006.001.2017-11.basic.20220120T173946Z.RELEASE-2022/NEON.D02.SCBI.DP1.00006.001.900.000.030.PRIPRE_30min.2017-11.basic.20211210T173420Z.csv
#RH
#HOR.VER= 0.06 = 52 meters (tower top)
filename=NEON_rel-humidity/NEON.D02.SCBI.DP1.00098.001.2017-11.basic.20220120T173946Z.RELEASE-2022/NEON.D02.SCBI.DP1.00098.001.000.060.030.RH_30min.2017-11.basic.20211210T211406Z.csv
#t is at HOR.VER= 0.06, to match the RH measurement location.
#We need to compute q = specific humidity, which is not relative humidity as in the NEON files, so I think it is important to use 'tempRHMean' for this calculation of q
#JULES variable names for SCBI site
var=['t',          'sw_down',    'q',      'lw_down',    'pstar',      'precip',        'wind'            ]
#data file source
va0=['RH',         'BundleEddy', 'RH',     'BundleEddy', 'BundleEddy', 'PRIPRE',        'BundleEddy'      ]
#variable names in data file
va1=['tempRHMean', 'radiNet',    'RHMean', 'radiNet',    'presBaro',   'priPrecipBulk', 'soni'            ]
#vertlevel and time for variable name
va2=[None,         '000_060_30m',None,     '000_060_30m','000_015_30m',None,            '000_060_30m'     ]
#sub-variable name
va3=[None,         'radiSwIn',   None,     'radiLwIn',   'presAtm',    None,            'veloXaxsYaxsErth']
#statistic of sub-variable (i.e., mean, max, min)
va4=[None,         'mean',       None,     'mean',       'mean',       None,            'mean'            ]

Hi Tristan:
I have a new draft of the Python code that runs and iterates over the monthly HDF5 files from the NEON SCBI site, and can read in the various top of tower variables. I still need to merge in the reading of the CSV files for temperatureRH, RH, and primary precipitation. I also need to check the units and verify that the numbers are right.
Below are some print statements from the code for April and May of 2017 for the Smithsonian SCBI site in Virginia.
Patrick

/gws/nopw/j04/landsurf_rdg/pmcguire/TristanQ/NEON.SCBI/NEON.D02.SCBI.DP4.00200.001.2017-04.basic.20220120T173946Z.RELEASE-2022
sw_down radiNet
b'2017-04-01T00:00:00.000Z' b'2017-04-30T23:30:00.000Z'
Range of min  data (30min samples): -4.99 983.8
Range of mean data (30min samples): -4.97 992.02
Range of max  data (30min samples): -4.93 1388.18
lw_down radiNet
b'2017-04-01T00:00:00.000Z' b'2017-04-30T23:30:00.000Z'
Range of min  data (30min samples): 225.3 410.3
Range of mean data (30min samples): 225.8 414.6
Range of max  data (30min samples): 226.4 431.2
pstar presBaro
b'2017-04-01T00:00:00.000Z' b'2017-04-30T23:30:00.000Z'
Range of min  data (30min samples): 94.92945 98.74174
Range of mean data (30min samples): 94.97379 98.75372
Range of max  data (30min samples): 95.02046 98.76273
wind soni
b'2017-04-01T00:00:00.000Z' b'2017-04-30T23:30:00.000Z'
Range of min  data (30min samples): 0.000707106815127673 6.385916815387671
Range of mean data (30min samples): 0.47491681358692645 9.21354457544177
Range of max  data (30min samples): 1.031953593772511 21.11054109383807
/gws/nopw/j04/landsurf_rdg/pmcguire/TristanQ/NEON.SCBI/NEON.D02.SCBI.DP4.00200.001.2017-05.basic.20220120T173946Z.RELEASE-2022
sw_down radiNet
b'2017-05-01T00:00:00.000Z' b'2017-05-31T23:30:00.000Z'
Range of min  data (30min samples): -4.99 1042.88
Range of mean data (30min samples): -4.99 1091.44
Range of max  data (30min samples): -4.99 1547.04
lw_down radiNet
b'2017-05-01T00:00:00.000Z' b'2017-05-31T23:30:00.000Z'
Range of min  data (30min samples): 236.5 418.7
Range of mean data (30min samples): 237.3 427.7
Range of max  data (30min samples): 238.8 442.4
pstar presBaro
b'2017-05-01T00:00:00.000Z' b'2017-05-31T23:30:00.000Z'
Range of min  data (30min samples): 95.02946 98.3237
Range of mean data (30min samples): 95.03807 98.35979
Range of max  data (30min samples): 95.05346 98.37971
wind soni
b'2017-05-01T00:00:00.000Z' b'2017-05-31T23:30:00.000Z'
Range of min  data (30min samples): 0.0003535534075638365 36.681834081692266
Range of mean data (30min samples): 0.33413772032718864 36.681834081692266
Range of max  data (30min samples): 1.1196487562327306 58.91044169838825