Replacing a field in a dump file with the equivalent field from another dump file

Hello, I want to replace the soil moisture field from the start dump of one of our simulations (u-ds882), which uses UM vn13.0, with the start dump from another simulation (bb055) which used vn10.8. I’m having problems converting these files into netcdf and back cleanly (using cfa), so I’m not able to use xarray or nco to edit, which would be my preferred tools. I think I need to modify the files directly but I have very little experience with Met Office file formats.

What would be the recommended tool(s) to do this? Reading online, it seems like mule and iris are the main tools, maybe cf-python - which would be better for this task?

Does anyone have any scripts which do something similar, they are happy to share? Or point in the direction of some training example scripts which do something similar?

Hi Scott,

Can you elaborate which field (stashcode) you are trying to replace ? With reconfiguration one can initialise any prognostic either from an ancillary, or directly another dump (e.g. initialising of chemical tracers).
Note that there may be changes to the land-sea mask between the two model versions, so directly overwriting can cause a mismatch at sea/ lake edges, ice points, etc.

Mohit

Hi Mohit,

I’m trying to replace “SOIL MOISTURE CONTENT IN A LAYER”, stash code: 0-9. Advice on how to do this welcome.

Thanks for the warning about the land-sea mask, do you know how I can check that?

Many thanks,

Scott

Hi Scott,

If the suite-ids above are correct- are you trying to replace the soil moisture field from a N96 global model into a 25-km LAM ?

Mohit

Hi Mohit,

The data I was shared was from the mi-bb055/ada.file , it looks like it is from a nested domain from that run, which has the same dimensions as the LAM we’re trying to run.

Scott

Ok, I was looking at u-bb055.
See panel below in the Reconfiguration and Ancillary settings => Configure Ancils and dump files => created by *Right click on list and [Add New section] then again right click on the new line and [View namelist])
*

In this case the domain has to be identical to target model otherwise the sourcefile will be rejected.

Mohit

Edit: New screenshot showing one more setting at the bottom of the panel -stashcode of field in the donor file.

Thank you Mohit, I have just set that up and started the run. Will update once I’ve seen whether or not it is successful.

Many thanks,

Scott

Hi Mohit, it looks like the run has failed in the recon step with the error:

???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!!

? Error code: 200

? Error from routine: EXPAND_FROM_MASK

? Error message: compressedField array (size 881553) is too small to hold the compressed field (size 881905).

? Error from processor: 0

? Error number: 5

???

Any ideas what that could be?

The suite is u-dx772, log files here:

/home/n02/n02/scottan/cylc-run/u-dx772/run2/log/job/20091009T0000Z/SAm_4km_RA2T_um_recon/02

Hi Scott,

From the log files, the two values in the error (881553, 881905) refer to the number of land-points for bb055 and dx772. So, the domains are not identical and direct import of the field will not be possible.
Any solution to try now will be complicated. You can either run a Recon-only step for the u-dx772/SAm_4km… domain using the bb055 dump as ainitial and then use that output to import the soli moisture field (i.e. use that astart as ancilfilename= for stash_req=9).
Alternatively, you will have to regrid the stash-code=9 offline to the original_run/u-cr358/qrparm.mask maybe using cf-python => NetCDF => Xancil => soil-moisture file to read in the same way as above.
Mohit

Hi Mohit,

Thank you, that’s frustrating but makes sense.

I think we will do something like the second suggestion, using Iris or cf-Python. Do you have any suggestions for guides or examples which do something similar that we can work from?

Many thanks,

Scott

Hi Mohit,

I managed to get something to work using Mule running on ARCHER2 which merged the data from the soil moisture field from bb055 with our dump file over regions where both domains had land_mask = True. I then read in the data to initialise the soil moisture field as suggested by you above. I’m including some code below mainly in case it helps someone else in the future, please let me know if you see any major problems with this method.

Thanks for the help.

import numpy as np

import mule
from mule import DumpFile

SM_STASH = 9         # stash code for SOIL MOISTURE CONTENT IN A LAYER
LANDMASK_STASH = 30  # stash code for LAND MASK (No halo) (LAND=TRUE)

class ReplaceOperator(mule.DataOperator):
    """
    Operator to replace a field with new data
    https://code.metoffice.gov.uk/doc/um/mule/2024.11.1/user_guide_operators.html
    """
    def __init__(self, replace_data, landmask):
        self.replace_data = replace_data
        self.landmask = landmask

    def new_field(self, source_field):
        return source_field.copy()

    def transform(self, source_field, result_field):
        data = source_field.get_data()
        result_data = data.copy()
        if result_data.shape != self.landmask.shape:
             raise ValueError("Land mask not compatible with soil moisture field")
        if result_data.shape != self.replace_data.shape:
             raise ValueError("Domains don't have the same shape")
        np.copyto(
            result_data,
            self.replace_data,
            where=self.landmask
        )
        return result_data

def get_landmask(dump_file):
    """
    Function to extract landmask field from a dump file
    """
    landmask_data = None
    for ifield, field in enumerate(dump_file.fields):
        if field.lbrel in (2,3) and field.lbuser4 == LANDMASK_STASH:
            landmask_data = field.get_data().astype(bool)  # Land mask (2D)

    if landmask_data is None:
        raise ValueError(f"Landmask field (Stashcode {LANDMASK_STASH}) not found")

    return landmask_data

def mule_sm_merge(file_src: str, file_tgt: str, file_out:str, verbose: int=0):
    """
    Reads UM start dumps and merges soil moisture from source into target where the
    land-mask is true in both dumps.
    """
    dump_src = DumpFile.from_file(file_src)
    dump_tgt = DumpFile.from_file(file_tgt)

    landmask_src_data = get_landmask(dump_src)
    landmask_tgt_data = get_landmask(dump_tgt)

    # Create combined landmask field where both domains are true
    landmask_combined = landmask_src_data & landmask_tgt_data
    if verbose > 0:
        print(f"Land points in source mask: {np.count_nonzero(landmask_src_data)}")
        print(f"Land points in target mask: {np.count_nonzero(landmask_tgt_data)}")
        print(f"Land points used for merge: {np.count_nonzero(landmask_combined)}")
    if not landmask_combined.any():
        raise RuntimeError(
            "Combined land mask is empty — no soil moisture will be merged"
        )

    # Replace soil moisture data where landmasks are both true using operator
    for ifield, field in enumerate(dump_tgt.fields):
        if field.lbrel in (2,3) and field.lbuser4 == SM_STASH:
            replace_data = dump_src.fields[ifield].get_data()
            replace_operator = ReplaceOperator(replace_data=replace_data, landmask=landmask_combined)
            dump_tgt.fields[ifield] = replace_operator(field)

    # Save output
    dump_tgt.to_file(file_out)
    

This topic was automatically closed 24 hours after the last reply. New replies are no longer allowed.