GeoZarrHandler#
- class dtcg.datacube.geozarr.GeoZarrHandler(
- ds=None,
- ds_name='L1',
- target_chunk_mb=5.0,
- compressor=None,
- metadata_mapping_data=None,
- metadata_mapping_coords=None,
- zarr_format=2,
Bases:
MetadataMapperAttributes
metadata_mappings_datametadata_mappings_coordsMethods
__init__([ds, ds_name, target_chunk_mb, ...])Initialise a GeoZarrHandler object.
add_datacube(datacubes, datacube_name[, ...])Add a new dataset as a child group of the DataTree at the root.
add_layer(ds, ds_name[, overwrite])Add a new dataset as a child group of the DataTree at the root.
export(storage_directory[, overwrite])Write the dataset to GeoZarr format.
get_layer(ds_name)Get a dataset from a DataTree.
read_metadata_mappings(schema, map_file)Load and validate metadata mappings from a YAML file.
update_metadata(dataset, ds_name)Apply variable and shared metadata to an xarray Dataset.
- Parameters:
ds (xr.Dataset)
ds_name (str)
target_chunk_mb (float)
compressor (Optional[Blosc])
metadata_mapping_data (str)
metadata_mapping_coords (str)
zarr_format (int)
- __init__(
- ds=None,
- ds_name='L1',
- target_chunk_mb=5.0,
- compressor=None,
- metadata_mapping_data=None,
- metadata_mapping_coords=None,
- zarr_format=2,
Initialise a GeoZarrHandler object.
- Parameters:
ds (xarray.DataTree | xarray.Dataset, default None) – Input dataset with dimensions (‘x’, ‘y’) or (‘t’, ‘x’, ‘y’). Must include coordinate variables. Accepts either a dataset or data tree.
data_tree (xarray.DataTree, default None) – Input data_tree. Either ds or data_tree must be provided.
ds_name (str, default 'L1') – Name of datacube.
target_chunk_mb (float, default 5.0) – Approximate chunk size in megabytes for efficient storage.
compressor (Blosc, default None) – Compressor to apply on arrays. If None, the compression will be Blosc with zstd.
metadata_mapping_data (str, default None) – Path to the YAML file containing variable metadata mappings. If None, defaults to ‘metadata_mapping_data.yaml’ in the current directory.
metadata_mapping_coords (str, default None) – Path to the YAML file containing time coordinate metadata mappings. If None, defaults to ‘metadata_mapping_data.yaml’ in the current directory.
zarr_format (int, default 2) – Zarr format version to use (2 or 3).
- add_datacube(
- datacubes,
- datacube_name,
- overwrite=False,
Add a new dataset as a child group of the DataTree at the root.
- Parameters:
datacubes (dict) – A dictionary with keys one of the currently supported L2 datacubes (‘monthly’, ‘annual_hydro’, ‘daily_smb’) and values the corresponding xr.Dataset.
datacube_name (str) – Layer name to be used for this node of the tree. It should either contain L2 or L3. If nothing from the both is included the name will get L2_ as suffix.
overwrite (bool) – If True, allow a layer of the same name to be overwritten.
- Return type:
None
- add_layer(ds, ds_name, overwrite=False)[source]#
Add a new dataset as a child group of the DataTree at the root. :param ds: New dataset layer to be added to the existing data tree. :type ds: xarray.Dataset :param ds_name: Layer name to be used for this node of the tree. :type ds_name: str :param overwrite: If True, allow a layer of the same name to be overwritten. :type overwrite: bool
- Parameters:
ds (Dataset)
ds_name (str)
overwrite (bool)
- Return type:
None
- export(storage_directory, overwrite=True)[source]#
Write the dataset to GeoZarr format.
- Parameters:
storage_directory (str) – Path to write the Zarr data.
overwrite (bool, default True) – Whether to overwrite existing Zarr contents in the target location.
- Return type:
None
- get_layer(ds_name)[source]#
Get a dataset from a DataTree.
- Parameters:
ds_name (str) – Layer name.
- Returns:
Dataset layer in tree.
- Return type:
xr.Dataset
- Raises:
KeyError – If the layer name is not present in the data tree.
AttributeError – If the layer does not contain a dataset.
- read_metadata_mappings(schema, map_file)#
Load and validate metadata mappings from a YAML file.
- Parameters:
schema (Schema) – The schema structure used for validation
map_file (str) – Path to the YAML file containing metadata mappings.
self (MetadataMapper)
- Returns:
Metadata mappings loaded from YAML file.
- Return type:
dict
- Raises:
schema.SchemaError – If any of the metadata entries fail schema validation.
- update_metadata(dataset, ds_name)#
Apply variable and shared metadata to an xarray Dataset.
- Parameters:
dataset (xarray.Dataset) – Dataset to which the metadata should be applied.
ds_name (str) – Name of dataset.
self (MetadataMapper)
- Returns:
The input dataset with updated metadata.
- Return type:
xarray.Dataset
- Warns:
UserWarning – If any dataset variables are missing in the metadata mapping.
Notes
This function adds both per-variable and global metadata attributes. Missing variable mappings are reported as warnings, not errors.
