GeoZarrHandler#
- class dtcg.datacube.geozarr.GeoZarrHandler(
- ds,
- ds_name='L1',
- target_chunk_mb=5.0,
- compressor=None,
- metadata_mapping_file_path=None,
- zarr_format=2,
Bases:
MetadataMapperAttributes
metadata_mappingsMethods
__init__(ds[, ds_name, target_chunk_mb, ...])Initialise a GeoZarrHandler object.
add_layer(ds, ds_name[, overwrite])Add a new dataset as a child group of the DataTree at the root.
export(storage_directory[, overwrite])Write the dataset to GeoZarr format.
get_layer(ds_name)Get a dataset from a DataTree.
Load and validate metadata mappings from a YAML file.
update_metadata(dataset, ds_name)Apply variable and shared metadata to an xarray Dataset.
- Parameters:
ds (xr.Dataset)
ds_name (str)
target_chunk_mb (float)
compressor (Optional[Blosc])
metadata_mapping_file_path (str)
zarr_format (int)
- __init__(
- ds,
- ds_name='L1',
- target_chunk_mb=5.0,
- compressor=None,
- metadata_mapping_file_path=None,
- zarr_format=2,
Initialise a GeoZarrHandler object.
- Parameters:
ds (xarray.Dataset) – Input dataset with dimensions (‘x’, ‘y’) or (‘t’, ‘x’, ‘y’). Must include coordinate variables.
ds_name (str, default 'L1') – Name of datacube.
target_chunk_mb (float, default 5.0) – Approximate chunk size in megabytes for efficient storage.
compressor (Blosc, default None) – Compressor to apply on arrays. If None, the compression will be Blosc with zstd.
metadata_mapping_file_path (str, default None) – Path to the YAML file containing variable metadata mappings. If None, defaults to ‘metadata_mapping.yaml’ in the current directory.
zarr_format (int, default 2) – Zarr format version to use (2 or 3).
- add_layer(ds, ds_name, overwrite=False)[source]#
Add a new dataset as a child group of the DataTree at the root.
- Parameters:
ds (xarray.Dataset) – New dataset layer to be added to the existing data tree.
ds_name (str) – Layer name to be used for this node of the tree.
overwrite (bool) – If True, allow a layer of the same name to be overwritten.
- Return type:
None
- export(storage_directory, overwrite=True)[source]#
Write the dataset to GeoZarr format.
- Parameters:
storage_directory (str) – Path to write the Zarr data.
overwrite (bool, default True) – Whether to overwrite existing Zarr contents in the target location.
- Return type:
None
- get_layer(ds_name)[source]#
Get a dataset from a DataTree.
- Parameters:
ds_name (str) – Layer name.
- Returns:
Dataset layer in tree.
- Return type:
xr.Dataset
- Raises:
KeyError – If the layer name is not present in the data tree.
AttributeError – If the layer does not contain a dataset.
- read_metadata_mappings(
- metadata_mapping_file_path,
Load and validate metadata mappings from a YAML file.
- Parameters:
metadata_mapping_file_path (str) – Path to the YAML file containing metadata mappings.
self (MetadataMapper)
- Raises:
schema.SchemaError – If any of the metadata entries fail schema validation.
- Return type:
None
- update_metadata(dataset, ds_name)#
Apply variable and shared metadata to an xarray Dataset.
- Parameters:
dataset (xarray.Dataset) – Dataset to which the metadata should be applied.
ds_name (str) – Name of dataset.
self (MetadataMapper)
- Returns:
The input dataset with updated metadata.
- Return type:
xarray.Dataset
- Warns:
UserWarning – If any dataset variables are missing in the metadata mapping.
Notes
This function adds both per-variable and global metadata attributes. Missing variable mappings are reported as warnings, not errors.
