GeoZarrHandler#

class dtcg.datacube.geozarr.GeoZarrHandler(
ds,
ds_name='L1',
target_chunk_mb=5.0,
compressor=None,
metadata_mapping_file_path=None,
zarr_format=2,
)[source]#

Bases: MetadataMapper

Attributes

metadata_mappings

Methods

__init__(ds[, ds_name, target_chunk_mb, ...])

Initialise a GeoZarrHandler object.

add_layer(ds, ds_name[, overwrite])

Add a new dataset as a child group of the DataTree at the root.

export(storage_directory[, overwrite])

Write the dataset to GeoZarr format.

get_layer(ds_name)

Get a dataset from a DataTree.

read_metadata_mappings(...)

Load and validate metadata mappings from a YAML file.

update_metadata(dataset, ds_name)

Apply variable and shared metadata to an xarray Dataset.

Parameters:
  • ds (xr.Dataset)

  • ds_name (str)

  • target_chunk_mb (float)

  • compressor (Optional[Blosc])

  • metadata_mapping_file_path (str)

  • zarr_format (int)

__init__(
ds,
ds_name='L1',
target_chunk_mb=5.0,
compressor=None,
metadata_mapping_file_path=None,
zarr_format=2,
)[source]#

Initialise a GeoZarrHandler object.

Parameters:
  • ds (xarray.Dataset) – Input dataset with dimensions (‘x’, ‘y’) or (‘t’, ‘x’, ‘y’). Must include coordinate variables.

  • ds_name (str, default 'L1') – Name of datacube.

  • target_chunk_mb (float, default 5.0) – Approximate chunk size in megabytes for efficient storage.

  • compressor (Blosc, default None) – Compressor to apply on arrays. If None, the compression will be Blosc with zstd.

  • metadata_mapping_file_path (str, default None) – Path to the YAML file containing variable metadata mappings. If None, defaults to ‘metadata_mapping.yaml’ in the current directory.

  • zarr_format (int, default 2) – Zarr format version to use (2 or 3).

  • self (dtcg.datacube.geozarr.GeoZarrHandler)

add_layer(ds, ds_name, overwrite=False)[source]#

Add a new dataset as a child group of the DataTree at the root.

Parameters:
  • ds (xarray.Dataset) – New dataset layer to be added to the existing data tree.

  • ds_name (str) – Layer name to be used for this node of the tree.

  • overwrite (bool) – If True, allow a layer of the same name to be overwritten.

  • self (dtcg.datacube.geozarr.GeoZarrHandler)

Return type:

None

export(storage_directory, overwrite=True)[source]#

Write the dataset to GeoZarr format.

Parameters:
  • storage_directory (str) – Path to write the Zarr data.

  • overwrite (bool, default True) – Whether to overwrite existing Zarr contents in the target location.

  • self (dtcg.datacube.geozarr.GeoZarrHandler)

Return type:

None

get_layer(ds_name)[source]#

Get a dataset from a DataTree.

Parameters:
Returns:

Dataset layer in tree.

Return type:

xr.Dataset

Raises:
  • KeyError – If the layer name is not present in the data tree.

  • AttributeError – If the layer does not contain a dataset.

read_metadata_mappings(
metadata_mapping_file_path,
)#

Load and validate metadata mappings from a YAML file.

Parameters:
  • metadata_mapping_file_path (str) – Path to the YAML file containing metadata mappings.

  • self (MetadataMapper)

Raises:

schema.SchemaError – If any of the metadata entries fail schema validation.

Return type:

None

update_metadata(dataset, ds_name)#

Apply variable and shared metadata to an xarray Dataset.

Parameters:
  • dataset (xarray.Dataset) – Dataset to which the metadata should be applied.

  • ds_name (str) – Name of dataset.

  • self (MetadataMapper)

Returns:

The input dataset with updated metadata.

Return type:

xarray.Dataset

Warns:

UserWarning – If any dataset variables are missing in the metadata mapping.

Notes

This function adds both per-variable and global metadata attributes. Missing variable mappings are reported as warnings, not errors.