adi_py.utils

This module provides a variety of functions that are used by the Process class to serialize/deserialize between ADI and XArray. Most of these utility functions are used within Process class methods and are generally not intended to be called directly by developers when implementing a specific Process subclass.

Classes

DatastreamIdentifier

NamedTuple class that holds various information used to identify a specific

Functions

add_vartag_attributes

For the given ADI Variable, extract the source_ds_name and source_var_name

adi_hook_exception_handler

Python function decorator used to consistently handle exceptions in hooks

correct_zero_length_dims

Using dsproc APIs, inspect the given output dataset. If there are dimensions with 0 length,

get_adi_var_as_dict

Convert the given adi variable to a dictionary that can be used to create

get_cds_type

For a given Python data value, convert the data type into the corresponding

get_dataset_dims

get_dataset_vars

get_datastream_files

Return the full path to each data file found for the given datastream

get_datastream_id

Gets the corresponding dataset id for the given datastream (input or output)

get_empty_ndarray_for_var

For the given ADI variable object, initialize an empty numpy ndarray data

get_time_data_as_datetime64

Get the time values from dsproc as seconds since 1970, then convert those

get_xr_datasets

Get an ADI dataset converted to an xarray.Dataset.

is_empty_function

Evaluates a given function to see if the code contains anything more than

sync_xarray

Carefully inspect the xr.Dataset and synchronize any changes back to the

sync_xr_dataset

Sync the contents of the given XArray.Dataset with the corresponding ADI

to_xarray

Convert the specified CDS.Group into an XArray dataset.

Function Descriptions

class adi_py.utils.DatastreamIdentifier

Bases: NamedTuple

NamedTuple class that holds various information used to identify a specific ADI dataset.

datastream_name :str
dsid :int
facility :str
site :str
adi_py.utils.add_vartag_attributes(xr_var_attrs: Dict, adi_var: cds3.Var)

For the given ADI Variable, extract the source_ds_name and source_var_name from the ADI var tags and add them to the attributes Dict for to be used for the Xarray variable.

Note

Currently we are not including the coordinate system and output targets as part of the XArray variable’s attributes since these are unlikely to be changed. If a user creates a new variable, then they should call the corresponding Process methods assign_coordinate_system_to_variable or assign_output_datastream_to_variable to add the new variable to the designated coordinate system or output datastream, respectively.

Parameters
  • xr_var_attrs (Dict) – A Dictionary of attributes to be assigned to the XArray variable.

  • adi_var (cds3.Var) – The original ADI variable object.

adi_py.utils.adi_hook_exception_handler(hook_func: Callable, pre_hook_func: Callable = None, post_hook_func: Callable = None) Callable

Python function decorator used to consistently handle exceptions in hooks so that they return the proper integer value to ADI core. Also used to ensure that consistent logging and debug dumps happen for hook methods.

Parameters
  • hook_func (Callable) – The original hook function implemented by the developer.

  • pre_hook_func (Callable) – An optional function to be invoked right before the hook function (i.e., to do debug dumps)

  • post_hook_func (Callable) – An optional function to be invoked right after the hook function (i.e., to do debug dumps

Returns

Callable – Decorator function that wraps the original hook function to provide built-in, ADI-compliant logging and exception handling.

adi_py.utils.correct_zero_length_dims(dsid: Optional[int] = None, datastream_name: Optional[str] = None)

Using dsproc APIs, inspect the given output dataset. If there are dimensions with 0 length, correct them to be of length 1. For corresponding coordinate variables, create a data array of length 1 and set the value to -9999.

Apparently you only have to add values for the coordinate variable - Krista says ADI will automatically fill out missing values for the second dimension for data variables.

Parameters
  • dsid (Optional[int]) – The dsproc dsid of the dataset (used to find the dataset)

  • datastream_name (Optional[str]) – Or alternatively, the datastream name of the output dataset

adi_py.utils.get_adi_var_as_dict(adi_var: cds3.Var) Dict

Convert the given adi variable to a dictionary that can be used to create an xarray dataarray.

Parameters

adi_var (cds3.Var) – An ADI variable object

Returns

Dict – A Dictionary representation of the variable that can be used in the XArray.DataArray constructor to initialize a corresponding XArray variable.

adi_py.utils.get_cds_type(value: Any) int

For a given Python data value, convert the data type into the corresponding ADI CDS data type.

Parameters

value (Any) – Can be a single value, a List of values, or a numpy.ndarray of values.

Returns

int – The corresponding CDS data type

adi_py.utils.get_dataset_dims(adi_dataset: cds3.Group) List[cds3.Dim]
adi_py.utils.get_dataset_vars(adi_dataset: cds3.Group) List[cds3.Var]
adi_py.utils.get_datastream_files(dsid: int, begin_date: int, end_date: int) List[str]

Return the full path to each data file found for the given datastream and time range.

Parameters
  • dsid (int) – the datastream id (call get_dsid() to retrieve)

  • begin_date (int) – the begin timestamp of the current processing interval (seconds since 1970)

  • end_date (int) – the end timestamp of the current processing interval (seconds since 1970)

Returns

List[str] – A list of file paths that match the datastream query.

adi_py.utils.get_datastream_id(datastream_name: str, site: str = None, facility: str = None, dataset_type: adi_py.constants.ADIDatasetType = None) Optional[int]

Gets the corresponding dataset id for the given datastream (input or output)

Parameters
  • datastream_name (str) – The name of the datastream to find

  • site (str) – Optional parameter used only to find some input datasets (RETRIEVED or TRANSFORMED). Site is only required if the retrieval rules in the PCM specify two different rules for the same datastream that differ by site.

  • facility (str) – Optional parameter used only to find some input datasets (RETRIEVED or TRANSFORMED). Facility is only required if the retrieval rules in the PCM specify two different rules for the same datastream that differ by facility.

  • dataset_type (ADIDatasetType) – The type of the dataset to convert (RETRIEVED, TRANSFORMED, OUTPUT)

Returns

Optional[int] – The dataset id or None if not found

adi_py.utils.get_empty_ndarray_for_var(adi_var: cds3.Var, attrs: Dict = None) numpy.ndarray

For the given ADI variable object, initialize an empty numpy ndarray data array with the correct shape and data type. All values will be filled with the appropriate fill value. The rules for selecting a fill value are as follows:

  • If this is a qc variable, the missing value bit flag will be used. If no missing value bit, then the failed transformation bit flag will be used. If no transformation failed bit, then use _FillValue. If no _FillValue, then use the netcdf default fill value for integer data type.

  • Else if a missing_value attribute is available, missing_value will be used

  • Else if a _FillValue attribute is available, _FillValue will be used

  • Else use the netcdf default fill value for the variable’s data type

Parameters
  • adi_var (cds3.Var) – The ADI variable object

  • attrs (Dict) – A Dictionary of attributes that will be assigned to the variable when it is converted to XArray. If not provided, it will be created from the ADI variable’s attrs.

Returns

np.ndarray – An empty ndarray of the same shape as the variable.

adi_py.utils.get_time_data_as_datetime64(time_var: cds3.Var) numpy.ndarray

Get the time values from dsproc as seconds since 1970, then convert those values to datetime64 with microsecond precision.

Parameters

time_var (cds3.Var) – An ADI time variable object

Returns

np.ndarray – An ndarray of the same shape as the variable with time values converted to the np.datetime64 data type with microsecond precision.

adi_py.utils.get_xr_datasets(dataset_type: adi_py.constants.ADIDatasetType, dsid: Optional[int] = None, datastream_name: Optional[str] = None, site: Optional[str] = None, facility: Optional[str] = None, coordsys_name: Optional[str] = None) List[xarray.Dataset]

Get an ADI dataset converted to an xarray.Dataset.

Parameters
  • dataset_type (ADIDatasetType) – The type of the dataset to convert (RETRIEVED, TRANSFORMED, OUTPUT)

  • dsid (int) – If the dsid is known, you can use it to look up the adi dataset. If it is not known, then use datastream_name, and optionally site/facility to identify the dataset.

  • datastream_name (str) – The name of one of the process’ datastreams as specified in the PCM.

  • coordsys_name (str) – Optional parameter used only to find TRANSFORMED datasets. Must be a coordinate system specified in the PCM or None if no coordinate system was specified.

  • site (str) – Optional parameter used only to find some input datasets (RETRIEVED or TRANSFORMED). Site is only required if the retrieval rules in the PCM specify two different rules for the same datastream that differ by site.

  • facility (str) – Optional parameter used only to find some input datasets (RETRIEVED or TRANSFORMED). Facility is only required if the retrieval rules in the PCM specify two different rules for the same datastream that differ by facility.

Returns

List[xr.Dataset]

Returns a list of xr.Datasets, one for each file. If there are

no files / datasets for the specified datastream / site / facility / coord system then the list will be empty.

adi_py.utils.is_empty_function(func: Callable) bool

Evaluates a given function to see if the code contains anything more than doctrings and ‘pass’. If not, it is considered an ‘empty’ function.

Parameters

func (Callable) –

Returns

bool – True if the function is empty, otherwise False.

adi_py.utils.sync_xarray(xr_dataset: xarray.Dataset, adi_dataset: cds3.Group)

Carefully inspect the xr.Dataset and synchronize any changes back to the given ADI dataset.

Parameters
  • xr_dataset (xr.Dataset) – The XArray dataset to sync

  • adi_dataset (csd3.Group) – The ADI dataset where changes will be applied

adi_py.utils.sync_xr_dataset(xr_dataset: xarray.Dataset)

Sync the contents of the given XArray.Dataset with the corresponding ADI data structure.

Parameters

xr_dataset (xr.Dataset) – The xr.Dataset(s) to sync.

adi_py.utils.to_xarray(adi_dataset: cds3.Group) xarray.Dataset

Convert the specified CDS.Group into an XArray dataset. Attributes will be copied, but the DataArrays for each variable will be backed by an np.ndarray that links directly to the C ADI data via np.PyArray_SimpleNewFromData

Parameters

adi_dataset (cds3.Group) – An ADI dataset object.

Returns

xr.Dataset – The corresponding XArray dataset object.