`adi_py`¶

This module provides the new ADI Python bindings which incorporate full XArray compatibility.

Submodules¶

Classes¶

`ADIAtts`
`ADIDataArrayAccessor`	Used to apply special ADI functions to an xarray data array (i.e., variable)
`ADIDatasetAccessor`	Used to apply special ADI functions to an xarray dataset with the
`ADIDatasetType`	Used to easily reference different types of ADI datasets.
`ADILogger`	This class provides python-like logging API facade around the dsproc
`BitAssessment`	Used to easily reference bit assessment values used in ADI QC
`DatastreamIdentifier`	NamedTuple class that holds various information used to identify a specific
`LogLevel`	Generic enumeration.
`Process`	The base class for running an ADI process in Python. All Python processes
`SpecialXrAttributes`	Enumerates the special XArray variable attributes that are assigned
`SplitMode`	Enumerates the split mode which is used to define the output file size
`TransformAttributes`	Used to easily reference transformation metadata attrs used in ADI QC

exception adi_py.DatasetConversionException¶

Bases: Exception

Exception used when converting from XArray to ADI or vice versa and the data are incompatible.

Initialize self. See help(type(self)) for accurate signature.

exception adi_py.SkipProcessingIntervalException(msg: str = '', log_level: adi_py.logger.LogLevel = LogLevel.INFO)¶

Bases: Exception

Processes should throw this exception if the current processing interval should be skipped. All other exceptions will be considered to fail the process.

Initialize self. See help(type(self)) for accurate signature.

class adi_py.ADIAtts¶

ANCILLARY_VARIABLES = ancillary_variables¶

DESCRIPTION = description¶

FILL_VALUE = ['_FillValue']¶

LONG_NAME = long_name¶

MISSING_VALUE = missing_value¶

STANDARD_NAME = standard_name¶

UNITS = units¶

VALID_MAX = valid_max¶

VALID_MIN = valid_min¶

class adi_py.ADIDataArrayAccessor(xarray_obj)¶

Used to apply special ADI functions to an xarray data array (i.e., variable) with the namespace ‘adi’

Class Methods

`assign_coordinate_system`
`assign_output_datastream`
`nsamples`
`source_ds_name`
`source_var_name`

Method Descriptions

assign_coordinate_system(self, coordinate_system_name: str)¶

assign_output_datastream(self, output_datastream_name: str, variable_name_in_datastream: str = None)¶

property nsamples(self) → int¶

property source_ds_name(self) → str¶

property source_var_name(self) → str¶

class adi_py.ADIDatasetAccessor(xarray_obj)¶

Used to apply special ADI functions to an xarray dataset with the namespace ‘adi’

Class Methods

`add_qc_variable`
`add_variable`
`convert_units`
`drop_transform_metadata`
`drop_variables`
`get_companion_transform_variable_names`
`get_qc_variable`
`record_qc_results`
`variables_exist`

Method Descriptions

add_qc_variable(self, variable_name: str)¶

add_variable(self, variable_name: str, dim_names: List[str], data: numpy.ndarray, long_name: str = None, standard_name: str = None, units: str = None, valid_min=None, valid_max=None, missing_value: numpy.ndarray = None, fill_value=None)¶

convert_units(self, old_units: str, new_units: str, variable_names: List[str] = None, converter_function: Callable = None)¶

drop_transform_metadata(self, variable_names: List[str]) → xarray.Dataset¶

drop_variables(self, variable_names: List[str]) → xarray.Dataset¶

get_companion_transform_variable_names(self, variable_name: str) → List[str]¶

get_qc_variable(self, variable_name: str)¶

record_qc_results(self, variable_name: str, bit_number: int = None, test_results: numpy.ndarray = None)¶

variables_exist(self, variable_names: List[str] = []) → numpy.ndarray¶

class adi_py.ADIDatasetType¶

Bases: enum.Enum

Used to easily reference different types of ADI datasets.

OUTPUT = 3¶

RETRIEVED = 1¶

TRANSFORMED = 2¶

class adi_py.ADILogger¶

This class provides python-like logging API facade around the dsproc logging methods.

Class Methods

`debug`
`error`
`exception`	Use this method to log the stack trace of any raised exception to the process’s
`info`
`warning`

Method Descriptions

static debug(message, debug_level=1)¶

static error(message)¶

static exception(message)¶

Use this method to log the stack trace of any raised exception to the process’s ADI log file.

Parameters: message (-) – str An optional additional message to log, in addition to the stack trace.

static info(message)¶

static warning(message)¶

class adi_py.BitAssessment¶

Bases: enum.Enum

Used to easily reference bit assessment values used in ADI QC

BAD = Bad¶

INDETERMINATE = Indeterminate¶

class adi_py.DatastreamIdentifier¶

Bases: NamedTuple

NamedTuple class that holds various information used to identify a specific ADI dataset.

datastream_name :str¶

dsid :int¶

facility :str¶

site :str¶

class adi_py.LogLevel¶

Bases: enum.Enum

Generic enumeration.

Derive from this class to define new enumerations.

DEBUG = debug¶

ERROR = error¶

INFO = info¶

WARNING = warning¶

class adi_py.Process¶

The base class for running an ADI process in Python. All Python processes should extend this class.

Class Methods

`add_qc_variable`	Add a companion qc variable for the given variable
`add_variable`	Create a new variable in the given xarray dataset with the specified dimensions,
`assign_coordinate_system_to_variable`	Assign the given variable to the designated ADI coordinate system.
`assign_output_datastream_to_variable`	Assign the given variable to the designated output datastream.
`convert_units`	For the specified variables, convert the units from old_units to new_units.
`debug_level`	Get the debug level passed on the command line when running the process.
`drop_transform_metadata`	This method removes all associated companion variables that are generated
`drop_variables`	This method removes the given variables plus all associated companion
`facility`	Get the facility where this invocation of the process is running
`find_retrieved_variable`	Find the input datastream where the given retrieved variable came
`finish_process_hook`	This hook will be called once just after the main data processing loop finishes. This function should be used
`get_bad_qc_mask`	Get a mask of same shape as the variable’s data which contains True values
`get_companion_transform_variable_names`	For the given variable, get a list of the companion/ancillary variables
`get_datastream_files`	See `utils.get_datastream_files()`
`get_dsid`	Gets the corresponding dataset id for the given datastream (input or output)
`get_missing_value_mask`	Get True/False mask of same shape as passed variable(s) which is used to
`get_non_missing_value_mask`	Get a True/False mask of same shape as passed variable(s) that is used to
`get_nsamples`	Get the ADI sample count for the given variable (i.e., the length
`get_output_dataset`	Get an ADI output dataset converted to an xr.Dataset.
`get_output_dataset_by_dsid`
`get_output_datasets`	Get an ADI output dataset converted to an xr.Dataset.
`get_output_datasets_by_dsid`
`get_qc_variable`	Return the companion qc variable for the given data variable.
`get_quicklooks_file_name`	Create a properly formatted file name where a quicklooks plot should be
`get_retrieved_dataset`	Get an ADI retrieved dataset converted to an xr.Dataset.
`get_retrieved_dataset_by_dsid`
`get_retrieved_datasets`	Get the ADI retrieved datasets converted to a list of xarray Datasets.
`get_retrieved_datasets_by_dsid`
`get_source_ds_name`	For the given variable, get name of the input datastream
`get_source_var_name`	For the given variable, get the name of the variable
`get_transformed_dataset`	Get an ADI transformed dataset converted to an xr.Dataset.
`get_transformed_dataset_by_dsid`
`get_transformed_datasets`	Get an ADI transformed dataset converted to an xr.Dataset.
`get_transformed_datasets_by_dsid`
`include_debug_dumps`	Setting controlling whether this process should provide debug dumps of the
`init_process_hook`	This hook will will be called once just before the main data processing loop begins and before the initial
`location`	Get the location where this invocation of the process is running.
`post_retrieval_hook`	This hook will will be called once per processing interval just after data retrieval,
`post_transform_hook`	This hook will be called once per processing interval just after data
`pre_retrieval_hook`	This hook will will be called once per processing interval just prior to data retrieval.
`pre_transform_hook`	This hook will be called once per processing interval just prior to data
`process_data_hook`	This hook will be called once per processing interval just after the output
`process_model`	The processing model to use. It can be one of:
`process_name`	The name of the process that is currently being run.
`process_names`	The name(s) of the process(es) that could run this code. Subclasses must
`process_version`	The version of this process’s code. Subclasses must define the
`quicklook_hook`	This hook will be called once per processing interval just after all data
`record_qc_results`	For the given variable, add bitwise test results to the companion qc
`rollup_qc`	ADI setting controlling whether all the qc bits are rolled up into a
`run`	Run the process.
`set_datastream_flags`	Apply a set of ADI control flags to a datastream as identified by the
`set_datastream_split_mode`	This method should be called in your init_process_hook if you need to
`set_retriever_time_offsets`	This method should be called in your init_process_hook if you need to override
`shift_output_interval`	This method should be called in your init_process_hook (i.e., before the
`shift_processing_interval`	This method should be called in your init_process_hook (i.e., before the
`site`	Get the site where this invocation of the process is running
`sync_datasets`	Sync the contents of one or more XArray.Datasets with the corresponding ADI
`variables_exist`	Check if the given variables exist in the given dataset.

Method Descriptions

static add_qc_variable(dataset: xarray.Dataset, variable_name: str)¶

Add a companion qc variable for the given variable

Parameters

dataset (xr.Dataset) –
variable_name (str) –

Returns

The newly created DataArray

static add_variable(dataset: xarray.Dataset, variable_name: str, dim_names: List[str], data: numpy.ndarray, long_name: str = None, standard_name: str = None, units: str = None, valid_min: Any = None, valid_max: Any = None, missing_value: numpy.ndarray = None, fill_value: Any = None)¶

Create a new variable in the given xarray dataset with the specified dimensions, data, and attributes.

Important

If you want to add the created variable to a given coordinate system, then you follow this with a call to assign_coordinate_system_to_variable. Similarly, if you want to add the created variable to a given output datastream, then you should follow this with a call to assign_output_datastream_to_variable

See also

assign_coordinate_system_to_variable
assign_output_datastream_to_variable

Parameters

dataset (xr.Dataset) – The xarray dataset to add the new variable to
variable_name (str) – The name of the variable
dim_names (List[str]) – A list of dimension names for the variable
data (np.ndarray) – A multidimensional array of the variable’s data Must have the same shape as the dimensions.
long_name (str) – The long_name attribute for the variable
standard_name (str) – The standard_name attribute for the variable
units (str) – The units attribute for the variable
valid_min (Any) – The valid_min attribute for the variable. Must be the same data type as the variable.
valid_max (Any) – The valid_max attribute for the variable Must be the same data type as the variable.
missing_value (np.ndarray) – An array of possible missing_value attributes for the variable. Must be the same data type as the variable.
() (fill_value) – The fill_value attribute for the variable. Must be the same data type as the variable.

Returns

The newly created variable (i.e., xr.DataArray object)

static assign_coordinate_system_to_variable(variable: xarray.DataArray, coordinate_system_name: str)¶

Assign the given variable to the designated ADI coordinate system.

Parameters

variable (xr.DataArray) – A data variable from an xarray dataset
coordinate_system_name (str) – The name of one of the process’s coordinate systems as specified in the PCM process definition.

static assign_output_datastream_to_variable(variable: xarray.DataArray, output_datastream_name: str, variable_name_in_datastream: str = None)¶

Assign the given variable to the designated output datastream.

Parameters

variable (xr.DataArray) – A data variable from an xarray dataset
output_datastream_name (str) – An output datastream name as specified in PCM process definition
variable_name_in_datastream (str) – The name of the variable as it should appear in the output datastream. If not specified, then the name of the given variable will be used.

static convert_units(xr_datasets: List[xarray.Dataset], old_units: str, new_units: str, variable_names: List[str] = None, converter_function: Callable = None)¶

For the specified variables, convert the units from old_units to new_units. For applicable variables, this conversion will include changing the units attribute value and optionally converting all the data values if a converter function is provided.

This method is needed for special cases where the units conversion is not supported by udunits and the default ADI converters.

Parameters

xr_datasets (List[xr.Dataset]) – One or more xarray datasets upon which to apply the conversion
old_units (str) – The old units (e.g., ‘degree F’)
new_units (str) – The new units (e.g., ‘K’)
variable_names (List[str]) – A list of specific variable names to convert. If not specified, it converts all variables with the given old_units to new_units.
() (converter_function) –
A function to run on an Xarray variable (i.e., DataArray that converts a variable’s values from old_units to new_units. If not specified, then only the units attribute value will be changed. This could happen if we just want to change the units attribute value because of a typo.

The function should take one parameter, an xarray.DataArray, and operate in place on the variable’s values.

property debug_level(self) → int¶

Get the debug level passed on the command line when running the process.

Returns: int – the debug level

static drop_transform_metadata(dataset: xarray.Dataset, variable_names: List[str]) → xarray.Dataset¶

This method removes all associated companion variables that are generated byt the transformation process (if they exist), as well as transformation attributes, but it does not remove the original variable.

Parameters

dataset (xr.Dataset) – The dataset containing the transformed variables.
variable_names (List[str]) – The variable names for which to remove transformation metadata.

Returns

xr.Dataset – A new dataset with the transform companion variables and metadata removed.

static drop_variables(dataset: xarray.Dataset, variable_names: List[str]) → xarray.Dataset¶

This method removes the given variables plus all associated companion variables that were added as part of the transformation step (if they exist).

Parameters

dataset (xr.Dataset) – The dataset containing the given variables.
variable_names (List[str]) – The variable names to remove.

Returns

xr.Dataset – A new dataset with the given variables and their transform companion variables removed.

property facility(self) → str¶

Get the facility where this invocation of the process is running

Returns: str – The facility where this process is running

static find_retrieved_variable(retrieved_variable_name) → Optional[adi_py.utils.DatastreamIdentifier]¶

Find the input datastream where the given retrieved variable came from. We may need this if there are complex retrieval rules and the given variable may be retrieved from different datastreams depending upon the site/facility where this process runs. We need to get the DatastreamIdentifier so we can load the correct xarray dataset if we need to modify the data values.

Parameters: retrieved_variable_name (str) – The name of the retrieved variable to find
Returns: A DatastreamIdentifier containing all the information needed to look up the given dataset or None if the retrieved variable was not found.

finish_process_hook(self)¶: This hook will be called once just after the main data processing loop finishes. This function should be used to clean up any temporary files used.

static get_bad_qc_mask(dataset: xarray.Dataset, variable_name: str, include_indeterminate: bool = False, bit_numbers: List[int] = None) → numpy.ndarray¶

Get a mask of same shape as the variable’s data which contains True values for each data point that has a corresponding bad qc bit set.

Parameters

dataset (xr.Dataset) – The dataset containing the variables
variable_name (str) – The variable name to check qc for
include_indeterminate (bool) – Whether to include indeterminate bits when determining the mask. By default this is False and only bad bits are used to compute the mask.
bit_numbers (List(int)) – The specific bit numbers to include in the qc check (i.e., 1,2,3,4, etc.). Note that if not specified, all bits will be used to compute the mask.

Returns

np.ndarray – An array of same shape as the variable consisting of True/False values, where each True indicates that the corresponding data point had bad (or indeterminate if include_indeterminate is specified) qc for the specified bit numbers (all bits if bit_numbers not specified).

static get_companion_transform_variable_names(dataset: xarray.Dataset, variable_name: str) → List[str]¶

For the given variable, get a list of the companion/ancillary variables that were added as a result of the ADI transformation.

Parameters

dataset (xr.Dataset) – The dataset
variable_name (str) – The name of a data variable in the dataset

Returns

A list of string companion variable names that were created from the transform engine. This is used for cleaning up associated variables when a variable is deleted from a dataset.

static get_datastream_files(datastream_name: str, begin_date: int, end_date: int) → List[str]¶: See utils.get_datastream_files()

static get_dsid(datastream_name: str, site: str = None, facility: str = None, dataset_type: adi_py.constants.ADIDatasetType = None) → Optional[int]¶

Gets the corresponding dataset id for the given datastream (input or output)

Parameters

datastream_name (str) – The name of the datastream to find
site (str) – Optional parameter used only to find some input datasets (RETRIEVED or TRANSFORMED). Site is only required if the retrieval rules in the PCM specify two different rules for the same datastream that differ by site.
facility (str) – Optional parameter used only to find some input datasets (RETRIEVED or TRANSFORMED). Facility is only required if the retrieval rules in the PCM specify two different rules for the same datastream that differ by facility.
dataset_type (ADIDatasetType) – The type of the dataset to convert (RETRIEVED, TRANSFORMED, OUTPUT)

Returns

Optional[int] – The dataset id or None if not found

static get_missing_value_mask(*args) → xarray.DataArray¶

Get True/False mask of same shape as passed variable(s) which is used to select data points for which one or more of the values of any of the specified variables are missing.

Parameters: *args (xr.DataArray) – Pass one or more xarray variables to check for missing values. All variables in the list must have the same shape.
Returns: xr.DataArray – An array of True/False values of the same shape as the input variables where each True represents the case where one or more of the variables has a missing_value at that index.

static get_non_missing_value_mask(*args) → xarray.DataArray¶

Get a True/False mask of same shape as passed variable(s) that is used to select data points for which none of the values of any of the specified variables are missing.

Parameters: *args (xr.DataArray) – Pass one or more xarray variables to check. All variables in the list must have the same shape.
Returns: xr.DataArray – An array of True/False values of the same shape as the input variables where each True represents the case where all variables passed in have non-missing value data at that index.

static get_nsamples(xr_var: xarray.DataArray) → int¶

Get the ADI sample count for the given variable (i.e., the length of the first dimension or 1 if the variable has no dimensions)

Parameters: xr_var (xr.DataArray) –
Returns: int – The ADI sample count

static get_output_dataset(output_datastream_name: str) → Optional[xarray.Dataset]¶

Get an ADI output dataset converted to an xr.Dataset.

Note: This method will return at most a single xr.Dataset. If you expect multiple datasets, or would like to handle cases where multiple dataset files may be retrieved, please use the Process.get_retrieved_datasets() function.

Parameters

output_datastream_name (str) – The name of one of the process’ output datastreams as specified in the PCM.

Returns

xr.Dataset | None –

Returns a single xr.Dataset, or None if no output: datasets exist for the specified datastream / site / facility / coord system.

static get_output_dataset_by_dsid(dsid: int) → Optional[xarray.Dataset]¶

static get_output_datasets(output_datastream_name: str) → List[xarray.Dataset]¶

Get an ADI output dataset converted to an xr.Dataset.

Parameters

output_datastream_name (str) – The name of one of the process’ output datastreams as specified in the PCM.

Returns

List[xr.Dataset] –

Returns a list of xr.Datasets. If no output datasets: exist for the specified datastream / site / facility / coord system then the list will be empty.

static get_output_datasets_by_dsid(dsid: int) → List[xarray.Dataset]¶

static get_qc_variable(dataset: xarray.Dataset, variable_name: str) → xarray.DataArray¶

Return the companion qc variable for the given data variable.

Parameters

dataset (xr.Dataset) –
variable_name (str) –

Returns

xr.DataArray – The companion qc variable or None if it doesn’t exist

get_quicklooks_file_name(self, datastream_name: str, begin_date: int, description: str = None, ext: str = 'png', mkdirs: bool = False)¶

Create a properly formatted file name where a quicklooks plot should be saved for the given processing interval. For example:

${QUICKLOOK_DATA}/ena/enamfrsrcldod1minC1.c1/2021/01/01/enamfrsrcldod1minC1.c1.20210101.000000.lwp.png

Parameters

datastream_name (str) – The name of the datastream which this plot applies to. For example, mfrsrcldod1min.c1
begin_date (int) – The begin timestamp of the current processing interval as passed to the quicklook hook function
description (str) – The description of the plot to be used in the file name For example, in the file enamfrsrcldod1minC1.c1.20210101.000000.lwp.png, the description is ‘lwp’.
ext (str) – The file extension for the image. Default is ‘png’
mkdirs (bool) – If True, then the folder path to the quicklooks file will be automatically created if it does not exist. Default is False.

Returns

str – The full path to where the quicklooks file should be saved.

static get_retrieved_dataset(input_datastream_name: str, site: Optional[str] = None, facility: Optional[str] = None) → Optional[xarray.Dataset]¶

Get an ADI retrieved dataset converted to an xr.Dataset.

Note: This method will return at most a single xr.Dataset. If you expect multiple datasets, or would like to handle cases where multiple dataset files may be retrieved, please use the Process.get_retrieved_datasets() function.

Parameters

input_datastream_name (str) – The name of one of the process’ input datastreams as specified in the PCM.
site (str) – Optional parameter used only to find some input datasets (RETRIEVED or TRANSFORMED). Site is only required if the retrieval rules in the PCM specify two different rules for the same datastream that differ by site.
facility (str) – Optional parameter used only to find some input datasets (RETRIEVED or TRANSFORMED). Facility is only required if the retrieval rules in the PCM specify two different rules for the same datastream that differ by facility.

Returns

xr.Dataset | None –

Returns a single xr.Dataset, or None if no retrieved datasets: exist for the specified datastream / site / facility.

static get_retrieved_dataset_by_dsid(dsid: int) → Optional[xarray.Dataset]¶

static get_retrieved_datasets(input_datastream_name: str, site: Optional[str] = None, facility: Optional[str] = None) → List[xarray.Dataset]¶

Get the ADI retrieved datasets converted to a list of xarray Datasets.

Parameters

input_datastream_name (str) – The name of one of the process’ input datastreams as specified in the PCM.
site (str) – Optional parameter used only to find some input datasets (RETRIEVED or TRANSFORMED). Site is only required if the retrieval rules in the PCM specify two different rules for the same datastream that differ by site.
facility (str) – Optional parameter used only to find some input datasets (RETRIEVED or TRANSFORMED). Facility is only required if the retrieval rules in the PCM specify two different rules for the same datastream that differ by facility.

Returns

List[xr.Dataset] –

Returns a list of xr.Datasets. If no retrieved datasets: exist for the specified datastream / site / facility / coord system then the list will be empty.

static get_retrieved_datasets_by_dsid(dsid: int) → List[xarray.Dataset]¶

static get_source_ds_name(xr_var: xarray.DataArray) → str¶

For the given variable, get name of the input datastream where it came from :param xr_var: :type xr_var: xr.DataArray

Returns: str

static get_source_var_name(xr_var: xarray.DataArray) → str¶

For the given variable, get the name of the variable used in the input datastream :param xr_var: :type xr_var: xr.DataArray

Returns: str

static get_transformed_dataset(input_datastream_name: str, coordinate_system_name: str, site: Optional[str] = None, facility: Optional[str] = None) → Optional[xarray.Dataset]¶

Get an ADI transformed dataset converted to an xr.Dataset.

Note: This method will return at most a single xr.Dataset. If you expect multiple datasets, or would like to handle cases where multiple dataset files may be retrieved, please use the Process.get_retrieved_datasets() function.

Parameters

input_datastream_name (str) – The name of one of the process’ input datastreams as specified in the PCM.
coordinate_system_name (str) – A coordinate system specified in the PCM or None if no coordinate system was specified.
site (str) – Optional parameter used only to find some input datasets (RETRIEVED or TRANSFORMED). Site is only required if the retrieval rules in the PCM specify two different rules for the same datastream that differ by site.
facility (str) – Optional parameter used only to find some input datasets (RETRIEVED or TRANSFORMED). Facility is only required if the retrieval rules in the PCM specify two different rules for the same datastream that differ by facility.

Returns

xr.Dataset | None –

Returns a single xr.Dataset, or None if no transformed: datasets exist for the specified datastream / site / facility / coord system.

static get_transformed_dataset_by_dsid(dsid: int, coordinate_system_name: str) → Optional[xarray.Dataset]¶

static get_transformed_datasets(input_datastream_name: str, coordinate_system_name: str, site: Optional[str] = None, facility: Optional[str] = None) → List[xarray.Dataset]¶

Get an ADI transformed dataset converted to an xr.Dataset.

Parameters

input_datastream_name (str) – The name of one of the process’ input datastreams as specified in the PCM.
coordinate_system_name (str) – A coordinate system specified in the PCM or None if no coordinate system was specified.
site (str) – Optional parameter used only to find some input datasets (RETRIEVED or TRANSFORMED). Site is only required if the retrieval rules in the PCM specify two different rules for the same datastream that differ by site.
facility (str) – Optional parameter used only to find some input datasets (RETRIEVED or TRANSFORMED). Facility is only required if the retrieval rules in the PCM specify two different rules for the same datastream that differ by facility.

Returns

List[xr.Dataset] –

Returns a list of xr.Datasets. If no transformed datasets: exist for the specified datastream / site / facility / coord system then the list will be empty.

static get_transformed_datasets_by_dsid(dsid: int, coordinate_system_name: str) → List[xarray.Dataset]¶

property include_debug_dumps(self) → bool¶

Setting controlling whether this process should provide debug dumps of the data after each hook.

Returns: bool – Whether debug dumps should be automatically included. If True and debug level is > 1, then debug dumps will be performed automatically before and after each code hook.

init_process_hook(self)¶: This hook will will be called once just before the main data processing loop begins and before the initial database connection is closed.

property location(self) → dsproc3.PyProcLoc¶

Get the location where this invocation of the process is running.

Returns: dsproc.PyProcLoc – A class containing the alt, lat, and lon where the process is running.

post_retrieval_hook(self, begin_date: int, end_date: int)¶

This hook will will be called once per processing interval just after data retrieval, but before the retrieved observations are merged and QC is applied.

Parameters

begin_date (int) – the begin time of the current processing interval
end_date (int) – the end time of the current processing interval

post_transform_hook(self, begin_date: int, end_date: int)¶

This hook will be called once per processing interval just after data transformation, but before the output datasets are created.

Parameters

begin_date (int) – the begin time of the current processing interval
end_date (int) – the end time of the current processing interval

pre_retrieval_hook(self, begin_date: int, end_date: int)¶

This hook will will be called once per processing interval just prior to data retrieval.

Parameters

begin_date (- int) – the begin time of the current processing interval
end_date (- int) – the end time of the current processing interval

pre_transform_hook(self, begin_date: int, end_date: int)¶

This hook will be called once per processing interval just prior to data transformation,and after the retrieved observations are merged and QC is applied.

Parameters

begin_date (int) – the begin time of the current processing interval
end_date (int) – the end time of the current processing interval

process_data_hook(self, begin_date: int, end_date: int)¶

This hook will be called once per processing interval just after the output datasets are created, but before they are stored to disk.

Parameters

begin_date (int) – the begin time of the current processing interval
end_date (int) – the end time of the current processing interval

property process_model(self) → int¶

The processing model to use. It can be one of:

dsproc.PM_GENERIC dsproc.PM_INGEST dsproc.PM_RETRIEVER_INGEST dsproc.PM_RETRIEVER_VAP dsproc.PM_TRANSFORM_INGEST dsproc.PM_TRANSFORM_VAP

Default value is PM_TRANSFORM_VAP. Subclasses can override in their constructor.

Returns: int – The processing modelj (see dsproc.ProcModel cdeftype)

property process_name(self) → str¶

The name of the process that is currently being run.

Returns: str – the name of the current process

property process_names(self) → List[str]¶

The name(s) of the process(es) that could run this code. Subclasses must define the self._names field in their constructor.

Returns: List[str] – One or more process names

property process_version(self) → str¶

The version of this process’s code. Subclasses must define the self._process_version field in their constructor.

Returns: str – The process version

quicklook_hook(self, begin_date: int, end_date: int)¶

This hook will be called once per processing interval just after all data is stored.

Parameters

begin_date (int) – the begin timestamp of the current processing interval
end_date (int) – the end timestamp of the current processing interval

static record_qc_results(xr_dataset: xarray.Dataset, variable_name: str, bit_number: int = None, test_results: numpy.ndarray = None)¶

For the given variable, add bitwise test results to the companion qc variable for the given test.

Parameters

xr_dataset (xr.Dataset) – The xr dataset
variable_name (str) – The name of the data variable (e.g., rh_ambient).
bit_number (int) – The bit/test number to record Note that bit numbering starts at 1 (i.e., 1, 2, 3, 4, etc.)
test_results (np.ndarray) – A ndarray mask of the same shape as the variable with True/False values for each data point. True means the test failed for that data point. False means the test passed for that data point.

property rollup_qc(self) → bool¶

ADI setting controlling whether all the qc bits are rolled up into a single 0/1 value or not.

Returns: bool – Whether this process should rollup qc or not

run(self) → int¶

Run the process.

Returns

int – The processing status:

1 if an error occurred
0 if successful

static set_datastream_flags(dsid: int, flags: int)¶

Apply a set of ADI control flags to a datastream as identified by the dsid. Multiple flags can be combined together using a bitwise OR (e.g., dsproc.DS_STANDARD_QC | dsproc.DS_FILTER_NANS). The allowed flags are identified below:

dsproc.DS_STANDARD_QC = Apply standard QC before storing a dataset.

dsproc.DS_FILTER_NANS = Replace NaN and Inf values with missing values
before storing a dataset.

dsproc.DS_OVERLAP_CHECK = Check for overlap with previously processed data.
This flag will be ignored and the overlap check will be skipped if reprocessing mode is enabled, or asynchronous processing mode is enabled.

dsproc.DS_PRESERVE_OBS = Preserve distinct observations when retrieving
data. Only observations that start within the current processing interval will be read in.

dsproc.DS_DISABLE_MERGE = Do not merge multiple observations in retrieved
data. Only data for the current processing interval will be read in.

dsproc.DS_SKIP_TRANSFORM = Skip the transformation logic for all variables
in this datastream.

dsproc.DS_ROLLUP_TRANS_QC = Consolidate the transformation QC bits for all
variables when mapped to the output datasets.

dsproc.DS_SCAN_MODE = Enable scan mode for datastream that are not
expected to be continuous. This prevents warning messages from being generated when data is not found within a processing interval. Instead, a message will be written to the log file indicating that the procesing interval was skipped.

dsproc.DS_OBS_LOOP = Loop over observations instead of time intervals.
This also sets the DS_PRESERVE_OBS flag.

dsproc.DS_FILTER_VERSIONED_FILES = Check for files with .v# version extensions
and filter out lower versioned files. Files without a version extension take precedence.

Call self.get_dsid() to obtain the dsid value for a specific datastream. If the flags value is < 0, then the following default flags will be set:

dsprc.DS_STANDARD_QC ‘b’ level datastreams dsproc.DS_FILTER_NANS ‘a’ and ‘b’ level datastreams dsproc.DS_OVERLAP_CHECK all output datastreams dsproc.DS_FILTER_VERSIONED_FILES input datastreams that are not level ‘0’

Parameters

dsid (int) – Datastream ID
flags (int) – Flags to set

Returns

int – The processing modelj (see dsproc.ProcModel cdeftype)

static set_datastream_split_mode(output_datastream_name: str, split_mode: adi_py.constants.SplitMode, split_start: int, split_interval: int)¶

This method should be called in your init_process_hook if you need to change the size of the output file for a given datastream. For example, to create monthly output files.

Parameters

output_datastream_name (str) – The name of the output datastream whose file output size will be changed.
split_mode (SplitMode) – One of the options from the SplitMode enum
split_start (int) – Depends on the split_mode selected
split_interval (int) – Depends on the split_mode selected

static set_retriever_time_offsets(input_datastream_name: str, begin_offset: int, end_offset: int)¶

This method should be called in your init_process_hook if you need to override the offsets per input datastream. By default, PCM only allows you to set global offsets that apply to all datastreams. If you need to change only one datastream, then you can do it via this method.

Parameters

input_datastream_name (str) – The specific input datastream to change the processing interval for.
begin_offset (int) – Seconds of data to fetch BEFORE the process interval starts
end_offset (int) – Seconds of data to fetch AFTER the process interval ends

static shift_output_interval(output_datastream_name: str, hours: int)¶

This method should be called in your init_process_hook (i.e., before the processing loop begins) if you need to shift the output interval to account for the timezone difference at the data location. For example, if you shift the output interval by -6 hours at SGP, the file will be split at 6:00 a.m. GMT.

Parameters

output_datastream_name (str) – The name of the output datastream whose file output will be shifted.
hours (int) – Number of hours to shift

static shift_processing_interval(seconds: int)¶

This method should be called in your init_process_hook (i.e., before the processing loop begins) if you need to shift the processing interval.

Parameters: seconds (int) – Number of seconds to shift

property site(self) → str¶

Get the site where this invocation of the process is running

Returns: str – The site where this process is running

static sync_datasets(*args: xarray.Dataset)¶

Sync the contents of one or more XArray.Datasets with the corresponding ADI data structure.

Important

This method MUST be called at the end of a hook function if any changes have been made to the XArray Dataset so that updates can be pushed back to ADI.

Important

This dataset must have been previously loaded via one of the get_*_dataset methods in order to have the correct embedded metadata to be able to sync to ADI. Specifically, this will include datastream name, coordinate system, dataset type, and obs_index.

adi_py¶

Submodules¶

Classes¶

`adi_py`¶