casys.readers

Data readers related sub-modules.

Functions

create_fields(data, fields)

Create a dataset from the fields clips and sources.

Classes

CasysReader([date_start, date_end, ...])

Abstract class representing a casys data reader.

CLSReader([ges_table_dir])

OCTANT CLS data source.

DatasetReader([data, data_path, ...])

xarray Dataset reader.

ZarrDatasetReader([data, data_path, ...])

xarray Dataset reader for the zarr format.

MultiReader(readers[, markers, tolerance, ...])

Reader allowing to read from a set of readers.

CLSTableReader(name[, ges_table_dir, ...])

OCTANT CLS TableMeasure data reader.

CLSTableInSituReader(sensor_type, sensor_name)

OCTANT TableInSitu data reader.

ZDatasetReader(data[, data_path, ...])

Reader for the zcollection.Dataset format.

ScCollectionReader([collection, data_path, ...])

Reader for a swot_calval Collection.

ZCollectionReader([collection, data_path, ...])

Reader for a Zcollection Collection.

StoreReader([store, store_path, ...])

xarray Dataset reader.

Exceptions

CasysReaderError

Exception raised when a problem related to data reading occurs.

class casys.readers.CLSReader(ges_table_dir=None, **kwargs)

Bases: CasysReader, ABC

OCTANT CLS data source.

Parameters:
  • ges_table_dir (str | None) – Path of the GES_TABLE_DIR to use.

  • date_start – Starting date of the interval we’re working on.

  • date_end – Ending date of the interval we’re working on.

  • select_clip – Selection clip allowing to work on a subset of the source’s data.

  • select_shape – Shape file, GeoDataFrame or Geometry to select.

  • data_cleaner – Data cleaning applied just after the reader. This cleaning might consist of sorting, duplication removal or removing indexes in order to keep them increasing.

  • orf – Source’s indexer.

  • reference_track – Reference track.

  • time – Time field.

  • longitude – Longitude field.

  • latitude – Latitude field.

FIELDS_SOURCE_FULL_CHECK: bool = True
FORBIDDEN_PARAMETERS: list[str] = []
KNOWN_PARAMETERS: dict[str, tuple[Any, Any]] = {'cross_track_distance': ('Optional[str]', 'cross_track_distance'), 'cycle_number': ('Optional[str]', 'CYCLE_NUMBER'), 'data_cleaner': ('Optional[DataCleaner]', None), 'date_end': ('Optional[DateType]', None), 'date_start': ('Optional[DateType]', None), 'latitude': ('Optional[str]', 'LATITUDE'), 'latitude_nadir': ('Optional[str]', 'latitude_nadir'), 'longitude': ('Optional[str]', 'LONGITUDE'), 'longitude_nadir': ('Optional[str]', 'longitude_nadir'), 'orf': ('Optional[PassIndexer | str]', None), 'pass_number': ('Optional[str]', 'PASS_NUMBER'), 'reference_track': ('Optional[ReferenceTrackType]', None), 'select_clip': ('Optional[str]', None), 'select_shape': ('Optional[str | gpd.GeoDataFrame | shg.Polygon]', None), 'swath_lines': ('Optional[str]', 'num_lines'), 'swath_pixels': ('Optional[str]', 'num_pixels'), 'time': ('Optional[str]', 'time')}
REQUIRED_PARAMETERS: list[str] = ['date_start', 'date_end']
RESOURCES: ClassVar[ong_data_cls.CLSResourcesManager] = <octantng.data.cls.resources.CLSResourcesManager object>
apply_clip(data)

Apply reader’s selection clip to the provided data.

Parameters:

data (Dataset) – Data on which to apply selection.

Return type:

Dataset

Returns:

Selected data.

apply_shape(data, with_clip=False)

Apply reader’s shape selection to the provided data.

Parameters:
  • data (Dataset) – Data from which to select data.

  • with_clip (bool) – Whether pre-selection with clip has to be done or not.

Return type:

Dataset

Returns:

Dataset reduced to the reader’s shape.

check_empty()

Check whether the reader contains data or not.

check_field(field, fields_ext=None, fill_properties=False)

Check the provided field validity.

Parameters:
  • field (Field) – Field to check.

  • fields_ext (list[str | Hashable] | None) – Additional valid external fields.

  • fill_properties (bool) – Whether to fill missing non-clip field’s properties or not.

Raises:

CasysReaderError – If the provided field is not valid.

check_fields_source(fields)

Check that the provided fields exists in this reader.

Parameters:

fields (Sequence[str]) – Fields to check.

Raises:

CasysReaderError – If one or more fields do not exist.

close()

Close used resources.

computation_dates(index, freq)

Computation starting and ending time for the provided parameters.

Parameters:
  • index (int | None) – Period’s number.

  • freq (FreqType) – Computation frequency.

Return type:

tuple[datetime64 | None, datetime64 | None]

Returns:

(start, end) dates.

dask_reader()

Return a version of the reader that might be used on a dask worker.

Parameters:

kwargs – Additional reader parameters.

Return type:

CLSReader

Returns:

Dask worker compatible reader.

property data: Dataset | None

Full reader’s data.

property date_end: DateHandler
property date_np_end: datetime64 | None
property date_np_start: datetime64 | None
property date_start: DateHandler
property field_cross_track_distance: Field | None
property field_cycle_number: Field | None
property field_lat: Field | None
property field_lat_nadir: Field | None
property field_lon: Field | None
property field_lon_nadir: Field | None
property field_pass_number: Field | None
property field_time: Field | None
property fields

Returns the dictionary of existing fields in the source.

Returns:

List of existing fields as Field objects.

property ges_table_dir: str

GES_TABLE_DIR set by this reader.

abstract get_table()

Instantiate a CLS table.

Return type:

TableMeasure | TableInSitu

Returns:

New instance of a CLS table.

have_special_fields(fields)

Check whether provided fields are set or not.

Parameters:

fields (list[FieldType]) – Fields to check.

Return type:

bool

Returns:

True if fields are set, False otherwise.

property index: str

Name of the index.

initialize(check_coords=True, force=True, time_extension=False, real_start=None)

Check and initialize reader’s parameters.

Parameters:
  • check_coords (bool) – Whether to check coordinates or not.

  • force (bool) – Whether to initialize if already initialized or not.

  • time_extension (bool) – Whether reader before or after initialization dates is allowed or not.

  • real_start (datetime64) – Real starting time of this processing. This is used to normalize interpolation index.

light_reader(freq, start, end)

Return a light version of the reader (that might be scattered on dask).

Parameters:
  • freq (FrequencyHandler) – Frequency handler used for parallelization.

  • start (datetime64) – Starting date of the required data.

  • end (datetime64) – Ending date of the required data.

Return type:

CasysReader

Returns:

Lightened and dask compatible reader.

property orf: PassIndexer | None
property periods: list[Period]
pre_computed_diagnostics()

List of pre-computed diagnostics (mainly used for stored diagnostics).

Return type:

list[MethodCaller]

Returns:

List of pre-computed diagnostics.

read_data(fields, start=None, end=None, period=None, include_end=True)

Read the requested fields and rename them according to the dictionary.

Parameters:
  • fields (dict[str, Field]) – Dictionary of fields names matched to their source.

  • start (datetime64 | None) – Starting date of the data to get.

  • end (datetime64 | None) – Ending date of the data to get.

  • period (int) – Period’s number to read.

  • include_end (bool) – Whether to include the end date or not.

Returns:

Fields content as a Dataset

property real_start: datetime64 | None
property reference_track: ReferenceTrack | None
reset_periods()

Reset the reader’s periods to None.

property select_clip: str | None
property select_shape: GeoDataFrame | None
set_dask_processing(freq, start, end, reference=None)

Do whatever needs to be done in case of a dask usage.

Parameters:
set_parameters(*, date_start=None, date_end=None, select_clip=None, select_shape=None, orf=None, reference_track=None, swath=False, **kwargs)

Method allowing to set reader’s parameters if unset.

Parameters:
  • date_start (DateHandler | None) – Starting date of the interval we’re working on.

  • date_end (DateHandler | None) – Ending date of the interval we’re working on.

  • select_clip (str | None) – Selection clip allowing to work on a subset of the source’s data.

  • select_shape (str | GeoDataFrame | Polygon | None) – Shape file, GeoDataFrame or Geometry to select.

  • orf (PassIndexer | str | None) – Source’s indexer.

  • reference_track (ReferenceTrack | None) – Reference track.

  • swath (bool) – Whether this reader contains swath type data or not.

  • kwargs – Special fields.

classmethod set_signature()

Fix the class initialization signature.

abstract property source: str

Reader’s source information.

special_field(ftype)
Parameters:

ftype (FieldType)

Return type:

Field | None

special_field_name(ftype)
Parameters:

ftype (FieldType)

Return type:

str | None

property special_fields: dict[FieldType, Field]
class casys.readers.CLSTableInSituReader(sensor_type, sensor_name, ges_table_dir=None, *, date_start=None, date_end=None, select_clip=None, select_shape=None, data_cleaner=None, orf=None, reference_track=None, time='time', longitude='LONGITUDE', latitude='LATITUDE', longitude_nadir='longitude_nadir', latitude_nadir='latitude_nadir', cycle_number='CYCLE_NUMBER', pass_number='PASS_NUMBER', cross_track_distance='cross_track_distance', swath_lines='num_lines', swath_pixels='num_pixels')

Bases: CLSReader

OCTANT TableInSitu data reader.

Parameters:
  • sensor_type (str) – Type of the in situ sensor.

  • sensor_name (str) – Name of the insitu sensor.

  • ges_table_dir (str | None) – Path of the GES_TABLE_DIR to use.

  • date_start – Starting date of the interval we’re working on.

  • date_end – Ending date of the interval we’re working on.

  • select_clip – Selection clip allowing to work on a subset of the source’s data.

  • select_shape – Shape file, GeoDataFrame or Geometry to select.

  • data_cleaner – Data cleaning applied just after the reader. This cleaning might consist of sorting, duplication removal or removing indexes in order to keep them increasing.

  • orf – Source’s indexer.

  • reference_track – Reference track.

  • time – Time field.

  • longitude – Longitude field.

  • latitude – Latitude field.

FIELDS_SOURCE_FULL_CHECK: bool = True
FORBIDDEN_PARAMETERS: list[str] = []
KNOWN_PARAMETERS: dict[str, tuple[Any, Any]] = {'cross_track_distance': ('Optional[str]', 'cross_track_distance'), 'cycle_number': ('Optional[str]', 'CYCLE_NUMBER'), 'data_cleaner': ('Optional[DataCleaner]', None), 'date_end': ('Optional[DateType]', None), 'date_start': ('Optional[DateType]', None), 'latitude': ('Optional[str]', 'LATITUDE'), 'latitude_nadir': ('Optional[str]', 'latitude_nadir'), 'longitude': ('Optional[str]', 'LONGITUDE'), 'longitude_nadir': ('Optional[str]', 'longitude_nadir'), 'orf': ('Optional[PassIndexer | str]', None), 'pass_number': ('Optional[str]', 'PASS_NUMBER'), 'reference_track': ('Optional[ReferenceTrackType]', None), 'select_clip': ('Optional[str]', None), 'select_shape': ('Optional[str | gpd.GeoDataFrame | shg.Polygon]', None), 'swath_lines': ('Optional[str]', 'num_lines'), 'swath_pixels': ('Optional[str]', 'num_pixels'), 'time': ('Optional[str]', 'time')}
REQUIRED_PARAMETERS: list[str] = ['date_start', 'date_end']
RESOURCES: ClassVar[ong_data_cls.CLSResourcesManager] = <octantng.data.cls.resources.CLSResourcesManager object>
apply_clip(data)

Apply reader’s selection clip to the provided data.

Parameters:

data (Dataset) – Data on which to apply selection.

Return type:

Dataset

Returns:

Selected data.

apply_shape(data, with_clip=False)

Apply reader’s shape selection to the provided data.

Parameters:
  • data (Dataset) – Data from which to select data.

  • with_clip (bool) – Whether pre-selection with clip has to be done or not.

Return type:

Dataset

Returns:

Dataset reduced to the reader’s shape.

check_empty()

Check whether the reader contains data or not.

check_field(field, fields_ext=None, fill_properties=False)

Check the provided field validity.

Parameters:
  • field (Field) – Field to check.

  • fields_ext (list[str | Hashable] | None) – Additional valid external fields.

  • fill_properties (bool) – Whether to fill missing non-clip field’s properties or not.

Raises:

CasysReaderError – If the provided field is not valid.

check_fields_source(fields)

Check that the provided fields exists in this reader.

Parameters:

fields (Sequence[str]) – Fields to check.

Raises:

CasysReaderError – If one or more fields do not exist.

close()

Close used resources.

computation_dates(index, freq)

Computation starting and ending time for the provided parameters.

Parameters:
  • index (int | None) – Period’s number.

  • freq (FreqType) – Computation frequency.

Return type:

tuple[datetime64 | None, datetime64 | None]

Returns:

(start, end) dates.

dask_reader()

Return a version of the reader that might be used on a dask worker.

Parameters:

kwargs – Additional reader parameters.

Return type:

CLSReader

Returns:

Dask worker compatible reader.

property data: Dataset | None

Full reader’s data.

property date_end: DateHandler
property date_np_end: datetime64 | None
property date_np_start: datetime64 | None
property date_start: DateHandler
property field_cross_track_distance: Field | None
property field_cycle_number: Field | None
property field_lat: Field | None
property field_lat_nadir: Field | None
property field_lon: Field | None
property field_lon_nadir: Field | None
property field_pass_number: Field | None
property field_time: Field | None
property fields

Returns the dictionary of existing fields in the source.

Returns:

List of existing fields as Field objects.

property ges_table_dir: str

GES_TABLE_DIR set by this reader.

get_table()

Instantiate a CLS table.

Return type:

TableMeasure | TableInSitu

Returns:

New instance of a CLS table.

have_special_fields(fields)

Check whether provided fields are set or not.

Parameters:

fields (list[FieldType]) – Fields to check.

Return type:

bool

Returns:

True if fields are set, False otherwise.

property index: str

Name of the index.

initialize(check_coords=True, force=True, time_extension=False, real_start=None)

Check and initialize reader’s parameters.

Parameters:
  • check_coords (bool) – Whether to check coordinates or not.

  • force (bool) – Whether to initialize if already initialized or not.

  • time_extension (bool) – Whether reader before or after initialization dates is allowed or not.

  • real_start (datetime64) – Real starting time of this processing. This is used to normalize interpolation index.

light_reader(freq, start, end)

Return a light version of the reader (that might be scattered on dask).

Parameters:
  • freq (FrequencyHandler) – Frequency handler used for parallelization.

  • start (datetime64) – Starting date of the required data.

  • end (datetime64) – Ending date of the required data.

Return type:

CasysReader

Returns:

Lightened and dask compatible reader.

property orf: PassIndexer | None
property periods: list[Period]
pre_computed_diagnostics()

List of pre-computed diagnostics (mainly used for stored diagnostics).

Return type:

list[MethodCaller]

Returns:

List of pre-computed diagnostics.

read_data(fields, start=None, end=None, period=None, include_end=True)

Read the requested fields and rename them according to the dictionary.

Parameters:
  • fields (dict[str, Field]) – Dictionary of fields names matched to their source.

  • start (datetime64 | None) – Starting date of the data to get.

  • end (datetime64 | None) – Ending date of the data to get.

  • period (int) – Period’s number to read.

  • include_end (bool) – Whether to include the end date or not.

Returns:

Fields content as a Dataset

property real_start: datetime64 | None
property reference_track: ReferenceTrack | None
reset_periods()

Reset the reader’s periods to None.

property select_clip: str | None
property select_shape: GeoDataFrame | None
property sensor_name: str

Name of the insitu sensor.

property sensor_type: str

Type of the in situ sensor.

set_dask_processing(freq, start, end, reference=None)

Do whatever needs to be done in case of a dask usage.

Parameters:
set_parameters(*, date_start=None, date_end=None, select_clip=None, select_shape=None, orf=None, reference_track=None, swath=False, **kwargs)

Method allowing to set reader’s parameters if unset.

Parameters:
  • date_start (DateHandler | None) – Starting date of the interval we’re working on.

  • date_end (DateHandler | None) – Ending date of the interval we’re working on.

  • select_clip (str | None) – Selection clip allowing to work on a subset of the source’s data.

  • select_shape (str | GeoDataFrame | Polygon | None) – Shape file, GeoDataFrame or Geometry to select.

  • orf (PassIndexer | str | None) – Source’s indexer.

  • reference_track (ReferenceTrack | None) – Reference track.

  • swath (bool) – Whether this reader contains swath type data or not.

  • kwargs – Special fields.

classmethod set_signature()

Fix the class initialization signature.

property source: str

Reader’s source information.

special_field(ftype)
Parameters:

ftype (FieldType)

Return type:

Field | None

special_field_name(ftype)
Parameters:

ftype (FieldType)

Return type:

str | None

property special_fields: dict[FieldType, Field]
class casys.readers.CLSTableReader(name, ges_table_dir=None, *, date_start=None, date_end=None, select_clip=None, select_shape=None, data_cleaner=None, orf=None, reference_track=None, time='time', longitude='LONGITUDE', latitude='LATITUDE', longitude_nadir='longitude_nadir', latitude_nadir='latitude_nadir', cycle_number='CYCLE_NUMBER', pass_number='PASS_NUMBER', cross_track_distance='cross_track_distance', swath_lines='num_lines', swath_pixels='num_pixels')

Bases: CLSReader

OCTANT CLS TableMeasure data reader.

Parameters:
  • name (str) – Table’s name.

  • ges_table_dir (str | None) – Path of the GES_TABLE_DIR to use.

  • date_start – Starting date of the interval we’re working on.

  • date_end – Ending date of the interval we’re working on.

  • select_clip – Selection clip allowing to work on a subset of the source’s data.

  • select_shape – Shape file, GeoDataFrame or Geometry to select.

  • data_cleaner – Data cleaning applied just after the reader. This cleaning might consist of sorting, duplication removal or removing indexes in order to keep them increasing.

  • orf – Source’s indexer.

  • reference_track – Reference track.

  • time – Time field.

  • longitude – Longitude field.

  • latitude – Latitude field.

FIELDS_SOURCE_FULL_CHECK: bool = True
FORBIDDEN_PARAMETERS: list[str] = []
KNOWN_PARAMETERS: dict[str, tuple[Any, Any]] = {'cross_track_distance': ('Optional[str]', 'cross_track_distance'), 'cycle_number': ('Optional[str]', 'CYCLE_NUMBER'), 'data_cleaner': ('Optional[DataCleaner]', None), 'date_end': ('Optional[DateType]', None), 'date_start': ('Optional[DateType]', None), 'latitude': ('Optional[str]', 'LATITUDE'), 'latitude_nadir': ('Optional[str]', 'latitude_nadir'), 'longitude': ('Optional[str]', 'LONGITUDE'), 'longitude_nadir': ('Optional[str]', 'longitude_nadir'), 'orf': ('Optional[PassIndexer | str]', None), 'pass_number': ('Optional[str]', 'PASS_NUMBER'), 'reference_track': ('Optional[ReferenceTrackType]', None), 'select_clip': ('Optional[str]', None), 'select_shape': ('Optional[str | gpd.GeoDataFrame | shg.Polygon]', None), 'swath_lines': ('Optional[str]', 'num_lines'), 'swath_pixels': ('Optional[str]', 'num_pixels'), 'time': ('Optional[str]', 'time')}
REQUIRED_PARAMETERS: list[str] = ['date_start', 'date_end']
RESOURCES: ClassVar[ong_data_cls.CLSResourcesManager] = <octantng.data.cls.resources.CLSResourcesManager object>
apply_clip(data)

Apply reader’s selection clip to the provided data.

Parameters:

data (Dataset) – Data on which to apply selection.

Return type:

Dataset

Returns:

Selected data.

apply_shape(data, with_clip=False)

Apply reader’s shape selection to the provided data.

Parameters:
  • data (Dataset) – Data from which to select data.

  • with_clip (bool) – Whether pre-selection with clip has to be done or not.

Return type:

Dataset

Returns:

Dataset reduced to the reader’s shape.

check_empty()

Check whether the reader contains data or not.

check_field(field, fields_ext=None, fill_properties=False)

Check the provided field validity.

Parameters:
  • field (Field) – Field to check.

  • fields_ext (list[str | Hashable] | None) – Additional valid external fields.

  • fill_properties (bool) – Whether to fill missing non-clip field’s properties or not.

Raises:

CasysReaderError – If the provided field is not valid.

check_fields_source(fields)

Check that the provided fields exists in this reader.

Parameters:

fields (Sequence[str]) – Fields to check.

Raises:

CasysReaderError – If one or more fields do not exist.

close()

Close used resources.

computation_dates(index, freq)

Computation starting and ending time for the provided parameters.

Parameters:
  • index (int | None) – Period’s number.

  • freq (FreqType) – Computation frequency.

Return type:

tuple[datetime64 | None, datetime64 | None]

Returns:

(start, end) dates.

dask_reader()

Return a version of the reader that might be used on a dask worker.

Parameters:

kwargs – Additional reader parameters.

Return type:

CLSReader

Returns:

Dask worker compatible reader.

property data: Dataset | None

Full reader’s data.

property date_end: DateHandler
property date_np_end: datetime64 | None
property date_np_start: datetime64 | None
property date_start: DateHandler
property field_cross_track_distance: Field | None
property field_cycle_number: Field | None
property field_lat: Field | None
property field_lat_nadir: Field | None
property field_lon: Field | None
property field_lon_nadir: Field | None
property field_pass_number: Field | None
property field_time: Field | None
property fields

Returns the dictionary of existing fields in the source.

Returns:

List of existing fields as Field objects.

property ges_table_dir: str

GES_TABLE_DIR set by this reader.

get_table()

Instantiate a CLS table.

Return type:

TableMeasure | TableInSitu

Returns:

New instance of a CLS table.

have_special_fields(fields)

Check whether provided fields are set or not.

Parameters:

fields (list[FieldType]) – Fields to check.

Return type:

bool

Returns:

True if fields are set, False otherwise.

property index: str

Name of the index.

initialize(check_coords=True, force=True, time_extension=False, real_start=None)

Check and initialize reader’s parameters.

Parameters:
  • check_coords (bool) – Whether to check coordinates or not.

  • force (bool) – Whether to initialize if already initialized or not.

  • time_extension (bool) – Whether reader before or after initialization dates is allowed or not.

  • real_start (datetime64) – Real starting time of this processing. This is used to normalize interpolation index.

light_reader(freq, start, end)

Return a light version of the reader (that might be scattered on dask).

Parameters:
  • freq (FrequencyHandler) – Frequency handler used for parallelization.

  • start (datetime64) – Starting date of the required data.

  • end (datetime64) – Ending date of the required data.

Return type:

CasysReader

Returns:

Lightened and dask compatible reader.

property name: str

Table’s name.

property orf: PassIndexer | None
property periods: list[Period]
pre_computed_diagnostics()

List of pre-computed diagnostics (mainly used for stored diagnostics).

Return type:

list[MethodCaller]

Returns:

List of pre-computed diagnostics.

read_data(fields, start=None, end=None, period=None, include_end=True)

Read the requested fields and rename them according to the dictionary.

Parameters:
  • fields (dict[str, Field]) – Dictionary of fields names matched to their source.

  • start (datetime64 | None) – Starting date of the data to get.

  • end (datetime64 | None) – Ending date of the data to get.

  • period (int) – Period’s number to read.

  • include_end (bool) – Whether to include the end date or not.

Returns:

Fields content as a Dataset

property real_start: datetime64 | None
property reference_track: ReferenceTrack | None
reset_periods()

Reset the reader’s periods to None.

property select_clip: str | None
property select_shape: GeoDataFrame | None
set_dask_processing(freq, start, end, reference=None)

Do whatever needs to be done in case of a dask usage.

Parameters:
set_parameters(*, date_start=None, date_end=None, select_clip=None, select_shape=None, orf=None, reference_track=None, swath=False, **kwargs)

Method allowing to set reader’s parameters if unset.

Parameters:
  • date_start (DateHandler | None) – Starting date of the interval we’re working on.

  • date_end (DateHandler | None) – Ending date of the interval we’re working on.

  • select_clip (str | None) – Selection clip allowing to work on a subset of the source’s data.

  • select_shape (str | GeoDataFrame | Polygon | None) – Shape file, GeoDataFrame or Geometry to select.

  • orf (PassIndexer | str | None) – Source’s indexer.

  • reference_track (ReferenceTrack | None) – Reference track.

  • swath (bool) – Whether this reader contains swath type data or not.

  • kwargs – Special fields.

classmethod set_signature()

Fix the class initialization signature.

property source: str

Reader’s source information.

special_field(ftype)
Parameters:

ftype (FieldType)

Return type:

Field | None

special_field_name(ftype)
Parameters:

ftype (FieldType)

Return type:

str | None

property special_fields: dict[FieldType, Field]
class casys.readers.CasysReader(date_start=None, date_end=None, select_clip=None, select_shape=None, data_cleaner=None, orf=None, reference_track=None, *, time='time', longitude='LONGITUDE', latitude='LATITUDE', longitude_nadir='longitude_nadir', latitude_nadir='latitude_nadir', cycle_number='CYCLE_NUMBER', pass_number='PASS_NUMBER', cross_track_distance='cross_track_distance', swath_lines='num_lines', swath_pixels='num_pixels')

Bases: ABC

Abstract class representing a casys data reader.

Parameters:
FIELDS_SOURCE_FULL_CHECK: bool = True
FORBIDDEN_PARAMETERS: list[str] = []
KNOWN_PARAMETERS: dict[str, tuple[Any, Any]] = {'cross_track_distance': ('Optional[str]', 'cross_track_distance'), 'cycle_number': ('Optional[str]', 'CYCLE_NUMBER'), 'data_cleaner': ('Optional[DataCleaner]', None), 'date_end': ('Optional[DateType]', None), 'date_start': ('Optional[DateType]', None), 'latitude': ('Optional[str]', 'LATITUDE'), 'latitude_nadir': ('Optional[str]', 'latitude_nadir'), 'longitude': ('Optional[str]', 'LONGITUDE'), 'longitude_nadir': ('Optional[str]', 'longitude_nadir'), 'orf': ('Optional[PassIndexer | str]', None), 'pass_number': ('Optional[str]', 'PASS_NUMBER'), 'reference_track': ('Optional[ReferenceTrackType]', None), 'select_clip': ('Optional[str]', None), 'select_shape': ('Optional[str | gpd.GeoDataFrame | shg.Polygon]', None), 'swath_lines': ('Optional[str]', 'num_lines'), 'swath_pixels': ('Optional[str]', 'num_pixels'), 'time': ('Optional[str]', 'time')}
REQUIRED_PARAMETERS: list[str] = []
RESOURCES: ClassVar[CLSResourcesManager] = <octantng.data.cls.resources.CLSResourcesManager object>
apply_clip(data)

Apply reader’s selection clip to the provided data.

Parameters:

data (Dataset) – Data on which to apply selection.

Return type:

Dataset

Returns:

Selected data.

apply_shape(data, with_clip=False)

Apply reader’s shape selection to the provided data.

Parameters:
  • data (Dataset) – Data from which to select data.

  • with_clip (bool) – Whether pre-selection with clip has to be done or not.

Return type:

Dataset

Returns:

Dataset reduced to the reader’s shape.

check_empty()

Check whether the reader contains data or not.

check_field(field, fields_ext=None, fill_properties=False)

Check the provided field validity.

Parameters:
  • field (Field) – Field to check.

  • fields_ext (list[str | Hashable] | None) – Additional valid external fields.

  • fill_properties (bool) – Whether to fill missing non-clip field’s properties or not.

Raises:

CasysReaderError – If the provided field is not valid.

check_fields_source(fields)

Check that the provided fields exists in this reader.

Parameters:

fields (Sequence[str]) – Fields to check.

Raises:

CasysReaderError – If one or more fields do not exist.

abstract close()

Close used resources.

computation_dates(index, freq)

Computation starting and ending time for the provided parameters.

Parameters:
  • index (int | None) – Period’s number.

  • freq (FreqType) – Computation frequency.

Return type:

tuple[datetime64 | None, datetime64 | None]

Returns:

(start, end) dates.

abstract dask_reader(**kwargs)

Return a version of the reader that might be used on a dask worker.

Parameters:

kwargs – Additional reader parameters.

Return type:

CasysReader

Returns:

Dask worker compatible reader.

property data: Dataset | None

Full reader’s data.

property date_end: DateHandler
property date_np_end: datetime64 | None
property date_np_start: datetime64 | None
property date_start: DateHandler
property field_cross_track_distance: Field | None
property field_cycle_number: Field | None
property field_lat: Field | None
property field_lat_nadir: Field | None
property field_lon: Field | None
property field_lon_nadir: Field | None
property field_pass_number: Field | None
property field_time: Field | None
abstract property fields: dict[str, Field]

Returns the dictionary of existing fields in the source.

Returns:

List of existing fields as Field objects.

have_special_fields(fields)

Check whether provided fields are set or not.

Parameters:

fields (list[FieldType]) – Fields to check.

Return type:

bool

Returns:

True if fields are set, False otherwise.

property index: str

Name of the index.

initialize(check_coords=True, force=True, time_extension=False, real_start=None)

Check and initialize reader’s parameters.

Parameters:
  • check_coords (bool) – Whether to check coordinates or not.

  • force (bool) – Whether to initialize if already initialized or not.

  • time_extension (bool) – Whether reader before or after initialization dates is allowed or not.

  • real_start (datetime64 | None) – Real starting time of this processing. This is used to normalize interpolation index.

light_reader(freq, start, end)

Return a light version of the reader (that might be scattered on dask).

Parameters:
  • freq (FrequencyHandler) – Frequency handler used for parallelization.

  • start (datetime64) – Starting date of the required data.

  • end (datetime64) – Ending date of the required data.

Return type:

CasysReader

Returns:

Lightened and dask compatible reader.

property orf: PassIndexer | None
property periods: list[Period]
pre_computed_diagnostics()

List of pre-computed diagnostics (mainly used for stored diagnostics).

Return type:

list[MethodCaller]

Returns:

List of pre-computed diagnostics.

abstract read_data(fields, start=None, end=None, period=None, include_end=True)

Read the requested fields and rename them according to the dictionary.

Parameters:
  • fields (dict[str, Field]) – Dictionary of fields names matched to their source.

  • start (datetime64 | None) – Starting date of the data to get.

  • end (datetime64 | None) – Ending date of the data to get.

  • period (int | None) – Period’s number to read.

  • include_end (bool) – Whether to include the end date or not.

Return type:

Dataset

Returns:

Fields content as a Dataset

property real_start: datetime64 | None
property reference_track: ReferenceTrack | None
reset_periods()

Reset the reader’s periods to None.

property select_clip: str | None
property select_shape: GeoDataFrame | None
set_dask_processing(freq, start, end, reference=None)

Do whatever needs to be done in case of a dask usage.

Parameters:
set_parameters(*, date_start=None, date_end=None, select_clip=None, select_shape=None, orf=None, reference_track=None, swath=False, **kwargs)

Method allowing to set reader’s parameters if unset.

Parameters:
  • date_start (DateHandler | None) – Starting date of the interval we’re working on.

  • date_end (DateHandler | None) – Ending date of the interval we’re working on.

  • select_clip (str | None) – Selection clip allowing to work on a subset of the source’s data.

  • select_shape (str | GeoDataFrame | Polygon | None) – Shape file, GeoDataFrame or Geometry to select.

  • orf (PassIndexer | str | None) – Source’s indexer.

  • reference_track (ReferenceTrack | None) – Reference track.

  • swath (bool) – Whether this reader contains swath type data or not.

  • kwargs – Special fields.

classmethod set_signature()

Fix the class initialization signature.

abstract property source: str

Reader’s source information.

special_field(ftype)
Parameters:

ftype (FieldType)

Return type:

Field | None

special_field_name(ftype)
Parameters:

ftype (FieldType)

Return type:

str | None

property special_fields: dict[FieldType, Field]
exception casys.readers.CasysReaderError

Bases: AltiDataError

Exception raised when a problem related to data reading occurs.

add_note(object, /)

Exception.add_note(note) – add a note to the exception

args
with_traceback(object, /)

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class casys.readers.DatasetReader(data=None, data_path=None, backend_fields=None, backend_kwargs=None, *, date_start=None, date_end=None, select_clip=None, select_shape=None, data_cleaner=None, orf=None, reference_track=None, time='time', longitude='LONGITUDE', latitude='LATITUDE', longitude_nadir='longitude_nadir', latitude_nadir='latitude_nadir', cycle_number='CYCLE_NUMBER', pass_number='PASS_NUMBER', cross_track_distance='cross_track_distance', swath_lines='num_lines', swath_pixels='num_pixels')

Bases: CasysReader

xarray Dataset reader.

Parameters:
  • data (Dataset | None) – Dataset.

  • data_path (str | list[str] | None) – Dataset’s file(s) path.

  • backend_fields (list[str] | None) – List of fields (variables) to read.

  • backend_kwargs (dict[str, Any] | None) – kwargs to provide to the backend when using a data_path to load the data.

  • date_start – Starting date of the interval we’re working on.

  • date_end – Ending date of the interval we’re working on.

  • select_clip – Selection clip allowing to work on a subset of the source’s data.

  • select_shape – Shape file, GeoDataFrame or Geometry to select.

  • data_cleaner – Data cleaning applied just after the reader. This cleaning might consist of sorting, duplication removal or removing indexes in order to keep them increasing.

  • orf – Source’s indexer.

  • reference_track – Reference track.

  • time – Time field.

  • longitude – Longitude field.

  • latitude – Latitude field.

  • swath_lines – Swath main dimension.

  • swath_pixels – Swath cross_track dimension.

  • cycle_number – The cycle number field.

  • pass_number – The pass number field.

  • longitude_nadir – The nadir’s longitude field.

  • latitude_nadir – The nadir’s latitude field.

  • cross_track_distance – Cross track distance field.

FIELDS_SOURCE_FULL_CHECK: bool = True
FORBIDDEN_PARAMETERS: list[str] = []
KNOWN_PARAMETERS: dict[str, tuple[Any, Any]] = {'cross_track_distance': ('Optional[str]', 'cross_track_distance'), 'cycle_number': ('Optional[str]', 'CYCLE_NUMBER'), 'data_cleaner': ('Optional[DataCleaner]', None), 'date_end': ('Optional[DateType]', None), 'date_start': ('Optional[DateType]', None), 'latitude': ('Optional[str]', 'LATITUDE'), 'latitude_nadir': ('Optional[str]', 'latitude_nadir'), 'longitude': ('Optional[str]', 'LONGITUDE'), 'longitude_nadir': ('Optional[str]', 'longitude_nadir'), 'orf': ('Optional[PassIndexer | str]', None), 'pass_number': ('Optional[str]', 'PASS_NUMBER'), 'reference_track': ('Optional[ReferenceTrackType]', None), 'select_clip': ('Optional[str]', None), 'select_shape': ('Optional[str | gpd.GeoDataFrame | shg.Polygon]', None), 'swath_lines': ('Optional[str]', 'num_lines'), 'swath_pixels': ('Optional[str]', 'num_pixels'), 'time': ('Optional[str]', 'time')}
REQUIRED_PARAMETERS: list[str] = []
RESOURCES: ClassVar[ong_data_cls.CLSResourcesManager] = <octantng.data.cls.resources.CLSResourcesManager object>
apply_clip(data)

Apply reader’s selection clip to the provided data.

Parameters:

data (Dataset) – Data on which to apply selection.

Return type:

Dataset

Returns:

Selected data.

apply_shape(data, with_clip=False)

Apply reader’s shape selection to the provided data.

Parameters:
  • data (Dataset) – Data from which to select data.

  • with_clip (bool) – Whether pre-selection with clip has to be done or not.

Return type:

Dataset

Returns:

Dataset reduced to the reader’s shape.

check_empty()

Check whether the reader contains data or not.

check_field(field, fields_ext=None, fill_properties=False)

Check the provided field validity.

Parameters:
  • field (Field) – Field to check.

  • fields_ext (list[str | Hashable] | None) – Additional valid external fields.

  • fill_properties (bool) – Whether to fill missing non-clip field’s properties or not.

Raises:

CasysReaderError – If the provided field is not valid.

check_fields_source(fields)

Check that the provided fields exists in this reader.

Parameters:

fields (Sequence[str]) – Fields to check.

Raises:

CasysReaderError – If one or more fields do not exist.

close()

Close used resources.

computation_dates(index, freq)

Computation starting and ending time for the provided parameters.

Parameters:
  • index (int | None) – Period’s number.

  • freq (FreqType) – Computation frequency.

Return type:

tuple[datetime64 | None, datetime64 | None]

Returns:

(start, end) dates.

dask_reader(data_dask=None)

Return a version of the reader that might be used on a dask worker.

Parameters:
  • kwargs – Additional reader parameters.

  • data_dask (Dataset | None)

Return type:

DatasetReader

Returns:

Dask worker compatible reader.

property data: Dataset | None

Full reader’s data.

property data_path: str

Dataset’s file(s) path.

property date_end: DateHandler
property date_np_end: datetime64 | None
property date_np_start: datetime64 | None
property date_start: DateHandler
property field_cross_track_distance: Field | None
property field_cycle_number: Field | None
property field_lat: Field | None
property field_lat_nadir: Field | None
property field_lon: Field | None
property field_lon_nadir: Field | None
property field_pass_number: Field | None
property field_time: Field | None
property fields

Returns the dictionary of existing fields in the source.

Returns:

List of existing fields as Field objects.

have_special_fields(fields)

Check whether provided fields are set or not.

Parameters:

fields (list[FieldType]) – Fields to check.

Return type:

bool

Returns:

True if fields are set, False otherwise.

property index: str

Name of the index.

initialize(check_coords=True, force=True, time_extension=False, real_start=None)

Check and initialize reader’s parameters.

Parameters:
  • check_coords (bool) – Whether to check coordinates or not.

  • force (bool) – Whether to initialize if already initialized or not.

  • time_extension (bool) – Whether reader before or after initialization dates is allowed or not.

  • real_start (datetime64 | None) – Real starting time of this processing. This is used to normalize interpolation index.

light_reader(freq, start, end)

Return a light version of the reader (that might be scattered on dask).

Parameters:
  • freq (FrequencyHandler) – Frequency handler used for parallelization.

  • start (datetime64) – Starting date of the required data.

  • end (datetime64) – Ending date of the required data.

Return type:

CasysReader

Returns:

Lightened and dask compatible reader.

load_data()

Load data as a xarray dataset.

Return type:

Dataset

Returns:

Loaded data as a dataset.

property orf: PassIndexer | None
property periods: list[Period]
pre_computed_diagnostics()

List of pre-computed diagnostics (mainly used for stored diagnostics).

Return type:

list[MethodCaller]

Returns:

List of pre-computed diagnostics.

read_data(fields, start=None, end=None, period=None, include_end=True)

Read the requested fields and rename them according to the dictionary.

Parameters:
  • fields (dict[str, Field]) – Dictionary of fields names matched to their source.

  • start (datetime64 | None) – Starting date of the data to get.

  • end (datetime64 | None) – Ending date of the data to get.

  • period (int) – Period’s number to read.

  • include_end (bool) – Whether to include the end date or not.

Returns:

Fields content as a Dataset

property real_start: datetime64 | None
property reference_track: ReferenceTrack | None
reset_periods()

Reset the reader’s periods to None.

property select_clip: str | None
property select_shape: GeoDataFrame | None
set_dask_processing(freq, start, end, reference=None)

Do whatever needs to be done in case of a dask usage.

Parameters:
  • freq (FrequencyHandler) – Frequency handler used for parallelization.

  • start (datetime64) – Starting date of the required data.

  • end (datetime64) – Ending date of the required data.

  • reference (list[tuple[int, int]]) – List of reference’s orbits.

set_parameters(*, date_start=None, date_end=None, select_clip=None, select_shape=None, orf=None, reference_track=None, swath=False, **kwargs)

Method allowing to set reader’s parameters if unset.

Parameters:
  • date_start (DateHandler | None) – Starting date of the interval we’re working on.

  • date_end (DateHandler | None) – Ending date of the interval we’re working on.

  • select_clip (str | None) – Selection clip allowing to work on a subset of the source’s data.

  • select_shape (str | GeoDataFrame | Polygon | None) – Shape file, GeoDataFrame or Geometry to select.

  • orf (PassIndexer | str | None) – Source’s indexer.

  • reference_track (ReferenceTrack | None) – Reference track.

  • swath (bool) – Whether this reader contains swath type data or not.

  • kwargs – Special fields.

classmethod set_signature()

Fix the class initialization signature.

property source: str

Reader’s source information.

special_field(ftype)
Parameters:

ftype (FieldType)

Return type:

Field | None

special_field_name(ftype)
Parameters:

ftype (FieldType)

Return type:

str | None

property special_fields: dict[FieldType, Field]
class casys.readers.MultiReader(readers, markers=None, tolerance=np.timedelta64(0, 'ns'), *, date_start=None, date_end=None, select_clip=None, select_shape=None, data_cleaner=None, orf=None, reference_track=None, time='time', longitude='LONGITUDE', latitude='LATITUDE', longitude_nadir='longitude_nadir', latitude_nadir='latitude_nadir', cycle_number='CYCLE_NUMBER', pass_number='PASS_NUMBER', cross_track_distance='cross_track_distance', swath_lines='num_lines', swath_pixels='num_pixels')

Bases: CasysReader

Reader allowing to read from a set of readers.

The first reader is used as reference (for time and coordinates). Fields from all readers are available and prefixed by the provided markers.

Parameters:
  • readers (list[CasysReader]) – List of readers.

  • markers (list[str]) – List of field’s prefixes for each reader. Default to Sx_ with x being the reader’s number.

  • tolerance (timedelta64) – Gap’s tolerance used to fill missing indexes from a reader when aligning it on the reference’s reader’s index (default to 0).

  • date_start – Starting date of the interval we’re working on.

  • date_end – Ending date of the interval we’re working on.

  • select_clip – Selection clip allowing to work on a subset of the source’s data.

  • select_shape – Shape file, GeoDataFrame or Geometry to select.

  • data_cleaner – Data cleaning applied just after the reader. This cleaning might consist of sorting, duplication removal or removing indexes in order to keep them increasing.

  • orf – Source’s indexer.

  • reference_track – Reference track.

  • time – Time field.

  • longitude – Longitude field.

  • latitude – Latitude field.

FIELDS_SOURCE_FULL_CHECK: bool = True
FORBIDDEN_PARAMETERS: list[str] = []
KNOWN_PARAMETERS: dict[str, tuple[Any, Any]] = {'cross_track_distance': ('Optional[str]', 'cross_track_distance'), 'cycle_number': ('Optional[str]', 'CYCLE_NUMBER'), 'data_cleaner': ('Optional[DataCleaner]', None), 'date_end': ('Optional[DateType]', None), 'date_start': ('Optional[DateType]', None), 'latitude': ('Optional[str]', 'LATITUDE'), 'latitude_nadir': ('Optional[str]', 'latitude_nadir'), 'longitude': ('Optional[str]', 'LONGITUDE'), 'longitude_nadir': ('Optional[str]', 'longitude_nadir'), 'orf': ('Optional[PassIndexer | str]', None), 'pass_number': ('Optional[str]', 'PASS_NUMBER'), 'reference_track': ('Optional[ReferenceTrackType]', None), 'select_clip': ('Optional[str]', None), 'select_shape': ('Optional[str | gpd.GeoDataFrame | shg.Polygon]', None), 'swath_lines': ('Optional[str]', 'num_lines'), 'swath_pixels': ('Optional[str]', 'num_pixels'), 'time': ('Optional[str]', 'time')}
REQUIRED_PARAMETERS: list[str] = []
RESOURCES: ClassVar[ong_data_cls.CLSResourcesManager] = <octantng.data.cls.resources.CLSResourcesManager object>
apply_clip(data)

Apply reader’s selection clip to the provided data.

Parameters:

data (Dataset) – Data on which to apply selection.

Return type:

Dataset

Returns:

Selected data.

apply_shape(data, with_clip=False)

Apply reader’s shape selection to the provided data.

Parameters:
  • data (Dataset) – Data from which to select data.

  • with_clip (bool) – Whether pre-selection with clip has to be done or not.

Return type:

Dataset

Returns:

Dataset reduced to the reader’s shape.

check_empty()

Check whether the reader contains data or not.

check_field(field, fields_ext=None, fill_properties=False)

Check the provided field validity.

Parameters:
  • field (Field) – Field to check.

  • fields_ext (list[str]) – Additional valid external fields.

  • fill_properties (bool) – Whether to fill missing non-clip field’s properties or not.

Raises:

CasysReaderError – If the provided field is not valid.

check_fields_source(fields)

Check that the provided fields exists in this reader.

Parameters:

fields (Sequence[str]) – Fields to check.

Raises:

CasysReaderError – If one or more fields do not exist.

close()

Close used resources.

computation_dates(index, freq)

Computation starting and ending time for the provided parameters.

Parameters:
  • index (int | None) – Period’s number.

  • freq (FreqType) – Computation frequency.

Return type:

tuple[datetime64 | None, datetime64 | None]

Returns:

(start, end) dates.

dask_reader(**kwargs)

Return a version of the reader that might be used on a dask worker.

Parameters:

kwargs – Additional reader parameters.

Return type:

MultiReader

Returns:

Dask worker compatible reader.

property data: Dataset | None

Full reader’s data.

property date_end: DateHandler
property date_np_end: datetime64 | None
property date_np_start: datetime64 | None
property date_start: DateHandler
property field_cross_track_distance: Field | None
property field_cycle_number: Field | None
property field_lat: Field | None
property field_lat_nadir: Field | None
property field_lon: Field | None
property field_lon_nadir: Field | None
property field_pass_number: Field | None
property field_time: Field | None
property fields: dict[str, Field]

Returns the dictionary of existing fields in the source.

Returns:

List of existing fields as Field objects.

have_special_fields(fields)

Check whether provided fields are set or not.

Parameters:

fields (list[FieldType]) – Fields to check.

Return type:

bool

Returns:

True if fields are set, False otherwise.

property index: str

Name of the index.

initialize(check_coords=True, force=True, time_extension=False, real_start=None)

Check and initialize reader’s parameters.

Parameters:
  • check_coords (bool) – Whether to check coordinates or not.

  • force (bool) – Whether to initialize if already initialized or not.

  • time_extension (bool) – Whether reader before or after initialization dates is allowed or not.

  • real_start (datetime64) – Real starting time of this processing. This is used to normalize interpolation index.

light_reader(freq, start, end)

Return a light version of the reader (that might be scattered on dask).

Parameters:
  • freq (FrequencyHandler) – Frequency handler used for parallelization.

  • start (datetime64) – Starting date of the required data.

  • end (datetime64) – Ending date of the required data.

Return type:

CasysReader

Returns:

Lightened and dask compatible reader.

property markers: list[str]
property orf: PassIndexer | None
property periods: list[Period]
pre_computed_diagnostics()

List of pre-computed diagnostics (mainly used for stored diagnostics).

Return type:

list[MethodCaller]

Returns:

List of pre-computed diagnostics.

read_data(fields, start=None, end=None, period=None, include_end=True)

Read the requested fields and rename them according to the dictionary.

Parameters:
  • fields (dict[str, Field]) – Dictionary of fields names matched to their source.

  • start (datetime64 | None) – Starting date of the data to get.

  • end (datetime64 | None) – Ending date of the data to get.

  • period (int) – Period’s number to read.

  • include_end (bool) – Whether to include the end date or not.

Return type:

Dataset

Returns:

Fields content as a Dataset

property reader_ref: CasysReader
property readers: list[CasysReader]
property real_start: datetime64 | None
property reference_track: ReferenceTrack | None
reset_periods()

Reset the reader’s periods to None.

property select_clip: str | None
property select_shape: GeoDataFrame | None
set_dask_processing(freq, start, end, reference=None)

Do whatever needs to be done in case of a dask usage.

Parameters:
  • freq (FrequencyHandler) – Frequency handler used for parallelization.

  • start (datetime64) – Starting date of the required data.

  • end (datetime64) – Ending date of the required data.

  • reference (list[tuple[int, int]]) – List of reference’s orbits.

set_parameters(*, date_start=None, date_end=None, select_clip=None, select_shape=None, orf=None, reference_track=None, swath=False, **kwargs)

Method allowing to set reader’s parameters if unset.

Parameters:
  • date_start (DateHandler | None) – Starting date of the interval we’re working on.

  • date_end (DateHandler | None) – Ending date of the interval we’re working on.

  • select_clip (str | None) – Selection clip allowing to work on a subset of the source’s data.

  • select_shape (str | GeoDataFrame | Polygon | None) – Shape file, GeoDataFrame or Geometry to select.

  • orf (PassIndexer | str | None) – Source’s indexer.

  • reference_track (ReferenceTrack | None) – Reference track.

  • swath (bool) – Whether this reader contains swath type data or not.

  • kwargs – Special fields.

classmethod set_signature()

Fix the class initialization signature.

property source: str

Reader’s source information.

special_field(ftype)
Parameters:

ftype (FieldType)

Return type:

Field | None

special_field_name(ftype)
Parameters:

ftype (FieldType)

Return type:

str | None

property special_fields: dict[FieldType, Field]
property tolerance: timedelta64
class casys.readers.ScCollectionReader(collection=None, data_path=None, backend_fields=None, backend_kwargs=None, *, date_start=None, date_end=None, data_cleaner=None, orf=None, time='time', longitude='LONGITUDE', latitude='LATITUDE', longitude_nadir='longitude_nadir', latitude_nadir='latitude_nadir', cycle_number='CYCLE_NUMBER', pass_number='PASS_NUMBER', cross_track_distance='cross_track_distance', swath_lines='num_lines', swath_pixels='num_pixels')

Bases: ZCollectionReader

Reader for a swot_calval Collection.

Parameters:
  • collection (Collection | None) – Collection.

  • data_path (str | None) – Collection path.

  • backend_fields (list[str] | None) – List of fields (variables) to read.

  • backend_kwargs (dict | None) – Kwargs dictionary to pass to the underlying collection.

  • date_start – Starting date of the interval we’re working on.

  • date_end – Ending date of the interval we’re working on.

  • select_clip – Selection clip allowing to work on a subset of the source’s data.

  • select_shape – Shape file, GeoDataFrame or Geometry to select.

  • data_cleaner – Data cleaning applied just after the reader. This cleaning might consist of sorting, duplication removal or removing indexes in order to keep them increasing.

  • orf – Source’s indexer.

  • reference_track – Reference track.

  • time – Time field.

  • longitude – Longitude field.

  • latitude – Latitude field.

  • swath_lines – Swath main dimension.

  • swath_pixels – Swath cross_track dimension.

  • cycle_number – The cycle number field.

  • pass_number – The pass number field.

  • longitude_nadir – The nadir’s longitude field.

  • latitude_nadir – The nadir’s latitude field.

  • cross_track_distance – Cross track distance field.

FIELDS_SOURCE_FULL_CHECK: bool = True
FORBIDDEN_PARAMETERS: list[str] = ['select_clip', 'select_shape', 'reference_track']
KNOWN_PARAMETERS: dict[str, tuple[Any, Any]] = {'cross_track_distance': ('Optional[str]', 'cross_track_distance'), 'cycle_number': ('Optional[str]', 'CYCLE_NUMBER'), 'data_cleaner': ('Optional[DataCleaner]', None), 'date_end': ('Optional[DateType]', None), 'date_start': ('Optional[DateType]', None), 'latitude': ('Optional[str]', 'LATITUDE'), 'latitude_nadir': ('Optional[str]', 'latitude_nadir'), 'longitude': ('Optional[str]', 'LONGITUDE'), 'longitude_nadir': ('Optional[str]', 'longitude_nadir'), 'orf': ('Optional[PassIndexer | str]', None), 'pass_number': ('Optional[str]', 'PASS_NUMBER'), 'reference_track': ('Optional[ReferenceTrackType]', None), 'select_clip': ('Optional[str]', None), 'select_shape': ('Optional[str | gpd.GeoDataFrame | shg.Polygon]', None), 'swath_lines': ('Optional[str]', 'num_lines'), 'swath_pixels': ('Optional[str]', 'num_pixels'), 'time': ('Optional[str]', 'time')}
REQUIRED_PARAMETERS: list[str] = []
RESOURCES: ClassVar[ong_data_cls.CLSResourcesManager] = <octantng.data.cls.resources.CLSResourcesManager object>
apply_clip(data)

Apply reader’s selection clip to the provided data.

Parameters:

data (Dataset) – Data on which to apply selection.

Return type:

Dataset

Returns:

Selected data.

apply_shape(data, with_clip=False)

Apply reader’s shape selection to the provided data.

Parameters:
  • data (Dataset) – Data from which to select data.

  • with_clip (bool) – Whether pre-selection with clip has to be done or not.

Return type:

Dataset

Returns:

Dataset reduced to the reader’s shape.

check_empty()

Check whether the reader contains data or not.

check_field(field, fields_ext=None, fill_properties=False)

Check the provided field validity.

Parameters:
  • field (Field) – Field to check.

  • fields_ext (list[str | Hashable] | None) – Additional valid external fields.

  • fill_properties (bool) – Whether to fill missing non-clip field’s properties or not.

Raises:

CasysReaderError – If the provided field is not valid.

check_fields_source(fields)

Check that the provided fields exists in this reader.

Parameters:

fields (Sequence[str]) – Fields to check.

Raises:

CasysReaderError – If one or more fields do not exist.

close()

Close used resources.

property collection
property collection_type: type
computation_dates(index, freq)

Computation starting and ending time for the provided parameters.

Parameters:
  • index (int | None) – Period’s number.

  • freq (FreqType) – Computation frequency.

Return type:

tuple[datetime64 | None, datetime64 | None]

Returns:

(start, end) dates.

dask_reader(data_dask=None)

Return a version of the reader that might be used on a dask worker.

Parameters:
  • kwargs – Additional reader parameters.

  • data_dask (Dataset | None)

Return type:

DatasetReader

Returns:

Dask worker compatible reader.

property data: Dataset | None

Full reader’s data.

property data_path: str

Dataset’s file(s) path.

property date_end: DateHandler
property date_np_end: datetime64 | None
property date_np_start: datetime64 | None
property date_start: DateHandler
property field_cross_track_distance: Field | None
property field_cycle_number: Field | None
property field_lat: Field | None
property field_lat_nadir: Field | None
property field_lon: Field | None
property field_lon_nadir: Field | None
property field_pass_number: Field | None
property field_time: Field | None
property fields

Returns the dictionary of existing fields in the source.

Returns:

List of existing fields as Field objects.

have_special_fields(fields)

Check whether provided fields are set or not.

Parameters:

fields (list[FieldType]) – Fields to check.

Return type:

bool

Returns:

True if fields are set, False otherwise.

property index: str

Name of the index.

initialize(check_coords=True, force=True, time_extension=False, real_start=None)

Check and initialize reader’s parameters.

Parameters:
  • check_coords (bool) – Whether to check coordinates or not.

  • force (bool) – Whether to initialize if already initialized or not.

  • time_extension (bool) – Whether reader before or after initialization dates is allowed or not.

  • real_start (datetime64 | None) – Real starting time of this processing. This is used to normalize interpolation index.

light_reader(freq, start, end)

Return a light version of the reader (that might be scattered on dask).

Parameters:
  • freq (FrequencyHandler) – Frequency handler used for parallelization.

  • start (datetime64) – Starting date of the required data.

  • end (datetime64) – Ending date of the required data.

Return type:

CasysReader

Returns:

Lightened and dask compatible reader.

static load_collection(data_path)
Parameters:

data_path (str)

Return type:

Collection

load_data()

Load data as a xarray dataset.

Return type:

Dataset

Returns:

Loaded data as a dataset.

load_zdata()

Load data from the collection.

Return type:

Dataset

Returns:

Data as a zcollection.Dataset.

property orf: PassIndexer | None
property periods: list[Period]
pre_computed_diagnostics()

List of pre-computed diagnostics (mainly used for stored diagnostics).

Return type:

list[MethodCaller]

Returns:

List of pre-computed diagnostics.

read_data(fields, start=None, end=None, period=None, include_end=True)

Read the requested fields and rename them according to the dictionary.

Parameters:
  • fields (dict[str, Field]) – Dictionary of fields names matched to their source.

  • start (datetime64 | None) – Starting date of the data to get.

  • end (datetime64 | None) – Ending date of the data to get.

  • period (int) – Period’s number to read.

  • include_end (bool) – Whether to include the end date or not.

Returns:

Fields content as a Dataset

property real_start: datetime64 | None
property reference_track: ReferenceTrack | None
reset_periods()

Reset the reader’s periods to None.

property select_clip: str | None
property select_shape: GeoDataFrame | None
set_dask_processing(freq, start, end, reference=None)

Do whatever needs to be done in case of a dask usage.

Parameters:
  • freq (FrequencyHandler) – Frequency handler used for parallelization.

  • start (datetime64) – Starting date of the required data.

  • end (datetime64) – Ending date of the required data.

  • reference (list[tuple[int, int]]) – List of reference’s orbits.

set_parameters(*, date_start=None, date_end=None, select_clip=None, select_shape=None, orf=None, reference_track=None, swath=False, **kwargs)

Method allowing to set reader’s parameters if unset.

Parameters:
  • date_start (DateHandler | None) – Starting date of the interval we’re working on.

  • date_end (DateHandler | None) – Ending date of the interval we’re working on.

  • select_clip (str | None) – Selection clip allowing to work on a subset of the source’s data.

  • select_shape (str | GeoDataFrame | Polygon | None) – Shape file, GeoDataFrame or Geometry to select.

  • orf (PassIndexer | str | None) – Source’s indexer.

  • reference_track (ReferenceTrack | None) – Reference track.

  • swath (bool) – Whether this reader contains swath type data or not.

  • kwargs – Special fields.

classmethod set_signature()

Fix the class initialization signature.

property source: str

Reader’s source information.

special_field(ftype)
Parameters:

ftype (FieldType)

Return type:

Field | None

special_field_name(ftype)
Parameters:

ftype (FieldType)

Return type:

str | None

property special_fields: dict[FieldType, Field]
property zdata: Dataset

Data as a zcollection Dataset.

class casys.readers.StoreReader(store=None, store_path=None, analyse_type=CUSTOM, analyse_date=None, auto=False, **kwargs)

Bases: CasysReader

xarray Dataset reader.

Parameters:
  • store (DiagnosticStore | None) – Diagnostic store.

  • store_path (str | None) – Diagnostic store’s path.

  • analyse_type (FreqType | str) – Type of period covered by this analyse (cycle, pass or custom). It’s used to determine the type of storage group to create.

  • analyse_date (Union[datetime64, Timestamp, datetime, str, DateHandler]) – Date representing the set of data used in this analyse. It’s used to determine at which timestamp to store non-temporal diagnostics.

  • auto (bool) – Whether to automatically detect and add existing diagnostics or not.

  • date_start – Starting date of the interval we’re working on.

  • date_end – Ending date of the interval we’re working on.

  • select_clip – Selection clip allowing to work on a subset of the source’s data.

  • select_shape – Shape file, GeoDataFrame or Geometry to select.

  • data_cleaner – Data cleaning applied just after the reader. This cleaning might consist of sorting, duplication removal or removing indexes in order to keep them increasing.

  • orf – Source’s indexer.

  • reference_track – Reference track.

  • time – Time field.

  • longitude – Longitude field.

  • latitude – Latitude field.

FIELDS_SOURCE_FULL_CHECK: bool = False
FORBIDDEN_PARAMETERS: list[str] = []
KNOWN_PARAMETERS: dict[str, tuple[Any, Any]] = {'cross_track_distance': ('Optional[str]', 'cross_track_distance'), 'cycle_number': ('Optional[str]', 'CYCLE_NUMBER'), 'data_cleaner': ('Optional[DataCleaner]', None), 'date_end': ('Optional[DateType]', None), 'date_start': ('Optional[DateType]', None), 'latitude': ('Optional[str]', 'LATITUDE'), 'latitude_nadir': ('Optional[str]', 'latitude_nadir'), 'longitude': ('Optional[str]', 'LONGITUDE'), 'longitude_nadir': ('Optional[str]', 'longitude_nadir'), 'orf': ('Optional[PassIndexer | str]', None), 'pass_number': ('Optional[str]', 'PASS_NUMBER'), 'reference_track': ('Optional[ReferenceTrackType]', None), 'select_clip': ('Optional[str]', None), 'select_shape': ('Optional[str | gpd.GeoDataFrame | shg.Polygon]', None), 'swath_lines': ('Optional[str]', 'num_lines'), 'swath_pixels': ('Optional[str]', 'num_pixels'), 'time': ('Optional[str]', 'time')}
REQUIRED_PARAMETERS: list[str] = []
RESOURCES: ClassVar[ong_data_cls.CLSResourcesManager] = <octantng.data.cls.resources.CLSResourcesManager object>
property analyse_date: datetime64 | None
property analyse_type: FreqType
apply_clip(data)

Apply reader’s selection clip to the provided data.

Parameters:

data (Dataset) – Data on which to apply selection.

Return type:

Dataset

Returns:

Selected data.

apply_shape(data, with_clip=False)

Apply reader’s shape selection to the provided data.

Parameters:
  • data (Dataset) – Data from which to select data.

  • with_clip (bool) – Whether pre-selection with clip has to be done or not.

Return type:

Dataset

Returns:

Dataset reduced to the reader’s shape.

check_empty()

Check whether the reader contains data or not.

check_field(field, fields_ext=None, fill_properties=False)

Not checking fields here.

Parameters:
check_fields_source(fields)

Check that the provided fields exists in this reader.

Parameters:

fields (Sequence[str]) – Fields to check.

Raises:

CasysReaderError – If one or more fields do not exist.

close()

Close used resources.

computation_dates(index, freq)

Computation starting and ending time for the provided parameters.

Parameters:
  • index (int | None) – Period’s number.

  • freq (FreqType) – Computation frequency.

Return type:

tuple[datetime64 | None, datetime64 | None]

Returns:

(start, end) dates.

dask_reader()

Return a version of the reader that might be used on a dask worker.

Parameters:

kwargs – Additional reader parameters.

Return type:

StoreReader

Returns:

Dask worker compatible reader.

property data: Dataset | None

Full reader’s data.

property date_end: DateHandler
property date_np_end: datetime64 | None
property date_np_start: datetime64 | None
property date_start: DateHandler
diag_group(name)

Group’s name of a diagnostic or None if unknown.

Parameters:

name (str) – Name of the diagnostic.

Return type:

str | None

Returns:

Name of the diagnostic group.

property field_cross_track_distance: Field | None
property field_cycle_number: Field | None
property field_lat: Field | None
property field_lat_nadir: Field | None
property field_lon: Field | None
property field_lon_nadir: Field | None
property field_pass_number: Field | None
property field_time: Field | None
property fields: dict[str, Field]

Returns the dictionary of existing fields in the source.

Returns:

List of existing fields as Field objects.

property groups: dict[str, str]
have_special_fields(fields)

Check whether provided fields are set or not.

Parameters:

fields (list[FieldType]) – Fields to check.

Return type:

bool

Returns:

True if fields are set, False otherwise.

property index: str

Name of the index.

initialize(check_coords=True, force=True, time_extension=False, real_start=None)

Check and initialize reader’s parameters.

Parameters:
  • check_coords (bool) – Whether to check coordinates or not.

  • force (bool) – Whether to initialize if already initialized or not.

  • time_extension (bool) – Whether reader before or after initialization dates is allowed or not.

  • real_start (datetime64 | None) – Real starting time of this processing. This is used to normalize interpolation index.

light_reader(freq, start, end)

Return a light version of the reader (that might be scattered on dask).

Parameters:
  • freq (FrequencyHandler) – Frequency handler used for parallelization.

  • start (datetime64) – Starting date of the required data.

  • end (datetime64) – Ending date of the required data.

Return type:

CasysReader

Returns:

Lightened and dask compatible reader.

property orf: PassIndexer | None
property periods: list[Period]
pre_computed_diagnostics()

List of pre-computed diagnostics (mainly used for stored diagnostics).

Return type:

list[MethodCaller]

Returns:

List of pre-computed diagnostics.

read_data(fields, start=None, end=None, period=None, include_end=True)

Not used.

Parameters:
Return type:

Dataset

property real_start: datetime64 | None
property reference_track: ReferenceTrack | None
reset_periods()

Reset the reader’s periods to None.

property select_clip: str | None
property select_shape: GeoDataFrame | None
set_dask_processing(freq, start, end, reference=None)

Do whatever needs to be done in case of a dask usage.

Parameters:
set_parameters(*, date_start=None, date_end=None, select_clip=None, select_shape=None, orf=None, reference_track=None, swath=False, **kwargs)

Method allowing to set reader’s parameters if unset.

Parameters:
  • date_start (DateHandler | None) – Starting date of the interval we’re working on.

  • date_end (DateHandler | None) – Ending date of the interval we’re working on.

  • select_clip (str | None) – Selection clip allowing to work on a subset of the source’s data.

  • select_shape (str | GeoDataFrame | Polygon | None) – Shape file, GeoDataFrame or Geometry to select.

  • orf (PassIndexer | str | None) – Source’s indexer.

  • reference_track (ReferenceTrack | None) – Reference track.

  • swath (bool) – Whether this reader contains swath type data or not.

  • kwargs – Special fields.

classmethod set_signature()

Fix the class initialization signature.

property source: str

Reader’s source information.

special_field(ftype)
Parameters:

ftype (FieldType)

Return type:

Field | None

special_field_name(ftype)
Parameters:

ftype (FieldType)

Return type:

str | None

property special_fields: dict[FieldType, Field]
property store: DiagnosticStore
property store_path: str

Diagnostic store’s path.

class casys.readers.ZCollectionReader(collection=None, data_path=None, backend_fields=None, backend_kwargs=None, *, date_start=None, date_end=None, data_cleaner=None, orf=None, time='time', longitude='LONGITUDE', latitude='LATITUDE', longitude_nadir='longitude_nadir', latitude_nadir='latitude_nadir', cycle_number='CYCLE_NUMBER', pass_number='PASS_NUMBER', cross_track_distance='cross_track_distance', swath_lines='num_lines', swath_pixels='num_pixels')

Bases: ZDatasetReader

Reader for a Zcollection Collection.

Parameters:
  • collection (Collection | None) – Collection.

  • data_path (str | None) – Collection path.

  • backend_fields (list[str] | None) – List of fields (variables) to read.

  • backend_kwargs (dict | None) – Kwargs dictionary to pass to the underlying collection.

  • date_start – Starting date of the interval we’re working on.

  • date_end – Ending date of the interval we’re working on.

  • select_clip – Selection clip allowing to work on a subset of the source’s data.

  • select_shape – Shape file, GeoDataFrame or Geometry to select.

  • data_cleaner – Data cleaning applied just after the reader. This cleaning might consist of sorting, duplication removal or removing indexes in order to keep them increasing.

  • orf – Source’s indexer.

  • reference_track – Reference track.

  • time – Time field.

  • longitude – Longitude field.

  • latitude – Latitude field.

  • swath_lines – Swath main dimension.

  • swath_pixels – Swath cross_track dimension.

  • cycle_number – The cycle number field.

  • pass_number – The pass number field.

  • longitude_nadir – The nadir’s longitude field.

  • latitude_nadir – The nadir’s latitude field.

  • cross_track_distance – Cross track distance field.

FIELDS_SOURCE_FULL_CHECK: bool = True
FORBIDDEN_PARAMETERS: list[str] = ['select_clip', 'select_shape', 'reference_track']
KNOWN_PARAMETERS: dict[str, tuple[Any, Any]] = {'cross_track_distance': ('Optional[str]', 'cross_track_distance'), 'cycle_number': ('Optional[str]', 'CYCLE_NUMBER'), 'data_cleaner': ('Optional[DataCleaner]', None), 'date_end': ('Optional[DateType]', None), 'date_start': ('Optional[DateType]', None), 'latitude': ('Optional[str]', 'LATITUDE'), 'latitude_nadir': ('Optional[str]', 'latitude_nadir'), 'longitude': ('Optional[str]', 'LONGITUDE'), 'longitude_nadir': ('Optional[str]', 'longitude_nadir'), 'orf': ('Optional[PassIndexer | str]', None), 'pass_number': ('Optional[str]', 'PASS_NUMBER'), 'reference_track': ('Optional[ReferenceTrackType]', None), 'select_clip': ('Optional[str]', None), 'select_shape': ('Optional[str | gpd.GeoDataFrame | shg.Polygon]', None), 'swath_lines': ('Optional[str]', 'num_lines'), 'swath_pixels': ('Optional[str]', 'num_pixels'), 'time': ('Optional[str]', 'time')}
REQUIRED_PARAMETERS: list[str] = []
RESOURCES: ClassVar[ong_data_cls.CLSResourcesManager] = <octantng.data.cls.resources.CLSResourcesManager object>
apply_clip(data)

Apply reader’s selection clip to the provided data.

Parameters:

data (Dataset) – Data on which to apply selection.

Return type:

Dataset

Returns:

Selected data.

apply_shape(data, with_clip=False)

Apply reader’s shape selection to the provided data.

Parameters:
  • data (Dataset) – Data from which to select data.

  • with_clip (bool) – Whether pre-selection with clip has to be done or not.

Return type:

Dataset

Returns:

Dataset reduced to the reader’s shape.

check_empty()

Check whether the reader contains data or not.

check_field(field, fields_ext=None, fill_properties=False)

Check the provided field validity.

Parameters:
  • field (Field) – Field to check.

  • fields_ext (list[str | Hashable] | None) – Additional valid external fields.

  • fill_properties (bool) – Whether to fill missing non-clip field’s properties or not.

Raises:

CasysReaderError – If the provided field is not valid.

check_fields_source(fields)

Check that the provided fields exists in this reader.

Parameters:

fields (Sequence[str]) – Fields to check.

Raises:

CasysReaderError – If one or more fields do not exist.

close()

Close used resources.

property collection
property collection_type: type
computation_dates(index, freq)

Computation starting and ending time for the provided parameters.

Parameters:
  • index (int | None) – Period’s number.

  • freq (FreqType) – Computation frequency.

Return type:

tuple[datetime64 | None, datetime64 | None]

Returns:

(start, end) dates.

dask_reader(data_dask=None)

Return a version of the reader that might be used on a dask worker.

Parameters:
  • kwargs – Additional reader parameters.

  • data_dask (Dataset | None)

Return type:

DatasetReader

Returns:

Dask worker compatible reader.

property data: Dataset | None

Full reader’s data.

property data_path: str

Dataset’s file(s) path.

property date_end: DateHandler
property date_np_end: datetime64 | None
property date_np_start: datetime64 | None
property date_start: DateHandler
property field_cross_track_distance: Field | None
property field_cycle_number: Field | None
property field_lat: Field | None
property field_lat_nadir: Field | None
property field_lon: Field | None
property field_lon_nadir: Field | None
property field_pass_number: Field | None
property field_time: Field | None
property fields

Returns the dictionary of existing fields in the source.

Returns:

List of existing fields as Field objects.

have_special_fields(fields)

Check whether provided fields are set or not.

Parameters:

fields (list[FieldType]) – Fields to check.

Return type:

bool

Returns:

True if fields are set, False otherwise.

property index: str

Name of the index.

initialize(check_coords=True, force=True, time_extension=False, real_start=None)

Check and initialize reader’s parameters.

Parameters:
  • check_coords (bool) – Whether to check coordinates or not.

  • force (bool) – Whether to initialize if already initialized or not.

  • time_extension (bool) – Whether reader before or after initialization dates is allowed or not.

  • real_start (datetime64 | None) – Real starting time of this processing. This is used to normalize interpolation index.

light_reader(freq, start, end)

Return a light version of the reader (that might be scattered on dask).

Parameters:
  • freq (FrequencyHandler) – Frequency handler used for parallelization.

  • start (datetime64) – Starting date of the required data.

  • end (datetime64) – Ending date of the required data.

Return type:

CasysReader

Returns:

Lightened and dask compatible reader.

static load_collection(data_path)
Parameters:

data_path (str)

Return type:

Collection

load_data()

Load data as a xarray dataset.

Return type:

Dataset

Returns:

Loaded data as a dataset.

load_zdata()

Load data from the collection.

Return type:

Dataset

Returns:

Data as a zcollection.Dataset.

property orf: PassIndexer | None
property periods: list[Period]
pre_computed_diagnostics()

List of pre-computed diagnostics (mainly used for stored diagnostics).

Return type:

list[MethodCaller]

Returns:

List of pre-computed diagnostics.

read_data(fields, start=None, end=None, period=None, include_end=True)

Read the requested fields and rename them according to the dictionary.

Parameters:
  • fields (dict[str, Field]) – Dictionary of fields names matched to their source.

  • start (datetime64 | None) – Starting date of the data to get.

  • end (datetime64 | None) – Ending date of the data to get.

  • period (int) – Period’s number to read.

  • include_end (bool) – Whether to include the end date or not.

Returns:

Fields content as a Dataset

property real_start: datetime64 | None
property reference_track: ReferenceTrack | None
reset_periods()

Reset the reader’s periods to None.

property select_clip: str | None
property select_shape: GeoDataFrame | None
set_dask_processing(freq, start, end, reference=None)

Do whatever needs to be done in case of a dask usage.

Parameters:
  • freq (FrequencyHandler) – Frequency handler used for parallelization.

  • start (datetime64) – Starting date of the required data.

  • end (datetime64) – Ending date of the required data.

  • reference (list[tuple[int, int]]) – List of reference’s orbits.

set_parameters(*, date_start=None, date_end=None, select_clip=None, select_shape=None, orf=None, reference_track=None, swath=False, **kwargs)

Method allowing to set reader’s parameters if unset.

Parameters:
  • date_start (DateHandler | None) – Starting date of the interval we’re working on.

  • date_end (DateHandler | None) – Ending date of the interval we’re working on.

  • select_clip (str | None) – Selection clip allowing to work on a subset of the source’s data.

  • select_shape (str | GeoDataFrame | Polygon | None) – Shape file, GeoDataFrame or Geometry to select.

  • orf (PassIndexer | str | None) – Source’s indexer.

  • reference_track (ReferenceTrack | None) – Reference track.

  • swath (bool) – Whether this reader contains swath type data or not.

  • kwargs – Special fields.

classmethod set_signature()

Fix the class initialization signature.

property source: str

Reader’s source information.

special_field(ftype)
Parameters:

ftype (FieldType)

Return type:

Field | None

special_field_name(ftype)
Parameters:

ftype (FieldType)

Return type:

str | None

property special_fields: dict[FieldType, Field]
property zdata: Dataset

Data as a zcollection Dataset.

class casys.readers.ZDatasetReader(data, data_path=None, backend_fields=None, *, date_start=None, date_end=None, data_cleaner=None, orf=None, time='time', longitude='LONGITUDE', latitude='LATITUDE', longitude_nadir='longitude_nadir', latitude_nadir='latitude_nadir', cycle_number='CYCLE_NUMBER', pass_number='PASS_NUMBER', cross_track_distance='cross_track_distance', swath_lines='num_lines', swath_pixels='num_pixels')

Bases: DatasetReader

Reader for the zcollection.Dataset format.

Parameters:
  • data (Dataset) – ZCollection Dataset.

  • data_path (str | None) – Zcollection path.

  • backend_fields (list[str] | None) – List of fields (variables) to read.

  • backend_kwargs – kwargs to provide to the backend when using a data_path to load the data.

  • date_start – Starting date of the interval we’re working on.

  • date_end – Ending date of the interval we’re working on.

  • select_clip – Selection clip allowing to work on a subset of the source’s data.

  • select_shape – Shape file, GeoDataFrame or Geometry to select.

  • data_cleaner – Data cleaning applied just after the reader. This cleaning might consist of sorting, duplication removal or removing indexes in order to keep them increasing.

  • orf – Source’s indexer.

  • reference_track – Reference track.

  • time – Time field.

  • longitude – Longitude field.

  • latitude – Latitude field.

  • swath_lines – Swath main dimension.

  • swath_pixels – Swath cross_track dimension.

  • cycle_number – The cycle number field.

  • pass_number – The pass number field.

  • longitude_nadir – The nadir’s longitude field.

  • latitude_nadir – The nadir’s latitude field.

  • cross_track_distance – Cross track distance field.

FIELDS_SOURCE_FULL_CHECK: bool = True
FORBIDDEN_PARAMETERS: list[str] = ['select_clip', 'select_shape', 'reference_track']
KNOWN_PARAMETERS: dict[str, tuple[Any, Any]] = {'cross_track_distance': ('Optional[str]', 'cross_track_distance'), 'cycle_number': ('Optional[str]', 'CYCLE_NUMBER'), 'data_cleaner': ('Optional[DataCleaner]', None), 'date_end': ('Optional[DateType]', None), 'date_start': ('Optional[DateType]', None), 'latitude': ('Optional[str]', 'LATITUDE'), 'latitude_nadir': ('Optional[str]', 'latitude_nadir'), 'longitude': ('Optional[str]', 'LONGITUDE'), 'longitude_nadir': ('Optional[str]', 'longitude_nadir'), 'orf': ('Optional[PassIndexer | str]', None), 'pass_number': ('Optional[str]', 'PASS_NUMBER'), 'reference_track': ('Optional[ReferenceTrackType]', None), 'select_clip': ('Optional[str]', None), 'select_shape': ('Optional[str | gpd.GeoDataFrame | shg.Polygon]', None), 'swath_lines': ('Optional[str]', 'num_lines'), 'swath_pixels': ('Optional[str]', 'num_pixels'), 'time': ('Optional[str]', 'time')}
REQUIRED_PARAMETERS: list[str] = []
RESOURCES: ClassVar[ong_data_cls.CLSResourcesManager] = <octantng.data.cls.resources.CLSResourcesManager object>
apply_clip(data)

Apply reader’s selection clip to the provided data.

Parameters:

data (Dataset) – Data on which to apply selection.

Return type:

Dataset

Returns:

Selected data.

apply_shape(data, with_clip=False)

Apply reader’s shape selection to the provided data.

Parameters:
  • data (Dataset) – Data from which to select data.

  • with_clip (bool) – Whether pre-selection with clip has to be done or not.

Return type:

Dataset

Returns:

Dataset reduced to the reader’s shape.

check_empty()

Check whether the reader contains data or not.

check_field(field, fields_ext=None, fill_properties=False)

Check the provided field validity.

Parameters:
  • field (Field) – Field to check.

  • fields_ext (list[str | Hashable] | None) – Additional valid external fields.

  • fill_properties (bool) – Whether to fill missing non-clip field’s properties or not.

Raises:

CasysReaderError – If the provided field is not valid.

check_fields_source(fields)

Check that the provided fields exists in this reader.

Parameters:

fields (Sequence[str]) – Fields to check.

Raises:

CasysReaderError – If one or more fields do not exist.

close()

Close used resources.

computation_dates(index, freq)

Computation starting and ending time for the provided parameters.

Parameters:
  • index (int | None) – Period’s number.

  • freq (FreqType) – Computation frequency.

Return type:

tuple[datetime64 | None, datetime64 | None]

Returns:

(start, end) dates.

dask_reader(data_dask=None)

Return a version of the reader that might be used on a dask worker.

Parameters:
  • kwargs – Additional reader parameters.

  • data_dask (Dataset | None)

Return type:

DatasetReader

Returns:

Dask worker compatible reader.

property data: Dataset | None

Full reader’s data.

property data_path: str

Dataset’s file(s) path.

property date_end: DateHandler
property date_np_end: datetime64 | None
property date_np_start: datetime64 | None
property date_start: DateHandler
property field_cross_track_distance: Field | None
property field_cycle_number: Field | None
property field_lat: Field | None
property field_lat_nadir: Field | None
property field_lon: Field | None
property field_lon_nadir: Field | None
property field_pass_number: Field | None
property field_time: Field | None
property fields

Returns the dictionary of existing fields in the source.

Returns:

List of existing fields as Field objects.

have_special_fields(fields)

Check whether provided fields are set or not.

Parameters:

fields (list[FieldType]) – Fields to check.

Return type:

bool

Returns:

True if fields are set, False otherwise.

property index: str

Name of the index.

initialize(check_coords=True, force=True, time_extension=False, real_start=None)

Check and initialize reader’s parameters.

Parameters:
  • check_coords (bool) – Whether to check coordinates or not.

  • force (bool) – Whether to initialize if already initialized or not.

  • time_extension (bool) – Whether reader before or after initialization dates is allowed or not.

  • real_start (datetime64 | None) – Real starting time of this processing. This is used to normalize interpolation index.

light_reader(freq, start, end)

Return a light version of the reader (that might be scattered on dask).

Parameters:
  • freq (FrequencyHandler) – Frequency handler used for parallelization.

  • start (datetime64) – Starting date of the required data.

  • end (datetime64) – Ending date of the required data.

Return type:

CasysReader

Returns:

Lightened and dask compatible reader.

load_data()

Load data as a xarray dataset.

Return type:

Dataset

Returns:

Loaded data as a dataset.

property orf: PassIndexer | None
property periods: list[Period]
pre_computed_diagnostics()

List of pre-computed diagnostics (mainly used for stored diagnostics).

Return type:

list[MethodCaller]

Returns:

List of pre-computed diagnostics.

read_data(fields, start=None, end=None, period=None, include_end=True)

Read the requested fields and rename them according to the dictionary.

Parameters:
  • fields (dict[str, Field]) – Dictionary of fields names matched to their source.

  • start (datetime64 | None) – Starting date of the data to get.

  • end (datetime64 | None) – Ending date of the data to get.

  • period (int) – Period’s number to read.

  • include_end (bool) – Whether to include the end date or not.

Returns:

Fields content as a Dataset

property real_start: datetime64 | None
property reference_track: ReferenceTrack | None
reset_periods()

Reset the reader’s periods to None.

property select_clip: str | None
property select_shape: GeoDataFrame | None
set_dask_processing(freq, start, end, reference=None)

Do whatever needs to be done in case of a dask usage.

Parameters:
  • freq (FrequencyHandler) – Frequency handler used for parallelization.

  • start (datetime64) – Starting date of the required data.

  • end (datetime64) – Ending date of the required data.

  • reference (list[tuple[int, int]]) – List of reference’s orbits.

set_parameters(*, date_start=None, date_end=None, select_clip=None, select_shape=None, orf=None, reference_track=None, swath=False, **kwargs)

Method allowing to set reader’s parameters if unset.

Parameters:
  • date_start (DateHandler | None) – Starting date of the interval we’re working on.

  • date_end (DateHandler | None) – Ending date of the interval we’re working on.

  • select_clip (str | None) – Selection clip allowing to work on a subset of the source’s data.

  • select_shape (str | GeoDataFrame | Polygon | None) – Shape file, GeoDataFrame or Geometry to select.

  • orf (PassIndexer | str | None) – Source’s indexer.

  • reference_track (ReferenceTrack | None) – Reference track.

  • swath (bool) – Whether this reader contains swath type data or not.

  • kwargs – Special fields.

classmethod set_signature()

Fix the class initialization signature.

property source: str

Reader’s source information.

special_field(ftype)
Parameters:

ftype (FieldType)

Return type:

Field | None

special_field_name(ftype)
Parameters:

ftype (FieldType)

Return type:

str | None

property special_fields: dict[FieldType, Field]
property zdata: Dataset

Data as a zcollection Dataset.

class casys.readers.ZarrDatasetReader(data=None, data_path=None, backend_fields=None, backend_kwargs=None, *, date_start=None, date_end=None, select_clip=None, select_shape=None, data_cleaner=None, orf=None, reference_track=None, time='time', longitude='LONGITUDE', latitude='LATITUDE', longitude_nadir='longitude_nadir', latitude_nadir='latitude_nadir', cycle_number='CYCLE_NUMBER', pass_number='PASS_NUMBER', cross_track_distance='cross_track_distance', swath_lines='num_lines', swath_pixels='num_pixels')

Bases: DatasetReader

xarray Dataset reader for the zarr format.

Parameters:
  • data (Dataset | None) – Dataset.

  • data_path (str | list[str] | None) – Dataset’s file(s) path.

  • backend_fields (list[str] | None) – List of fields (variables) to read.

  • backend_kwargs (dict[str, Any] | None) – kwargs to provide to the backend when using a data_path to load the data.

  • date_start – Starting date of the interval we’re working on.

  • date_end – Ending date of the interval we’re working on.

  • select_clip – Selection clip allowing to work on a subset of the source’s data.

  • select_shape – Shape file, GeoDataFrame or Geometry to select.

  • data_cleaner – Data cleaning applied just after the reader. This cleaning might consist of sorting, duplication removal or removing indexes in order to keep them increasing.

  • orf – Source’s indexer.

  • reference_track – Reference track.

  • time – Time field.

  • longitude – Longitude field.

  • latitude – Latitude field.

FIELDS_SOURCE_FULL_CHECK: bool = True
FORBIDDEN_PARAMETERS: list[str] = []
KNOWN_PARAMETERS: dict[str, tuple[Any, Any]] = {'cross_track_distance': ('Optional[str]', 'cross_track_distance'), 'cycle_number': ('Optional[str]', 'CYCLE_NUMBER'), 'data_cleaner': ('Optional[DataCleaner]', None), 'date_end': ('Optional[DateType]', None), 'date_start': ('Optional[DateType]', None), 'latitude': ('Optional[str]', 'LATITUDE'), 'latitude_nadir': ('Optional[str]', 'latitude_nadir'), 'longitude': ('Optional[str]', 'LONGITUDE'), 'longitude_nadir': ('Optional[str]', 'longitude_nadir'), 'orf': ('Optional[PassIndexer | str]', None), 'pass_number': ('Optional[str]', 'PASS_NUMBER'), 'reference_track': ('Optional[ReferenceTrackType]', None), 'select_clip': ('Optional[str]', None), 'select_shape': ('Optional[str | gpd.GeoDataFrame | shg.Polygon]', None), 'swath_lines': ('Optional[str]', 'num_lines'), 'swath_pixels': ('Optional[str]', 'num_pixels'), 'time': ('Optional[str]', 'time')}
REQUIRED_PARAMETERS: list[str] = []
RESOURCES: ClassVar[ong_data_cls.CLSResourcesManager] = <octantng.data.cls.resources.CLSResourcesManager object>
apply_clip(data)

Apply reader’s selection clip to the provided data.

Parameters:

data (Dataset) – Data on which to apply selection.

Return type:

Dataset

Returns:

Selected data.

apply_shape(data, with_clip=False)

Apply reader’s shape selection to the provided data.

Parameters:
  • data (Dataset) – Data from which to select data.

  • with_clip (bool) – Whether pre-selection with clip has to be done or not.

Return type:

Dataset

Returns:

Dataset reduced to the reader’s shape.

check_empty()

Check whether the reader contains data or not.

check_field(field, fields_ext=None, fill_properties=False)

Check the provided field validity.

Parameters:
  • field (Field) – Field to check.

  • fields_ext (list[str | Hashable] | None) – Additional valid external fields.

  • fill_properties (bool) – Whether to fill missing non-clip field’s properties or not.

Raises:

CasysReaderError – If the provided field is not valid.

check_fields_source(fields)

Check that the provided fields exists in this reader.

Parameters:

fields (Sequence[str]) – Fields to check.

Raises:

CasysReaderError – If one or more fields do not exist.

close()

Close used resources.

computation_dates(index, freq)

Computation starting and ending time for the provided parameters.

Parameters:
  • index (int | None) – Period’s number.

  • freq (FreqType) – Computation frequency.

Return type:

tuple[datetime64 | None, datetime64 | None]

Returns:

(start, end) dates.

dask_reader(data_dask=None)

Return a version of the reader that might be used on a dask worker.

Parameters:
  • kwargs – Additional reader parameters.

  • data_dask (Dataset | None)

Return type:

DatasetReader

Returns:

Dask worker compatible reader.

property data: Dataset | None

Full reader’s data.

property data_path: str

Dataset’s file(s) path.

property date_end: DateHandler
property date_np_end: datetime64 | None
property date_np_start: datetime64 | None
property date_start: DateHandler
property field_cross_track_distance: Field | None
property field_cycle_number: Field | None
property field_lat: Field | None
property field_lat_nadir: Field | None
property field_lon: Field | None
property field_lon_nadir: Field | None
property field_pass_number: Field | None
property field_time: Field | None
property fields

Returns the dictionary of existing fields in the source.

Returns:

List of existing fields as Field objects.

have_special_fields(fields)

Check whether provided fields are set or not.

Parameters:

fields (list[FieldType]) – Fields to check.

Return type:

bool

Returns:

True if fields are set, False otherwise.

property index: str

Name of the index.

initialize(check_coords=True, force=True, time_extension=False, real_start=None)

Check and initialize reader’s parameters.

Parameters:
  • check_coords (bool) – Whether to check coordinates or not.

  • force (bool) – Whether to initialize if already initialized or not.

  • time_extension (bool) – Whether reader before or after initialization dates is allowed or not.

  • real_start (datetime64 | None) – Real starting time of this processing. This is used to normalize interpolation index.

light_reader(freq, start, end)

Return a light version of the reader (that might be scattered on dask).

Parameters:
  • freq (FrequencyHandler) – Frequency handler used for parallelization.

  • start (datetime64) – Starting date of the required data.

  • end (datetime64) – Ending date of the required data.

Return type:

CasysReader

Returns:

Lightened and dask compatible reader.

load_data()

Load data as a xarray dataset.

Return type:

Dataset

Returns:

Loaded data as a dataset.

property orf: PassIndexer | None
property periods: list[Period]
pre_computed_diagnostics()

List of pre-computed diagnostics (mainly used for stored diagnostics).

Return type:

list[MethodCaller]

Returns:

List of pre-computed diagnostics.

read_data(fields, start=None, end=None, period=None, include_end=True)

Read the requested fields and rename them according to the dictionary.

Parameters:
  • fields (dict[str, Field]) – Dictionary of fields names matched to their source.

  • start (datetime64 | None) – Starting date of the data to get.

  • end (datetime64 | None) – Ending date of the data to get.

  • period (int) – Period’s number to read.

  • include_end (bool) – Whether to include the end date or not.

Returns:

Fields content as a Dataset

property real_start: datetime64 | None
property reference_track: ReferenceTrack | None
reset_periods()

Reset the reader’s periods to None.

property select_clip: str | None
property select_shape: GeoDataFrame | None
set_dask_processing(freq, start, end, reference=None)

Do whatever needs to be done in case of a dask usage.

Parameters:
  • freq (FrequencyHandler) – Frequency handler used for parallelization.

  • start (datetime64) – Starting date of the required data.

  • end (datetime64) – Ending date of the required data.

  • reference (list[tuple[int, int]]) – List of reference’s orbits.

set_parameters(*, date_start=None, date_end=None, select_clip=None, select_shape=None, orf=None, reference_track=None, swath=False, **kwargs)

Method allowing to set reader’s parameters if unset.

Parameters:
  • date_start (DateHandler | None) – Starting date of the interval we’re working on.

  • date_end (DateHandler | None) – Ending date of the interval we’re working on.

  • select_clip (str | None) – Selection clip allowing to work on a subset of the source’s data.

  • select_shape (str | GeoDataFrame | Polygon | None) – Shape file, GeoDataFrame or Geometry to select.

  • orf (PassIndexer | str | None) – Source’s indexer.

  • reference_track (ReferenceTrack | None) – Reference track.

  • swath (bool) – Whether this reader contains swath type data or not.

  • kwargs – Special fields.

classmethod set_signature()

Fix the class initialization signature.

property source: str

Reader’s source information.

special_field(ftype)
Parameters:

ftype (FieldType)

Return type:

Field | None

special_field_name(ftype)
Parameters:

ftype (FieldType)

Return type:

str | None

property special_fields: dict[FieldType, Field]
casys.readers.create_fields(data, fields)

Create a dataset from the fields clips and sources.

Parameters:
Return type:

Dataset

Returns:

Dataset of computed fields.