Data source

Two data container are available:

NadirData for the nadir type data
SwathData for swath type data

NadirData allows to:

specify the source of data and its characteristics
apply selection criteria (using times interval, CLIPs or shapes)
apply transformations (interpolation to a reference track)
set and compute diagnostics

NadirData is a data container allowing to access a source of data and define then
compute diagnostics.

Parameters
----------
source
    Input source (name of the table if using OCTANT storage).
date_start
     Starting date of the period of interest.
date_end
    Ending date of the period of interest.
select_clip
    Selection clip allowing to work on a subset of the source's data.
select_shape
    Shape file, GeoDataFrame or Geometry on which to limit source's data.
orf
    Path or name of the orf.
reference_track
    Setting this parameter enables source's data interpolation on this reference
    track.
    Every diagnostic is then computed using these interpolated data.

    File path or data of the reference track on which to interpolate read data.
    A list of existing theoretical reference tracks can be shown using the
    show_theoretical_tracks method:

    >>> CommonData.show_theoretical_tracks()

    Standard along track data (orbits) can be provided as well.

    This parameter can be provided as a dictionary containing 'data', 'path'
    and 'coordinates' keys.
time
    The time field. (if not provided, default is "time" field)
latitude
    The latitude field. (if not provided, default is "LATITUDE" field)
longitude
    The longitude field. (if not provided, default is "LONGITUDE" field)
cycle_number
    Cycle number's field. (if not provided, default is "CYCLE_NUMBER" field)
pass_number
    Pass number's field. (if not provided, default is "PASS_NUMBER" field)
diag_overwrite
    Define the behavior when adding a diagnostic with an already used name:

        * [default] False: raise an error
        * True: remove the old diagnostic and add the new one
time_extension
    Whether to allow the extension of user defined time interval for specific
    diagnostic requirements or not.
source_type
    Input source type.

SwathData allows to:

specify the source of data and its characteristics
apply selection criteria (using times interval, CLIPs or shapes)
set and compute diagnostics

NadirData is a data container allowing to access a source of data and define then
compute diagnostics.

Parameters
----------
source
    Input source (name of the table if using OCTANT storage).
date_start
     Starting date of the period of interest.
date_end
    Ending date of the period of interest.
select_clip
    Selection clip allowing to work on a subset of the source's data.
select_shape
    Shape file, GeoDataFrame or Geometry on which to limit source's data.
orf
    Path or name of the orf.
reference_track
    Setting this parameter enables source's data interpolation on this reference
    track.
    Every diagnostic is then computed using these interpolated data.

    File path or data of the reference track on which to interpolate read data.
    A list of existing theoretical reference tracks can be shown using the
    show_theoretical_tracks method:

    >>> CommonData.show_theoretical_tracks()

    Standard along track data (orbits) can be provided as well.

    This parameter can be provided as a dictionary containing 'data', 'path'
    and 'coordinates' keys.
time
    The time field. (if not provided, default is "time" field)
latitude
    The latitude field. (if not provided, default is "LATITUDE" field)
longitude
    The longitude field. (if not provided, default is "LONGITUDE" field)
cycle_number
    Cycle number's field. (if not provided, default is "CYCLE_NUMBER" field)
pass_number
    Pass number's field. (if not provided, default is "PASS_NUMBER" field)
diag_overwrite
    Define the behavior when adding a diagnostic with an already used name:

        * [default] False: raise an error
        * True: remove the old diagnostic and add the new one
time_extension
    Whether to allow the extension of user defined time interval for specific
    diagnostic requirements or not.
source_type
    Input source type.

Those data containers optimize their source readings and diagnostics computing by separating the setting part from the reading and computing parts.

Until the computation is started no data are read nor computed. The data container object is just organizing everything internally to be as efficient as possible.

Note

When working with xarray datasets, date_start and date_end parameters are optional.

If omitted, the first and last values of the provided dataset’s time field will be used to define these parameters.

CLS Tables and in-situ Tables might contain years of data therefore the date_start and date_end are mandatory for this kind of sources.

Data readers

CasysReader are designed to interact and read from different kind of sources. Each source’s type is associated with a data reader:

DatasetReader

ZarrDatasetReader

CLSTableReader

CLSTableInSituReader

ZDatasetReader

ZCollectionReader

ScCollectionReader

MultiReader

Some readers are automatically declared when instantiating an data container object, NadirData or SwathData: the reader’s type is inferred from the source’s type.

xarray.Dataset -> DatasetReader

zcollection.Dataset -> ZDatasetReader

string -> CLSTableReader

dictionary -> CLSTableInSituReader

zcollection.Collection -> ZCollectionReader

swot_calval.io.Collection -> ScCollectionReader

Complex readers, such as MultiReader require to explicitly instantiate the reader.

In general, instantiating the reader beforehand is strongly recommended.

xarray datasets

This type of data readers can be instantiated using anything than can be opened as a xarray Dataset.

It allows to dynamically create a dataset by using as many inputs as you want, align their coordinates if needed, arrange them however you want and use this dataset as your data source.

from casys import NadirData, DateHandler
from casys.readers import ZarrDatasetReader
import os

# Instantiate your zarr compatible xarray dataset reader
reader = ZarrDatasetReader(
    data_path=os.path.join(os.environ["RESOURCES_DIR"], "data_C_J3_B.zarr"),
    date_start=DateHandler("2019/06/01"),
    date_end=DateHandler("2019/06/20"),
    time="time",
    longitude="LONGITUDE",
    latitude="LATITUDE",
)

# Create your NadirData object
ad_ds = NadirData(source=reader)

The readers allowing to work with this kind of source are the DatasetReader and ZarrDatasetReader classes.

xarray Dataset reader.

Parameters
----------
data
    Dataset.
data_path
    Dataset's file(s) path.
backend_fields
    List of fields (variables) to read.
backend_kwargs
    kwargs to provide to the backend when using a data_path to load the data.
date_start
    Starting date of the interval we're working on.
date_end
    Ending date of the interval we're working on.
select_clip
    Selection clip allowing to work on a subset of the source's data.
select_shape
    Shape file, GeoDataFrame or Geometry to select.
data_cleaner
    Data cleaning applied just after the reader.
    This cleaning might consist of sorting, duplication removal or removing indexes
    in order to keep them increasing.
orf
    Source's indexer.
reference_track
    Reference track.
time
    Time field.
longitude
    Longitude field.
latitude
    Latitude field.
swath_lines
    Swath main dimension.
swath_pixels
    Swath cross_track dimension.
cycle_number
    The cycle number field.
pass_number
    The pass number field.
longitude_nadir
    The nadir's longitude field.
latitude_nadir
    The nadir's latitude field.
cross_track_distance
    Cross track distance field.

A complete example using a dataset as source can be found in this notebook.

CLS Tables

CLS Tables is a data store containing altimetry data stored using a CLS proprietary format.

A Table contains data related to a specific mission and project.

Information on how to determine available data in Tables through their ORF can be found here.

from casys import NadirData, DateHandler
from casys.readers import CLSTableReader

reader = CLSTableReader(
    name="TABLE_C_J3_B_GDRD",
    date_start=DateHandler.from_orf("C_J3_GDRD", 122, 1, pos="first"),
    date_end=DateHandler.from_orf("C_J3_GDRD", 122, 154, pos="last"),
    orf="C_J3_GDRD",
    time="time",
    longitude="LONGITUDE",
    latitude="LATITUDE",
)

ad = NadirData(source=reader)

The reader allowing to work with this kind of source is the CLSTableReader class.

OCTANT CLS TableMeasure data reader.

Parameters
----------
name
    Table's name.
ges_table_dir
    Path of the GES_TABLE_DIR to use.
date_start
    Starting date of the interval we're working on.
date_end
    Ending date of the interval we're working on.
select_clip
    Selection clip allowing to work on a subset of the source's data.
select_shape
    Shape file, GeoDataFrame or Geometry to select.
data_cleaner
    Data cleaning applied just after the reader.
    This cleaning might consist of sorting, duplication removal or removing indexes
    in order to keep them increasing.
orf
    Source's indexer.
reference_track
    Reference track.
time
    Time field.
longitude
    Longitude field.
latitude
    Latitude field.

CLS in-situ Tables

In-situ CLS Tables can be instantiated with the CLSTableInSituReader class.

The reader needs the following to be instantiated:

sensor_type

sensor_name

OCTANT TableInSitu data reader.

Parameters
----------
sensor_type
    Type of the in situ sensor.
sensor_name
    Name of the insitu sensor.
ges_table_dir
    Path of the GES_TABLE_DIR to use.
date_start
    Starting date of the interval we're working on.
date_end
    Ending date of the interval we're working on.
select_clip
    Selection clip allowing to work on a subset of the source's data.
select_shape
    Shape file, GeoDataFrame or Geometry to select.
data_cleaner
    Data cleaning applied just after the reader.
    This cleaning might consist of sorting, duplication removal or removing indexes
    in order to keep them increasing.
orf
    Source's indexer.
reference_track
    Reference track.
time
    Time field.
longitude
    Longitude field.
latitude
    Latitude field.

In-situ CLS Tables can also be used as data container source through the source parameter.

The source parameter has to be a dictionary containing the following keys:

sensor_type

sensor_name

ZCollection datasets

The ZDatasetReader class allows to work with ZCollection datasets.

from casys import SwathData
from casys.readers import ZDatasetReader
import os


# Instantiate your Zdataset reader
reader = ZDatasetReader(
    data=zds,
    time="time",
    longitude="LONGITUDE",
    latitude="LATITUDE",
)

# Create your SwathData object
ad_ds = SwathData(source=reader)

The reader allowing to work with this kind of source is the ZDatasetReader class.

Reader for the zcollection.Dataset format.

Parameters
----------
data
    ZCollection Dataset.
data_path
    Zcollection path.
backend_fields
    List of fields (variables) to read.
backend_kwargs
    kwargs to provide to the backend when using a data_path to load the data.
date_start
    Starting date of the interval we're working on.
date_end
    Ending date of the interval we're working on.
select_clip
    Selection clip allowing to work on a subset of the source's data.
select_shape
    Shape file, GeoDataFrame or Geometry to select.
data_cleaner
    Data cleaning applied just after the reader.
    This cleaning might consist of sorting, duplication removal or removing indexes
    in order to keep them increasing.
orf
    Source's indexer.
reference_track
    Reference track.
time
    Time field.
longitude
    Longitude field.
latitude
    Latitude field.
swath_lines
    Swath main dimension.
swath_pixels
    Swath cross_track dimension.
cycle_number
    The cycle number field.
pass_number
    The pass number field.
longitude_nadir
    The nadir's longitude field.
latitude_nadir
    The nadir's latitude field.
cross_track_distance
    Cross track distance field.

ZCollection collections

The ZCollectionReader class allows to work with ZCollection collections.

Either the collection directory or a previously opened Zcollection can be provided to the reasder.

from casys import SwathData
from casys.readers import ZCollectionReader
import os

# Instantiate your Zcollection reader
reader = ZCollectionReader(
    data_path=os.path.join(os.environ["RESOURCES_DIR"], "my_collection"),
    time="time",
    longitude="longitude",
    latitude="latitude",
)

# Create your SwathData object
ad_ds = SwathData(source=reader)

The reader allowing to work with this kind of source is the ZCollectionReader class.

Reader for a Zcollection Collection.

Parameters
----------
collection
    Collection.
data_path
    Collection path.
backend_fields
    List of fields (variables) to read.
backend_kwargs
    Kwargs dictionary to pass to the underlying collection.
date_start
    Starting date of the interval we're working on.
date_end
    Ending date of the interval we're working on.
select_clip
    Selection clip allowing to work on a subset of the source's data.
select_shape
    Shape file, GeoDataFrame or Geometry to select.
data_cleaner
    Data cleaning applied just after the reader.
    This cleaning might consist of sorting, duplication removal or removing indexes
    in order to keep them increasing.
orf
    Source's indexer.
reference_track
    Reference track.
time
    Time field.
longitude
    Longitude field.
latitude
    Latitude field.
swath_lines
    Swath main dimension.
swath_pixels
    Swath cross_track dimension.
cycle_number
    The cycle number field.
pass_number
    The pass number field.
longitude_nadir
    The nadir's longitude field.
latitude_nadir
    The nadir's latitude field.
cross_track_distance
    Cross track distance field.

Swot_calval collections

The ScCollectionReader class allows to work with swot_calval collections.

Either the collection directory or a previously opened swot_calval collection can be provided to the reasder.

from casys import SwathData
from casys.readers import ScCollectionReader
import os

# Instantiate your Sc Collection reader
reader = ScCollectionReader(
    data_path=os.path.join(os.environ["RESOURCES_DIR"], "my_collection"),
    time="time",
    longitude="longitude",
    latitude="latitude",
)

# Create your SwathData object
ad_ds = SwathData(source=reader)

The reader allowing to work with this kind of source is the ScCollectionReader class.

Reader for a swot_calval Collection.

Parameters
----------
collection
    Collection.
data_path
    Collection path.
backend_fields
    List of fields (variables) to read.
backend_kwargs
    Kwargs dictionary to pass to the underlying collection.
date_start
    Starting date of the interval we're working on.
date_end
    Ending date of the interval we're working on.
select_clip
    Selection clip allowing to work on a subset of the source's data.
select_shape
    Shape file, GeoDataFrame or Geometry to select.
data_cleaner
    Data cleaning applied just after the reader.
    This cleaning might consist of sorting, duplication removal or removing indexes
    in order to keep them increasing.
orf
    Source's indexer.
reference_track
    Reference track.
time
    Time field.
longitude
    Longitude field.
latitude
    Latitude field.
swath_lines
    Swath main dimension.
swath_pixels
    Swath cross_track dimension.
cycle_number
    The cycle number field.
pass_number
    The pass number field.
longitude_nadir
    The nadir's longitude field.
latitude_nadir
    The nadir's latitude field.
cross_track_distance
    Cross track distance field.

Warning

Swot_calval collection readers needs a half_orbit.parquet file in the collection directory

to be instantiated.

Multi-readers

The MultiReader class allows to work on a set of readers as if they were a single data source. Multi-readers’ fields can use fields from any of their readers as source.

The first reader is used as reference’s reader and data from all readers will be aligned on this reader indexes.

Warning

Any index not present in the reference’s reader will be ignored.

Reference’s index missing in reader’s data are filled with numpy.nan

Reader’s index missing in reference’s data are ignored.

The reference’s reader is used to define the multi-reader coordinates values and names.

Reader allowing to read from a set of readers.

The first reader is used as reference (for time and coordinates).
Fields from all readers are available and prefixed by the provided markers.

Parameters
----------
readers
    List of readers.
markers
    List of field's prefixes for each reader.
    Default to ``Sx_`` with x being the reader's number.
tolerance
    Gap's tolerance used to fill missing indexes from a reader when aligning it
    on the reference's reader's index (default to 0).
date_start
    Starting date of the interval we're working on.
date_end
    Ending date of the interval we're working on.
select_clip
    Selection clip allowing to work on a subset of the source's data.
select_shape
    Shape file, GeoDataFrame or Geometry to select.
data_cleaner
    Data cleaning applied just after the reader.
    This cleaning might consist of sorting, duplication removal or removing indexes
    in order to keep them increasing.
orf
    Source's indexer.
reference_track
    Reference track.
time
    Time field.
longitude
    Longitude field.
latitude
    Latitude field.

Multi-readers can be used in the following cases:

When working with multiple sources having the same (or close enough) indexes

When working with multiple sources interpolated on the same reference track

Note

Readers’ dates, coordinates, orf and reference_track parameters will be set using
the multi-reader’s ones if not set, or the NadirData/SwathData’s ones if the
multi-reader’s ones are not set either.

Multi-reader’s dates, coordinates, orf and reference track parameters will be set
using the NadirData/SwathData’s ones  if not set, and if NadirData/SwathData’s ones
are not set either, the multi-reader’s first reader’s ones will be used.

In general, instantiating the reader beforehand is strongly recommended.
Avoid defining dates, coordinates, orf and reference track parameters in the
NadirData/SwathData objects, instead of the reader.

Example: Sentinel 6 HR vs LR

The following example shows how to configure a MultiReader in order to work with fields coming from Sentinel 6 HR and LR storage.

from casys import NadirData, DateHandler, Field
from casys.readers import MultiReader, CLSTableReader

start = DateHandler.from_orf(orf="C_S6A_LR", cycle_nb=41, pass_nb=1, pos="first")
end = DateHandler.from_orf(orf="C_S6A_LR", cycle_nb=44, pass_nb=254, pos="last")

r1 = CLSTableReader(name="TABLE_C_S6A_LR_B")
r2 = CLSTableReader(name="TABLE_C_S6A_HR_B")

ad = NadirData(
    source=MultiReader(
        readers=[r1, r2],
        markers=["LR_", "HR_"],
        date_start=start,
        date_end=end,
        orf="C_S6A_LR",
        time="time",
        longitude="LONGITUDE",
        latitude="LATITUDE",
    )
)

ad.show_fields(containing="RANGE.ALTI")

Name	Description	Unit
LR_RANGE.ALTI	All instrumental corrections included, i.e. distance antenna-COG, USO drift correction, internal path correction, Doppler corre	m
LR_RANGE.ALTI.B2	All instrumental corrections included, i.e. distance antenna-COG, USO drift correction, internal path correction, Doppler corre	m
LR_RANGE.ALTI.CORR_GEO	The geographical correction parameter provides the range correction for the acrosstrack shift induced geographical variations.	m
LR_RANGE.ALTI.CORR_GEO_MLE3	The geographical correction parameter provides the range correction for the acrosstrack shift induced geographical variations.	m
LR_RANGE.ALTI.CORR_GEO_NR	The geographical correction parameter provides the range correction for the acrosstrack shift induced geographical variations.	m
LR_RANGE.ALTI.MLE3	All instrumental corrections included, i.e. distance antenna-COG, USO drift correction, internal path correction, Doppler corre	m
LR_RANGE.ALTI.NR	All instrumental corrections included, i.e. distance antenna-COG, USO drift correction, internal path correction, Doppler corre	m
HR_RANGE.ALTI	All instrumental corrections included, i.e. distance antenna-COG, USO drift correction, internal path correction, Doppler corre	m
HR_RANGE.ALTI.CORR_GEO	The geographical correction parameter provides the range correction for the acrosstrack shift induced geographical variations.	m
HR_RANGE.ALTI.CORR_GEO_NR	The geographical correction parameter provides the range correction for the acrosstrack shift induced geographical variations.	m
HR_RANGE.ALTI.NR	All instrumental corrections included, i.e. distance antenna-COG, USO drift correction, internal path correction, Doppler corre	m

Example: Sentinel 6 vs Jason 3

The following example shows how to configure a MultiReader in order to work with fields coming from Sentinel 6 and Jason 3 interpolated on a common reference track.

from casys import NadirData, DateHandler, Field
from casys.readers import MultiReader, CLSTableReader

start_s6 = DateHandler.from_orf(orf="C_S6A_LR", cycle_nb=44, pass_nb=1, pos="first")
end_s6 = DateHandler.from_orf(orf="C_S6A_LR", cycle_nb=44, pass_nb=254, pos="last")

start_j3 = DateHandler.from_orf(orf="C_J3", cycle_nb=219, pass_nb=1, pos="first")
end_j3 = DateHandler.from_orf(orf="C_J3", cycle_nb=219, pass_nb=254, pos="last")

# Each reader has it own ORF (important for the along track interpolation)
r1 = CLSTableReader(
    name="TABLE_C_S6A_LR_B", orf="C_S6A_LR", date_start=start_s6, date_end=end_s6
)
r2 = CLSTableReader(
    name="TABLE_C_J3_B", orf="C_J3", date_start=start_j3, date_end=end_j3
)

ad = NadirData(
    source=MultiReader(
        readers=[r1, r2],
        markers=["S6_", "J3_"],
        time="time",
        longitude="LONGITUDE",
        latitude="LATITUDE",
        reference_track="J3",
    )
)

ad.show_fields(containing="RANGE.ALTI")

Name	Description	Unit
S6_RANGE.ALTI	All instrumental corrections included, i.e. distance antenna-COG, USO drift correction, internal path correction, Doppler corre	m
S6_RANGE.ALTI.B2	All instrumental corrections included, i.e. distance antenna-COG, USO drift correction, internal path correction, Doppler corre	m
S6_RANGE.ALTI.CORR_GEO	The geographical correction parameter provides the range correction for the acrosstrack shift induced geographical variations.	m
S6_RANGE.ALTI.CORR_GEO_MLE3	The geographical correction parameter provides the range correction for the acrosstrack shift induced geographical variations.	m
S6_RANGE.ALTI.CORR_GEO_NR	The geographical correction parameter provides the range correction for the acrosstrack shift induced geographical variations.	m
S6_RANGE.ALTI.MLE3	All instrumental corrections included, i.e. distance antenna-COG, USO drift correction, internal path correction, Doppler corre	m
S6_RANGE.ALTI.NR	All instrumental corrections included, i.e. distance antenna-COG, USO drift correction, internal path correction, Doppler corre	m
J3_RANGE.ALTI	All instrumental corrections included, i.e. distance antenna-COG, USO drift correction, internal path correction, Doppler corre	m
J3_RANGE.ALTI.ADAPTIVE	All instrumental corrections included, i.e. distance antenna-COG (cog_corr), USO drift correction (uso_corr), internal path cor	m
J3_RANGE.ALTI.B2	All instrumental corrections included, i.e. distance antenna-COG, USO drift correction, internal path correction, Doppler corre	m
J3_RANGE.ALTI.MLE3	All instrumental corrections included, i.e. distance antenna-COG, USO drift correction, internal path correction, Doppler corre	m

Data selection

Using clips conditions

A selection clip allowing to work on a subset of the source’s data can set using the select_clip parameter.

from casys import NadirData, DateHandler

# Selecting data having:
# 27 <= LATITUDE <= 46 and -10 <= LONGITUDE <= 40
# with LONGITUDE being normalized between -180, 180
selection = """is_bounded(27, LATITUDE, 46) &&
               is_bounded(-10, deg_normalize(-180, LONGITUDE), 40)"""

ad_sel = NadirData(
    date_start=DateHandler.from_orf("C_J3_GDRD", 125, 1, pos="first"),
    date_end=DateHandler.from_orf("C_J3_GDRD", 125, 254, pos="last"),
    source="TABLE_C_J3_B_GDRD",
    orf="C_J3_GDRD",
    time="time",
    longitude="LONGITUDE",
    latitude="LATITUDE",
    select_clip=selection
)
ad_sel

AltiData
	Source Source type ORF Time name Longitude Latitude	TABLE_C_J3_B_GDRD CLSTableReader C_J3_GDRD time LONGITUDE LATITUDE		Period start Period end Selection clip Selection shape	2019-06-30 23:26:04.865308 Cycle: 125 - Pass: 1 2019-07-10 21:24:35.317009 Cycle: 125 - Pass: 254 is_bounded(27, LATITUDE, 46) && is_bounded(-10, deg_normalize(-180, LONGITUDE), 40) False

Using shape files

A shape file, GeoDataFrame or Geometry on which to limit source’s data can set using the select_shape parameter.

from casys import NadirData, DateHandler

# Shape file selection
import geopandas as gpd

shape = gpd.read_file(os.path.join(os.environ["RESOURCES_DIR"], "Med", "Med.shp"))
shape = shape.set_crs(crs="EPSG:4326")

ad_sel = NadirData(
    date_start=DateHandler.from_orf("C_J3_GDRD", 125, 1, pos="first"),
    date_end=DateHandler.from_orf("C_J3_GDRD", 125, 254, pos="last"),
    source="TABLE_C_J3_B_GDRD",
    orf="C_J3_GDRD",
    time="time",
    longitude="LONGITUDE",
    latitude="LATITUDE",
    select_shape=shape
)
ad_sel

AltiData
	Source Source type ORF Time name Longitude Latitude	TABLE_C_J3_B_GDRD CLSTableReader C_J3_GDRD time LONGITUDE LATITUDE		Period start Period end Selection clip Selection shape	2019-06-30 23:26:04.865308 Cycle: 125 - Pass: 1 2019-07-10 21:24:35.317009 Cycle: 125 - Pass: 254 is_bounded(29.133297406790106, LATITUDE, 46.17221385443534) && is_bounded(-5.632114355511645, deg_normalize(-180, LONGITUDE), 37.53314064518961) True

Along track data interpolation

Setting the reference_track parameter enables every data read by NadirData to be interpolated on the provided track.

The value of this parameter can be set in many different ways:

By giving the name of an existing reference track (these tracks can be listed using the show_theoretical_tracks() method)

By giving the path of a reference track netCDF file

By providing a TheoreticalTrack

By providing a StandardAlongTrack

from casys import NadirData, DateHandler

ad_interp = NadirData(
    date_start=DateHandler.from_orf("C_J3_GDRD", 122, 1, pos="first"),
    date_end=DateHandler.from_orf("C_J3_GDRD", 122, 1, pos="last"),
    source="TABLE_C_J3_B_GDRD",
    orf="C_J3_GDRD",
    reference_track="J3",
    time="time",
    longitude="LONGITUDE",
    latitude="LATITUDE",
)

Each field will be interpolated using properties defined through its interpolation parameter.

from casys import Field

var_sla_linear = Field(
    name="SLA_linear",
    source="ORBIT.ALTI - RANGE.ALTI - MEAN_SEA_SURFACE.MODEL.CNESCLS15",
    unit="m",
    interpolation="linear",
)
var_sla_nearest = Field(
    name="SLA_nearest",
    source="ORBIT.ALTI - RANGE.ALTI - MEAN_SEA_SURFACE.MODEL.CNESCLS15",
    unit="m",
    interpolation="nearest",
)
var_sla_spline = Field(
    name="SLA_spline",
    source="ORBIT.ALTI - RANGE.ALTI - MEAN_SEA_SURFACE.MODEL.CNESCLS15",
    unit="m",
    interpolation={"mode": "smoothing_spline", "noise_level": 0.1},
)

Diagnostic using these fields are defined in the very same way “normal” diagnostics are.

A complete example can be found here: How to compare interpolated fields.

ad_interp.add_raw_data(name="SLA linear", field=var_sla_linear)
ad_interp.add_histogram(name="SLA nearest hist", x=var_sla_nearest, res_x="auto")
ad_interp.add_time_stat(
    name="SLA spline (10 minutes)", field=var_sla_spline, freq="10min"
)

Merging NadirData

General use case of the merging functionalities is when working with 2 different co-located data sources with close or identical time indexes.

Comparing them might be very complex and time consuming (especially if they are in different “GES_TABLE_DIR”).

As mentioned above, data might have different “time” index and require some operations to be correctly aligned with each other.

These kinds of data might origin from a reprocessing of the same original data or from interpolated NadirData objects along the same reference track.

The main idea is to build two independent (but compatible) NadirData objects, add raw data, compute them and finally merge them into a single one able to use both sets of fields.

Note

Alignment can only be made on the “time” dimension. Latitudes and longitudes are considered to be shared.

If that’s not the case, diagnostics making use of latitude and longitude need to be considered as invalid.

Merge the provided data container raw data into the current one.

* If provided data and current data include the INTERPOLATED_INDEX field, data
  will be considered as already aligned otherwise the provided data will be
  interpolated or re-indexed along the time dimension using the provided method
* Longitudes from provided data will be replaced by the current ones
* Latitude from the provided data will be replaced by the current ones

Interpolation is using interp_like method from xarray.
Reindexing is using reindex_like method from xarray.

Parameters
----------
data
    Data container object containing computed raw data to merge.
interp
    Whether to interpolate (True) or just reindex the data (False)
method
    * Interpolation methods:

        * {“linear”, “nearest”} for multidimensional array
        * {“linear”, “nearest”, “zero”,
          “slinear”, “quadratic”, “cubic”} for 1-dimensional array.
        * linear is used by default

    * Reindexing methods:

        * None (default): don't fill gaps
        * pad / ffill: propagate last valid index value forward
        * backfill / bfill: propagate next valid index value backward
        * nearest: use the nearest valid index value

kwargs
    Additional parameters passed to the underlying xarray function.

    * Interpolation options:

        * Additional keyword passed to scipy’s interpolator.

    * Reindexing options:

        * tolerance: Maximum distance between original and new labels for
          inexact matches. The values of the index at the matching locations
          must satisfy the equation
        * fill_value: Value to use for newly missing values

See the example presented in this notebook.