API Reference

intake_xarray.netcdf.NetCDFSource(*args, ...)

Open a xarray file.

intake_xarray.opendap.OpenDapSource(*args, ...)

Open a OPeNDAP source.

intake_xarray.xzarr.ZarrSource(*args, **kwargs)

Open a xarray dataset.

intake_xarray.raster.RasterIOSource(*args, ...)

Open a xarray dataset via RasterIO.

intake_xarray.image.ImageSource(*args, **kwargs)

Open a xarray dataset from image files.

class intake_xarray.netcdf.NetCDFSource(*args, **kwargs)[source]

Open a xarray file.

Parameters
urlpathstr, List[str]

Path to source file. May include glob “*” characters, format pattern strings, or list. Some examples:

  • {{ CATALOG_DIR }}/data/air.nc

  • {{ CATALOG_DIR }}/data/*.nc

  • {{ CATALOG_DIR }}/data/air_{year}.nc

chunksint or dict, optional

Chunks is used to load the new dataset into dask arrays. chunks={} loads the dataset with dask using a single chunk for all arrays.

combine({‘by_coords’, ‘nested’}, optional)

Which function is used to concatenate all the files when urlpath has a wildcard. It is recommended to set this argument in all your catalogs because the default has changed and is going to change. It was “nested”, and is now the default of xarray.open_mfdataset which is “auto_combine”, and is planed to change from “auto” to “by_corrds” in a near future.

concat_dimstr, optional

Name of dimension along which to concatenate the files. Can be new or pre-existing if combine is “nested”. Must be None or new if combine is “by_coords”.

path_as_patternbool or str, optional

Whether to treat the path as a pattern (ie. data_{field}.nc) and create new coodinates in the output corresponding to pattern fields. If str, is treated as pattern to match on. Default is True.

xarray_kwargs: dict

Additional xarray kwargs for xr.open_dataset() or xr.open_mfdataset().

storage_options: dict

If using a remote fs (whether caching locally or not), these are the kwargs to pass to that FS.

Attributes
cache
cache_dirs
cat
classname
description
dtype
entry
gui

Source GUI, with parameter selection and plotting

has_been_persisted
hvplot

Returns a hvPlot object to provide a high-level plotting API.

is_persisted
path_as_pattern
pattern
plot

Returns a hvPlot object to provide a high-level plotting API.

plots

List custom associated quick-plots

shape
urlpath

Methods

__call__(**kwargs)

Create a new instance of this source with altered arguments

close()

Delete open file from memory

configure_new(**kwargs)

Create a new instance of this source with altered arguments

describe()

Description from the entry spec

discover()

Open resource and populate the source attributes.

export(path, **kwargs)

Save this data for sharing with other people

get(**kwargs)

Create a new instance of this source with altered arguments

persist([ttl])

Save data from this source to local persistent storage

read()

Return a version of the xarray with all the data in memory

read_chunked()

Return xarray object (which will have chunks)

read_partition(i)

Fetch one chunk of data at tuple index i

to_dask()

Return xarray object where variables are dask arrays

to_spark()

Provide an equivalent data object in Apache Spark

yaml()

Return YAML representation of this data-source

get_persisted

set_cache_dir

class intake_xarray.opendap.OpenDapSource(*args, **kwargs)[source]

Open a OPeNDAP source.

Parameters
urlpath: str

Path to source file.

chunks: None, int or dict

Chunks is used to load the new dataset into dask arrays. chunks={} loads the dataset with dask using a single chunk for all arrays.

auth: None, “esgf” or “urs”

Method of authenticating to the OPeNDAP server. Choose from one of the following: None - [Default] Anonymous access. ‘esgf’ - Earth System Grid Federation. ‘urs’ - NASA Earthdata Login, also known as URS. ‘generic_http’ - OPeNDAP servers which support plain HTTP authentication None - No authentication. Note that you will need to set your username and password respectively using the environment variables DAP_USER and DAP_PASSWORD.

engine: str

Engine used for reading OPeNDAP URL. Should be one of ‘pydap’ or ‘netcdf4’.

Attributes
cache
cache_dirs
cat
classname
description
dtype
entry
gui

Source GUI, with parameter selection and plotting

has_been_persisted
hvplot

Returns a hvPlot object to provide a high-level plotting API.

is_persisted
plot

Returns a hvPlot object to provide a high-level plotting API.

plots

List custom associated quick-plots

shape

Methods

__call__(**kwargs)

Create a new instance of this source with altered arguments

close()

Delete open file from memory

configure_new(**kwargs)

Create a new instance of this source with altered arguments

describe()

Description from the entry spec

discover()

Open resource and populate the source attributes.

export(path, **kwargs)

Save this data for sharing with other people

get(**kwargs)

Create a new instance of this source with altered arguments

persist([ttl])

Save data from this source to local persistent storage

read()

Return a version of the xarray with all the data in memory

read_chunked()

Return xarray object (which will have chunks)

read_partition(i)

Fetch one chunk of data at tuple index i

to_dask()

Return xarray object where variables are dask arrays

to_spark()

Provide an equivalent data object in Apache Spark

yaml()

Return YAML representation of this data-source

get_persisted

set_cache_dir

class intake_xarray.xzarr.ZarrSource(*args, **kwargs)[source]

Open a xarray dataset.

Parameters
urlpath: str

Path to source. This can be a local directory or a remote data service (i.e., with a protocol specifier like 's3://).

storage_options: dict

Parameters passed to the backend file-system

kwargs:

Further parameters are passed to xr.open_zarr

Attributes
cache
cache_dirs
cat
classname
description
dtype
entry
gui

Source GUI, with parameter selection and plotting

has_been_persisted
hvplot

Returns a hvPlot object to provide a high-level plotting API.

is_persisted
plot

Returns a hvPlot object to provide a high-level plotting API.

plots

List custom associated quick-plots

shape

Methods

__call__(**kwargs)

Create a new instance of this source with altered arguments

close()

Delete open file from memory

configure_new(**kwargs)

Create a new instance of this source with altered arguments

describe()

Description from the entry spec

discover()

Open resource and populate the source attributes.

export(path, **kwargs)

Save this data for sharing with other people

get(**kwargs)

Create a new instance of this source with altered arguments

persist([ttl])

Save data from this source to local persistent storage

read()

Return a version of the xarray with all the data in memory

read_chunked()

Return xarray object (which will have chunks)

read_partition(i)

Fetch one chunk of data at tuple index i

to_dask()

Return xarray object where variables are dask arrays

to_spark()

Provide an equivalent data object in Apache Spark

yaml()

Return YAML representation of this data-source

get_persisted

set_cache_dir

close()[source]

Delete open file from memory

class intake_xarray.raster.RasterIOSource(*args, **kwargs)[source]

Open a xarray dataset via RasterIO.

This creates an xarray.array, not a dataset (i.e., there is exactly one variable).

See https://rasterio.readthedocs.io/en/latest/ for the file formats supported, particularly GeoTIFF, and http://xarray.pydata.org/en/stable/generated/xarray.open_rasterio.html#xarray.open_rasterio for possible extra arguments

Parameters
urlpath: str or iterable, location of data

May be a local path, or remote path if including a protocol specifier such as 's3://'. May include glob wildcards or format pattern strings. Must be a format supported by rasterIO (normally GeoTiff). Some examples:

  • {{ CATALOG_DIR }}data/RGB.tif

  • s3://data/*.tif

  • s3://data/landsat8_band{band}.tif

  • s3://data/{location}/landsat8_band{band}.tif

  • {{ CATALOG_DIR }}data/landsat8_{start_date:%Y%m%d}_band{band}.tif

chunks: None or int or dict, optional

Chunks is used to load the new dataset into dask arrays. chunks={} loads the dataset with dask using a single chunk for all arrays. default None loads numpy arrays.

path_as_pattern: bool or str, optional

Whether to treat the path as a pattern (ie. data_{field}.tif) and create new coodinates in the output corresponding to pattern fields. If str, is treated as pattern to match on. Default is True.

Attributes
cache
cache_dirs
cat
classname
description
dtype
entry
gui

Source GUI, with parameter selection and plotting

has_been_persisted
hvplot

Returns a hvPlot object to provide a high-level plotting API.

is_persisted
path_as_pattern
pattern
plot

Returns a hvPlot object to provide a high-level plotting API.

plots

List custom associated quick-plots

shape
urlpath

Methods

__call__(**kwargs)

Create a new instance of this source with altered arguments

close()

Delete open file from memory

configure_new(**kwargs)

Create a new instance of this source with altered arguments

describe()

Description from the entry spec

discover()

Open resource and populate the source attributes.

export(path, **kwargs)

Save this data for sharing with other people

get(**kwargs)

Create a new instance of this source with altered arguments

persist([ttl])

Save data from this source to local persistent storage

read()

Return a version of the xarray with all the data in memory

read_chunked()

Return xarray object (which will have chunks)

read_partition(i)

Fetch one chunk of data at tuple index i

to_dask()

Return xarray object where variables are dask arrays

to_spark()

Provide an equivalent data object in Apache Spark

yaml()

Return YAML representation of this data-source

get_persisted

set_cache_dir

class intake_xarray.image.ImageSource(*args, **kwargs)[source]

Open a xarray dataset from image files.

This creates an xarray.DataArray or an xarray.Dataset. See http://scikit-image.org/docs/dev/api/skimage.io.html#skimage.io.imread for the file formats supported.

NOTE: Although skimage.io.imread is used by default, any reader function which accepts a file object and outputs a numpy array can be used instead.

Parameters
urlpathstr or iterable, location of data

May be a local path, or remote path if including a protocol specifier such as 's3://'. May include glob wildcards or format pattern strings. Must be a format supported by skimage.io.imread or user-supplied imread. Some examples:

  • {{ CATALOG_DIR }}/data/RGB.tif

  • s3://data/*.jpeg

  • https://example.com/image.png

  • s3://data/Images/{{ landuse }}/{{ '%02d' % id }}.tif

chunksint or dict

Chunks is used to load the new dataset into dask arrays. chunks={} loads the dataset with dask using a single chunk for all arrays.

path_as_patternbool or str, optional

Whether to treat the path as a pattern (ie. data_{field}.tif) and create new coodinates in the output corresponding to pattern fields. If str, is treated as pattern to match on. Default is True.

concat_dimstr or iterable

Dimension over which to concatenate. If iterable, all fields must be part of the the pattern.

imreadfunction (optional)

Optionally provide custom imread function. Function should expect a file object and produce a numpy array. Defaults to skimage.io.imread.

preprocessfunction (optional)

Optionally provide custom function to preprocess the image. Function should expect a numpy array for a single image and return a numpy array.

coerce_shapeiterable of len 2 (optional)

Optionally coerce the shape of the height and width of the image by setting coerce_shape to desired shape.

Attributes
cache
cache_dirs
cat
classname
description
dtype
entry
gui

Source GUI, with parameter selection and plotting

has_been_persisted
hvplot

Returns a hvPlot object to provide a high-level plotting API.

is_persisted
path_as_pattern
pattern
plot

Returns a hvPlot object to provide a high-level plotting API.

plots

List custom associated quick-plots

shape
urlpath

Methods

__call__(**kwargs)

Create a new instance of this source with altered arguments

close()

Delete open file from memory

configure_new(**kwargs)

Create a new instance of this source with altered arguments

describe()

Description from the entry spec

discover()

Open resource and populate the source attributes.

export(path, **kwargs)

Save this data for sharing with other people

get(**kwargs)

Create a new instance of this source with altered arguments

persist([ttl])

Save data from this source to local persistent storage

read()

Return a version of the xarray with all the data in memory

read_chunked()

Return xarray object (which will have chunks)

read_partition(i)

Fetch one chunk of data at tuple index i

to_dask()

Return xarray object where variables are dask arrays

to_spark()

Provide an equivalent data object in Apache Spark

yaml()

Return YAML representation of this data-source

get_persisted

set_cache_dir