API Reference
|
Open a xarray file. |
|
Open a OPeNDAP source. |
|
Open a xarray dataset. |
|
Open a xarray dataset via RasterIO. |
|
Open a xarray dataset from image files. |
- class intake_xarray.netcdf.NetCDFSource(*args, **kwargs)[source]
Open a xarray file.
- Parameters
- urlpathstr, List[str]
Path to source file. May include glob “*” characters, format pattern strings, or list. Some examples:
{{ CATALOG_DIR }}/data/air.nc
{{ CATALOG_DIR }}/data/*.nc
{{ CATALOG_DIR }}/data/air_{year}.nc
- chunksint or dict, optional
Chunks is used to load the new dataset into dask arrays.
chunks={}
loads the dataset with dask using a single chunk for all arrays.- combine({‘by_coords’, ‘nested’}, optional)
Which function is used to concatenate all the files when urlpath has a wildcard. It is recommended to set this argument in all your catalogs because the default has changed and is going to change. It was “nested”, and is now the default of xarray.open_mfdataset which is “auto_combine”, and is planed to change from “auto” to “by_corrds” in a near future.
- concat_dimstr, optional
Name of dimension along which to concatenate the files. Can be new or pre-existing if combine is “nested”. Must be None or new if combine is “by_coords”.
- path_as_patternbool or str, optional
Whether to treat the path as a pattern (ie.
data_{field}.nc
) and create new coodinates in the output corresponding to pattern fields. If str, is treated as pattern to match on. Default is True.- xarray_kwargs: dict
Additional xarray kwargs for xr.open_dataset() or xr.open_mfdataset().
- storage_options: dict
If using a remote fs (whether caching locally or not), these are the kwargs to pass to that FS.
- Attributes
- cache
- cache_dirs
- cat
- classname
- description
- dtype
- entry
gui
Source GUI, with parameter selection and plotting
- has_been_persisted
hvplot
Returns a hvPlot object to provide a high-level plotting API.
- is_persisted
- path_as_pattern
- pattern
plot
Returns a hvPlot object to provide a high-level plotting API.
plots
List custom associated quick-plots
- shape
- urlpath
Methods
__call__
(**kwargs)Create a new instance of this source with altered arguments
close
()Delete open file from memory
configure_new
(**kwargs)Create a new instance of this source with altered arguments
describe
()Description from the entry spec
discover
()Open resource and populate the source attributes.
export
(path, **kwargs)Save this data for sharing with other people
get
(**kwargs)Create a new instance of this source with altered arguments
persist
([ttl])Save data from this source to local persistent storage
read
()Return a version of the xarray with all the data in memory
read_chunked
()Return xarray object (which will have chunks)
read_partition
(i)Fetch one chunk of data at tuple index i
to_dask
()Return xarray object where variables are dask arrays
to_spark
()Provide an equivalent data object in Apache Spark
yaml
()Return YAML representation of this data-source
get_persisted
set_cache_dir
- class intake_xarray.opendap.OpenDapSource(*args, **kwargs)[source]
Open a OPeNDAP source.
- Parameters
- urlpath: str
Path to source file.
- chunks: None, int or dict
Chunks is used to load the new dataset into dask arrays.
chunks={}
loads the dataset with dask using a single chunk for all arrays.- auth: None, “esgf” or “urs”
Method of authenticating to the OPeNDAP server. Choose from one of the following: None - [Default] Anonymous access. ‘esgf’ - Earth System Grid Federation. ‘urs’ - NASA Earthdata Login, also known as URS. ‘generic_http’ - OPeNDAP servers which support plain HTTP authentication None - No authentication. Note that you will need to set your username and password respectively using the environment variables DAP_USER and DAP_PASSWORD.
- engine: str
Engine used for reading OPeNDAP URL. Should be one of ‘pydap’ or ‘netcdf4’.
- Attributes
- cache
- cache_dirs
- cat
- classname
- description
- dtype
- entry
gui
Source GUI, with parameter selection and plotting
- has_been_persisted
hvplot
Returns a hvPlot object to provide a high-level plotting API.
- is_persisted
plot
Returns a hvPlot object to provide a high-level plotting API.
plots
List custom associated quick-plots
- shape
Methods
__call__
(**kwargs)Create a new instance of this source with altered arguments
close
()Delete open file from memory
configure_new
(**kwargs)Create a new instance of this source with altered arguments
describe
()Description from the entry spec
discover
()Open resource and populate the source attributes.
export
(path, **kwargs)Save this data for sharing with other people
get
(**kwargs)Create a new instance of this source with altered arguments
persist
([ttl])Save data from this source to local persistent storage
read
()Return a version of the xarray with all the data in memory
read_chunked
()Return xarray object (which will have chunks)
read_partition
(i)Fetch one chunk of data at tuple index i
to_dask
()Return xarray object where variables are dask arrays
to_spark
()Provide an equivalent data object in Apache Spark
yaml
()Return YAML representation of this data-source
get_persisted
set_cache_dir
- class intake_xarray.xzarr.ZarrSource(*args, **kwargs)[source]
Open a xarray dataset.
- Parameters
- urlpath: str
Path to source. This can be a local directory or a remote data service (i.e., with a protocol specifier like
's3://
).- storage_options: dict
Parameters passed to the backend file-system
- kwargs:
Further parameters are passed to xr.open_zarr
- Attributes
- cache
- cache_dirs
- cat
- classname
- description
- dtype
- entry
gui
Source GUI, with parameter selection and plotting
- has_been_persisted
hvplot
Returns a hvPlot object to provide a high-level plotting API.
- is_persisted
plot
Returns a hvPlot object to provide a high-level plotting API.
plots
List custom associated quick-plots
- shape
Methods
__call__
(**kwargs)Create a new instance of this source with altered arguments
close
()Delete open file from memory
configure_new
(**kwargs)Create a new instance of this source with altered arguments
describe
()Description from the entry spec
discover
()Open resource and populate the source attributes.
export
(path, **kwargs)Save this data for sharing with other people
get
(**kwargs)Create a new instance of this source with altered arguments
persist
([ttl])Save data from this source to local persistent storage
read
()Return a version of the xarray with all the data in memory
read_chunked
()Return xarray object (which will have chunks)
read_partition
(i)Fetch one chunk of data at tuple index i
to_dask
()Return xarray object where variables are dask arrays
to_spark
()Provide an equivalent data object in Apache Spark
yaml
()Return YAML representation of this data-source
get_persisted
set_cache_dir
- class intake_xarray.raster.RasterIOSource(*args, **kwargs)[source]
Open a xarray dataset via RasterIO.
This creates an xarray.array, not a dataset (i.e., there is exactly one variable).
See https://rasterio.readthedocs.io/en/latest/ for the file formats supported, particularly GeoTIFF, and http://xarray.pydata.org/en/stable/generated/xarray.open_rasterio.html#xarray.open_rasterio for possible extra arguments
- Parameters
- urlpath: str or iterable, location of data
May be a local path, or remote path if including a protocol specifier such as
's3://'
. May include glob wildcards or format pattern strings. Must be a format supported by rasterIO (normally GeoTiff). Some examples:{{ CATALOG_DIR }}data/RGB.tif
s3://data/*.tif
s3://data/landsat8_band{band}.tif
s3://data/{location}/landsat8_band{band}.tif
{{ CATALOG_DIR }}data/landsat8_{start_date:%Y%m%d}_band{band}.tif
- chunks: None or int or dict, optional
Chunks is used to load the new dataset into dask arrays.
chunks={}
loads the dataset with dask using a single chunk for all arrays. default None loads numpy arrays.- path_as_pattern: bool or str, optional
Whether to treat the path as a pattern (ie.
data_{field}.tif
) and create new coodinates in the output corresponding to pattern fields. If str, is treated as pattern to match on. Default is True.
- Attributes
- cache
- cache_dirs
- cat
- classname
- description
- dtype
- entry
gui
Source GUI, with parameter selection and plotting
- has_been_persisted
hvplot
Returns a hvPlot object to provide a high-level plotting API.
- is_persisted
- path_as_pattern
- pattern
plot
Returns a hvPlot object to provide a high-level plotting API.
plots
List custom associated quick-plots
- shape
- urlpath
Methods
__call__
(**kwargs)Create a new instance of this source with altered arguments
close
()Delete open file from memory
configure_new
(**kwargs)Create a new instance of this source with altered arguments
describe
()Description from the entry spec
discover
()Open resource and populate the source attributes.
export
(path, **kwargs)Save this data for sharing with other people
get
(**kwargs)Create a new instance of this source with altered arguments
persist
([ttl])Save data from this source to local persistent storage
read
()Return a version of the xarray with all the data in memory
read_chunked
()Return xarray object (which will have chunks)
read_partition
(i)Fetch one chunk of data at tuple index i
to_dask
()Return xarray object where variables are dask arrays
to_spark
()Provide an equivalent data object in Apache Spark
yaml
()Return YAML representation of this data-source
get_persisted
set_cache_dir
- class intake_xarray.image.ImageSource(*args, **kwargs)[source]
Open a xarray dataset from image files.
This creates an xarray.DataArray or an xarray.Dataset. See http://scikit-image.org/docs/dev/api/skimage.io.html#skimage.io.imread for the file formats supported.
NOTE: Although
skimage.io.imread
is used by default, any reader function which accepts a file object and outputs a numpy array can be used instead.- Parameters
- urlpathstr or iterable, location of data
May be a local path, or remote path if including a protocol specifier such as
's3://'
. May include glob wildcards or format pattern strings. Must be a format supported byskimage.io.imread
or user-suppliedimread
. Some examples:{{ CATALOG_DIR }}/data/RGB.tif
s3://data/*.jpeg
https://example.com/image.png
s3://data/Images/{{ landuse }}/{{ '%02d' % id }}.tif
- chunksint or dict
Chunks is used to load the new dataset into dask arrays.
chunks={}
loads the dataset with dask using a single chunk for all arrays.- path_as_patternbool or str, optional
Whether to treat the path as a pattern (ie.
data_{field}.tif
) and create new coodinates in the output corresponding to pattern fields. If str, is treated as pattern to match on. Default is True.- concat_dimstr or iterable
Dimension over which to concatenate. If iterable, all fields must be part of the the pattern.
- imreadfunction (optional)
Optionally provide custom imread function. Function should expect a file object and produce a numpy array. Defaults to
skimage.io.imread
.- preprocessfunction (optional)
Optionally provide custom function to preprocess the image. Function should expect a numpy array for a single image and return a numpy array.
- coerce_shapeiterable of len 2 (optional)
Optionally coerce the shape of the height and width of the image by setting coerce_shape to desired shape.
- Attributes
- cache
- cache_dirs
- cat
- classname
- description
- dtype
- entry
gui
Source GUI, with parameter selection and plotting
- has_been_persisted
hvplot
Returns a hvPlot object to provide a high-level plotting API.
- is_persisted
- path_as_pattern
- pattern
plot
Returns a hvPlot object to provide a high-level plotting API.
plots
List custom associated quick-plots
- shape
- urlpath
Methods
__call__
(**kwargs)Create a new instance of this source with altered arguments
close
()Delete open file from memory
configure_new
(**kwargs)Create a new instance of this source with altered arguments
describe
()Description from the entry spec
discover
()Open resource and populate the source attributes.
export
(path, **kwargs)Save this data for sharing with other people
get
(**kwargs)Create a new instance of this source with altered arguments
persist
([ttl])Save data from this source to local persistent storage
read
()Return a version of the xarray with all the data in memory
read_chunked
()Return xarray object (which will have chunks)
read_partition
(i)Fetch one chunk of data at tuple index i
to_dask
()Return xarray object where variables are dask arrays
to_spark
()Provide an equivalent data object in Apache Spark
yaml
()Return YAML representation of this data-source
get_persisted
set_cache_dir