Processor

class spectral_indices.Processor(source: ~spectral_indices.sources.sources.DataSource, indices: ~spectral_indices.idb.models.Index | str | ~typing.List[~spectral_indices.idb.models.Index | str], timestamps: str | ~datetime.date | ~typing.List[str | ~datetime.date], rois: ~spectral_indices.roi.bbox.BoundingBox | ~typing.List[~spectral_indices.roi.bbox.BoundingBox], pipeline: ~spectral_indices.pipeline.pipeline.Pipeline, filter: ~spectral_indices.filter.Filter = <spectral_indices.filter.Filter object>, maskings: ~typing.Dict[str, ~typing.List[int]] = {}, chunk_size=500, n_workers=1, processes=1, worker_memory_limit='24GB', threads=1)[source]

Processor wrap all informations to process some data. This is the main class of the library.

Parameters:

source (DataSource) – Source of data to collect data from.
indice (Indice) – Indice to compute. Pass Indices short names to retrieve indices from database.
timestamps (Union[Union[str, date], List[Union[str, date]]]) – Date range of data.
rois (Union[BoundingBox, List[BoundingBox]]) – Region of interest.
pipeline (Pipeline) – Pipeline to apply on data.
filter (Filter, optional) – Filter to apply on itemCollection. Defaults to Filter().
chunk_size (int, optional) – Size of chunks to process for each thread. A value of x represent a chuk of shape(x,x) for spatial dimension (height, width).Defaults to 500.
n_workers (int, optional) – Number of workers to use. Defaults to 4.
processes (int, optional) – Number of processes to launch. Defaults to 1.
worker_memory_limit (str, optional) – Maximum memory allowed for a worker. Defaults to “24GB”.
threads (int, optional) – Number of threads/worker. Defaults to 1.

apply_masks(array: DataArray) → DataArray[source]

Apply mask to data array.

Parameters:

array (DataArray) – Data array with masks bands.

Returns:

Masked data array.

Return type:

DataArray

default_cluster_client() → Tuple[LocalCluster, Client][source]: Build default cluster & client based on dask distributed.

get_collection() → ItemCollection[source]: Get item collection and apply filters.

get_data() → DataArray[source]

Get query data as xarray DataArray.

Returns:

Lazy DataArray for query data.

Return type:

DataArray

graph(array: DataArray) → DataArray[source]

Apply transformations graph to array and return lazy processed array.

Parameters:

array (DataArray) – DataArray to apply pipeline on.
{}. (It contain all objects that could be usefull for transformations to be applied. Defaults to)

Returns:

Lazy processed array.

Return type:

DataArray

launch(save: str = '', nodata=-1000, compute: bool = False, cluster: Any = None, client: Any = None) → DataArray[source]

Run the pipeline.

Parameters:

save (str, optional) – Path to save pipeline output. If provided, transformation SaveRaster will be applied and so the array will be computed. Defaults to “”.
nodata (int, optional) – Value to fill nan for raster writing. Defaults to -1000.
compute (bool, optionnal) – To compute the result array or not. Default to False. Warning if bot save and compute the array will be computed 2 times, onece during SaveRaster and once before returning result. It may be sub efficient to do so.
cluster (Any`, optionnal) – Custom cluster to apply the pipeline. Default to None (will use dask LocalCluster).
client (Any`, optionnal) – Custom client to apply the pipeline. Default to None (will use dask Client).

Returns:

Result of pipeline as lazy DataArray.

Return type:

DataArray