Processor

class spectral_indices.Processor(source: ~spectral_indices.sources.sources.DataSource, indices: ~spectral_indices.idb.models.Index | str | ~typing.List[~spectral_indices.idb.models.Index | str], timestamps: str | ~datetime.date | ~typing.List[str | ~datetime.date], rois: ~spectral_indices.roi.bbox.BoundingBox | ~typing.List[~spectral_indices.roi.bbox.BoundingBox], pipeline: ~spectral_indices.pipeline.pipeline.Pipeline, filter: ~spectral_indices.filter.Filter = <spectral_indices.filter.Filter object>, maskings: ~typing.Dict[str, ~typing.List[int]] = {}, chunk_size=500, n_workers=1, processes=1, worker_memory_limit='24GB', threads=1)[source]

Processor wrap all informations to process some data. This is the main class of the library.

Parameters:
  • source (DataSource) – Source of data to collect data from.

  • indice (Indice) – Indice to compute. Pass Indices short names to retrieve indices from database.

  • timestamps (Union[Union[str, date], List[Union[str, date]]]) – Date range of data.

  • rois (Union[BoundingBox, List[BoundingBox]]) – Region of interest.

  • pipeline (Pipeline) – Pipeline to apply on data.

  • filter (Filter, optional) – Filter to apply on itemCollection. Defaults to Filter().

  • chunk_size (int, optional) – Size of chunks to process for each thread. A value of x represent a chuk of shape(x,x) for spatial dimension (height, width).Defaults to 500.

  • n_workers (int, optional) – Number of workers to use. Defaults to 4.

  • processes (int, optional) – Number of processes to launch. Defaults to 1.

  • worker_memory_limit (str, optional) – Maximum memory allowed for a worker. Defaults to “24GB”.

  • threads (int, optional) – Number of threads/worker. Defaults to 1.

apply_masks(array: DataArray) DataArray[source]

Apply mask to data array.

Parameters:

array (DataArray) – Data array with masks bands.

Returns:

  • Masked data array.

Return type:

DataArray

default_cluster_client() Tuple[LocalCluster, Client][source]

Build default cluster & client based on dask distributed.

get_collection() ItemCollection[source]

Get item collection and apply filters.

get_data() DataArray[source]

Get query data as xarray DataArray.

Returns:

  • Lazy DataArray for query data.

Return type:

DataArray

graph(array: DataArray) DataArray[source]

Apply transformations graph to array and return lazy processed array.

Parameters:
  • array (DataArray) – DataArray to apply pipeline on.

  • {}. (It contain all objects that could be usefull for transformations to be applied. Defaults to)

Returns:

  • Lazy processed array.

Return type:

DataArray

launch(save: str = '', nodata=-1000, compute: bool = False, cluster: Any = None, client: Any = None) DataArray[source]

Run the pipeline.

Parameters:
  • save (str, optional) – Path to save pipeline output. If provided, transformation SaveRaster will be applied and so the array will be computed. Defaults to “”.

  • nodata (int, optional) – Value to fill nan for raster writing. Defaults to -1000.

  • compute (bool, optionnal) – To compute the result array or not. Default to False. Warning if bot save and compute the array will be computed 2 times, onece during SaveRaster and once before returning result. It may be sub efficient to do so.

  • cluster (Any`, optionnal) – Custom cluster to apply the pipeline. Default to None (will use dask LocalCluster).

  • client (Any`, optionnal) – Custom client to apply the pipeline. Default to None (will use dask Client).

Returns:

  • Result of pipeline as lazy DataArray.

Return type:

DataArray