poligrain package¶

poligrain: Effortlessly plot and compare (rainfall) sensor data: with point, line and grid geometry.

Submodules¶

poligrain.plot_map module¶

Functions for plotting.

poligrain.plot_map.plot_lines(cmls, vmin=None, vmax=None, use_lon_lat=True, cmap='turbo', line_color='C0', line_width=1, pad_width=0.5, pad_color='k', line_style='-', cap_style='round', ax=None, background_map=None, projection=None)¶

Plot paths of line-based sensors like CMLs.

If a xarray.Dataset is passed, the paths are plotted using the defined line_color. If a xarray.DataArray is passed its content is used to color the lines based on cmap, vmin and vmax. The xarray.DataArray has to be 1D with one entry per line.

Parameters:

cmls (xr.Dataset | xr.DataArray) – The line-based sensors data with coordinates defined according to the OPENSENSE data format conventions.
vmin (float | None, optional) – Minimum value of colormap, by default None.
vmax (float | None, optional) – Maximum value of colormap, by default None.
cmap (str | Colormap, optional) – A matplotlib colormap either as string or a Colormap object, by default “turbo”.
line_color (str, optional) – The color of the lines when plotting based on a xarray.Dataset, by default “k”.
line_width (float, optional) – The width of the lines. In case of coloring lines with a cmap, this is the width of the colored line, which is extend by pad_width with a black outline. By default 1.
pad_width (float, optional) – The width of the outline, i.e. edge width, around the lines, by default 0.
pad_color (str, optional) – Color of the padding, i.e. the edge color of the lines. Default is “k”.
line_style (str, optional) – Line style as used by matplotlib, default is “-“.
cap_style (str, optional) – Whether to have “round” or rectangular (“butt”) ends of the lines. Default is “round”.
ax (matplotlib.axes.Axes | None, optional) – A Axes object on which to plot. If not supplied, a new figure with an Axes will be created. By default None.
background_map (str | None, optional) – Type of background map.
projection (cartopy.crs.Projection | None, optional) – The map projection to be used.

Return type:

LineCollection

poligrain.plot_map.plot_plg(da_grid=None, da_cmls=None, da_gauges=None, vmin=None, vmax=None, cmap='turbo', alpha=1, ax=None, use_lon_lat=True, edge_color='k', edge_width=0.5, marker_size=20, line_color='k', point_color='k', add_colorbar=True, colorbar_label='', kwargs_cmls_plot=None, kwargs_gauges_plot=None, background_map=None, projection=None, extent=None)¶

Plot point, line and grid data.

The data to be plotted has to be provided as xr.DataArray or xr.Dataset conforming to our naming conventions. Data has to be for one selected time step if provided as xr.DataArray. For points and lines providing data as xr.Dataset is allowed and then only locations of sensors are plotted with single color.

Data of the three different sources can be passed all at once, but one can also pass only one or two of them. vmin, vmax and cmap will be the same for all three data sources, but can be adjusted separately via kwargs_cmls_plot and kwargs_gauges_plot.

Parameters:

da_grid (xr.DataArray, optional) – 2D gridded data (only one time step), typically from weather radar
da_cmls (xr.DataArray or xr.Dataset, optional) – CML data (for one specific time step) if passed as xr.DataArray. If passed as xr.Dataset only the locations will be plotted.
da_gauges (xr.DataArray, optional) – Gauge data (for on specific time step) if passed as xr.DataArray. If passed as xr.Dataset only the locations will be plotted.
vmin (float, optional) – vmin for all three data sources, by default None. If set to None it will be derived individually for each data source when plotting.
vmax (float, optional) – vmax for all three data sources, by default None. If set to None it will be derived individually for each data source when plotting.
cmap (str, optional) – cmap for all three data sources, by default “turbo”
alpha (float, optional) – Alpha values used for the gridded dataset.
ax (_type_, optional) – Axes object from matplotlib, by default None which will create a new figure and return the Axes object.
use_lon_lat (bool, optional) – If set to True use lon-lat coordinates for plotting. If set to False use x-y coordinates (meant to be projected coordinates). By default True. Note that our data conventions enforce that lon-lat coordinates are provided, but projected coordinates might need to be generated first before plotting. This plotting function does not project data on the fly.
edge_color (str, optional) – Edge color of points and lines, by default “k”
edge_width (float, optional) – Width of edge line of points and lines, by default 0.5
marker_size (int, optional) – Size of points and lines, by default 20. Note that the value is directly passed to plt.scatter for plotting points but for the width of the lines it is divided by 10 so that visually the have more or less the same size.
line_color (str, optional) – Color of lines if da_cmls is provided as xr.Dataset, by default “k”. If da_cmls is provided as xr.DataArray this is ignored and the cmap is applied for the colored lines.
point_color (str, optional) – Color of points if da_gauges is provided as xr.Dataset, by default “k”. If da_gauges is provided as xr.DataArray this is ignored and the cmap is applied for the colored points.
add_colorbar (bool, optional) – If True adds a color bar to the plot, by default True.
colorbar_label (str, optional) – Label for the color bar, by default “”
kwargs_cmls_plot (dict or None, optional) – kwargs to be passed to the CML plotting function, by default None. See plot_lines for supported kwargs.
kwargs_gauges_plot (dict or None, optional) – kwargs to be passed to plt.scatter, by default None.
data_crs........
background_map (str | None)
projection (Projection | None)
extent (list | None)

poligrain.plot_map.scatter_lines(x0, y0, x1, y1, s=3, c='C0', line_style='-', pad_width=0, pad_color='k', cap_style='round', vmin=None, vmax=None, cmap='viridis', ax=None, data_crs=None)¶

Plot lines as if you would use plt.scatter for points.

Parameters:

x0 (npt.ArrayLike | float) – x coordinate of start point of line
y0 (npt.ArrayLike | float) – y coordinate of start point of line
x1 (npt.ArrayLike | float) – x coordinate of end point of line
y1 (npt.ArrayLike | float) – y coordinate of end point of line
s (float, optional) – The width of the lines. In case of coloring lines with a cmap, this is the width of the colored line, which is extend by pad_width with colored outline using pad_color. By default 1.
c (str | npt.ArrayLike, optional) – The color of the lines. If something array-like is passe, this data is used to color the lines based on the cmap, vmin and vmax. By default “C0”.
line_style (str, optional) – Line style as used by matplotlib, default is “-“.
pad_width (float, optional) – The width of the outline, i.e. edge width, around the lines, by default 0.
pad_color (str, optional) – Color of the padding, i.e. the edge color of the lines. Default is “k”.
cap_style (str, optional) – Whether to have “round” or rectangular (“butt”) ends of the lines. Default is “round”.
vmin (float | None, optional) – Minimum value of colormap, by default None.
vmax (float | None, optional) – Maximum value of colormap, by default None.
cmap (str | Colormap, optional) – A matplotlib colormap either as string or a Colormap object, by default “turbo”.
ax (matplotlib.axes.Axes | None, optional) – A Axes object on which to plot. If not supplied, a new figure with an Axes will be created. By default None.
data_crs (cartopy.crs.Projection | None, optional) – The coordinate reference system of the data provided. The default is None. In the default case cartopy.crs.PlateCarree will be used when plotting with a ax that is a cartopy.mpl.geoaxes.GeoAxes. When plotting with ax being a normal matplotlib.axes.Axes data_crs has to be None since the coordinate transformation it implies are not supported by matplotlib.

Returns:

_description_

Return type:

LineCollection

poligrain.plot_map.set_up_axes(background_map=None, projection=None, extent=None)¶

Create and configure matplotlib axes based on background map type.

Set up the plot axes using Cartopy or standard matplotlib. Supports different types of background maps including stock images, OpenStreetMap, and Natural Earth datasets with optional geographic projections and extents.

Parameters:

background_map (str or None) – Type of background map to use. Can be ‘stock’ for built-in stock image, ‘OSM’ for OpenStreetMap or ‘NE’ for Natural Earth features. If not provided creates mapltolib Axes object without any background.
projection (cartopy.crs.Projection, optional) – Projection to use for the plot. For ‘OSM’ map background it is automatically set. For other map backgrounds it defaults to PlateCarree() if not provided.
extent (list-like or None, optional) – Geographic bounding box [lon_min, lat_min, lon_max, lat_max]. Defaults to None.

Returns:

matplotlib.axes.Axes or cartopy.mpl.geoaxes.GeoAxes object with appropriate projection and background.

poligrain.spatial module¶

Functions for calculating spatial distance, intersections and finding neighbors.

class poligrain.spatial.GridAtLines(da_gridded_data, ds_line_data, grid_point_location='center', use_lon_lat=False)¶

Bases: object

Get path-averaged grid values along lines.

For each line, e.g. a CML path, in ds_line_data the grid intersections are calculated and stored as sparse matrix during initialization. Via __call__ the time series of path-averaged grid values for each line can be calculated.

Note that da_gridded_data and ds_line_data have to contain the correct coordinate variable names, see below.

Parameters:

da_gridded_data (DataArray | Dataset) – The gridded data, typically rainfall fields from weather radar. It has to contain lon and lat variables with coordinates as 2D matrix.
ds_line_data (DataArray | Dataset) – The line data, typically from CMLs. It has to contain lon and lat coordinates for site_0 and site_1 according to the OPENSENSE data format conventions.
grid_point_location (str) – The location of the grid point for which the coordinates are given. Can be “center” or “lower_right”. Default is “center”.
use_lon_lat (bool)

class poligrain.spatial.GridAtPoints(da_gridded_data, da_point_data, nnear, stat='best', use_lon_lat=False)¶

Bases: object

Get grid values at points or in their neighborhood.

This class is based on wradblib.adjust.RawAtObs which already implements all required functionality. To make usage simpler and equivalent to GridAtLines, we just provide a wrapper around the RawAtObs code that was copy-pasted to not rely on a import of wradlib. In addition we use xarray.DataArray or xarray.Dataset as input to avoid reshaping and passing point and grid coordinates.

Note that da_gridded_data and da_point_data have to contain the correct coordinate variable names, see below.

Parameters:

da_gridded_data (DataArray | Dataset) – The gridded data, typically rainfall fields from weather radar. It has to contain lon and lat variables with coordinates as 2D matrix.
da_point_data (DataArray | Dataset) – The point data, typically from rain gauges. The coordinates must be given as lon and lat variable according to the OPENSENSE data format conventions.
nnear (int) – Number of neighbors which should be considered in the vicinity of each point in obs. Note that this is using a nearest-neighbor-lookup and does not guarantee that e.g. nnear=9 results in a 3x3 grid with the central pixel in the middle.
stat (str) – Name of stat function to be used to derive one value from all neighboring grid cells in case that nnear > 1. Default is “best”.

poligrain.spatial.best(x, y)¶

Find the values of y which corresponds best to x.

If x is an array, the comparison is carried out for each element of x.

This is a copy-paste function from wradlib.adjust that we need for our implementation that is similar to their RawAtObs.

Parameters:

x (float | numpy:numpy.ndarray) – float or 1-d array of float
y (numpy:numpy.ndarray) – array of float

Returns:

output – 1-d array of float with length len(y)

Return type:

numpy:numpy.ndarray

poligrain.spatial.calc_intersect_weights(x1_line, y1_line, x2_line, y2_line, x_grid, y_grid, grid_point_location='center', offset=None)¶

Calculate intersecting weights for a line and a grid.

Calculate the intersecting weights for the line defined by x1_line, y1_line, x2_line and y2_line and the grid defined by the x- and y- grid points from x_grid and y_grid.

Parameters:

x1_line (float)
y1_line (float)
x2_line (float)
y2_line (float)
x_grid (2D array) – x-coordinates of grid points
y_grid (2D array) – y-coordinates of grid points
grid_point_location (str, optional) – The location of the grid point for which the coordinates are given. Can be “center” or “lower_right”. Default is “center”.
offset (float, optional) – The offset in units of the coordinates to constrain the calculation of intersection to a bounding box around the CML coordinates. The offset specifies by how much this bounding box will be larger then the width- and height-extent of the CML coordinates.

Returns:

intersect – 2D array of intersection weights with shape of the longitudes- and latitudes grid of xr_ds

Return type:

array

poligrain.spatial.calc_point_to_point_distances(ds_points_a, ds_points_b)¶

Calculate the distance between the point coordinates of two datasets.

Note that both datasets that are passed as input have to have the variables x and y which should be projected coordinates that preserve lengths as good as possible.

Parameters:

ds_points_a (xr.DataArray | xr.Dataset) – One dataset of points.
ds_points_b (xr.DataArray | xr.Dataset) – The other dataset of points.

Returns:

Distance matrix in meters, assuming x and y coordinate variables in the supplied data are projected to something like UTM. The dimensions of the matrix are the id dimensions of the two input datasets. The id values are also provided along each dimension. The second dimension name is appended with _neighbor.

Return type:

xr.DataArray

poligrain.spatial.calc_sparse_intersect_weights_for_several_cmls(x1_line, y1_line, x2_line, y2_line, cml_id, x_grid, y_grid, grid_point_location='center')¶

Calculate sparse intersection weights matrix for several CMLs.

This function just loops over calc_intersect_weights for several CMLs, but stores the intersection weight matrices as sparase matrix to save space and to allow faster calculation with sparse.tensordot afterwards.

Function arguments are the same as in calc_intersect_weights, except that we take a 1D array or list of line coordinates here.

Parameters:

x1_line (1D-array or list of float)
y1_line (1D-array or list of float)
x2_line (1D-array or list of float)
y2_line (1D-array or list of float)
cml_id (1D-array or list of strings)
x_grid (2D array) – x-coordinates of grid points
y_grid (2D array) – y-coordinates of grid points
grid_point_location (str, optional) – The location of the grid point for which the coordinates are given. Can be “center” or “lower_right”. Default is “center”.

Returns:

intersect – The variables x_grid and y_grid are used as coordinates.

Return type:

xarray.DataArray with sparse intersection weights

poligrain.spatial.get_closest_points_to_line(ds_cmls, ds_gauges, max_distance, n_closest)¶

Get closest points to line.

Finds n closest points from a CML within given max distance. Note that the function guarantees that all returned points are within max distance to the CML, not that all points that are within max distance are returned. Uses KDTree for fast processing of large datasets.

Parameters:

ds_cmls (xarray.Dataset) – Dataset of line data using the OpenSense naming convention for CMLs. It must contain the coordinate cml_id with the cml names. It must also contain projected coordinates site_0_y, site_0_x, site_1_y and site_1_x as well as the CML length.
ds_gauges (xarray.Dataset) – Dataset of point data using the OpenSense data format conventions for PWS. The dataset must contain the coordinate ‘id’ with the PWS names. It must also contain projected coordinates x and y.
max_distance (float) – Maximum distance a point can have to the CML, measured as the smallest distance from the point to the line. Points outside this range is not considered close to the CML.
n_closest (int) – Maximum number of points that are returned.

Returns:

closest_gauges – Dataset with CML ids and corresponding n_closest point names and distance. If a CML has less that “n_closest” nearby points, the remaining entries in variable “distance” are filled with np.inf and the remaining entries in variable “id_neighbor” are filled with None.

Return type:

xarray.Dataset

poligrain.spatial.get_closest_points_to_point(ds_points, ds_points_neighbors, max_distance, n_closest)¶

Get the closest points for given point locations.

Note that both datasets that are passed as input have to have the variables x and y which should be projected coordinates that preserve lengths as good as possible.

Parameters:

ds_points (xr.DataArray | xr.Dataset) – This is the dataset for which the nearest neighbors will be looked up. That is, for each point location in this dataset the nearest neighbors from ds_points_neighbors will be returned.
ds_points_neighbors (xr.DataArray | xr.Dataset) – This is the dataset from which the nearest neighbors will be looked up.
max_distance (float) – The allowed distance of neighbors has to be smaller than max_distance. The unites are the units used for the projected coordinates x and y in the two datasets.
n_closest (int) – The maximum number of nearest neighbors to be returned.

Returns:

A dataset which has distance and neighbor_id as variables along the dimensions id, taken from ds_points and n_closest. The unit of the distance follows from the unit of the projected coordinates of the input datasets. The neighbor_id entries for point locations that are further away then max_distance are set to None. The according distances are np.inf.

Return type:

xr.Dataset

poligrain.spatial.get_grid_time_series_at_intersections(grid_data, intersect_weights)¶

Get time series from grid data using sparse intersection weights.

Time series of grid data are derived via intersection weights of CMLs. Please note that it is crucial to have the correct order of dimensions, see parameter list below.

Input can be ndarrays or xarray.DataArrays. If at least one input is a DataArray, a DataArray is returned.

Parameters:

grid_data (ndarray or xarray.DataArray) – 3-D data of the gridded data we want to extract time series from at the given pixel intersection. The order of dimensions must be (‘time’, ‘y’, ‘x’). The size in the x and y dimension must be the same as in the intersection weights.
intersect_weights (ndarray or xarray.DataArray) – 3-D data of intersection weights. The order of dimensions must be (‘cml_id’, ‘y’, ‘x’). The size in the x and y dimension must be the same as in the grid data. Intersection weights do not have to be a sparse.array but will be converted to one internally before doing a sparse.tensordot contraction.

Returns:

grid_intersect_timeseries (ndarray or xarray.DataArray) – The time series for each grid intersection. If at least one of the inputs is a xarray.DataArray, a xarray.DataArray is returned. Coordinates are derived from the input.
DataArrays.

poligrain.spatial.get_point_xy(ds_points)¶

Get x and y coordinate data for point Dataset or DataArray.

Use this function instead of just getting the x and y variables from the xarray.Dataset or DataArray, because it will do some additional checks. Furthermore it will facility adapting to changing naming conventions in the future.

Parameters:: ds_points (xr.DataArray | xr.Dataset) – The Dataset or DataArray to get x and y from. It has to obey to the OPENSENSE data format conventions.
Returns:: x and y as xr.DataArray
Return type:: tuple[xr.DataArray, xr.DataArray]

poligrain.spatial.project_point_coordinates(x, y, target_projection, source_projection='EPSG:4326')¶

Project coordinates x and y of point data.

Note that x and y have to be xarray.DataArray so that we can return the projected coordinates also as xarray.DataArray with the correct coord data so that they can easily and safely added to an existing xarray.Dataset, e.g. like the following code:

>>> ds.coords["x"], ds.coords["y"] = plg.spatial.project_point_coordinates(
...     ds.lon, ds.lat, target_projection="EPSG:25832",
...     )

Parameters:

x (xr.DataArray) – The coordinates along the x-axis
y (xr.DataArray) – The coordinates along the y-axis
target_projection (str) – An EPSG string that defines the projection the points shall be projected too, e.g. “EPSG:25832” for UTM zone 32N
source_projection (str, optional) – An EPSG string that defines the projection of the supplied x and y data, by default “EPSG:4326”

Returns:

The projected coordinates

Return type:

tuple[xr.DataArray, xr.DataArray]