Grouping by multiple arrays with Xarray

Tuesday, July 18th, 2023 (about 1 year ago)



TLDR#

Xarray now supports grouping by multiple variables (docs). 🎉 😱 🤯 🥳. Try it out!

How do I use it?#

Install xarray>=2024.08.0 and optionally flox for better performance with reductions.

Simple example#

Set up a multiple variable groupby using Grouper objects.

1import xarray as xr
2from xarray.groupers import UniqueGrouper
3
4da = xr.DataArray(
5    np.array([1, 2, 3, 0, 2, np.nan]),
6    dims="d",
7    coords=dict(
8        labels1=("d", np.array(["a", "b", "c", "c", "b", "a"])),
9        labels2=("d", np.array(["x", "y", "z", "z", "y", "x"])),
10    ),
11)
12
13gb = da.groupby(labels1=UniqueGrouper(), labels2=UniqueGrouper())
14gb
15
<DataArrayGroupBy, grouped over 2 grouper(s), 9 groups in total:
	'labels1': 3 groups with labels 'a', 'b', 'c'
	'labels2': 3 groups with labels 'x', 'y', 'z'>

Reductions work as usual:

1gb.mean()
2
xarray.DataArray (labels1: 3, labels2: 3)> Size: 72B
array([[1. , nan, nan],
       [nan, 2. , nan],
       [nan, nan, 1.5]])
Coordinates:
  * labels1  (labels1) object 24B 'a' 'b' 'c'
  * labels2  (labels2) object 24B 'x' 'y' 'z'

So does map:

1gb.map(lambda x: x[0])
2
<xarray.DataArray (labels1: 3, labels2: 3)> Size: 72B
array([[ 1., nan, nan],
       [nan,  2., nan],
       [nan, nan,  3.]])
Coordinates:
  * labels1  (labels1) object 24B 'a' 'b' 'c'
  * labels2  (labels2) object 24B 'x' 'y' 'z'

Multiple Groupers#

Combining different grouper types is allowed, that is you can combine categorical grouping with UniqueGrouper, binning with BinGrouper, and resampling with TimeResampler.

1ds = xr.Dataset(
2        {"foo": (("x", "y"), np.arange(12).reshape((4, 3)))},
3        coords={"x": [10, 20, 30, 40], "letters": ("x", list("abba"))},
4    )
5gb = ds.groupby(x=BinGrouper(bins=[5, 15, 25]), letters=UniqueGrouper())
6gb
7
from xarray.groupers import BinGrouper

ds = xr.Dataset(
        {"foo": (("x", "y"), np.arange(12).reshape((4, 3)))},
        coords={"x": [10, 20, 30, 40], "letters": ("x", list("abba"))},
    )
gb = ds.foo.groupby(x=BinGrouper(bins=[5, 15, 25]), letters=UniqueGrouper())
gb
<DatasetGroupBy, grouped over 2 grouper(s), 4 groups in total:
	'x_bins': 2 groups with labels (5,, 15], (15,, 25]
	'letters': 2 groups with labels 'a', 'b'>
1gb.mean()
2
<xarray.DataArray 'foo' (x_bins: 2, letters: 2, y: 3)> Size: 96B
array([[[ 0.,  1.,  2.],
        [nan, nan, nan]],

       [[nan, nan, nan],
        [ 3.,  4.,  5.]]])
Coordinates:
  * x_bins   (x_bins) object 16B (5, 15] (15, 25]
  * letters  (letters) object 16B 'a' 'b'
Dimensions without coordinates: y
Back to Blog

xarray logo

© 2024, Xarray core developers. Apache 2.0 Licensed.

7c3e71d

TwitterGitHubYouTubeBlog RSS Feed
Powered by â–² Vercel