drop_duplicates

UnitsAwareDataArray.drop_duplicates(dim: Hashable | Iterable[Hashable], *, keep: Literal['first', 'last', False] = 'first') Self

Returns a new DataArray with duplicate dimension values removed.

Parameters:
  • dim (dimension label or labels) – Pass to drop duplicates along all dimensions.

  • keep ({"first", "last", False}, default: "first") –

    Determines which duplicates (if any) to keep.

    • "first" : Drop duplicates except for the first occurrence.

    • "last" : Drop duplicates except for the last occurrence.

    • False : Drop all duplicates.

Return type:

DataArray

See also

Dataset.drop_duplicates

Examples

>>> da = xr.DataArray(
...     np.arange(25).reshape(5, 5),
...     dims=("x", "y"),
...     coords={"x": np.array([0, 0, 1, 2, 3]), "y": np.array([0, 1, 2, 3, 3])},
... )
>>> da
<xarray.DataArray (x: 5, y: 5)> Size: 200B
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])
Coordinates:
  * x        (x) int64 40B 0 0 1 2 3
  * y        (y) int64 40B 0 1 2 3 3
>>> da.drop_duplicates(dim="x")
<xarray.DataArray (x: 4, y: 5)> Size: 160B
array([[ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])
Coordinates:
  * x        (x) int64 32B 0 1 2 3
  * y        (y) int64 40B 0 1 2 3 3
>>> da.drop_duplicates(dim="x", keep="last")
<xarray.DataArray (x: 4, y: 5)> Size: 160B
array([[ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])
Coordinates:
  * x        (x) int64 32B 0 1 2 3
  * y        (y) int64 40B 0 1 2 3 3

Drop all duplicate dimension values:

>>> da.drop_duplicates(dim=...)
<xarray.DataArray (x: 4, y: 4)> Size: 128B
array([[ 0,  1,  2,  3],
       [10, 11, 12, 13],
       [15, 16, 17, 18],
       [20, 21, 22, 23]])
Coordinates:
  * x        (x) int64 32B 0 1 2 3
  * y        (y) int64 32B 0 1 2 3