Skip to content

Commit

Permalink
einops accessor (#51)
Browse files Browse the repository at this point in the history
* homogenize einops API

* add accessors

* add some docs on accessors

* fix about einops docs

* make running einops tutorial locally easier

* improve docs and update changelog
  • Loading branch information
OriolAbril committed May 30, 2023
1 parent 2b92d28 commit 4fc61d7
Show file tree
Hide file tree
Showing 20 changed files with 807 additions and 391 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@
# website: http://jupyter.org/

.ipynb_checkpoints
.virtual_documents
*/.ipynb_checkpoints/*
*/.virtual_documents/*
pati.ipynb

# IPython
Expand Down
3 changes: 1 addition & 2 deletions docs/source/api/einops.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,13 @@
.. automodule:: xarray_einstats.einops
```


```{eval-rst}
.. autosummary::
:toctree: generated/
rearrange
raw_rearrange
reduce
raw_reduce
```

```{eval-rst}
Expand Down
1 change: 0 additions & 1 deletion docs/source/api/linalg.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@
:toctree: generated/
einsum
raw_einsum
einsum_path
matmul
linalg.matrix_transpose
Expand Down
65 changes: 49 additions & 16 deletions docs/source/background/einops.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,18 +23,11 @@ of the elements respectively: `->`, space as delimiter and parenthesis:
dimension order in xarray doesn't matter and there isn't much to be done without knowing
the dimension names.

:::{attention}
We also provide some cruder wrappers with syntax closer to einops.
We are experimenting on trying to find the right spot between being clear,
semantic and flexible yet concise.

These `raw_` wrappers like {func}`xarray_einstats.einops.raw_rearrange`
impose several extra constraints to accepted xarray inputs, in addition
to dimension names being strings.

The example data we are using on this page uses single word alphabetical
dimensions names which allows us to demonstrate both side by side.
:::
However, there are also many cases in which dimension names in xarray will be strings
without any spaces nor parenthesis in them. So similarly to the option of
doing `da.stack(dim=("dim1", "dim2"))` which can't be used for all valid
dimension names but is generally easier to write and less error prone,
`xarray_einstats.einops` also provides two possible syntaxes.

The guiding principle of the einops module is to take the input expressions
in our list of str/list/dict and translate them to valid einops expressions
Expand All @@ -45,20 +38,60 @@ and thus support "partial" expressions that cover only the dimensions
that will be modified.

Another important consideration is to take into account that _in xarray_,
dimension names should not matter, hence the constraint of using dicts
dimension order should not matter, hence the constraint of using dicts
on the left side. Imposing this constraint also
makes our job of filling in the "partial" expressions much easier.
We do accept that in the right side as we can generate sensible
default names.

As for the `raw_` wrappers, in order to avoid rewriting the partial
expression filling logic, their behaviour is very simplified:
As for the alternative API, its syntax is much closer to that in einops,
as it is string base, but it does add some extra constraints to the dimension names
that are compatible with it.

To avoid rewriting the partial expression filling logic, their behaviour is very simplified:
1. Split the expression in two if possible using `->`
2. Convert each side to list of str/list/dict following the rules of the complete wrappers
3. Call the complete wrapper

This has an extra and a bit hidden advantage. einops supports
_explicit_ ellipsis but we don't, to us an ellipsis is not writing
the dimension name in the expression. Therefore, `.` are valid
in our `raw_` expressions, we convert those to "full xarray" expressions
in our string expressions, we convert those to "full xarray" expressions
which support everything and we don't need extra logic to handle dots either.

## Examples

Given a {class}`~xarray.DataArray` `da` with dimensions `a`, `b`, `c` and `d`,
the table below shows the result of equivalent expressions
and the dimensions (and order) present in their output:

```python
# list syntax
rearrange(da, ["c", "d", "a", "b"])
# string syntax
rearrange(da, "c d a b")
# dims in output: `c`, `d`, `a`, `b`

# ----------------------------

# list syntax
rearrange(
da,
[{"e": ["c", "d"]}, {"f": ["a", "b"]}]
)
# string syntax
rearrange(da, "(c d)=e (a b)=f")
# dims in output: `e`, `f`

# ----------------------------

# list syntax
rearrange(
da,
["a2", "c", "a1", {"e": ["d", "b"]}],
pattern_in=[{"a": ["a1", "a2"]}]
)
# string syntax
rearrange(da, "(a1 a2)=a -> a1 c a2 (d b)=e")
# dims in output: `a1`, `c`, `a2`, `e`
```
3 changes: 3 additions & 0 deletions docs/source/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,14 @@
## v0.x.x (Unreleased)
### New features
* {func}`.ecdf` now returns a DataArray to be compatible with {meth}`~xarray.Dataset.map` {pull}`47`
* Added `.linalg` and `.einops` accessors for `DataArray` objects {pull}`51`

### Maintenance and fixes
* Update dependencies and follow new pylint recommendations {pull}`49`

### Documentation
* Add documentation showing how to use accessors {pull}`51`
* Ease running einops tutorial locally {pull}`51`

## v0.5.1 (2023 Jan 20)
### Maintenance and fixes
Expand Down
1 change: 1 addition & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
"jupyter_sphinx",
"sphinx_design",
"matplotlib.sphinxext.plot_directive",
"sphinx_togglebutton",
]

templates_path = ["_templates"]
Expand Down
5 changes: 5 additions & 0 deletions docs/source/contributing/dev_reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,10 @@ Runs test suite with pytest
:toctree: generated/
PairHandler
_translate_pattern_string
_einsum_parent
_einsum_path
_einsum
```

### Einops
Expand All @@ -70,6 +73,8 @@ Runs test suite with pytest
DimHandler
process_pattern_list
translate_pattern
_reduce
_rearrange
```

### Tutorial
Expand Down
Loading

0 comments on commit 4fc61d7

Please sign in to comment.