Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: allow np.arrays in ak.full_like as fill_value #3315

Merged
merged 1 commit into from
Nov 26, 2024

Conversation

pfackeldey
Copy link
Collaborator

@pfackeldey pfackeldey commented Nov 22, 2024

This PR allows to pass numpy arrays as fill value to ak.full_like. They are then broadcasted accordingly into the shape of the reference array, e.g.:

ref = ak.Array(np.ones((2, 2)))
ak.full_like(ref, fill_value=np.array([2.0, 3.0]))
# >> <Array [[2, 3], [2, 3]] type='2 * 2 * float64'>

Fixes #2787.

Currently, this only works with numpy arrays. Maybe at some point this should be extended with arbitrary awkward-arrays?

Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, this only works with numpy arrays. Maybe at some point this should be extended with arbitrary awkward-arrays?

The main use-case for fill_value is an integer or a string, so this is extending upward in complexity. However, you're right that it should accept some complex types: the question is how to bit off an appropriately sized chunk of possibilities? I know that it could support [] as a fill_value in the past—isn't that converted into an Awkward Array?

Is the thing that's new here the preservation of regular type if it's NumPy? I think that is_array_like doesn't check for NumPy specifically, but for an __array__ method or something.

Is it the case right now that passing an array-like becomes a regular-length list and any other sequence becomes a variable-length list?

@pfackeldey
Copy link
Collaborator Author

pfackeldey commented Nov 22, 2024

Ok, so just to clarify what is possible and what not with and without this PR:

import awkward as ak
import numpy as np

ref = ak.Array(np.ones((2, 2)))

ak.full_like(ref, 2)
# >> <Array [[2, 2], [2, 2]] type='2 * 2 * float64'>

ak.full_like(ref, "a")
# >> ValueError: could not convert string to float: np.str_('a')

# oh... this is possible but only if the ref has the correct dtype? 
# - a little unexpected for me, I thought there is an automatic
# type promotion to whatever the fill_value is
ak.full_like(ak.Array([["a", "b"], ["c", "d"]]), "a")
# >> <Array [['a', 'a'], ['a', 'a']] type='2 * var * string'>

ak.full_like(ref, [])
# >> ValueError: could not broadcast input array from shape (0,) into shape (4,)

ak.full_like(ref, None)
# >> TypeError: Encountered a None value, but None conversion/promotion is disabled

# this doesn't work because `is_array_like` is false for any `ak.Array`.
# I could allow this to work, but in principle I'm relying on nplikes's correct broadcasting
# implementation for `nplike.full_like`, which would go wrong for var-len `ak.Arrays`
ak.full_like(ref, ak.Array([2, 3]))
# >> ValueError: could not broadcast input array from shape (2,) into shape (4,)

so basically only 0-d number-types (e.g. int or float) and strings are currently usable.

This PR adds the following new possibilities:

import awkward as ak
import numpy as np

ref = ak.Array(np.ones((2, 2)))

ak.full_like(ref, np.array([2, 3]))
# >> <Array [[2, 3], [2, 3]] type='2 * 2 * float64'>


# also other backend because the check is for any `is_array_like`:
import jax.numpy as jnp

ak.jax.register_and_check()

jax_arr = ak.full_like(ak.to_backend(ref, "jax"), jnp.array([2, 3]))
print(jax_arr)
# >> <Array [[2.0, 3.0], [2.0, 3.0]] type='2 * 2 * float32'>
print(jax_arr.layout.backend)
# >> <awkward._backends.jax.JaxBackend at 0x1095f6860>

# doesn't work if the backend don't match
ak.full_like(ref, jnp.array([2, 3]))
# >> ValueError: cannot operate on arrays with incompatible backends. Use #ak.to_backend to coerce the arrays to the same backend

Is it the case right now that passing an array-like becomes a regular-length list and any other sequence becomes a variable-length list?

Not sure if I understand you correctly, but the only difference is that in the array-like passing case the array-like will be broadcasted into the reference array in the same way as numpy would. This only works for rectangular arrays.

Most of these advanced use cases can actually be achieved through something like:

full_like = ak.Array([2, 3]) * ak.ones_like(ref)

That would do the correct broadcasting, but comes at the price of an unnecessary operation (multiplication here).

@jpivarski
Copy link
Member

What I was missing was that this is ak.full_like, rather than ak.fill_none. What I said about filling with [] works for ak.fill_none.

@jpivarski
Copy link
Member

As I said in our meeting, feel free to merge this. It's a good feature.

@pfackeldey pfackeldey merged commit 582ef9c into main Nov 26, 2024
43 of 44 checks passed
@pfackeldey pfackeldey deleted the pfackeldey/ak_full_like_with_broadcasting branch November 26, 2024 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ak.full_like should support asymmetric broadcasting
2 participants