-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cupy fixes #265
base: cupy
Are you sure you want to change the base?
Cupy fixes #265
Conversation
Thanks for updating this! This looks like a great step forward. (EDIT: somehow lost this line the first time) The commit history looks a bit messy. Maybe we can clean up the commit history by first rebasing HIPS:cupy onto HIPS:master, then rebasing any new commits here onto the result? (I can do the first step if that would help, then github can probably do the second.) Also, I think the commit with message "Add vspace support for cupy" might be better as "Add flatten support for cupy". However, it's using the old flatten code. Jamie's new flatten code dispatches to the vspace and would probably be simpler to use here. (Also I had another implementation of flatten in cupy/autograd_util.py for use with cupy, but I think Jamie's new version should make flatten work for anything that has vspaces set up.) |
Sure, if you rebase Yes, it's using the |
Just pushed the rebase. Can you resolve conflicts, or should I? Good point re: unflatten, though can you point out which |
I rebased, commit history looks good now. I generalized the vector spaces a bit so that they have a |
PS The only thing that this change breaks is that the vector spaces of |
Hi Bart, thanks for working on this! Wrapping a good GPU library is probably the best thing we can do to get wider adoption of Autograd. But I'm not totally convinced that this
An alternative approach is to have A drawback of this approach is that it gives numpy a special status. We like to hold on to the idea that Autograd should be backend-agnostic. But the truth is that numpy has always been a dependency and it's likely to remain that way, at least as a fallback library. It also helps to have a concrete array type for instantiating and evaluating tests, and numpy arrays are as good a choice as any. (There's an argument that Python's native Are there other objections to this approach I'm not thinking of? Then there's the question of the user-exposed function |
This is exciting stuff! Having only looked over the pr fairly briefly, I think I agree with @dougalm's points...
I reckon this is the right approach, provided that the performance isn't ridiculously terrible — if, say it led to the tests taking 10x longer to run I think that would be unacceptable. I have no experience with GPU computing but I'm assuming the situation won't be that bad. Then have either @dougalm's suggestion for the high level I'm wondering to what extent it's possible to test this new functionality on Travis, which has no GPU support, or some other CI. There's an option to install cupy without cuda, however I've tried doing that and almost none of cupy's own tests run on my machine (which has no GPU). I was hoping there might be some simulated GPU mode, but this doesn't seem to be the case. Anyone got any thoughts on testing? |
In reply to some of @dougalm's points:
The inability to mix NumPy and CuPy arrays is not because of any approach,
it's inherent to the libraries themselves. You can't operate on both data
types at the same time. Returning a NumPy array by default in places will
often just crash.
If you want to be backend agnostic, the correct approach to empty container
spaces would probably be to special case them and return empty lists
instead of empty NumPy arrays when calling flatten, ones, etc. on them.
It's the only way of being agnostic.
That the lib approach requires wrapped libraries to have the same API is
what allows you to avoid code duplication. If you want to wrap a library
that has a different API, you can write a super thin wrapper around it.
Overall, I haven't really followed the back and forth between flatten, but
from a software design perspective it seems like a bad idea to duplicate
logic for different backends. It's a matter of time before the code will
rot.
The performance aspect seems easy enough to address. You can write a custom
"lib" function that acts like "vspace(x).lib" with almost no overhead.
|
My proposal was to have
The performance issue I was talking about wasn't overhead cost, it was the cost of copying data from the GPU to the CPU which will always be expensive! But for testing I think it's fine (subject to @j-towns' caveat that if it slows down the tests 10X it might get annoying). |
A few fixes I had to make to make the Cupy branch run.