New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Masked fill: slow operations compared to tch for small models #2602

Open

JHucker opened this issue Nov 7, 2024 · 0 comments

JHucker commented Nov 7, 2024

This is a follow up to: #2583

In our use case, the masking operations are quite a bit slower in candle vs tch: ~150 μs vs ~40 μs using a batch size of 200.

I’ve created a minrep repo demonstrating the same for various batch sizes, please see: https://github.com/JHucker/candle_mask_minrep

To make sure I was using candle masked-ops correctly, I followed the same mask creation and application from this example.

Thanks in advance, appreciate any help or feedback with regards to this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment