Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support additional Execution Providers in ONNX wasi-nn backend #8547

Open
kaivol opened this issue May 3, 2024 · 2 comments
Open

Support additional Execution Providers in ONNX wasi-nn backend #8547

kaivol opened this issue May 3, 2024 · 2 comments

Comments

@kaivol
Copy link

kaivol commented May 3, 2024

Feature

Currently the ONNX backend in wasmtime-wasi-nn only uses the default CPU execution provider and ignores the ExecutionTarget requested by the WASM caller.

fn load(&mut self, builders: &[&[u8]], target: ExecutionTarget) -> Result<Graph, BackendError> {
if builders.len() != 1 {
return Err(BackendError::InvalidNumberOfBuilders(1, builders.len()).into());
}
let session = Session::builder()?
.with_optimization_level(GraphOptimizationLevel::Level3)?
.with_model_from_memory(builders[0])?;
let box_: Box<dyn BackendGraph> =
Box::new(ONNXGraph(Arc::new(Mutex::new(session)), target));
Ok(box_.into())
}

I would like to suggest adding support for additional execution providers (CUDA, TensorRT, ROCm, ...) to wasmtime-wasi-nn.

Benefit

Improved performance for WASM modules using the wasi-nn API.

Implementation

ort already has support for many execution providers, so integrating these into wasmtime-wasi-nn should not be to much work.
I would be interested in looking into this, however, I only really have the means to test the DirectML and NVIDIA CUDA / TensorRT EPs.

Alternatives

Leave it to the users to add support for additional execution providers.

@abrown
Copy link
Contributor

abrown commented Jun 11, 2024

I was looking at old issues and ran across this one (sorry for such a late reply!): I completely agree with this idea. I am tempted to say "go for it!" but maybe there is some coordination needed. E.g., I think @jianjunz has started enabling some DirectML bits in #8756. And @devigned may have some opinions on the best way to do this. But from my perspective, this seems like a worthwhile avenue to pursue.

@devigned
Copy link
Contributor

I think this is a great idea! One interesting part will be testing. We may need to spin up some hardware to make sure the functionality stays evergreen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants