[FEATURE] Support for 'cuda' device in NexaVoiceInference class #143

nmandic78 · 2024-10-03T20:59:49Z

I noticed that the NexaVoiceInference class hardcodes the device to "cpu", making it impossible to use a GPU for inference. I suggest adding a device argument to allow switching between "cpu" and "cuda". Here’s the proposed change:

    def __init__(self, model_path, local_path=None, device='cpu', **kwargs):
        self.model_path = model_path
        self.downloaded_path = local_path
        self.device = device   # this line added
        self.params = DEFAULT_VOICE_GEN_PARAMS

and here:

self.model = WhisperModel(
    self.downloaded_path,
    device=self.device,  # Change this line
    compute_type=self.params["compute_type"],
)

Would you be open to a pull request for this change?

Similar Features or References

No response

The text was updated successfully, but these errors were encountered:

zhiyuan8 · 2024-10-03T22:27:57Z

Sure, we will add the option to support using huggingface transformer style usge of CUDA, such as cuda:0 in our next release. Now all GPUs are used by defaults, if you use CUDA compilation options.

nmandic78 · 2024-10-04T07:47:51Z

I did use CUDA compilation options (CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"). NexaTextInference, for example, do use GPU by default, but this is in NexaVoiceInference class:

            self.model = WhisperModel(
                self.downloaded_path,
                device="cpu",
                compute_type=self.params["compute_type"],
            )

zhycheng614 · 2024-10-04T23:06:51Z

I did use CUDA compilation options (CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"). NexaTextInference, for example, do use GPU by default, but this is in NexaVoiceInference class:
            self.model = WhisperModel(
                self.downloaded_path,
                device="cpu",
                compute_type=self.params["compute_type"],
            )

We hard-coded "cpu" here because to enable cuda for faster-whisper, wither cuBLAS or cuDNN is needed on your machine. Currently we cannot build this into our sdk. If we change "cpu" to "cuda" or "auto", it won't work because of lack of dependency.

However, you can achieve to run on cuda by doing this:

Refer to the faster whisper official github to know how to install cuBLAS or cuDNN as the dependency required by GPU running.
Change our python source code (on your machine, not through pull request) either in your environment packages or through pip install -e ., locate this issue and change "cpu" to "auto" or "cuda". And it should work for you then.

Thank you for your question and we will be committed to thoroughly solve this problem in the near future.

nmandic78 added the 💡 feature request New feature or request label Oct 3, 2024

zhiyuan8 assigned zhycheng614 Oct 3, 2024

zhycheng614 closed this as completed Oct 4, 2024

zhycheng614 reopened this Oct 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Support for 'cuda' device in NexaVoiceInference class #143

[FEATURE] Support for 'cuda' device in NexaVoiceInference class #143

nmandic78 commented Oct 3, 2024 •

edited

Loading

zhiyuan8 commented Oct 3, 2024

nmandic78 commented Oct 4, 2024

zhycheng614 commented Oct 4, 2024

[FEATURE] Support for 'cuda' device in NexaVoiceInference class #143

[FEATURE] Support for 'cuda' device in NexaVoiceInference class #143

Comments

nmandic78 commented Oct 3, 2024 • edited Loading

Similar Features or References

zhiyuan8 commented Oct 3, 2024

nmandic78 commented Oct 4, 2024

zhycheng614 commented Oct 4, 2024

nmandic78 commented Oct 3, 2024 •

edited

Loading