Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does using kilosort3 in docker require GPU access now? #3562

Open
taningh86 opened this issue Dec 2, 2024 · 11 comments
Open

Does using kilosort3 in docker require GPU access now? #3562

taningh86 opened this issue Dec 2, 2024 · 11 comments
Labels
question General question regarding SI

Comments

@taningh86
Copy link

Hi,
Wondering if recent updates in Spikeinterface or docker have changed setting such that it requires GPU access now. While running kilosort3 I am getting 'Unable to find a supported GPU device' error, which I think is docker trying to access GPU and failing because pytorch is not currently installed in my 'si_env' environment.
Please confirm if that is the case or there's more to this. If GPU access is now required then I am going to install Pytorch in my environment compatible with CUDA 11.8.
Thanks
Jimmy

@alejoe91 alejoe91 added the question General question regarding SI label Dec 2, 2024
@alejoe91
Copy link
Member

alejoe91 commented Dec 2, 2024

Hi, Kilosort3 has always required a GPU. The docker image comes with per-installed CUDA binaries, but the GPU needs to be available on the system. Does runningnvidia-smi work?

@taningh86
Copy link
Author

yeah running nvidia-smi works and gives the follwing info as usual:```

Mon Dec 2 09:46:39 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 556.12 Driver Version: 556.12 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Quadro RTX 5000 WDDM | 00000000:01:00.0 On | Off |
| 34% 32C P8 15W / 230W | 5403MiB / 16384MiB | 7% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 1560 C+G C:\Users\Gregg\anaconda3\python.exe N/A |
| 0 N/A N/A 8556 C+G ...__8wekyb3d8bbwe\Notepad\Notepad.exe N/A |
| 0 N/A N/A 12440 C+G ...oogle\Chrome\Application\chrome.exe N/A |
| 0 N/A N/A 13128 C+G C:\Users\Gregg\anaconda3\python.exe N/A |
| 0 N/A N/A 13916 C+G ...2txyewy\StartMenuExperienceHost.exe N/A |
| 0 N/A N/A 13936 C+G C:\Windows\explorer.exe N/A |
| 0 N/A N/A 14284 C+G ...nt.CBS_cw5n1h2txyewy\SearchHost.exe N/A |
| 0 N/A N/A 14452 C+G ...\Docker\frontend\Docker Desktop.exe N/A |
| 0 N/A N/A 16960 C+G ...CBS_cw5n1h2txyewy\TextInputHost.exe N/A |
| 0 N/A N/A 17572 C+G ...oogle\Chrome\Application\chrome.exe N/A |
| 0 N/A N/A 19176 C+G ...crosoft\Edge\Application\msedge.exe N/A |
| 0 N/A N/A 19512 C+G ...es (x86)\Dropbox\Client\Dropbox.exe N/A |
| 0 N/A N/A 20568 C+G ...\Local\slack\app-4.40.133\slack.exe N/A |
| 0 N/A N/A 21616 C+G ...__8wekyb3d8bbwe\WindowsTerminal.exe N/A |
| 0 N/A N/A 23976 C+G ...5n1h2txyewy\ShellExperienceHost.exe N/A |
| 0 N/A N/A 24660 C+G ...on\HEX\Creative Cloud UI Helper.exe N/A |
| 0 N/A N/A 24720 C+G C:\Users\Gregg\anaconda3\python.exe N/A |
| 0 N/A N/A 27304 C+G C:\Users\Gregg\anaconda3\python.exe N/A |
| 0 N/A N/A 27764 C+G ...es (x86)\Dropbox\Client\Dropbox.exe N/A |
| 0 N/A N/A 28724 C+G ...t Office\root\Office16\POWERPNT.EXE N/A |
+-----------------------------------------------------------------------------------------+

@alejoe91
Copy link
Member

alejoe91 commented Dec 2, 2024

Mmm, then there might be some incompatibility between your CUDA version and the one in the docker image...Can you share the entire error you're getting?

@taningh86
Copy link
Author

Sure. let me restart notebook and run the code again to get the error. Wil update in few minutes.

@taningh86
Copy link
Author

This is the error I first got:
`Spikeinterface errors:
Why the following error:

SpikeSortingError Traceback (most recent call last)
Cell In[14], line 2
1 # Only run this if you have NOT sorted your data yet.
----> 2 sorting_KS3_trimmed = ss.run_sorter('kilosort3', recording_saved,
3 folder=base_folder / 'results_KS3_trimmed_1',
4 verbose=True, docker_image=True,
5 )

File ~\anaconda3\envs\si_env\Lib\site-packages\spikeinterface\sorters\runsorter.py:210, in run_sorter(sorter_name, recording, folder, remove_existing_folder, delete_output_folder, verbose, raise_error, docker_image, singularity_image, delete_container_files, with_output, output_folder, **sorter_params)
204 if not has_spython():
205 raise RuntimeError(
206 "The python spython package must be installed to "
207 "run singularity. Install with pip install spython"
208 )
--> 210 return run_sorter_container(
211 container_image=container_image,
212 mode=mode,
213 **common_kwargs,
214 )
216 return run_sorter_local(**common_kwargs)

File ~\anaconda3\envs\si_env\Lib\site-packages\spikeinterface\sorters\runsorter.py:663, in run_sorter_container(sorter_name, recording, mode, container_image, folder, remove_existing_folder, delete_output_folder, verbose, raise_error, with_output, delete_container_files, extra_requirements, installation_mode, spikeinterface_version, spikeinterface_folder_source, output_folder, **sorter_params)
661 if run_error:
662 if raise_error:
--> 663 raise SpikeSortingError(f"Spike sorting in {mode} failed with the following error:\n{run_sorter_output}")
664 else:
665 if with_output:

SpikeSortingError: Spike sorting in docker failed with the following error:
/Cat_GT_Out/catgt_NPX2_2_28_24_2Rght_LHA_RSP_EXP_g0/in_container_sorter_script.py:23: DeprecationWarning: output_folder is deprecated and will be removed in version 0.103.0 Please use folder instead
sorting = run_sorter_local(
Traceback (most recent call last):
File "/Cat_GT_Out/catgt_NPX2_2_28_24_2Rght_LHA_RSP_EXP_g0/in_container_sorter_script.py", line 23, in
RUNNING SHELL SCRIPT: /Cat_GT_Out/catgt_NPX2_2_28_24_2Rght_LHA_RSP_EXP_g0/results_KS3_trimmed_1/sorter_output/run_kilosort3.sh
Warning: X does not support locale C.UTF-8

Time 0s. Computing whitening matrix..

Getting channel whitening matrix...

----------------------------------------Error using gpuArray.zeros

Unable to find a supported GPU device. For more information on GPU support, see GPU Support by Release.
Error running kilosort3
sorting = run_sorter_local(
File "/root/.local/lib/python3.9/site-packages/spikeinterface/sorters/runsorter.py", line 276, in run_sorter_local
SorterClass.run_from_folder(folder, raise_error, verbose)
File "/root/.local/lib/python3.9/site-packages/spikeinterface/sorters/basesorter.py", line 301, in run_from_folder
raise SpikeSortingError(
spikeinterface.sorters.utils.misc.SpikeSortingError: Spike sorting error trace:
Traceback (most recent call last):
File "/root/.local/lib/python3.9/site-packages/spikeinterface/sorters/basesorter.py", line 261, in run_from_folder
SorterClass._run_from_folder(sorter_output_folder, sorter_params, verbose)
File "/root/.local/lib/python3.9/site-packages/spikeinterface/sorters/external/kilosortbase.py", line 217, in _run_from_folder
raise Exception(f"{cls.sorter_name} returned a non-zero exit code")
Exception: kilosort3 returned a non-zero exit code

Spike sorting failed. You can inspect the runtime trace in /Cat_GT_Out/catgt_NPX2_2_28_24_2Rght_LHA_RSP_EXP_g0/results_KS3_trimmed_1/spikeinterface_log.json.

For this code:
sorting_KS3_trimmed = ss.run_sorter('kilosort3', recording_saved,
folder=base_folder / 'results_KS3_trimmed_1',
verbose=True, docker_image=True,
)`

@taningh86
Copy link
Author

Sorry the font changed, and the bold letters are not my intentional doing :)

@alejoe91
Copy link
Member

alejoe91 commented Dec 2, 2024

Thanks. The error is not due to a missing GPU, but some internal Kilosort3 failure... It could be something wrong with your data or some high levels of localized activity/noise which makes whitening failing

@taningh86
Copy link
Author

hmmm...ok. I thought the whitening also had to do something with the GPU issue. What is the GPU not found error then?
Also, I have ran the same data multiple times and did not use to get the same error. The only thing is that I have not used SI for few months. I am using the 101.0 version of SI and was wondering if anything have changed since my last use.
What do you recommend for this issue?

@alejoe91
Copy link
Member

alejoe91 commented Dec 2, 2024

Ahh sorry I misread the error.. yeah it could be an incompatibility issue between the GPU-CUDA version on your system and the one KS was compiled against. Unfortunately, in this case the only solution would be to compile a new image with the required versions...

@taningh86
Copy link
Author

ok. I will do that and see if it works.
Also, what version of notebook you guys are using. Mine is 7.2.2 and I believe it came default with si_env environment installation. I may be having some issue with the notebook too.

@alejoe91
Copy link
Member

alejoe91 commented Dec 2, 2024

We don't specify any notebook version (also there are many different packages involved). If you have problems with Jupyter and spikeinterface, please open another issue about it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question General question regarding SI
Projects
None yet
Development

No branches or pull requests

2 participants