Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

||Only For YellowRoseCx|| KoboldCPP-v1.79.1.yr1-ROCm // rocBLAS error: Cannot read ...TensileLibrary.dat... #1267

Open
trolly056 opened this issue Dec 15, 2024 · 4 comments

Comments

@trolly056
Copy link

I couldn't find a way to message YellowRoseCx and there was no way to create an issue on his fork. I was getting this error:
rocBLAS_Error
When i googled this error i saw some issues here and it seemed to be about detecting the correct gpu when someone's CPU has a iGPU however mine doesn't.

It also dawned on me this was a temp file created by koboldcpp then i recalled the odd file size difference between YellowRoseCx's new version and the last one when such a reduction in size wasn't present in LostRuins's releases, so as a last attempt before going mald and giving up i tried the version before the file reduction and voila it works.

I'm assuming(cuz i don't know how any of this works) whatever file is needed by rocm YellowRoseCx forgot to include it in his latest version. (hence the file size reduction from 568mb to 404mb) KoboldCPP-v1.79.1.yr1-ROCm gives the error, however KoboldCPP-v1.79.1.yr0-ROCm does not.

Hopefully this prevents some new issues being created here.

@LostRuins
Copy link
Owner

Have you tried the vulkan backend? It might have better compatibility in recent versions

@trolly056
Copy link
Author

trolly056 commented Dec 15, 2024

I did not know i could use Vulkan i thought only ROCM was viable for whatever reason. Just tested Vulkan: Got 0.92x (7.5% slower) token generation but 2.3x faster token processing compared to ROCM. It does work though. Thanks for letting me know. You can close this issue whenever you feel like, hopefully someone sees this and has a way to contact @YellowRoseCx. If anyone is wondering i'm using a RX 7600.
I will add, I can actually feel the token speed difference. Processing speed feels quite good, i only tested up to 1.6k context, however, processing is something that only needs to happen the first time you open KoboldCPP, in that regard you can just wait and do something else. Generation speed is something that actively affects your experience. Although again, it's a small difference, maybe half a token/second.

@cb88
Copy link

cb88 commented Dec 18, 2024

The yellowrose fork doesn't have tickets but you can open a discussion about your issue there.

@YellowRoseCx
Copy link

The yellowrose fork doesn't have tickets but you can open a discussion about your issue there.

I can't believe I never noticed that, I just enabled them, Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants