Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Add Paligemma support #7875

Closed
4 tasks done
nischalj10 opened this issue Jun 11, 2024 · 4 comments
Closed
4 tasks done

Feature Request: Add Paligemma support #7875

nischalj10 opened this issue Jun 11, 2024 · 4 comments
Labels
enhancement New feature or request stale

Comments

@nischalj10
Copy link

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

Its a really solid model and has a lot of requests in the discussions

Motivation

Pulls way above its weight and has really good ocr capabilites

Possible Implementation

No response

@nischalj10 nischalj10 added the enhancement New feature or request label Jun 11, 2024
@iamlemec
Copy link
Collaborator

Yup! Work in progress at #7553.

@github-actions github-actions bot added the stale label Jul 12, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

@abetlen abetlen reopened this Aug 10, 2024
@github-actions github-actions bot removed the stale label Aug 11, 2024
@TalonBvV
Copy link

TalonBvV commented Sep 9, 2024

Hey there, I hope all is well, any luck getting the model to run?

I'm quite curious, I managed to use Clip.cpp quantization to quantize phi-3 vision's projector and Llama.cpp quantization to quantize the language model component, the result was a pretty useful VLM with a total size of 1.5GB.
While this is great, because it proves that VLM models can be smaller and still accurate, but in terms of overall functionality Paligemma is a much better choice, it has baked in bounding box abilities which on it's own has limitless possibilities.

@github-actions github-actions bot added the stale label Oct 10, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale
Projects
None yet
Development

No branches or pull requests

4 participants