Native Support for PDF Reading in Anthropic and Gemini Models (API and Vertex) #7243

Abel1011 · 2024-11-22T19:08:47Z

Abel1011
Nov 22, 2024

Checked

I searched existing ideas and did not find a similar one
I added a very descriptive title
I've clearly described the feature request and motivation for it

Feature request

I propose adding native support for reading PDF files in the Anthropic and Gemini models via their respective APIs (Anthropic API and Vertex AI). This feature would allow users to upload a PDF file directly for processing, enabling the models to extract both text and visual elements, such as images.

Expected functionality:

Direct upload of a PDF file to the API.
Automated processing of the PDF content, including both text and images.
Structured output containing the extracted text, metadata, and visual details.

Motivation

The ability to work with PDFs natively is essential for a wide range of use cases, including legal document analysis, technical reports, academic studies, and any context involving a combination of text and images.

Currently, users need to preprocess PDFs manually before sending them to the models, which adds complexity, time, and potential errors to the workflow. Implementing native support would streamline the process, improve efficiency, and enhance the versatility of the APIs.

Proposal (If applicable)

No response

abc0008 · 2024-12-02T19:55:39Z

abc0008
Dec 2, 2024

This is a huge need! My understanding is that Anthropic is using a ColPali style late interaction mechanism which is far superior to OCR + embedding etc.

Much simpler for user experience as well (provided files fit within sizing parameters).

Please add!

0 replies

khenzo · 2024-12-21T18:08:45Z

khenzo
Dec 21, 2024

+1

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Native Support for PDF Reading in Anthropic and Gemini Models (API and Vertex) #7243

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Native Support for PDF Reading in Anthropic and Gemini Models (API and Vertex) #7243

Abel1011 Nov 22, 2024

Checked

Feature request

Motivation

Proposal (If applicable)

Replies: 2 comments

abc0008 Dec 2, 2024

khenzo Dec 21, 2024

Abel1011
Nov 22, 2024

abc0008
Dec 2, 2024

khenzo
Dec 21, 2024