-
Notifications
You must be signed in to change notification settings - Fork 804
Insights: DS4SD/docling
Overview
Could not load contribution data
Please try again later
3 Releases published by 1 person
10 Pull requests merged by 7 people
-
docs: add Weaviate RAG recipe notebook
#451 merged
Dec 19, 2024 -
docs: document Haystack & Vectara support
#628 merged
Dec 19, 2024 -
feat: Create a backend to transform PubMed XML files to DoclingDocument
#557 merged
Dec 17, 2024 -
feat: Updated Layout processing with forms and key-value areas
#530 merged
Dec 17, 2024 -
test: generate file from CLI in a temporary directory
#618 merged
Dec 17, 2024 -
feat: create a backend to parse USPTO patents into DoclingDocument
#606 merged
Dec 17, 2024 -
docs: add Haystack RAG example
#615 merged
Dec 17, 2024 -
feat: Add Easyocr parameter recog_network
#613 merged
Dec 17, 2024 -
docs: Fix the path to the run_with_accelerator.py example
#608 merged
Dec 16, 2024 -
feat: Introduce support for Accelerator options
#593 merged
Dec 13, 2024
3 Pull requests opened by 3 people
-
chore: Add example for inspection of picture content
#624 opened
Dec 18, 2024 -
feat: Enable markdown text formatting for docx
#630 opened
Dec 19, 2024 -
feat: experimental, - adding optional unsanitized cells to page_processig and table model
#631 opened
Dec 19, 2024
30 Issues closed by 13 people
-
Error building extension 'MultiScaleDeformableAttention' when running sample from web site.
#603 closed
Dec 18, 2024 -
Integrate Docling in Haystack
#453 closed
Dec 18, 2024 -
Deployment of docling using Docker
#303 closed
Dec 18, 2024 -
name 'reader' is not defined in your sample code
#523 closed
Dec 18, 2024 -
missing compatibility with .safetensor version of docling-models
#591 closed
Dec 18, 2024 -
Change language of our
#471 closed
Dec 18, 2024 -
Issue with reading downloading model
#580 closed
Dec 18, 2024 -
Should the second "if" keyword in adapt_bbox from layout_utils.py rather be an "elif" keyword ?
#362 closed
Dec 18, 2024 -
Docx cannot get pic info
#391 closed
Dec 18, 2024 -
docling vs GROBID
#74 closed
Dec 18, 2024 -
Input document doc\2410.13085v1.pdf is not valid.
#554 closed
Dec 18, 2024 -
Know issues with installation on windows?
#559 closed
Dec 18, 2024 -
cannot find loader for this WMF file <- When extracting a PPTX.
#560 closed
Dec 18, 2024 -
Text missing or displaced in parsed table
#540 closed
Dec 18, 2024 -
Create a backend to transform XML files to DoclingDocument
#446 closed
Dec 17, 2024 -
Create a backend to transform USPTO patents (XML and TXT) to DoclingDocument
#605 closed
Dec 17, 2024 -
Allowing EasyOCR to use the recog_network parameter
#602 closed
Dec 17, 2024 -
Seeing "numbers" as text in converted "tables" json
#588 closed
Dec 17, 2024 -
PermissionError: [Errno 13] Permission denied
#607 closed
Dec 16, 2024 -
Convert model weights to safetensors format
#308 closed
Dec 15, 2024 -
State of GPU support
#133 closed
Dec 15, 2024 -
Unable to find installation candidates for torchvision (0.20.1)
#596 closed
Dec 14, 2024 -
It is hoped that the content translation function can be added during conversion
#589 closed
Dec 14, 2024 -
External OCR API Integration
#575 closed
Dec 13, 2024 -
Turning off unused features (OCR, tables)
#583 closed
Dec 13, 2024 -
May I ask how could I disable download easyocr model file if my linux server has no internet?
#577 closed
Dec 13, 2024 -
how to use the new models in huggingface
#587 closed
Dec 13, 2024 -
Problem with .xlsx file reading
#581 closed
Dec 13, 2024 -
XLSX Not Working- Docling Core Needs Update
#493 closed
Dec 13, 2024
22 Issues opened by 22 people
-
When I run the example code, nothing happens—not even an error.
#635 opened
Dec 19, 2024 -
Pass HTTP request headers to docling when parsing via url
#634 opened
Dec 19, 2024 -
Headers and footers for docx
#632 opened
Dec 19, 2024 -
Problem with Markdown-Export on Windows
#629 opened
Dec 19, 2024 -
Processing of TOC objects in Word Documents DOCX fails
#627 opened
Dec 19, 2024 -
Writing picture enrichment annotations to Markdown file
#625 opened
Dec 18, 2024 -
Convert Markdown document incorrect
#623 opened
Dec 18, 2024 -
OCR on hand-written english text and digits?
#619 opened
Dec 18, 2024 -
support google spreadsheet
#617 opened
Dec 17, 2024 -
Explicitly reexport all attributes meant as public
#614 opened
Dec 17, 2024 -
Numbered headings in Word documents appear as list items
#612 opened
Dec 16, 2024 -
AttributeError: module 'torch.compiler' has no attribute 'is_compiling' error
#611 opened
Dec 16, 2024 -
export_to_markdown with ImageRefMode.EMBEDDED generate error
#610 opened
Dec 16, 2024 -
Arabic OCR is not working
#601 opened
Dec 16, 2024 -
Special characters are no longer recognized correctly
#600 opened
Dec 15, 2024 -
Unable to create tensor
#599 opened
Dec 15, 2024 -
UnicodeEncodeError - Several Different PDFs
#598 opened
Dec 14, 2024 -
Add S3 integration for input/output files
#597 opened
Dec 13, 2024 -
WMF images cause unhandled exceptions in mspowerpoint_backend
#594 opened
Dec 13, 2024 -
Nested List
#592 opened
Dec 13, 2024 -
Doesn't parse the table, treats it as an image
#590 opened
Dec 13, 2024 -
Docling lost the link URLs embedded in the text while parsing the PDF content.
#585 opened
Dec 13, 2024
10 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Placeholder elements in Powerpoint files have no size
#584 commented on
Dec 16, 2024 • 0 new comments -
Backend: epub support
#515 commented on
Dec 18, 2024 • 0 new comments -
UnicodeEncodeError: 'charmap' codec can't encode character '\u015b' in position 895: character maps to <undefined>
#578 commented on
Dec 18, 2024 • 0 new comments -
AsciiDoc backend fails parsing admonitions
#566 commented on
Dec 18, 2024 • 0 new comments -
UnicodeEncodeError: 'charmap' codec can't encode character '\uf0df' in position 28689: character maps to <undefined>
#579 commented on
Dec 18, 2024 • 0 new comments -
Standalone version of EasyOCR giving much better result than using EasyOCR in docling [ tested with Vietnamese ]
#426 commented on
Dec 18, 2024 • 0 new comments -
Text in PDF Recognized as Image Instead of Text During Parsing
#572 commented on
Dec 18, 2024 • 0 new comments -
Incorrect Reading Order in Single-page Image-Text Layouts
#570 commented on
Dec 19, 2024 • 0 new comments -
Docling 2.10.0: Performance Degradation When Reading Large PDF Files
#568 commented on
Dec 19, 2024 • 0 new comments -
Dev/update html parser with h1
#240 commented on
Dec 17, 2024 • 0 new comments