Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Updated Layout processing with forms and key-value areas #530

Merged
merged 45 commits into from
Dec 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
b9f8f5a
Upgraded Layout Postprocessing, sending old code back to ERZ
cau-git Dec 2, 2024
05bffd3
Implement hierachical cluster layout processing
cau-git Dec 3, 2024
db70916
Pass nested cluster processing through full pipeline
cau-git Dec 3, 2024
65fa584
Pass nested clusters through GLM as payload
cau-git Dec 3, 2024
a1ac0c6
Move to_docling_document from ds-glm to this repo
cau-git Dec 4, 2024
e826642
Merge branch 'release_v3' of github.com:DS4SD/docling into cau/layout…
cau-git Dec 4, 2024
8b04edd
Clean up imports again
cau-git Dec 4, 2024
ddb8ad9
feat(Accelerator): Introduce options to control the num_threads and d…
nikos-livathinos Dec 2, 2024
40d7a8e
Merge pull request #504 from DS4SD/cau/layout-postprocessing
cau-git Dec 6, 2024
6f0b912
Rebase from release_v3
cau-git Dec 6, 2024
975fe07
fix: Improve the pydantic objects in the pipeline_options and imports.
nikos-livathinos Dec 6, 2024
5d5d14d
fix: TableStructureModel: Refactor the artifacts path to use the new …
nikos-livathinos Dec 9, 2024
03f8690
Updated test ground-truth
cau-git Dec 9, 2024
46ae215
Updated test ground-truth (again), bugfix for empty layout
cau-git Dec 9, 2024
9e99e24
Rebase from main
cau-git Dec 9, 2024
bb1774d
Merge branch 'release_v3' into nli/performance
cau-git Dec 9, 2024
accb7b4
fix: Do proper check to set the device in EasyOCR, RapidOCR.
nikos-livathinos Dec 10, 2024
94caee3
fix: Correct the way to set GPU for EasyOCR, RapidOCR
nikos-livathinos Dec 10, 2024
f46fd9c
fix: Ocr AccleratorDevice
nikos-livathinos Dec 10, 2024
e282bfd
Merge pull request #514 from DS4SD/nli/performance
cau-git Dec 10, 2024
cd579fd
Merge pull request #556 from DS4SD/cau/layout-processing-improvement
cau-git Dec 10, 2024
586abd5
Rebase from main
cau-git Dec 10, 2024
bd30b46
Update lockfile
cau-git Dec 10, 2024
c8b5915
Update tests
cau-git Dec 10, 2024
f4512d0
Update HF model ref, reset test generate
cau-git Dec 10, 2024
5a82f2b
Rebase from main
cau-git Dec 11, 2024
48db8a5
Repin to release package versions
cau-git Dec 11, 2024
55b195c
Many layout processing improvements, add document index type
cau-git Dec 11, 2024
c02af42
Update pinnings to docling-core
cau-git Dec 12, 2024
f57884a
Merge from main
cau-git Dec 12, 2024
3f854bd
Update test GT
cau-git Dec 12, 2024
dd4f72e
Fix table box snapping
cau-git Dec 13, 2024
a9ff29b
Fixes for cluster pre-ordering
cau-git Dec 13, 2024
81bf033
Rebase from main
cau-git Dec 16, 2024
18e01f6
Introduce OCR confidence, propagate to orphan in post-processing
cau-git Dec 16, 2024
bed8fc8
Fix form and key value area groups
cau-git Dec 16, 2024
c5d1fbf
Adjust confidence in EasyOcr
cau-git Dec 17, 2024
8f09fcd
Merge branch 'main' of github.com:DS4SD/docling into release_v3
cau-git Dec 17, 2024
6c8c625
Roll back CLI changes from main
cau-git Dec 17, 2024
8243325
Update test GT
cau-git Dec 17, 2024
d29a245
Update docling-core pinning
cau-git Dec 17, 2024
6d38c7c
Annoying fixes for historical python versions
cau-git Dec 17, 2024
dca32bf
Updated test GT for legacy
cau-git Dec 17, 2024
2c2026d
Merge branch 'main' of github.com:DS4SD/docling into release_v3
cau-git Dec 17, 2024
7649ba7
Comment cleanup
cau-git Dec 17, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion docling/datamodel/base_models.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,7 @@ class Cluster(BaseModel):
bbox: BoundingBox
confidence: float = 1.0
cells: List[Cell] = []
children: List["Cluster"] = [] # Add child cluster support


class BasePageElement(BaseModel):
Expand All @@ -143,6 +144,12 @@ class LayoutPrediction(BaseModel):
clusters: List[Cluster] = []


class ContainerElement(
BasePageElement
): # Used for Form and Key-Value-Regions, only for typing.
pass


class Table(BasePageElement):
otsl_seq: List[str]
num_rows: int = 0
Expand Down Expand Up @@ -182,7 +189,7 @@ class PagePredictions(BaseModel):
equations_prediction: Optional[EquationPrediction] = None


PageElement = Union[TextElement, Table, FigureElement]
PageElement = Union[TextElement, Table, FigureElement, ContainerElement]


class AssembledUnit(BaseModel):
Expand Down
4 changes: 3 additions & 1 deletion docling/datamodel/document.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@

layout_label_to_ds_type = {
DocItemLabel.TITLE: "title",
DocItemLabel.DOCUMENT_INDEX: "table-of-contents",
DocItemLabel.DOCUMENT_INDEX: "table",
DocItemLabel.SECTION_HEADER: "subtitle-level-1",
DocItemLabel.CHECKBOX_SELECTED: "checkbox-selected",
DocItemLabel.CHECKBOX_UNSELECTED: "checkbox-unselected",
Expand All @@ -88,6 +88,8 @@
DocItemLabel.PICTURE: "figure",
DocItemLabel.TEXT: "paragraph",
DocItemLabel.PARAGRAPH: "paragraph",
DocItemLabel.FORM: DocItemLabel.FORM.value,
DocItemLabel.KEY_VALUE_REGION: DocItemLabel.KEY_VALUE_REGION.value,
}

_EMPTY_DOCLING_DOC = DoclingDocument(name="dummy")
Expand Down
2 changes: 2 additions & 0 deletions docling/datamodel/pipeline_options.py
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,8 @@ class EasyOcrOptions(OcrOptions):

use_gpu: Optional[bool] = None

confidence_threshold: float = 0.65

model_storage_directory: Optional[str] = None
recog_network: Optional[str] = "standard"
download_enabled: bool = True
Expand Down
1 change: 1 addition & 0 deletions docling/datamodel/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ class DebugSettings(BaseModel):
visualize_cells: bool = False
visualize_ocr: bool = False
visualize_layout: bool = False
visualize_raw_layout: bool = False
visualize_tables: bool = False

profile_pipeline_timings: bool = False
Expand Down
38 changes: 34 additions & 4 deletions docling/models/ds_glm_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,15 @@
from docling_core.types.legacy_doc.document import CCSFileInfoObject as DsFileInfoObject
from docling_core.types.legacy_doc.document import ExportedCCSDocument as DsDocument
from PIL import ImageDraw
from pydantic import BaseModel, ConfigDict

from docling.datamodel.base_models import Cluster, FigureElement, Table, TextElement
from pydantic import BaseModel, ConfigDict, TypeAdapter

from docling.datamodel.base_models import (
Cluster,
ContainerElement,
FigureElement,
Table,
TextElement,
)
from docling.datamodel.document import ConversionResult, layout_label_to_ds_type
from docling.datamodel.settings import settings
from docling.utils.glm_utils import to_docling_document
Expand Down Expand Up @@ -204,7 +210,31 @@ def make_spans(cell):
)
],
obj_type=layout_label_to_ds_type.get(element.label),
# data=[[]],
payload={
"children": TypeAdapter(List[Cluster]).dump_python(
element.cluster.children
)
}, # hack to channel child clusters through GLM
)
)
elif isinstance(element, ContainerElement):
main_text.append(
BaseText(
text="",
payload={
"children": TypeAdapter(List[Cluster]).dump_python(
element.cluster.children
)
}, # hack to channel child clusters through GLM
obj_type=layout_label_to_ds_type.get(element.label),
name=element.label,
prov=[
Prov(
bbox=target_bbox,
page=element.page_no + 1,
span=[0, 0],
)
],
)
)

Expand Down
1 change: 1 addition & 0 deletions docling/models/easyocr_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,7 @@ def __call__(
),
)
for ix, line in enumerate(result)
if line[2] >= self.options.confidence_threshold
]
all_ocr_cells.extend(cells)

Expand Down
Loading
Loading