Metadata-Version: 2.1
Name: bbox-align
Version: 0.1.1
Summary: A python library that reorders bounding boxes generated by OCR engines into the correct reading order
Author-email: Gautham V Reddy <gauthamv93@yahoo.com>
Maintainer-email: Gautham V Reddy <gauthamv93@yahoo.com>
Project-URL: homepage, https://github.com/doctor-entropy/bbox-align
Project-URL: repository, https://github.com/doctor-entropy/bbox-align
Project-URL: documentation, https://github.com/doctor-entropy/bbox-align
Keywords: OCR,bounding boxes,reorder,lines
Requires-Python: >=3.8
Description-Content-Type: text/markdown

# bbox-align (In development)

`bbox-align` is a Python library that reorders bounding boxes generated by OCR engines into the correct reading order. It aims to group bounding boxes into logical lines (even when documents have folds, irregular spacing, or distortions) and sort them for downstream use in document processing.

## Installation

`pip install bbox-align`

**Prereqs:** Python 3.8+

## Concept
Two bounding boxes are considered inline if the y-coordinate of one box's vertical center lies within the top and bottom bounds of the other box.

<img src="./images/parallel.png" alt="parallel" style="width:1000px;"/>

What if the bounding boxes are cross?
<img src="./images/cross.png" alt="parallel" style="width:1000px;"/>

Now the bounding boxes are reflected
<img src="./images/reflected.png" alt="parallel" style="width:1000px;"/>
