-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve polygons of bottom lines #544
Comments
I know this is a recurrent issue. It's funny because I've never had this problem with columns in Chinese. Is it an issue with the way Shapely computes polygons? |
Which Shapely version do you have installed? |
@lauxley what is on the stack on msia pls? |
Last time Shapely>=2.0 gave me overlapping polygons (see #430). However, this affected all polygons, not just the last line. Until it got fixed in eScript we used to export the dataset and recalculate all the polygons using another Shapely version. |
Just adding a little note: given that all or a lot of data produced during the early days of eScriptorium had masks like that (top ones and bottom ones), I am wondering if the new model, when we are using new model, just simply learned that bottom and top have higher masks. ie is it the data or the polygon :) |
@PonteIneptique only the baseline coordinates and the image feature (and optionally other polygons) are taken into account during the polygon computation: kraken/kraken/lib/segmentation.py Lines 637 to 643 in 674a772
|
I stand corrected then ;) Thanks @colibrisson |
This needs work as such cases yield poor recognition results in comparison to other lines on the same page. Whenever I have such cases, I adjust the baseline till I get the correct masks. As such adjusted baselines are retained during further training, I usually see some improvement in automatic baselines. However this is slow, plus the masks just won't budge in cases where the line is very close to the border box such as in the image here: |
@rohanchn - is there a resolution to this issue? I also notice the same issue. |
I have some hacky API workaround that (1) calculates the average line distance (2) creates dummy lines above the top and below the bottom line of each region in the same distance and (3) repolygonizes the top and bottom lines and (4) finally deletes the dummy lines. |
ahah - thanks for this @dstoekl ! Seems like a proper solution is needed but would be happy to use your temporary solution if you are willing to share it :-) |
Sure. Just shared a colab with you. have a look at the following function among the complex ones: restrict_first_and_last_line_polygon_according_to_average_line_height. |
Thanks so much! @mittagessen - is there any chance we could do something more robust to solve this problem? |
On 24/05/05 03:26AM, Alexis Litvine wrote:
Thanks so much! @mittagessen - is there any chance we could do something more robust to solve this problem?
Yes, I've got a solution in mind that gets rid of bounding polygons
completely which would also solve a host of other issues, especially
with Arabic manuscripts, illumination, and other decorations. I'll start
working on it next month if everything goes according to plan.
|
@mittagessen - have you managed to do anything about this. I found that the polygonisation (at least the one available in escriptorium) is really problematic. Adding many hours to any annotation workflow |
On one of my test images, the last release, 5.2.9 improvdes [email protected] to .8 from .77
|
typical case:



The text was updated successfully, but these errors were encountered: