-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
run_project.py #158
Comments
这个是pdf 2markdown的脚本,是综合使用布局检测,公式检测,公式识别等任务,提取pdf并转换为markdown,具体可以参考教程文档https://pdf-extract-kit.readthedocs.io/zh-cn/latest/project/pdf_extract.html |
有没有不需要将这4个组装一起的脚本呢?暂时用不到公式检测和公式识别 |
更准确的拼接可以参考MinerU,后处理的逻辑比Kit的要复杂些,效果也更好。 |
现在百度新增了版面区域检测模型,这边有打算接入的可能吗? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
你好 想问下有没有整体流程的测试,同时问下run_project.py这个脚本是干嘛用的
The text was updated successfully, but these errors were encountered: