This utility compares PDF files visually by converting each page into images and then comparing them using OpenCV. It's particularly useful for identifying differences between PDF files that may not be apparent through text comparison alone.
- Compares PDF files visually, page by page.
- Supports multi-page PDF files.
- Reports differences between PDF files, specifying the page number and source file.
- Python 3.x
- PyMuPDF (
fitz
) library - OpenCV (
cv2
) library
-
Clone the repository:
git clone https://github.com/Formartha/compare-pdf.git
-
Install the required dependencies:
pip install pymupdf opencv-python
pip install compare-pdf
compare_pdf --pdf <path_to_pdf1> --pdf <path_to_pdf2> ...
- Replace
<path_to_pdf1>
,<path_to_pdf2>
, etc. with the paths to the PDF files you want to compare. - At least two PDF files are required for comparison.
compare_pdf --pdf file1.pdf --pdf file2.pdf
This will compare full/path/to/file1.pdf
and full/path/to/file2.pdf
visually, reporting any differences found.
This project is licensed under the MIT License - see the LICENSE file for details.