This repository contains official implementation of Modeling Stroke Mask for End-to-End Text Erasing in WACV 2023.
In this paper, we present an end-to-end network (SAEN) that focuses on modeling text stroke masks that provide more accurate locations to compute erased images. The network consists of two stages, i.e., a basic network with stroke generation and a refinement network with stroke awareness. The basic network predicts the text stroke masks and initial erasing results simultaneously. The refinement network receives the masks as supervision to generate natural erased results.
- This work was tested with PyTorch 1.4.0, CUDA 10.1, python 3.7 and Ubuntu 18.04. Clone this repo:
pip install -r requirements.txt
The datatset can be access at SCUT-EnsText or synthetic dataset SCUT-Syn for training and testing.
-
The dataset structure is as follows:
SCUT-EnsText ├── train_sets | └── all_images | └── all_labels | └── all_gts ├── test_sets └── all_images └── all_labels └── all_gts
-
To generate text stroke mask
python script/generate_stoke.py
- To generate trainlmdb and testlmdb dataset
python3 script/create_lmdb_dataset.py --inputPath SCUT-EnsText/train_sets/all_images --gtPath SCUT-EnsText/train_sets/all_labels --maskPath SCUT-EnsText/train_sets/stroke --outputPath SCUT-EnsText/trainlmdb
python3 script/create_lmdb_dataset.py --inputPath SCUT-EnsText/test_sets/all_images --gtPath SCUT-EnsText/test_sets/all_labels --maskPath SCUT-EnsText/test_sets/stroke --outputPath SCUT-EnsText/testlmdb
- Download pretrained model from here
- Run demo.py
python demo.py --pretrained "pretrained_model_path" --imgPath "test_img_path" --savedPath "saved_img_path"
for examples:
python demo.py --pretrained pretrained.pth --imgPath samples/1.jpg --savedPath result.jpg
Prepare the lmdb dataset and copy them under ./SCUT-EnsText
directory.
To train the model, you can change some parameter in config/config.yaml
.
python train.py --config_path ./config/config.yml
To generate the results of test datasets
python test.py --dataRoot SCUT-EnsText/testlmdb --batchSize 1 --pretrain "pretrained_model_path"
To evaluate the results and calculate metrics of performance:
python evaluatuion.py --target_path "results path" --gt_path "ground_truth path"
Please consider citing this work in your publications if it helps your research.
@inproceedings{du2023modeling,
title={Modeling Stroke Mask for End-to-End Text Erasing},
author={Du, Xiangcheng and Zhou, Zhao and Zheng, Yingbin and Ma, Tianlong and Wu, Xingjiao and Jin, Cheng},
booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
pages={6151--6159},
year={2023}
}
The code is benefit a lot from EraseNet and EdgeConnect. Thanks a lot for their excellent work.