OCR TFLITE #38

tulasiram58827 · 2020-11-24T07:44:15Z

Previously we succesfully created TFLite models for CRAFT text detector. But text detectors are generally aren't much use if it's not combined with OCR and I think there aren't much opensource tflite models are available in OCR. So I did as small research and initially converted captcha ocr to tflite which is giving almost same results as original model.

Please find this repo for the source code.

FYI : Also observed that training with tf-nightly resulted in improvement of accuracy (Compared with tensorflow-2.3 keeping all other parameters kept constant)

@sayakpaul

sayakpaul · 2020-11-24T08:13:20Z

On-device OCR is pretty useful in my opinion.

Just for more clarification, @tulasiram58827 took the model as shown in this official Keras example, was able to successfully convert it to TFLite. He also plans to try out EasyOCR to see if their pre-trained models could be converted to TFLite. This will be particularly useful since EasyOCR has multilingual support and the performance of their pre-trained models is vetted.

@tulasiram58827 could you send a PR enlisting this project?

Cc: @khanhlvg @margaretmz

tulasiram58827 · 2020-11-24T08:22:20Z

Created PR

tulasiram58827 · 2020-11-27T09:59:30Z

Hi everyone, While converting EasyOCR to ONNX I am facing not supported operator issues. So I decided to convert Keras OCR to TFLite and succesfully converted to TFLite. Also succesfully did inference and also benchmarked the dynamic range and float16 quantized model. I am in the process of creating small dataset required for integer quantization.

Please follow this notebook for the conversion process and inference and also the benchmarks. @sayakpaul

CC: @khanhlvg @margaretmz

sayakpaul · 2020-11-27T10:20:27Z

This is really wonderful.

Keras OCR is indeed a fantastic project that allows us to run off-the-shelf OCR inference and even fine-tune OCR models. @tulasiram58827 I would suggest publishing these models on TF Hub as well.

Also, here are a couple of comments on the notebook -

Provide citation to the code where needed.
It looks like you have uploaded images to the Colab runtime for inference purposes. I would suggest hosting those images first (you can imgbb.com), retrieve them on Colab using wget or something similar and then do the inference part. This makes it fully executable on Colab without any problems.
With respect to "Note : Support for CTC Decoder is not available in TFLite yet. So while converting we removed CTCDecoder in model part. We need to run Decoder from the output of the model." I would recommend adding a few more sentences to point out to the exact code that was used to achieve this. You can also add comments in the code snippets.
Include the inference results.
In the Benchmark sections, just include the information that matters. Memory usage, inference latency, for example.
Maybe run an ls for the TFLite model sizes?

tulasiram58827 · 2020-11-27T11:33:22Z

Created PR for publishing TFLite Models. Updated Notebook with all the points mentioned.

sayakpaul · 2020-11-27T11:38:48Z

Looks good to me.

So, in build_model the CTC Decoder part is being discarded. Right? You can lament about it in the notebook. You can also clear out the unnecessary outputs. The rest of the things look pretty good.

tulasiram58827 · 2020-11-27T11:57:43Z

Done.

tulasiram58827 · 2020-11-27T16:58:13Z

Problems with Integer Quantizations:

1. Integer Quantization:

I am successfully able to convert using Integer Quantization . But while doing inferencing I am facing issues:

RuntimeError: tensorflow/lite/kernels/kernel_util.cc:309 scale_diff / output_scale <= 0.02 was not true.Node number 22 (FULLY_CONNECTED) failed to prepare.

2. Fully Integer Quantization:

This is the error log while converting using Fully Integer Quantization technique:

RuntimeError: Quantization not yet supported for op: 'FLOOR'.
Quantization not yet supported for op: 'CAST'.
Quantization not yet supported for op: 'CAST'.
Quantization not yet supported for op: 'CAST'.
Quantization not yet supported for op: 'FLOOR'.
Quantization not yet supported for op: 'CAST'.
Quantization not yet supported for op: 'CAST'.
Quantization not yet supported for op: 'CAST'.
Quantization not yet supported for op: 'ADD_N'.
Quantization not yet supported for op: 'REVERSE_V2'.
Quantization not yet supported for op: 'REVERSE_V2'.
Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'DIV'.

You can reproduce the same with this Notebook

@khanhlvg

tulasiram58827 · 2020-12-02T13:52:10Z

Hi @khanhlvg

I have been working on conversion of EasyOCR to TFLite . I have been getting this error while converting to TFLite.

ConverterError: input resource[0] expected type resource != float, the type of assignvariableop_resource_0[0]
In {{node AssignVariableOp}}

I successfully converted the PyTorch Model to ONNX and ran the inference with sample data and the results are matching correctly. So I hope there are no issues in PyTorch --> ONNX conversion. Even there are no issues in converting to TensorFlow SavedModel. I am getting the error while converting the SavedModel to TFLite.

FYI: Model consists of 2 layer Bi-Directional LSTM cells.

I have attached all the mentioned details in this Notebook . To reproduce the above error you can use the same Notebook.

tulasiram58827 · 2020-12-04T05:48:40Z

Hi @khanhlvg

To study the above issue further and find out the reason I created a simple LSTM layer model in PyTorch and tried converting to TFLite. I came across the same error mentioned in the previous comment. Sharing the Notebook with you.

khanhlvg · 2020-12-04T06:19:38Z

The issue came from the fact that the SavedModel contains mutable variable, which isn't supported by TFLite Converter. I think it may come from the PyTorch -> ONNX -> TF conversion pipeline rather than from the fact that it's a LSTM and not being supported. I'm waiting for a TFLite engineer do take a further investigation. I'll keep you posted.

tulasiram58827 · 2020-12-04T06:29:34Z

Okay. Thanks, @khanhlvg for the update.

sayakpaul · 2020-12-04T06:38:02Z

What about this one @khanhlvg?

#38 (comment)

Could you shed some light on it?

khanhlvg · 2020-12-04T06:43:48Z

#38 (comment)
Could you shed some light on it?

Unfortunately they are the limitation of the current TFLite integer quantization engine. There isn't much we can do about that.

sayakpaul · 2020-12-12T14:09:40Z

sayakpaul mentioned this issue Nov 27, 2020

Converting keras_ocr to TFLite #15

Closed

sayakpaul closed this as completed Dec 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OCR TFLITE #38

OCR TFLITE #38

tulasiram58827 commented Nov 24, 2020

sayakpaul commented Nov 24, 2020

tulasiram58827 commented Nov 24, 2020

tulasiram58827 commented Nov 27, 2020

sayakpaul commented Nov 27, 2020

tulasiram58827 commented Nov 27, 2020

sayakpaul commented Nov 27, 2020

tulasiram58827 commented Nov 27, 2020

tulasiram58827 commented Nov 27, 2020

tulasiram58827 commented Dec 2, 2020 •

edited

Loading

tulasiram58827 commented Dec 4, 2020 •

edited

Loading

khanhlvg commented Dec 4, 2020

tulasiram58827 commented Dec 4, 2020

sayakpaul commented Dec 4, 2020

khanhlvg commented Dec 4, 2020

sayakpaul commented Dec 12, 2020

OCR TFLITE #38

OCR TFLITE #38

Comments

tulasiram58827 commented Nov 24, 2020

sayakpaul commented Nov 24, 2020

tulasiram58827 commented Nov 24, 2020

tulasiram58827 commented Nov 27, 2020

sayakpaul commented Nov 27, 2020

tulasiram58827 commented Nov 27, 2020

sayakpaul commented Nov 27, 2020

tulasiram58827 commented Nov 27, 2020

tulasiram58827 commented Nov 27, 2020

tulasiram58827 commented Dec 2, 2020 • edited Loading

tulasiram58827 commented Dec 4, 2020 • edited Loading

khanhlvg commented Dec 4, 2020

tulasiram58827 commented Dec 4, 2020

sayakpaul commented Dec 4, 2020

khanhlvg commented Dec 4, 2020

sayakpaul commented Dec 12, 2020

tulasiram58827 commented Dec 2, 2020 •

edited

Loading

tulasiram58827 commented Dec 4, 2020 •

edited

Loading