Skip to content

Latest commit

 

History

History
121 lines (93 loc) · 6.67 KB

v3.0.0rc.en.md

File metadata and controls

121 lines (93 loc) · 6.67 KB

Interface Change Documentation

1. Model and Module Related

1.1 Model Configuration Files

  • Storage Directory Change: paddlex/configs has been updated to paddlex/configs/modules.
  • Module Name Changes, and related configuration file paths have also been updated:
    • anomaly_detection updated to image_anomaly_detection
    • face_recognition updated to face_feature
    • general_recognition updated to image_feature
    • multilabel_classification updated to image_multilabel_classification
    • pedestrian_attribute updated to pedestrian_attribute_recognition
    • structure_analysis updated to layout_detection
    • table_recognition updated to table_structure_recognition
    • text_detection_seal updated to seal_text_detection
    • vehicle_attribute updated to vehicle_attribute_recognition

1.2 Module Inference

1. create_model()

  • Parameter Change:

    • model_name: Only accepts model name.
    • New Parameters:
      • model_dir: Specifies the local directory for inference model files, defaults to None, which means automatically downloading and using the official model.
      • batch_size: Specifies the batch size during inference, defaults to 1.
      • Supports specifying common model inference hyperparameters, with specific parameters related to the module, as detailed in the module tutorial documentation. For example, image classification module support topk.
      • use_hpip and hpi_params: For supporting high-performance inference, not enabled by default.
  • Function Updates:

    • Supports using PDF files as input samples for CV modules.
    • Prediction results remain of dict type, but the format has changed: from {'key1': val} to {"res": {'key': val}}, using "res" as the key with the original result data as the value.
    • When using the save_to_xxx() method to save prediction results, if save_path is a directory, the name for stored files has changed. For example, saving in JSON format is {input_file_prefix}_res.json; saving in image format is {input_file_prefix}_res_img.{input_file_extension}.

2. Pipeline Related

2.1 Pipeline Configuration Files

  • Configuration File Storage Directory Change: paddlex/pipelines updated to paddlex/configs/pipelines.
  • Pipeline Name Changes:
    • ts_fc updated to ts_forecast
    • ts_ad updated to ts_anomaly_detection
    • ts_cls updated to ts_classification

2.2 Pipeline Inference

1. CLI Inference for Pipelines

  • New Support:
    • Inference hyperparameters, specific parameters related to the pipeline, detailed in the pipeline tutorial documentation. For example, image classification pipeline supports the --topk parameter to specify the topk results to return.
  • Removed:
    • --serial_number, high-performance inference no longer requires the serial number.

2. create_pipeline()

  • Removed:
    • The serial_number parameter in high-performance inference hpi_params, high-performance inference no longer requires the serial number.
  • No Longer Supported:
    • Setting pipeline inference hyperparameters, all related parameters must be set through the pipeline configuration file, such as batch_size, thresholds, etc.
  • Function Updates:
    1. When using the save_to_xxx() method to save prediction results, if save_path is a directory, the name for stored files has updated.
    2. CV model prediction results have a new page_index field, which indicates the page number of the current prediction result only when the input sample is a PDF file.
    3. Model pipeline prediction results have new pipeline inference parameter fields, such as the text_det_params field in the OCR pipeline, with values for the post-processing settings of the text detection model.
  • Configuration File Format Update:
    • After updating the content of the pipeline configuration file, it is divided into three parts: pipeline name, pipeline-related parameter settings, and sub-pipelines and sub-modules composition. For example:

      pipeline_name: pipeline # Pipeline Name
      threshold: 0.5 # Pipeline Inference Related Parameters
      SubPipelines: # Sub-pipelines
        DocPreprocessor:
          pipeline_name: doc_preprocessor
          use_doc_unwarping: True # Settings related to the sub-pipeline DocPreprocessor
      SubModules: # Sub-modules
        TextDetection:
          module_name: text_detection
          model_name: PP-OCRv4_mobile_det
          model_dir: null
          limit_side_len: 960 # Settings related to the sub-module TextDetection
          limit_type: max
          thresh: 0.3
          box_thresh: 0.6
          unclip_ratio: 2.0

3. Pipeline Features Changes

3.1 OCR Pipeline

  • New Features:
    • Document Preprocessing: Supports whole image direction classification and correction, controlled by relevant parameters in the OCR.yaml configuration file.
    • Text Line Direction Classification: Controlled by relevant parameters in the configuration file.
    • Support for modifying model inference hyperparameters, such as post-processing parameters of the text detection model, controlled by relevant parameters in the configuration file.

3.2 Seal Recognition and Formula Recognition Pipeline

  • New Features:
    • Document Preprocessing: Supports whole image direction classification and correction, controlled by relevant parameters in the configuration file.
    • Option to use the layout detection model: Controlled by relevant parameters in the configuration file.

3.3 Table Recognition Pipeline

  • New Features:
    • Document Preprocessing: Supports whole image direction classification and correction, controlled by relevant parameters in the configuration file.
    • Option to use the OCR pipeline for text detection and recognition: Controlled by relevant parameters in the configuration file.

3.4 Layout Analysis Pipeline

  • Updated Features:
    • Supports more inference hyperparameter settings, such as document preprocessing, text recognition, and model post-processing parameter settings, all of which can be configured in the pipeline configuration file.

3.5 PP-ChatOCRv3-doc Pipeline

  • New Features:

    • Supports standard OpenAI API calls, which can be controlled through relevant parameters in the configuration file.
    • Allows switching large language models during Chat API calls by passing the relevant configuration through the API call parameters.
  • Updated Features:

    • Inference Module Initialization: Supports initialization of the inference module upon its first invocation, eliminating the need for full initialization at production line startup.
    • Vector Library: Enables setting block size for long text and removes the control of interval duration between vector library calls.