Implement NLP-Based Data Extraction System for ORCA Log Files #99

SiriChandanaGarimella · 2024-11-11T17:19:31Z

Is your feature request related to a problem? Please describe.
Based on the analysis from Issue #98, we need to implement an NLP-based system for extracting data from ORCA log files. The current rule-based system needs to be replaced with a more robust solution that can maintain accuracy, scale easily and efficiently, and reduce maintenance.

Describe the solution you'd like
Implement a Python-based extraction system using SpaCy and sci-kit-learn to extract search terms and other sections from ORCA files. The system should handle multiple sections and maintain data structure integrity with proper error handling. Use a hybrid approach if required.

SiriChandanaGarimella self-assigned this Nov 11, 2024

SiriChandanaGarimella mentioned this issue Dec 17, 2024

Added NLP-Based Section Matching and Data Extraction Logic Proof of Concept #125

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement NLP-Based Data Extraction System for ORCA Log Files #99

Implement NLP-Based Data Extraction System for ORCA Log Files #99

SiriChandanaGarimella commented Nov 11, 2024

Implement NLP-Based Data Extraction System for ORCA Log Files #99

Implement NLP-Based Data Extraction System for ORCA Log Files #99

Comments

SiriChandanaGarimella commented Nov 11, 2024