feat: support siliconflow in offline #547

IcyKallen · 2025-02-28T05:24:51Z

Fixes #

🤖 AI-Generated PR Description (Powered by Amazon Bedrock)

Description

This pull request includes changes to the ETL (Extract, Transform, Load) process for the Figure LLM (Large Language Model) project. The modifications aim to enhance the data processing pipeline and improve the overall efficiency of the system.

The main updates are as follows:

Refactored the model-construct.ts file to optimize the infrastructure setup for the ETL process.
Updated the iam-helper.ts file to ensure proper IAM (Identity and Access Management) permissions are granted for the ETL process.
Modified the glue-job-script.py file to incorporate performance improvements and bug fixes in the Glue job script.
Optimized the figure_llm.py module to enhance the data transformation and preprocessing steps.
Streamlined the main.py script, which serves as the entry point for the ETL process, to improve overall execution flow.

These changes are designed to improve the reliability, performance, and maintainability of the ETL process, ensuring efficient data processing for the Figure LLM project.

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

File Stats Summary

File number involved in this PR: 5, unfold to see the details:

The file changes summary is as follows:

Files	Changes	Change Summary
source/lambda/job/glue-job-script.py	5 added, 0 removed	The code changes add configuration options for model provider, model ID, API secret name, and API URL, likely for integrating with a language model service.
source/infrastructure/lib/shared/iam-helper.ts	19 added, 12 removed	The code changes add a new policy statement called `secretsManagerStatement` to the `IAMHelper` class, which grants permission to access the `GetSecretValue` action on all resources in AWS Secrets Manager.
source/model/etl/code/figure_llm.py	191 added, 75 removed	This code refactors the figureUnderstand class to support multiple model providers (Bedrock, OpenAI, SiliconFlow, etc.) with configurable model IDs and API keys/URLs, and separates prompts into constants for better maintainability.
source/infrastructure/lib/model/model-construct.ts	1 added, 1 removed	The code change adds a new policy statement from `modelIamHelper.secretsManagerStatement` to the `executionRole` role, likely granting access to AWS Secrets Manager for the model execution.
source/model/etl/code/main.py	258 added, 100 removed	The code changes include the following updates:

Added support for fetching API keys from AWS Secrets Manager.
Added validation for LLM request parameters (model provider, API key, API URL).
Refactored the structure_predict function to accept the StructureSystem and figureUnderstand instances.
Added support for different LLM providers (Bedrock, OpenAI, SiliconFlow) in the figureUnderstand class.
Restructured the process_pdf_pipeline function to handle API key retrieval, LLM request validation, and instantiation of StructureSystem and figureUnderstand classes.
General code cleanup and formatting improvements. |

IcyKallen added 2 commits February 28, 2025 03:17

chore: formatting

b0590b7

feat: support siliconflow

4f7a531

IcyKallen merged commit 6524603 into gcr-custom Feb 28, 2025
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support siliconflow in offline #547

feat: support siliconflow in offline #547

IcyKallen commented Feb 28, 2025 •

edited by github-actions bot

Loading

feat: support siliconflow in offline #547

feat: support siliconflow in offline #547

Conversation

IcyKallen commented Feb 28, 2025 • edited by github-actions bot Loading

Description

Type of change

File Stats Summary

IcyKallen commented Feb 28, 2025 •

edited by github-actions bot

Loading