Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not creating ECS models #237

Open
alu-85 opened this issue Feb 6, 2025 · 9 comments
Open

Not creating ECS models #237

alu-85 opened this issue Feb 6, 2025 · 9 comments

Comments

@alu-85
Copy link

alu-85 commented Feb 6, 2025

Hello

I have deployed LISA and connected it to IdP. Models are downloaded to the S3 bucket that specified in the LISA config and the infrastructure spins up without error. However, there are no containerised models present in ECS - the cluster instance is created but there are 0 containers. Similarly the associated model repo is empty. I can log into the Model Management page via the UI and attempted to create a model there, however the UI fails to load the instance types in the drop down and I can get no further there.

So the issue is the models detailed in the configuration as below do not seem to be deployed:

 ecsModels:
  - modelName: mistralai/Mistral-7B-Instruct-v0.2
    inferenceContainer: tgi
    baseImage: ghcr.io/huggingface/text-generation-inference:2.0.1
  - modelName: bigcode/starcoder2-15b
    inferenceContainer: tgi
    baseImage: ghcr.io/huggingface/text-generation-inference:2.0.1
  - modelName: meta-llama/Llama-3.1-70B-Instruct
    inferenceContainer: tgi
    baseImage: ghcr.io/huggingface/text-generation-inference:2.0.1

I can run make checkModels and get the following output:

Found 4 safetensors for model: mistralai/Mistral-7B-Instruct-v0.2 in bucket: hf-models-sourcegraph.
Found 15 safetensors for model: bigcode/starcoder2-15b in bucket: hf-models-sourcegraph.
Found 18 safetensors for model: meta-llama/Llama-3.1-70B-Instruct in bucket: hf-models-sourcegraph.

The CDK output gives no errors - can someone direct any areas that might help to track down this issue?

Additional info for the custom config:

  • it's an internal deployment
  • I specify the ssl certificate
  • I have tried deploying by specifying private subnets in the config and without specifying and get the same result
  • Additionally I specify the VPC, region, account number, s3 bucket and litellm key
@dustins
Copy link
Contributor

dustins commented Feb 6, 2025

Hey @alu-85!

With the release of LISA 3.0, model management has become more dynamic, meaning models are no longer deployed automatically during deployment. From what you've done so far, it looks like you've successfully staged them into S3 for use by LISA.

To proceed with model deployment, try using the Models API. For API authentication, refer to our guide on Programmatic API Tokens.

If you're troubleshooting a Chat UI issue, check the following:

  • In your browser's developer tools, inspect the network request to https://${apigw-url}/${stage}/models/metadata/instances and review the response and error code.
  • In AWS Lambda, locate the function ending in models-handler and check its CloudWatch logs for any error messages or clues about what went wrong.

@alu-85
Copy link
Author

alu-85 commented Feb 6, 2025

Hi @dustins, thanks for the quick reply.

Was about to look into the api so thank for confirming that's the way to deploy models. Does that mean that any config below ecsModels section in the config is surplus and not required?

I've looked at the API Tokens documents and cannot see the $DEPLOYMENT_NAME-LISAApiTokenTable mentioned. I only have these relating to API:

$DEPLOYMENT_NAME-lisa-chat-prod-ConfigurationApiConfigurationTable4B2B7EE1-1BL0VS5FROVTC
$DEPLOYMENT_NAME-lisa-chat-prod-SessionApiSessionsTableDA695141-1BJZACA6X6UF2
$DEPLOYMENT_NAME-lisa-models-prod-ModelsApiModelTable72B9582E-16NL7JES3SNM1

Are the docs out of sync or am I missing a table?...

@alu-85
Copy link
Author

alu-85 commented Feb 6, 2025

Looks like $DEPLOYMENT_NAME-LISAApiTokenTable is only created for internet facing deployments. Ours is currently internal - so the obvious question is how do you deploy models for internal only deployments if you need to use the model api and this table to generate and store a token?

        let tokenTable;
        if (config.restApiConfig.internetFacing) {  
            // Create DynamoDB Table for enabling API token usage
            tokenTable = new Table(this, 'TokenTable', {
                tableName: `${config.deploymentName}-LISAApiTokenTable`,
                partitionKey: {
                    name: 'token',
                    type: AttributeType.STRING,
                },
                billingMode: BillingMode.PAY_PER_REQUEST,
                encryption: TableEncryption.AWS_MANAGED,
                removalPolicy: config.removalPolicy,
            });
        }

@bedanley
Copy link
Contributor

bedanley commented Feb 6, 2025

Would you also confirm the version of LISA you have deployed?

@dustins
Copy link
Contributor

dustins commented Feb 6, 2025

I was just about to link to the same spot! I’ve added it here for anyone else who might come across this issue in the future.

While we've supported some partners running within a private VPC, this isn’t a typical deployment scenario for us. You’re correct that setting config.restApiConfig.internetFacing to false will prevent the table from being created. We’ll look into whether this is the best approach for private endpoints.

Given your setup, using the Chat UI for model management might be the most straightforward solution. However, I understand that this approach has caused separate issues for you. As an alternative, you can use the master model management token we generate during deployment. Although this is currently undocumented, you can find it in AWS Secrets Manager under a name like ${config.deploymentName}-lisa-management-key. By using this as your bearer token (Bearer ${secretValue}), you should be able to access the models API and create models.

Thank you for reaching out and letting us know about this need. We’re planning to update our documentation to clarify this process. If it would be helpful, we’d be happy to hop on a quick call today to discuss this further. We’d also love to hear more about your use case, as our roadmap is customer-driven. The product manager enjoys connecting with everyone deploying LISA deployment.

@alu-85
Copy link
Author

alu-85 commented Feb 7, 2025

Would you also confirm the version of LISA you have deployed?

sure, it's the latest, 3.5.1

Will do some more digging into the Chat UI issues, looks like it may be a permissions issue as the calls to retrieve models, instance types, history are all timing out/failing. It may be an IdP/token thing?

If it would be helpful, we’d be happy to hop on a quick call today to discuss this further. We’d also love to hear more about your use case, as our roadmap is customer-driven. The product manager enjoys connecting with everyone deploying LISA deployment.

That would be good to arrange I think - there are some questions about this deployment and other configs we have - would be good to discuss on a call. We're GMT timezone though...

@alu-85
Copy link
Author

alu-85 commented Feb 7, 2025

Although this is currently undocumented, you can find it in AWS Secrets Manager under a name like ${config.deploymentName}-lisa-management-key. By using this as your bearer token (Bearer ${secretValue}), you should be able to access the models API and create models.

Getting forbidden errors when attempting this with the master token, for both listing models or attempting to create a model, e.g.
curl -s -H "Authorization: Bearer <secret value>" -X GET https://<generated-api>.execute-api.eu-west-2.amazonaws.com/models

results in {"message":"Forbidden"} which is the same error I see in the Chat-UI via web developer tools.

@dustins
Copy link
Contributor

dustins commented Feb 11, 2025

I tested this and observed behavior very similar to when Lambda functions lack internet access. While you mentioned that your private subnets have internet connectivity, I recommend double-checking to confirm. Additionally, ensure that the route tables for your private subnets are correctly configured to route external traffic through the NAT Gateway or Internet Gateway.

If this resolves the issue, the next steps will be to add rules to control access to the internal load balancer. If you'd like to set up another call, I'd be happy to walk you through that process.

@dustins
Copy link
Contributor

dustins commented Feb 24, 2025

Since this issue has been resolved offline, I will be closing the ticket in 48 hours unless you require any further information. Please let me know if you need anything else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants