-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GCP] Update artifact registry/db code #685
Open
AaDalal
wants to merge
15
commits into
main
Choose a base branch
from
aagamdalal/sgp-3576-model-engine-update-artifact-registry-model-engine-code
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
8db5a37
Add google-cloud-artifact-registry dep
AaDalal 8d8d8fa
Add gcp_artifact_registry_docker_repository
AaDalal 7894737
[untested] completed gcp_artifact_registry_docker_repository
AaDalal f19da2b
Add TODO comments in places I don't understand
AaDalal 0655ccd
[deps] Bump protobuf and add google-secret-manager
AaDalal c2993f8
Creating a model bundle hits artifact registry!
AaDalal 66d4f3a
Remove todo comments
AaDalal 5d4c90c
Remove gcp_project_id infra config key in favor of reusing ml_account_id
AaDalal 04f5a19
Use redis for task queue gateway and cache
AaDalal 06df969
Add redis on GCP support
AaDalal 8c6265f
Document values_sample.yaml better
AaDalal 3bd84cf
Add documentation on how to build llm-engine docker images to README
AaDalal 777a604
Clean up todos
AaDalal d2019e5
Cleanup more todos
AaDalal a7074ca
Return image tag instead of image name in
AaDalal File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -45,6 +45,7 @@ def excluded_namespaces(): | |
ELASTICACHE_REDIS_BROKER = "redis-elasticache-message-broker-master" | ||
SQS_BROKER = "sqs-message-broker-master" | ||
SERVICEBUS_BROKER = "servicebus-message-broker-master" | ||
GCP_REDIS_BROKER = "redis-gcp-memorystore-message-broker-master" | ||
|
||
UPDATE_DEPLOYMENT_MAX_RETRIES = 10 | ||
|
||
|
@@ -588,6 +589,7 @@ async def main(): | |
ELASTICACHE_REDIS_BROKER: RedisBroker(use_elasticache=True), | ||
SQS_BROKER: SQSBroker(), | ||
SERVICEBUS_BROKER: ASBBroker(), | ||
GCP_REDIS_BROKER: RedisBroker(use_elasticache=False), | ||
} | ||
|
||
broker = BROKER_NAME_TO_CLASS[autoscaler_broker] | ||
|
@@ -598,10 +600,18 @@ async def main(): | |
) | ||
|
||
if broker_type == "redis": | ||
# TODO gcp: change this to use cloud storage | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should be covered in the cloud storage PR @anishxyz was working on. I think we can merge this first, then merge that. |
||
# NOTE: the infra config is not available in the autoscaler (for some reason), so we have | ||
# to use the autoscaler_broker to determine the infra. | ||
backend_protocol = "redis" if "gcp" in autoscaler_broker else "s3" | ||
inspect = { | ||
db_index: inspect_app( | ||
app=celery_app( | ||
None, broker_type=broker_type, task_visibility=db_index, aws_role=aws_profile | ||
None, | ||
broker_type=broker_type, | ||
task_visibility=db_index, | ||
aws_role=aws_profile, | ||
backend_protocol=backend_protocol, | ||
) | ||
) | ||
for db_index in get_all_db_indexes() | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
70 changes: 70 additions & 0 deletions
70
...-engine/model_engine_server/infra/repositories/gcp_artifact_registry_docker_repository.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
from typing import Optional | ||
|
||
from google.api_core.client_options import ClientOptions | ||
from google.api_core.exceptions import NotFound | ||
from google.cloud import artifactregistry_v1 as artifactregistry | ||
from model_engine_server.common.dtos.docker_repository import BuildImageRequest, BuildImageResponse | ||
from model_engine_server.core.config import infra_config | ||
from model_engine_server.core.loggers import logger_name, make_logger | ||
from model_engine_server.domain.exceptions import DockerRepositoryNotFoundException | ||
from model_engine_server.domain.repositories import DockerRepository | ||
|
||
logger = make_logger(logger_name()) | ||
|
||
|
||
class GCPArtifactRegistryDockerRepository(DockerRepository): | ||
def _get_client(self): | ||
client = artifactregistry.ArtifactRegistryClient( | ||
client_options=ClientOptions() | ||
# NOTE: uses default auth credentials for GCP. Read `google.auth.default` function for more details | ||
) | ||
return client | ||
|
||
def _get_repository_prefix(self) -> str: | ||
# GCP is verbose and so has a long prefix for the repository | ||
return f"projects/{infra_config().ml_account_id}/locations/{infra_config().default_region}/repositories" | ||
|
||
def image_exists( | ||
self, image_tag: str, repository_name: str, aws_profile: Optional[str] = None | ||
) -> bool: | ||
client = self._get_client() | ||
|
||
try: | ||
client.get_docker_image( | ||
artifactregistry.GetDockerImageRequest( | ||
# This is the google cloud naming convention: https://cloud.google.com/artifact-registry/docs/docker/names | ||
name=f"{self._get_repository_prefix()}/{repository_name}/dockerImages/{image_tag}" | ||
) | ||
) | ||
except NotFound: | ||
return False | ||
return True | ||
|
||
def get_image_url(self, image_tag: str, repository_name: str) -> str: | ||
return f"{infra_config().docker_repo_prefix}/{repository_name}:{image_tag}" | ||
|
||
def build_image(self, image_params: BuildImageRequest) -> BuildImageResponse: | ||
raise NotImplementedError("GCP image build not supported yet") | ||
|
||
def get_latest_image_tag(self, repository_name: str) -> str: | ||
client = self._get_client() | ||
parent = f"{self._get_repository_prefix()}/{repository_name}" | ||
try: | ||
images_pager = client.list_docker_images( | ||
artifactregistry.ListDockerImagesRequest( | ||
parent=parent, | ||
order_by="update_time_desc", # NOTE: we expect that the artifact registry is immutable, so there should not be any updates after upload | ||
page_size=1, | ||
) | ||
) | ||
|
||
docker_image_page = next(images_pager.pages, None) | ||
if docker_image_page is None: | ||
raise DockerRepositoryNotFoundException | ||
if ( | ||
len(docker_image_page.docker_images) == 0 | ||
): # This condition shouldn't happen since we're asking for 1 image per page | ||
raise DockerRepositoryNotFoundException | ||
return docker_image_page.docker_images[0].tags[0] | ||
except NotFound: | ||
raise DockerRepositoryNotFoundException |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kovben95scale had a good question about this -- is there a reason we don't use redis on azure etc? I think we're just using this as a celery task queue here, which seems like it would fit with redis.