Distributed-CellProfiler

Run encapsulated docker containers with CellProfiler in the Amazon Web Services infrastructure.

This code is an example of how to use AWS distributed infrastructure for running CellProfiler. The configuration of the AWS resources is done using boto3 and the AWS CLI. The worker is written in Python and is encapsulated in a docker container. There are four AWS components that are minimally needed to run distributed jobs:

An SQS queue
An ECS cluster
An S3 bucket
A spot fleet of EC2 instances

All of them can be managed through the AWS Management Console. However, this code helps to get started quickly and run a job autonomously if all the configuration is correct. The code prepares the infrastructure to run a distributed job. When the job is completed, the code is also able to stop resources and clean up components. It also adds logging and alarms via CloudWatch, helping the user troubleshoot runs and destroy stuck machines.

Documentation

Comprehensive documentation, including troubleshooting, is available at Distributed CellProfiler Documentation.

Running the code

Step 1

Edit the config.py file with all the relevant information for your job. Then, start creating the basic AWS resources by running the following script:

$ python run.py setup

This script initializes the resources in AWS. Notice that the docker registry is built separately, and you can modify the worker code to build your own. Any time you modify the worker code, you need to update the docker registry using the Makefile script inside the worker directory.

Step 2

After the first script runs successfully, the job can now be submitted to AWS using EITHER of the following commands:

$ python run.py submitJob files/exampleJob.json

OR

$ python run_batch_general.py

Running either script uploads the tasks that are configured in the json file. This assumes that your data is stored in S3, and the json file has the paths to find input and output directories. You have to customize the exampleJob.json file or the run_batch_general.py file with paths that make sense for your project. The tasks that compose your job are CP groups, and each one will be run in parallel. You need to define each task in your input file to guide the parallelization.

Step 3

After submitting the job to the queue, we can add computing power to process all tasks in AWS. This code starts a fleet of spot EC2 instances which will run the worker code. The worker code is encapsulated in docker containers, and the code uses ECS services to inject them in EC2. All of this is automated with the following command:

$ python run.py startCluster files/exampleFleet.json

After the cluster is ready, the code informs you that everything is setup, and saves the spot fleet identifier in a file for further reference.

Step 4

When the cluster is up and running, you can monitor progress using the following command:

$ python run.py monitor files/APP_NAMESpotFleetRequestId.json

The file APP_NAMESpotFleetRequestId.json is created after the cluster is setup in step 3. It is important to keep this monitor running if you want to automatically shutdown computing resources when there are no more tasks in the queue (recommended).

Name	Name	Last commit message	Last commit date
Latest commit ErinWeisbart formatting Nov 13, 2024 1c1e181 · Nov 13, 2024 History 278 Commits
.github/workflows	.github/workflows	Update deploy.yml	Mar 22, 2024
documentation	documentation	formatting	Nov 13, 2024
example_project	example_project	default to not assign ip address	Oct 25, 2024
example_project_CPG	example_project_CPG	default to not assign ip address	Oct 25, 2024
files	files	default to not assign ip address	Oct 25, 2024
python2worker	python2worker	Add Names to the volumes (DistributedScience#118 )	Nov 13, 2020
worker	worker	update to DCP2.2.0 CP4.2.8	Nov 5, 2024
.travis.yml	.travis.yml	Create .travis.yml (DistributedScience#82 )	Nov 7, 2018
CITATION.cff	CITATION.cff	Create CITATION.cff	Nov 7, 2023
LICENSE	LICENSE	DCP2.0 (DistributedScience#104 )	Nov 5, 2020
README.md	README.md	update README schematic	Jan 24, 2023
config.py	config.py	update to DCP2.2.0 CP4.2.8	Nov 5, 2024
lambda_function.py	lambda_function.py	improve auto-monitor	Jul 24, 2023
run.py	run.py	missed final commit in DistributedScience#182	Oct 28, 2024
run_batch_general.py	run_batch_general.py	reformat run_batch_general to CLI, add CPG structure (DistributedScie…	Nov 5, 2024
setup_AWS.py	setup_AWS.py	make setup_AWS more helpful to debug fleet permisison issues	Aug 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Distributed-CellProfiler

Documentation

Running the code

Step 1

Step 2

Step 3

Step 4

About

Releases

Packages

Languages

License

emiglietta/Distributed-CellProfiler

Folders and files

Latest commit

History

Repository files navigation

Distributed-CellProfiler

Documentation

Running the code

Step 1

Step 2

Step 3

Step 4

About

Resources

License

Citation

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages