Skip to content

biodatageeks/big-data-lab

Repository files navigation

How to use this repo

  1. Fork this repo Fork instructions
  2. Change parameters in notebooks/conf/00_env_variables.ipynb:
    • USER_ID - a number associated with your WUT email
    • TF_VAR_billing_account - billing account ID (GCP Console > Billing > Billing account ID)
  3. Open the notebook 01_ds_lab_project_bootstrap.ipynb on Colab by clicking the badge below
  • Bootstrap GCP project Open in Colab
  1. Run the notebook 02_ds_lab_infra_setup.ipynb on Colab by clicking the badge
  • Provision Big Data Lab resources Open in Colab
  1. Change parameter in notebooks/01_ds_lab_project_bootstrap.ipynb and notebooks/02_ds_lab_infra_setup.ipynb:
    • USER_NAME - set to your GitHub username
  2. Run both Colab notebooks.
  3. Go to GCP Console > Dataproc > Workbench > Open JupyterLab
  4. Clone the repo:
cd
git clone https://github.com/biodatageeks/ds-notebooks.git
  1. Run the /root/ds-notebooks/session_1/ notebooks (on LocalDisk mount, not on GCS mount). (Kernel: Python 3)

About

Self service for Data Science labs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •