Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create 7 - Transfer Dataset to S3.ipynb #3

Closed
wants to merge 1 commit into from

Conversation

billfreeman44
Copy link
Member

LMK what you think.

I think transferring from S3 to local can be thru docs, cause it isn't much notebook to run.

@billfreeman44 billfreeman44 requested a review from razor-x May 19, 2022 22:48
@razor-x
Copy link
Member

razor-x commented May 19, 2022

This is a good start but I wanted to put this tool in the dsdk. I was about to start on a PR.

@billfreeman44
Copy link
Member Author

🤔 Seems a bit simple but OK. Feel free to yoink anything of use.

@razor-x
Copy link
Member

razor-x commented May 19, 2022

We need to add a warning to this notebook, and try to give them a rough cost estimate. Something with these notes:

  • While the data set subscription as provided is free, exporting and downloading the data set will incur a real money cost to your AWS bill according to their pricing.
  • Even if you are on the AWS free tier, the volume of data in this data set may exceed the free tier limits.
  • If you setup auto-export of the data, you will get daily exports into your S3 bucket until you disable the automatic process. This will incur increasing daily storage cost.
  • Downloading data from S3 across internet, including to your local machine, will cost money.
  • Exporting the data from the data exchange to a bucket outside of the region the data set is hosted in will incur additional cost.
  • FPS Critic Inc., makers of PureSkill.gg, is not liable for any AWS costs your incur. Run this notebook only if you understand and accept the AWS billing implications.
  • For S3 pricing, refer to https://aws.amazon.com/s3/pricing/
  • You may use this calculator to estimate your costs: https://calculator.aws

For reference, you can use this cost estimate tool. As an example, exporting a day of data (1 revision), assuming it had 13GB and has 350 matches, would cost $x / month in S3 storage if the bucket was in us-east-1. And $y / month if you were on the free tier with no other usage. Downloading this data to you local machine would cost $z. If you use a bucket in a different region, expect to pay an additional $a. This is only an example, actual costs may vary and should be evaluated independently.

@razor-x
Copy link
Member

razor-x commented May 19, 2022

Calculator to link to https://calculator.aws/#/createCalculator/S3

We can create and share a few estimates and link to them here, like this https://calculator.aws/#/estimate?id=baecd1a90fc2fd919a027c8821ad033615961d11

@razor-x
Copy link
Member

razor-x commented May 19, 2022

Estimates we can generate:

  • Cost per 30 revisions / month for storage only.
  • Cost per 30 revisions / month for download only.
  • Cost per 30 revisions / month for storage and download.
  • Cost per 30 revisions / month for storage and download, also assuming bucket is in another region.

@razor-x
Copy link
Member

razor-x commented May 21, 2022

I updated master with the latest dsdk version. Let's make a new branch from that and use the new functions.

@razor-x razor-x closed this May 21, 2022
@razor-x razor-x deleted the add-adx-download branch June 9, 2022 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants