Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update for latest ADX version #2

Merged
merged 4 commits into from
May 19, 2022
Merged

Update for latest ADX version #2

merged 4 commits into from
May 19, 2022

Conversation

billfreeman44
Copy link
Member

No description provided.

@billfreeman44 billfreeman44 requested a review from razor-x May 19, 2022 06:29
@billfreeman44
Copy link
Member Author

Fixes dsdk version and downloading sample data.

"- Expand the \"Data set overview\" box near the top to reveal the Data set ID.\n",
"\n",
"_**Replace the `dataset_id` parameter below with your particular ID then run the notebook.**_\n",
"The use license will be in your `PURESKILLGG_TOME_DS_COLLECTION_PATH` named `license.pdf`. You must agree to these terms to use the data.\n",
Copy link
Member

@razor-x razor-x May 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"The use license will be in your `PURESKILLGG_TOME_DS_COLLECTION_PATH` named `license.pdf`. You must agree to these terms to use the data.\n",
"The use license will be in your `PURESKILLGG_TOME_DS_COLLECTION_PATH` named `LICENSE.pdf`. You must agree to these terms to use the data.\n",

Same for README.md in the zip file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix'd

@@ -89,23 +69,36 @@
"outputs": [],
"source": [
"import os\n",
"from pureskillgg_dsdk.exchange import download_dataexchange_dataset_revision"
"import requests\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, when I looked up how to do this I was suggested to use import urllib and urllib.request. Is requests also in std lib? Two ways to do the same thing in Python? 😲

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my google search led me to this and i didn't look more into it haha https://www.codingem.com/python-download-file-from-url/

Requests is mentioned in the top yellow box here https://docs.python.org/3/library/urllib.request.html#module-urllib.request

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😮 yup. done.

"download_dataexchange_dataset_revision(output_path, dataset_id)"
" response = requests.get(url)\n",
" with open(output_filename, \"wb\") as f:\n",
" f.write(response.content)\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just skip the part where we save the zip file to disk: https://stackoverflow.com/a/2463819

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Though now I can't check if the zip exists to see if we've already downloaded it 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is such a small amount of data I'm not sure it matters

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note that the way you've written this, if there was an error extracting the zip file and they reran the loop it would still assume the zip file was extracted correctly. So I recommend just not making this one "smart"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old one took 20 minutes to download, this one takes less than 10 seconds so it is trivial to overwrite, i guess :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix'd

@billfreeman44 billfreeman44 requested a review from razor-x May 19, 2022 19:21
@billfreeman44 billfreeman44 merged commit 03ddfc0 into master May 19, 2022
@billfreeman44 billfreeman44 deleted the ADX-Updates branch May 19, 2022 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants