Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compress raw files to parquet and possibly tar/zip them as well #41

Open
eddyizm opened this issue Feb 1, 2024 · 0 comments
Open

compress raw files to parquet and possibly tar/zip them as well #41

eddyizm opened this issue Feb 1, 2024 · 0 comments
Labels
backend back end logic workflows enhancement New feature or request

Comments

@eddyizm
Copy link
Owner

eddyizm commented Feb 1, 2024

Depending on how long we need to store them, might be worth saving on storage space and definitely would save on egress when moving these files out to client sites.

$ la raw_data/npidata_pfile_20050523-20231112*
-rw-r--r-- 1 eddyizm 197121 9.1G Nov 30 21:57 raw_data/npidata_pfile_20050523-20231112.csv
-rw-r--r-- 1 eddyizm 197121 2.2G Dec 10 21:55 raw_data/npidata_pfile_20050523-20231112.csv.parquet
@eddyizm eddyizm added enhancement New feature or request backend back end logic workflows labels Feb 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend back end logic workflows enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant