This project contain the source code for the article How to quickly load a large amount of records to AWS DynamoDB.
-
Clone the project to the local machine
git clone https://github.com/mojitocoder/etl-ddb.git
-
Install dependencies
cd etl-ddb npm install
-
Create the
Postcodes
table in DynamoDBcd local_scripts node createTable.js
-
Deploy the project
npm run sls -- deploy
The Serverless framework will create a ``Postcodes
table in DynamoDB and a Lambda named
etl-ddb-dev-loadPostcodes`. -
Trigger the Lambda to load data into the
Postcodes
table of DynamoDBaws lambda invoke --region eu-west-1 \ --function-name etl-ddb-dev-loadPostcodes out --log-type Tail \ --query 'LogResult' --output text | base64 -d
If you prefer not to use Lambda to load data to DynamoDB, you can do it from your local development environment.
-
Make sure you are in the
local_scripts
foldercd etl-ddb cd local_scripts
-
Download UK postcode geo data file
curl https://www.freemaptools.com/download/full-postcodes/ukpostcodes.zip --output ukpostcodes.zip
-
Unzip the file
unzip -a ukpostcodes.zip
-
If you have not deployed the serverless project, you will need to create the
Postcodes
table in DynamoDB manuallynode createTable.js
There are three versions of the program to run:
lineByLine.js
is the slowest one, it would take more than 2 days to finish.writeInBatches.js
is faster, taking around 2h.concurrentRequests.js
uses the same logic as the Lambda version and it's the quickest to run. It should take 15 minutes.
-
See the detail of the DynamoDB table
aws dynamodb --region eu-west-1 describe-table --table-name Postcodes
-
Scan the DynamoDB table
aws dynamodb --region eu-west-1 scan \ --table-name Postcodes
-
Delete a DynamoDB table
aws dynamodb --region eu-west-1 delete-table \ --table-name Postcodes
-
List tables in DynamoDB
aws dynamodb list-tables --region eu-west-1