diff --git a/website/docs/blueprints/data-analytics/spark-operator-s3tables.md b/website/docs/blueprints/data-analytics/spark-operator-s3tables.md index 315cc9ec3..6adbcdd8c 100644 --- a/website/docs/blueprints/data-analytics/spark-operator-s3tables.md +++ b/website/docs/blueprints/data-analytics/spark-operator-s3tables.md @@ -175,9 +175,9 @@ aws s3 cp s3table-iceberg-pyspark.py s3:///s3table-example/scripts/ Navigate to example directory and submit the Spark job. -### Step 5: Create Amazon S3 Table +### Step 5: Create Amazon S3 table bucket -This is the main step where you will create an S3 bucket that will be used for S3 Tables, which your PySpark job will access later. +This is the main step where you will create an S3 table bucket that will be used for S3 Tables, which your PySpark job will access later. Replace `` with your desired bucket name. Replace `` with your AWS region. @@ -188,7 +188,7 @@ aws s3tables create-table-bucket \ --name "" ``` - Make note of the S3TABLE ARN generated by this command. Verify the S3 Table ARN from AWS Console. + Make note of the S3TABLE BUCKET ARN generated by this command. Verify the S3 table bucket ARN from AWS Console. ![alt text](img/s3table_bucket.png) @@ -197,8 +197,8 @@ aws s3tables create-table-bucket \ Update the Spark Operator YAML file as below: - Open [s3table-spark-operator.yaml](https://github.com/awslabs/data-on-eks/blob/main/analytics/terraform/spark-k8s-operator/examples/s3-tables/s3table-spark-operator.yaml) file in your preferred text editor. -- Replace `` with your S3 bucket created by this blueprint(Check Terraform outputs). S3 Bucket is the place where you copied test data and sample spark job in the above steps. -- REPLACE `` with your S3 Table ARN captured in the previous step. +- Replace `` with your S3 bucket created by this blueprint(Check Terraform outputs). S3 bucket is the place where you copied the test data and sample spark job in the above steps. +- REPLACE `` with your S3 table bucket ARN captured in the previous step. You can see the snippet of Spark Operator Job config below. @@ -227,7 +227,7 @@ spec: mainApplicationFile: "s3a:///s3table-example/scripts/s3table-iceberg-pyspark.py" arguments: - "s3a:///s3table-example/input/" - - "" + - "" sparkConf: "spark.app.name": "s3table-example" "spark.kubernetes.driver.pod.name": "s3table-example"