Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-19233: ABFS: [FnsOverBlob] Implementing Rename and Delete APIs over Blob Endpoint #7265

Merged
merged 43 commits into from
Feb 3, 2025

Conversation

bhattmanish98
Copy link
Contributor

@bhattmanish98 bhattmanish98 commented Jan 2, 2025

Description of PR:


This PR is in correlation to the series of work done under Parent Jira: [HADOOP-19179]
Jira for this Patch: [HADOOP-19233]

Currently, we only support rename and delete operations on the DFS endpoint. The reason for not supporting rename and delete operations on the Blob endpoint is that the Blob endpoint does not account for hierarchy. We need to ensure that the HDFS contracts are maintained when performing rename and delete operations. Renaming or deleting a directory over the Blob endpoint requires the client to handle the orchestration and rename or delete all the blobs within the specified directory.
 
The task outlines the considerations for implementing rename and delete operations for the FNS-blob endpoint to ensure compatibility with HDFS contracts.

  • Blob Endpoint Usage: The task addresses the need for abstraction in the code to maintain HDFS contracts while performing rename and delete operations on the blob endpoint, which does not support hierarchy.
  • Rename Operations: The AzureBlobFileSystem#rename() method will use a RenameHandler instance to handle rename operations, with separate handlers for the DFS and blob endpoints. This method includes prechecks, destination adjustments, and orchestration of directory renaming for blobs.
  • Atomic Rename: Atomic renaming is essential for blob endpoints, as it requires orchestration to copy or delete each blob within the directory. A configuration will allow developers to specify directories for atomic renaming, with a JSON file to track the status of renames.
  • Delete Operations: Delete operations are simpler than renames, requiring fewer HDFS contract checks. For blob endpoints, the client must handle orchestration, including managing orphaned directories created by Az-copy.
  • Orchestration for Rename/Delete: Orchestration for rename and delete operations over blob endpoints involves listing blobs and performing actions on each blob. The process must be optimized to handle large numbers of blobs efficiently.
  • Need for Optimization: Optimization is crucial because the ListBlob API can return a maximum of 5000 blobs at once, necessitating multiple calls for large directories. The task proposes a producer-consumer model to handle blobs in parallel, thereby reducing processing time and memory usage.
  • Producer-Consumer Design: The proposed design includes a producer to list blobs, a queue to store the blobs, and a consumer to process them in parallel. This approach aims to improve efficiency and mitigate memory issues.

@bhattmanish98 bhattmanish98 changed the title HADOOP-19381: [ABFS] Support Rename and Delete operation over FNS-Blob endpoint HADOOP-19233: [ABFS] Support Rename and Delete operation over FNS-Blob endpoint Jan 2, 2025
@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@bhattmanish98 bhattmanish98 changed the title HADOOP-19233: [ABFS] Support Rename and Delete operation over FNS-Blob endpoint HADOOP-19233: ABFS: [FnsOverBlob] Implementing Rename and Delete APIs over Blob Endpoint Jan 3, 2025
@bhattmanish98 bhattmanish98 marked this pull request as ready for review January 6, 2025 07:00
@hadoop-yetus

This comment was marked as outdated.

Copy link
Contributor

@anujmodi2021 anujmodi2021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@bhattmanish98
Copy link
Contributor Author

bhattmanish98 commented Jan 31, 2025

============================================================
HNS-OAuth-DFS

[WARNING] Tests run: 160, Failures: 0, Errors: 0, Skipped: 3
[WARNING] Tests run: 758, Failures: 0, Errors: 0, Skipped: 142
[WARNING] Tests run: 171, Failures: 0, Errors: 0, Skipped: 25
[WARNING] Tests run: 262, Failures: 0, Errors: 0, Skipped: 23

============================================================
HNS-SharedKey-DFS

[WARNING] Tests run: 160, Failures: 0, Errors: 0, Skipped: 4
[WARNING] Tests run: 758, Failures: 0, Errors: 0, Skipped: 94
[WARNING] Tests run: 171, Failures: 0, Errors: 0, Skipped: 25
[WARNING] Tests run: 262, Failures: 0, Errors: 0, Skipped: 10

============================================================
NonHNS-SharedKey-DFS

[WARNING] Tests run: 160, Failures: 0, Errors: 0, Skipped: 11
[WARNING] Tests run: 742, Failures: 0, Errors: 0, Skipped: 340
[WARNING] Tests run: 171, Failures: 0, Errors: 0, Skipped: 27
[WARNING] Tests run: 262, Failures: 0, Errors: 0, Skipped: 11

============================================================
AppendBlob-HNS-OAuth-DFS

[WARNING] Tests run: 160, Failures: 0, Errors: 0, Skipped: 3
[WARNING] Tests run: 758, Failures: 0, Errors: 0, Skipped: 147
[WARNING] Tests run: 171, Failures: 0, Errors: 0, Skipped: 49
[WARNING] Tests run: 262, Failures: 0, Errors: 0, Skipped: 23

============================================================
NonHNS-OAuth-DFS

[WARNING] Tests run: 160, Failures: 0, Errors: 0, Skipped: 11
[WARNING] Tests run: 742, Failures: 0, Errors: 0, Skipped: 344
[WARNING] Tests run: 171, Failures: 0, Errors: 0, Skipped: 27
[WARNING] Tests run: 262, Failures: 0, Errors: 0, Skipped: 24

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus

This comment was marked as outdated.

Copy link
Contributor

@anujmodi2021 anujmodi2021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 with suggestions.
Let's wait for test results to come before merging this.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 49s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 xmllint 0m 1s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 20 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 40m 33s trunk passed
+1 💚 compile 0m 41s trunk passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 35s trunk passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
+1 💚 checkstyle 0m 32s trunk passed
+1 💚 mvnsite 0m 40s trunk passed
+1 💚 javadoc 0m 40s trunk passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 33s trunk passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
+1 💚 spotbugs 1m 7s trunk passed
+1 💚 shadedclient 39m 2s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 33s the patch passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 33s the patch passed
+1 💚 compile 0m 27s the patch passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
+1 💚 javac 0m 27s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 21s /results-checkstyle-hadoop-tools_hadoop-azure.txt hadoop-tools/hadoop-azure: The patch generated 6 new + 23 unchanged - 3 fixed = 29 total (was 26)
+1 💚 mvnsite 0m 31s the patch passed
-1 ❌ javadoc 0m 28s /results-javadoc-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04.txt hadoop-tools_hadoop-azure-jdkUbuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 generated 5 new + 10 unchanged - 1 fixed = 15 total (was 11)
-1 ❌ javadoc 0m 25s /results-javadoc-javadoc-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga.txt hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga generated 5 new + 10 unchanged - 1 fixed = 15 total (was 11)
+1 💚 spotbugs 1m 6s the patch passed
+1 💚 shadedclient 38m 23s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 34s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 37s The patch does not generate ASF License warnings.
130m 32s
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7265/31/artifact/out/Dockerfile
GITHUB PR #7265
JIRA Issue HADOOP-19233
Optional Tests dupname asflicense codespell detsecrets xmllint compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle
uname Linux 9226e7503c33 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 5765c5a
Default Java Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7265/31/testReport/
Max. process+thread count 585 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7265/31/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus

This comment was marked as outdated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 48s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 xmllint 0m 1s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 21 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 41m 45s trunk passed
+1 💚 compile 0m 40s trunk passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 35s trunk passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
+1 💚 checkstyle 0m 32s trunk passed
+1 💚 mvnsite 0m 40s trunk passed
+1 💚 javadoc 0m 40s trunk passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 33s trunk passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
+1 💚 spotbugs 1m 9s trunk passed
+1 💚 shadedclient 38m 38s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 30s the patch passed
+1 💚 compile 0m 32s the patch passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 32s the patch passed
+1 💚 compile 0m 28s the patch passed with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
+1 💚 javac 0m 28s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 21s hadoop-tools/hadoop-azure: The patch generated 0 new + 23 unchanged - 3 fixed = 23 total (was 26)
+1 💚 mvnsite 0m 31s the patch passed
+1 💚 javadoc 0m 28s hadoop-tools_hadoop-azure-jdkUbuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 generated 0 new + 10 unchanged - 1 fixed = 10 total (was 11)
+1 💚 javadoc 0m 25s hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga with JDK Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga generated 0 new + 10 unchanged - 1 fixed = 10 total (was 11)
+1 💚 spotbugs 1m 6s the patch passed
+1 💚 shadedclient 38m 38s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 35s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 35s The patch does not generate ASF License warnings.
131m 29s
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7265/33/artifact/out/Dockerfile
GITHUB PR #7265
JIRA Issue HADOOP-19233
Optional Tests dupname asflicense codespell detsecrets xmllint compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle
uname Linux 691b86429183 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / d2c8baf
Default Java Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_432-8u432-gaus1-0ubuntu220.04-ga
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7265/33/testReport/
Max. process+thread count 620 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7265/33/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@bhattmanish98
Copy link
Contributor Author


:::: AGGREGATED TEST RESULT ::::

============================================================
HNS-OAuth

[WARNING] Tests run: 161, Failures: 0, Errors: 0, Skipped: 4
[WARNING] Tests run: 758, Failures: 0, Errors: 0, Skipped: 144
[WARNING] Tests run: 171, Failures: 0, Errors: 0, Skipped: 25
[WARNING] Tests run: 262, Failures: 0, Errors: 0, Skipped: 23

============================================================
HNS-SharedKey

[WARNING] Tests run: 161, Failures: 0, Errors: 0, Skipped: 5
[WARNING] Tests run: 758, Failures: 0, Errors: 0, Skipped: 96
[WARNING] Tests run: 171, Failures: 0, Errors: 0, Skipped: 25
[WARNING] Tests run: 262, Failures: 0, Errors: 0, Skipped: 10

============================================================
NonHNS-SharedKey

[WARNING] Tests run: 161, Failures: 0, Errors: 0, Skipped: 11
[WARNING] Tests run: 742, Failures: 0, Errors: 0, Skipped: 339
[WARNING] Tests run: 171, Failures: 0, Errors: 0, Skipped: 27
[WARNING] Tests run: 262, Failures: 0, Errors: 0, Skipped: 11

============================================================
AppendBlob-HNS-OAuth

[WARNING] Tests run: 161, Failures: 0, Errors: 0, Skipped: 4
[WARNING] Tests run: 758, Failures: 0, Errors: 0, Skipped: 149
[WARNING] Tests run: 171, Failures: 0, Errors: 0, Skipped: 49
[WARNING] Tests run: 262, Failures: 0, Errors: 0, Skipped: 23

Time taken: 99 mins 10 secs.

Copy link
Contributor

@anujmodi2021 anujmodi2021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
Thanks for the patch @bhattmanish98
And @anmolanmol1234 for the review.

The test results look good.
This is good to merge now.

@anujmodi2021 anujmodi2021 merged commit 6d20de1 into apache:trunk Feb 3, 2025
4 checks passed
bhattmanish98 added a commit to bhattmanish98/hadoop that referenced this pull request Feb 17, 2025
… over Blob Endpoint (apache#7265)

Contributed by Manish Bhatt.
Signed off by Anuj Modi, Anmol Asrani
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants