Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS-16968. Recover two replicas when 2-replication write pipepine fails #6125

Open
wants to merge 2 commits into
base: trunk
Choose a base branch
from

Conversation

ted12138
Copy link
Contributor

@ted12138 ted12138 commented Sep 28, 2023

Description of PR

When user tries to write a 2-replica file and fails in the process, write pipeline will only make sure one DN is written successfully, this will make the data lost. This PR aims to solve this case.

How was this patch tested?

  1. Test in UT
  2. This feature has run in our company cluster for more than 6 months and works well

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 28s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 13m 42s Maven dependency ordering for branch
+1 💚 mvninstall 22m 19s trunk passed
+1 💚 compile 3m 14s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 compile 3m 7s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 0m 53s trunk passed
+1 💚 mvnsite 1m 35s trunk passed
+1 💚 javadoc 1m 27s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 52s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 4m 8s trunk passed
+1 💚 shadedclient 25m 26s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 25s Maven dependency ordering for patch
+1 💚 mvninstall 1m 11s the patch passed
+1 💚 compile 3m 15s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javac 3m 15s the patch passed
+1 💚 compile 3m 19s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 3m 19s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 48s /results-checkstyle-hadoop-hdfs-project.txt hadoop-hdfs-project: The patch generated 2 new + 24 unchanged - 0 fixed = 26 total (was 24)
+1 💚 mvnsite 1m 28s the patch passed
+1 💚 javadoc 1m 10s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 45s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 53s the patch passed
+1 💚 shadedclient 24m 32s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 1m 50s hadoop-hdfs-client in the patch passed.
-1 ❌ unit 209m 55s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 39s The patch does not generate ASF License warnings.
333m 27s
Reason Tests
Failed junit tests hadoop.hdfs.TestClientProtocolForPipelineRecovery
hadoop.hdfs.TestDFSInputStream
hadoop.hdfs.server.namenode.TestFsck
hadoop.hdfs.TestFileAppend4
hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6125/1/artifact/out/Dockerfile
GITHUB PR #6125
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 8b3d1ae2e955 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 5c69f3f
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6125/1/testReport/
Max. process+thread count 3114 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6125/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 27s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 13m 35s Maven dependency ordering for branch
+1 💚 mvninstall 22m 32s trunk passed
+1 💚 compile 3m 48s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 compile 3m 26s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 0m 59s trunk passed
+1 💚 mvnsite 1m 36s trunk passed
+1 💚 javadoc 1m 24s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 43s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 21s trunk passed
+1 💚 shadedclient 24m 0s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 26s Maven dependency ordering for patch
+1 💚 mvninstall 1m 22s the patch passed
+1 💚 compile 3m 41s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javac 3m 41s the patch passed
+1 💚 compile 3m 17s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 3m 17s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 46s /results-checkstyle-hadoop-hdfs-project.txt hadoop-hdfs-project: The patch generated 2 new + 24 unchanged - 0 fixed = 26 total (was 24)
+1 💚 mvnsite 1m 27s the patch passed
+1 💚 javadoc 1m 10s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 35s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 19s the patch passed
+1 💚 shadedclient 24m 37s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 2s hadoop-hdfs-client in the patch passed.
-1 ❌ unit 209m 2s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 40s The patch does not generate ASF License warnings.
332m 16s
Reason Tests
Failed junit tests hadoop.hdfs.TestClientProtocolForPipelineRecovery
hadoop.hdfs.TestDFSInputStream
hadoop.hdfs.server.namenode.TestFsck
hadoop.hdfs.TestFileAppend4
hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6125/2/artifact/out/Dockerfile
GITHUB PR #6125
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux e605c51a04d3 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 412b38a
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6125/2/testReport/
Max. process+thread count 3622 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6125/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@zhangshuyan0
Copy link
Contributor

If you want to improve the reliability of 2-replica writing, it is recommended that you directly configure dfs.client.block.write.replace-datanode-on-failure.policy to ALWAYS. The current changes conflict with the design ideas of the DEFAULT policy. See:

<property>
<name>dfs.client.block.write.replace-datanode-on-failure.policy</name>
<value>DEFAULT</value>
<description>
This property is used only if the value of
dfs.client.block.write.replace-datanode-on-failure.enable is true.
ALWAYS: always add a new datanode when an existing datanode is removed.
NEVER: never add a new datanode.
DEFAULT:
Let r be the replication number.
Let n be the number of existing datanodes.
Add a new datanode only if r is greater than or equal to 3 and either
(1) floor(r/2) is greater than or equal to n; or
(2) r is greater than n and the block is hflushed/appended.
</description>
</property>

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 28s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 15m 49s Maven dependency ordering for branch
+1 💚 mvninstall 24m 3s trunk passed
+1 💚 compile 3m 43s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 compile 3m 39s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 1m 1s trunk passed
+1 💚 mvnsite 1m 46s trunk passed
-1 ❌ javadoc 0m 47s /branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04.txt hadoop-hdfs in trunk failed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04.
+1 💚 javadoc 1m 55s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 59s trunk passed
+1 💚 shadedclient 29m 20s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 24s Maven dependency ordering for patch
+1 💚 mvninstall 1m 25s the patch passed
+1 💚 compile 3m 53s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javac 3m 53s the patch passed
+1 💚 compile 3m 44s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 3m 44s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 53s /results-checkstyle-hadoop-hdfs-project.txt hadoop-hdfs-project: The patch generated 2 new + 24 unchanged - 0 fixed = 26 total (was 24)
+1 💚 mvnsite 1m 29s the patch passed
-1 ❌ javadoc 0m 42s /patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04.txt hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04.
+1 💚 javadoc 1m 48s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 4m 3s the patch passed
+1 💚 shadedclient 29m 7s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 0s hadoop-hdfs-client in the patch passed.
-1 ❌ unit 201m 13s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 39s The patch does not generate ASF License warnings.
340m 51s
Reason Tests
Failed junit tests hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS
hadoop.hdfs.TestFileAppend4
hadoop.hdfs.TestClientProtocolForPipelineRecovery
hadoop.hdfs.TestDFSInputStream
hadoop.hdfs.server.namenode.TestFsck
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6125/1/artifact/out/Dockerfile
GITHUB PR #6125
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 262da4d96e9b 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 412b38a
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6125/1/testReport/
Max. process+thread count 3571 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6125/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants