Saturday, May 2, 2015

When will Application Master blacklist Node Manager and when will it ignore the blacklist?

Goal:

This article explains below mechanism of MapReduceV2 job on YARN:
When will Application Master(AM) blacklist one Node Manager(NM), and when will it ignore the blacklist?

Env:

Hadoop 2.5.1

Code Analysis:

1. When will AM blacklist NM?

If the number of task-failures on a Node Manager of a given job/Yarn application reaches 3(mapreduce.job.maxtaskfailures.per.tracker), Application Master of that job will ask Resource Manager to blacklist that Node Manager for that job only.
It means, new tasks of that job will not be assigned to that blacklisted Node Manager.
Note:
  • The blacklist is per job and other jobs will not be affected.
  • This feature is enabled(yarn.app.mapreduce.am.job.node-blacklisting.enable) by default.
The code logic is in function containerFailedOnHost(String hostName) of RMContainerRequestor.java:
    maxTaskFailuresPerNode =
      conf.getInt(MRJobConfig.MAX_TASK_FAILURES_PER_TRACKER, 3);

  protected void containerFailedOnHost(String hostName) {
  
    LOG.info(failures + " failures on node " + hostName);
    if (failures >= maxTaskFailuresPerNode) {
      blacklistedNodes.add(hostName);
      if (!ignoreBlacklisting.get()) {
        blacklistAdditions.add(hostName);
      }
      //Even if blacklisting is ignored, continue to remove the host from
      // the request table. The RM may have additional nodes it can allocate on.
      LOG.info("Blacklisted host " + hostName);

2. When will AM ignore the blacklist?

When the percentage of blacklisted Node Managers reaches 33%(yarn.app.mapreduce.am.job.node-blacklisting.enable), Application Master of that job will ask Resource Manager to ignore the blacklist and set all blacklisted nodes free.
Assume the cluster has 100 Node Managers. When 33 nodes are blacklisted, the blacklist will be ignored since then.

The code logic is in function computeIgnoreBlacklisting() of RMContainerRequestor.java:
      int val = (int) ((float) blacklistedNodes.size() / clusterNmCount * 100);
      if (val >= blacklistDisablePercent) {
        if (ignoreBlacklisting.compareAndSet(false, true)) {
          LOG.info("Ignore blacklisting set to true. Known: " + clusterNmCount
              + ", Blacklisted: " + blacklistedNodeCount + ", " + val + "%");
          // notify RM to ignore all the blacklisted nodes
          blacklistAdditions.clear();
          blacklistRemovals.addAll(blacklistedNodes);
        }

3. Will AM re-blacklist NMs after ignoring-blacklist event happens?

The current behavior is : if that Node Manager has ever been blacklisted before, then it will not be blacklisted again after ignore-blacklist happens; Else, it will be blacklisted.

The code logic is in function containerFailedOnHost(String hostName) of RMContainerRequestor.java:
  protected void containerFailedOnHost(String hostName) {
    if (!nodeBlacklistingEnabled) {
      return;
    }
    if (blacklistedNodes.contains(hostName)) {
      if (LOG.isDebugEnabled()) {
        LOG.debug("Host " + hostName + " is already blacklisted.");
      }
      return; //already blacklisted
The reason of above behavior is in above item 2: when ignoring-blacklist happens, it only ask RM to clear "blacklistAdditions", however it dose not clear the "blacklistedNodes" variable.

This behavior may cause the whole job/application to fail if the previous blacklisted NM was released after ignoring-blacklist event happens.
Imagine a serial murder is released from prison just because the prison is 33% full, and horribly he/she will never be put in prison again. Only new murder will be put in prison.
YARN-3571 is opened for this issue.

4. Will ignoring-blacklist event happen more than once?

Nope, ingoring-blacklist only happens once.
This is because function computeIgnoreBlacklisting() in RMContainerRequestor.java only triggers ignoring-blacklist event "if (ignoreBlacklisting.compareAndSet(false, true))".

This function "compareAndSet"'s return value is:
"true if successful. False return indicates that the actual value was not equal to the expected value."

So initially ignoreBlacklisting=false.
After ignoring-blacklist happens once, ignoreBlacklisting is set to true by function "compareAndSet".
After that, ignoring-blacklist can never be triggered.

Log Analysis:

Test Bed:
4 nodes cluster with 4 NMs.
This is a pig job on YARN after setting yarn.app.mapreduce.am.job.node-blacklisting.ignore-threshold-node-percent to 1. It means if any NodeManger is blacklisted, ignoring-blacklist event will happen.

Below I made some disasters to some nodes, and analyze the Application Master logs to prove above theory. 

Test 1:
One node(h4) has issue, other 3 nodes are healthy.
The job failed with below AM logs:
[root@h1 container_1430425729977_0006_01_000001]# egrep -i 'failures on node|blacklist|FATAL' syslog
2015-05-02 18:38:41,246 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: nodeBlacklistingEnabled:true
2015-05-02 18:38:41,246 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: blacklistDisablePercent is 1
2015-05-02 18:39:07,249 FATAL [IPC Server handler 3 on 41696] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0006_m_000002_0 - exited : java.io.IOException: Spill failed
2015-05-02 18:39:07,297 INFO [Thread-49] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 1 failures on node h4.poc.com
2015-05-02 18:39:07,950 FATAL [IPC Server handler 16 on 41696] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0006_m_000008_0 - exited : java.io.IOException: Spill failed
2015-05-02 18:39:07,954 INFO [Thread-49] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 2 failures on node h4.poc.com
2015-05-02 18:39:08,148 FATAL [IPC Server handler 17 on 41696] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0006_m_000007_0 - exited : java.io.IOException: Spill failed
2015-05-02 18:39:08,152 INFO [Thread-49] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 3 failures on node h4.poc.com
2015-05-02 18:39:08,152 INFO [Thread-49] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: Blacklisted host h4.poc.com
2015-05-02 18:39:08,561 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: Update the blacklist for application_1430425729977_0006: blacklistAdditions=1 blacklistRemovals=0
2015-05-02 18:39:08,561 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: Ignore blacklisting set to true. Known: 4, Blacklisted: 1, 25%
2015-05-02 18:39:09,563 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: Update the blacklist for application_1430425729977_0006: blacklistAdditions=0 blacklistRemovals=1
2015-05-02 18:39:32,912 FATAL [IPC Server handler 19 on 41696] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0006_m_000002_1 - exited : java.io.IOException: Spill failed
2015-05-02 18:39:35,076 FATAL [IPC Server handler 1 on 41696] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0006_m_000009_0 - exited : java.io.IOException: Spill failed
2015-05-02 18:39:35,133 FATAL [IPC Server handler 5 on 41696] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0006_m_000008_1 - exited : java.io.IOException: Spill failed
2015-05-02 18:39:57,308 FATAL [IPC Server handler 17 on 41696] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0006_m_000002_2 - exited : java.io.IOException: Spill failed
2015-05-02 18:40:00,174 FATAL [IPC Server handler 10 on 41696] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0006_m_000009_1 - exited : java.io.IOException: Spill failed
2015-05-02 18:40:00,227 FATAL [IPC Server handler 12 on 41696] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0006_m_000007_1 - exited : java.io.IOException: Spill failed
2015-05-02 18:40:22,905 FATAL [IPC Server handler 3 on 41696] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0006_m_000018_0 - exited : java.io.IOException: Spill failed
2015-05-02 18:40:24,413 FATAL [IPC Server handler 19 on 41696] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0006_m_000009_2 - exited : java.io.IOException: Spill failed
2015-05-02 18:40:26,086 FATAL [IPC Server handler 16 on 41696] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0006_m_000002_3 - exited : java.io.IOException: Spill failed

From above logs, we can see the node h4 got blacklisted after 3 task failures.
Immediately after that, the ignoring-blacklist event happened.
Then node h4 will never be blacklisted again.
When task 1430425729977_0006_m_000002 failed for 4 times, the whole job failed.

Test 2:
2 nodes(h3 and h4) have issues, other 2 nodes are healthy.
The job failed with below AM logs:
[root@h1 container_1430425729977_0007_01_000001]# egrep -i 'failures on node|blacklist|FATAL' syslog
2015-05-02 19:08:40,143 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: nodeBlacklistingEnabled:true
2015-05-02 19:08:40,143 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: blacklistDisablePercent is 1
2015-05-02 19:09:06,255 FATAL [IPC Server handler 5 on 51955] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0007_m_000005_0 - exited : java.io.IOException: Spill failed
2015-05-02 19:09:06,286 INFO [Thread-49] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 1 failures on node h3.poc.com
2015-05-02 19:09:06,387 FATAL [IPC Server handler 8 on 51955] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0007_m_000007_0 - exited : java.io.IOException: Spill failed
2015-05-02 19:09:06,391 INFO [Thread-49] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 2 failures on node h3.poc.com
2015-05-02 19:09:06,860 FATAL [IPC Server handler 26 on 51955] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0007_m_000008_0 - exited : java.io.IOException: Spill failed
2015-05-02 19:09:06,872 INFO [Thread-49] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 1 failures on node h4.poc.com
2015-05-02 19:09:06,923 FATAL [IPC Server handler 28 on 51955] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0007_m_000001_0 - exited : java.io.IOException: Spill failed
2015-05-02 19:09:06,928 INFO [Thread-49] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 3 failures on node h3.poc.com
2015-05-02 19:09:06,928 INFO [Thread-49] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: Blacklisted host h3.poc.com
2015-05-02 19:09:07,496 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: Update the blacklist for application_1430425729977_0007: blacklistAdditions=1 blacklistRemovals=0
2015-05-02 19:09:07,496 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: Ignore blacklisting set to true. Known: 4, Blacklisted: 1, 25%
2015-05-02 19:09:08,137 FATAL [IPC Server handler 1 on 51955] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0007_m_000002_0 - exited : java.io.IOException: Spill failed
2015-05-02 19:09:08,143 INFO [Thread-49] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 2 failures on node h4.poc.com
2015-05-02 19:09:08,499 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: Update the blacklist for application_1430425729977_0007: blacklistAdditions=0 blacklistRemovals=1
2015-05-02 19:09:08,705 FATAL [IPC Server handler 13 on 51955] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0007_m_000004_0 - exited : java.io.IOException: Spill failed
2015-05-02 19:09:08,712 INFO [Thread-49] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: 3 failures on node h4.poc.com
2015-05-02 19:09:08,712 INFO [Thread-49] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: Blacklisted host h4.poc.com
2015-05-02 19:09:32,252 FATAL [IPC Server handler 8 on 51955] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0007_m_000009_0 - exited : java.io.IOException: Spill failed
2015-05-02 19:09:32,560 FATAL [IPC Server handler 17 on 51955] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0007_m_000001_1 - exited : java.io.IOException: Spill failed
2015-05-02 19:09:32,815 FATAL [IPC Server handler 23 on 51955] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0007_m_000007_1 - exited : java.io.IOException: Spill failed
2015-05-02 19:09:32,993 FATAL [IPC Server handler 5 on 51955] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0007_m_000008_1 - exited : java.io.IOException: Spill failed
2015-05-02 19:09:33,145 FATAL [IPC Server handler 8 on 51955] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0007_m_000005_1 - exited : java.io.IOException: Spill failed
2015-05-02 19:09:34,731 FATAL [IPC Server handler 24 on 51955] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0007_m_000011_0 - exited : java.io.IOException: Spill failed
2015-05-02 19:09:56,388 FATAL [IPC Server handler 19 on 51955] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0007_m_000009_1 - exited : java.io.IOException: Spill failed
2015-05-02 19:09:57,443 FATAL [IPC Server handler 21 on 51955] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0007_m_000007_2 - exited : java.io.IOException: Spill failed
2015-05-02 19:09:58,607 FATAL [IPC Server handler 25 on 51955] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0007_m_000002_1 - exited : java.io.IOException: Spill failed
2015-05-02 19:09:59,282 FATAL [IPC Server handler 18 on 51955] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0007_m_000001_2 - exited : java.io.IOException: Spill failed
2015-05-02 19:09:59,557 FATAL [IPC Server handler 23 on 51955] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0007_m_000008_2 - exited : java.io.IOException: Spill failed
2015-05-02 19:09:59,838 FATAL [IPC Server handler 5 on 51955] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0007_m_000004_1 - exited : java.io.IOException: Spill failed
2015-05-02 19:10:21,783 FATAL [IPC Server handler 4 on 51955] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1430425729977_0007_m_000008_3 - exited : java.io.IOException: Spill failed
From above logs, we can see the node h3 got blacklisted firstly after 3 task failures.
Immediately after that, the ignoring-blacklist event happened.
Then node h4 got blacklisted again, and node h3 will never be blacklisted again.
Note: ignoring-blacklist did not happen for the 2nd time.
When task 1430425729977_0007_m_000008 failed for 4 times, the whole job failed.

Key takeaways

  • AM can blacklist the NM if 3 task failures reaches.
  • AM ignores all blacklist NMs if 33% threshold reaches.
  • AM can not re-blacklist the same NMs again after ignoring-blacklist event happens once.
  • Ignoring-blacklist event only happens once.

No comments:

Post a Comment

Popular Posts