Friday, September 30, 2016

Drill on Yarn workshop 1: Find drillbits activity history

Goal:

How to find drillbits activity history in Drill on Yarn feature.

Env:

Drill 1.8 using Drill on Yarn feature.
MapR 5.2

Solution:

Application Master(AM) is the key, and its logs contains all activity history of each drillbit.
For example, its log is as below in my env:
/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1475192050844_0002/container_e02_1475192050844_0002_01_000001/stdout

Below are just key log entries from AM log; they may not be complete but they are the ones we should search for when troubleshooting.

1. Initialize 2 drillbits

For drillbit with id=1:
o.a.drill.yarn.appMaster.TaskState - [id=1, type=drillbits, name=Drillbit, state=StartState] StartState --> RequestingState
o.a.drill.yarn.appMaster.TaskState - [id=1, type=drillbits, name=Drillbit, state=RequestingState] - Received container: [id: container_e02_1475192050844_0002_01_000002, host: s4.poc.com, priority: 1, memory: 5120 MB, vcores: 1]
o.a.drill.yarn.appMaster.TaskState - [id=1, type=drillbits, name=Drillbit, host=s4.poc.com, state=RequestingState] RequestingState --> LaunchingState
o.a.drill.yarn.appMaster.TaskState - [id=1, type=drillbits, name=Drillbit, host=s4.poc.com, state=LaunchingState] LaunchingState --> WaitStartAckState
o.a.drill.yarn.appMaster.TaskState - [id=1, type=drillbits, name=Drillbit, host=s4.poc.com, state=WaitStartAckState] WaitStartAckState --> RunningState
For drillbit with id=2:
o.a.drill.yarn.appMaster.TaskState - [id=2, type=drillbits, name=Drillbit, state=StartState] StartState --> RequestingState
o.a.drill.yarn.appMaster.TaskState - [id=2, type=drillbits, name=Drillbit, state=RequestingState] - Received container: [id: container_e02_1475192050844_0002_01_000003, host: s1.poc.com, priority: 1, memory: 5120 MB, vcores: 1]
o.a.drill.yarn.appMaster.TaskState - [id=2, type=drillbits, name=Drillbit, host=s1.poc.com, state=RequestingState] RequestingState --> LaunchingState
o.a.drill.yarn.appMaster.TaskState - [id=2, type=drillbits, name=Drillbit, host=s1.poc.com, state=LaunchingState] LaunchingState --> WaitStartAckState
o.a.drill.yarn.appMaster.TaskState - [id=2, type=drillbits, name=Drillbit, host=s1.poc.com, state=WaitStartAckState] WaitStartAckState --> RunningState

2. Add 2 more drillbits

For drillbits with id=3,4:
o.a.drill.yarn.appMaster.TaskState - [id=3, type=drillbits, name=Drillbit, state=RequestingState] - Received container: [id: container_e02_1475192050844_0002_01_000004, host: s3.poc.com, priority: 1, memory: 5120 MB, vcores: 1]
o.a.drill.yarn.appMaster.TaskState - [id=4, type=drillbits, name=Drillbit, state=RequestingState] - Received container: [id: container_e02_1475192050844_0002_01_000005, host: s2.poc.com, priority: 1, memory: 5120 MB, vcores: 1]

3. Remove 2 drillbits

Drillbits with id=1,3 are killed:
o.a.d.y.a.PersistentTaskScheduler - [drillbits] - Cancelling 4 tasks. 0 are already cancelled, 4 more will be cancelled.
o.a.drill.yarn.appMaster.TaskState - [id=1, type=drillbits, name=Drillbit, host=s4.poc.com, state=RunningState] RunningState --> KillingState
o.a.drill.yarn.appMaster.TaskState - [id=3, type=drillbits, name=Drillbit, host=s3.poc.com, state=RunningState] RunningState --> KillingState
o.a.drill.yarn.appMaster.TaskState - [id=1, type=drillbits, name=Drillbit, host=s4.poc.com, state=KillingState] KillingState --> EndState
o.a.d.y.appMaster.SchedulerStateImpl - [id=1, type=drillbits, name=Drillbit, host=s4.poc.com, state=EndState] - Task completed
o.a.drill.yarn.appMaster.TaskState - [id=3, type=drillbits, name=Drillbit, host=s3.poc.com, state=KillingState] KillingState --> EndState
o.a.d.y.appMaster.SchedulerStateImpl - [id=3, type=drillbits, name=Drillbit, host=s3.poc.com, state=EndState] - Task completed

4. Add 2 more drillbits

Drillbits with id=5,6 are added:
o.a.drill.yarn.appMaster.TaskState - [id=5, type=drillbits, name=Drillbit, state=RequestingState] - Received container: [id: container_e02_1475192050844_0002_01_000006, host: s4.poc.com, priority: 1, memory: 5120 MB, vcores: 1]
o.a.drill.yarn.appMaster.TaskState - [id=6, type=drillbits, name=Drillbit, state=RequestingState] - Received container: [id: container_e02_1475192050844_0002_01_000007, host: s3.poc.com, priority: 1, memory: 5120 MB, vcores: 1]

Findings

1. When initializing drillbits, the state change would be:
StartState -> RequestingState -> LaunchingState -> WaitStartAckState -> RunningState
2. When removing drillbits, the state change would be:
RunningState -> KillingState -> EndState
3. In traditional way of restarting drillbits, the drillbit home directory does not change;
In Drill on Yarn, each time when you restart drillbit, it is in a new YARN container. 
For example:
[root@s4 application_1475192050844_0002]# ls -altr
total 16
drwxr-s--- 3 mapr mapr 4096 Sep 29 18:15 container_e02_1475192050844_0002_01_000002
drwxr-s--- 4 mapr mapr 4096 Sep 30 13:08 .
drwxr-xr-x 3 mapr mapr 4096 Sep 30 13:09 ..
drwxr-s--- 3 mapr mapr 4096 Sep 30 13:10 container_e02_1475192050844_0002_01_000006
4. Because of finding #3, the SQL profiles in dead/killed YARN container can not be seen unless you configure MFS as SQL profile location.
For example:
[root@s4 application_1475192050844_0002]# ls -altr container_e02_1475192050844_0002_01_000002/profiles/|tail -1
-rw-r--r-- 1 mapr mapr  3514 Sep 30 11:57 28114a53-414e-69e3-f6a9-d60763d4dfa8.sys.drill

[root@s4 application_1475192050844_0002]# ls -altr container_e02_1475192050844_0002_01_000006/profiles
total 8
drwxr-s--- 3 mapr mapr 4096 Sep 30 13:10 ..
drwxr-s--- 2 mapr mapr 4096 Sep 30 13:10 .

No comments:

Post a Comment