Thursday, January 15, 2015

Union-All with NULL value fails with java.lang.NullPointerException in Hive 0.12

Env:

Hive 0.12

Symptom:

Union-All with NULL value fails with java.lang.NullPointerException in Hive 0.12.
One minimum example is :
select * from
(
select col0,col1,NULL,NULL from passwords limit 1
union all 
select col0,col1,NULL,NULL from passwords limit 2
) tmp;
Sample error message is:
Ended Job = job_201501081639_0053 with errors
Error during job, obtaining debugging information...
Job Tracking URL: http://n1a.mycluster2.com:50030/jobdetails.jsp?jobid=job_201501081639_0053
Examining task ID: task_201501081639_0053_m_000001 (and more) from job job_201501081639_0053

Task with the most failures(4):
-----
Task ID:
  task_201501081639_0053_m_000001

-----
Diagnostic Messages for this Task:
java.lang.RuntimeException: Error in configuring object
 at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
 at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:439)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:282)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1122)
 at org.apache.hadoop.mapred.Child.main(Child.java:271)
Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
 ... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
 at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
 at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
 at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
 ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
 ... 17 more
Caused by: java.lang.RuntimeException: Map operator initialization failed
 at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:134)
 ... 22 more
Caused by: java.lang.NullPointerException
 at org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector.toString(StructObjectInspector.java:64)
 at java.lang.String.valueOf(String.java:2854)
 at java.lang.StringBuilder.append(StringBuilder.java:128)
 at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:110)
 at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377)
 at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:453)
 at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:409)
 at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:188)
 at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377)
 at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443)
 at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377)
 at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:113)
 ... 22 more


FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Job 0: Map: 1  Reduce: 1   Cumulative CPU: 1.8 sec   MAPRFS Read: 0 MAPRFS Write: 0 SUCCESS
Job 1: Map: 1  Reduce: 1   Cumulative CPU: 1.94 sec   MAPRFS Read: 0 MAPRFS Write: 0 SUCCESS
Job 2: Map: 2   MAPRFS Read: 0 MAPRFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 3 seconds 740 msec

Root Cause:

This is bug HIVE-4002 which was fixed in Hive 0.13.

Solution:

Workaround in Hive 0.12 is to re-write the query to put the "SELECT NULL" into outside of the union-all. For example, rewrite below query:
select * from
(
select col0,col1,NULL,NULL from passwords limit 1
union all 
select col0,col1,NULL,NULL from passwords limit 2
) tmp;
To:
select col0,col1, NULL,NULL from
(
select col0,col1 from passwords limit 1
union all 
select col0,col1 from passwords limit 2
) tmp;

Or upgrade to Hive 0.13 or above version.

No comments:

Post a Comment

Popular Posts