Monday, November 13, 2017

How to configure LDAP client by using SSSD for authentication on CentOS

Goal:

How to configure LDAP client by using SSSD(System Security Services Daemon) for authentication on CentOS.

Monday, November 6, 2017

How to install and configure MapR Hive ODBC driver on Linux

Goal:

How to install and configure MapR Hive ODBC driver on Linux.
This article gives detailed step-by-step instructions as a supplement to this MapR Documentation.

Monday, October 30, 2017

How to modify hbase thrift client code if Hbase Thrift Service enables framed transport and compact protocol

Goal:

How to modify hbase thrift client code if Hbase Thrift Service enables framed transportation and compact protocol.
The background is:
To avoid thrift service crash issue mentioned in HBASE-11052, we need to enable framed transport and compact protocol in hbase-site.xml and restart Hbase Thrift Service as below:
<property> 
  <name>hbase.regionserver.thrift.framed</name> 
  <value>true</value> 
</property> 
<property> 
  <name>hbase.regionserver.thrift.framed.max_frame_size_in_mb</name> 
  <value>2</value> 
</property> 
<property> 
  <name>hbase.regionserver.thrift.compact</name> 
  <value>true</value> 
</property>
After that, the old Hbase thrift client code need to be modified, otherwise it will fail with below error:
thrift.transport.TTransport.TTransportException: TSocket read 0 bytes
This article explains what to modify in hbase thrift code to make the job compatible with framed transport and compact protocol.

Friday, October 27, 2017

How to install Thrift and run a sample Hbase thrift job towards Hbase Thrift Gateway on MapR Cluster

Goal:

How to install Thrift and run a sample Hbase thrift job towards Hbase Thrift Gateway on MapR Cluster.
The example job is written in Python, and it just scans a MapR-DB table.

Friday, September 22, 2017

How to modify max heap size for Impalad embedded JVM

Goal:

The impalad is a 'native' process that has an embedded JVM. The JVM is started from within c++ code.
 The -mem_limit startup option sets an overall limit for the impalad process (which handles multiple queries concurrently).
However the max heap size of impalad embedded JVM is much smaller than that limit.
Some impala queries may use up all embedded JVM heap size before reaching the limit set by "-mem_limit" startup option, so that it may cause impalad errors "OutOfMemoryError: Java heap space" or just get hung. In that situation, we need to increase the JVM max heap size.

This article shows how to check and modify the max heap size for impalad embedded JVM.

Tuesday, September 19, 2017

Hue could not show Hive tables after Hive enables PAM authentication

Symptom:

a. Hue could not show Hive tables after Hive enables PAM authentication, see below screenshot:
b. From /opt/mapr/hue/hue-<version>/logs/runcpserver.log, below error messages show up:
[19/Sep/2017 15:55:07 -0700] dbms         DEBUG    Query Server: {'server_name': 'beeswax', 'transport_mode': 'socket', 'server_host': 's4.poc.com', 'server_port': 10000, 'auth_password_used': False, 'http_url': 'http://s4.poc.com:10001/cliservice', 'auth_username': 'hue', 'principal': None}
[19/Sep/2017 15:55:10 -0700] thrift_util  INFO     Thrift saw a transport exception: Bad status: 3 (Error validating the login)
c. From HiveServer2 log /opt/mapr/hive/hive-<version>/logs/mapr/hive.log, below stacktrace shows up:
2017-09-19T15:57:11,046 ERROR [HiveServer2-Handler-Pool: Thread-60] transport.TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: Error validating the login [Caused by javax.security.sasl.AuthenticationException: Error authenticating with the PAM service: login [Caused by javax.security.sasl.AuthenticationException: Error authenticating with the PAM service: login]]
 at org.apache.hive.service.auth.PlainSaslServer.evaluateResponse(PlainSaslServer.java:110)
 at org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:539)
 at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:283)
 at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
 at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
 at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
Caused by: javax.security.sasl.AuthenticationException: Error authenticating with the PAM service: login [Caused by javax.security.sasl.AuthenticationException: Error authenticating with the PAM service: login]
 at org.apache.hive.service.auth.PamAuthenticationProviderImpl.Authenticate(PamAuthenticationProviderImpl.java:54)
 at org.apache.hive.service.auth.PlainSaslHelper$PlainServerCallbackHandler.handle(PlainSaslHelper.java:119)
 at org.apache.hive.service.auth.PlainSaslServer.evaluateResponse(PlainSaslServer.java:103)
 ... 8 more
Caused by: javax.security.sasl.AuthenticationException: Error authenticating with the PAM service: login
 at org.apache.hive.service.auth.PamAuthenticationProviderImpl.Authenticate(PamAuthenticationProviderImpl.java:48)
 ... 10 more
2017-09-19T15:57:11,046 ERROR [HiveServer2-Handler-Pool: Thread-60] server.TThreadPoolServer: Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Error validating the login
 at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
 at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TTransportException: Error validating the login
 at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
 at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316)
 at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
 at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
 ... 4 more

Friday, September 15, 2017

Hive 2.x queries got stuck when waiting for "tryAcquireCompileLock" in HS2 stacktrace

Env:

Hive 2.x

Symptom:

Hive 2.x queries got stuck when waiting for "tryAcquireCompileLock" in HS2 stacktrace.
When the issue happens, connecting using beeline to HS2 works.
However any queries, eg, "show databases" will hung.

Below is one example of the jstack output on HS2 process:
"7d88a5ad-cd2c-4c37-9025-8372164524fd HiveServer2-Handler-Pool: Thread-214" #214 prio=5 os_prio=0 tid=0x00007fdacc2d1800 nid=0x7af0 waiting on condition [0x00007fda9b6fa000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000005c1fb7f28> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
        at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
        at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
        at org.apache.hadoop.hive.ql.Driver.tryAcquireCompileLock(Driver.java:1324)
        at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1236)
        at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1230)
        at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:191)
        at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:276)
        at org.apache.hive.service.cli.operation.Operation.run(Operation.java:324)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:499)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:486)
        at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
        at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
        at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
        at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
        at com.sun.proxy.$Proxy25.executeStatementAsync(Unknown Source)
        at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:294)
        at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:505)
        at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1437)
        at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1422)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
        at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)

How to limit the Hive log size using RFA instead of default DRFA in Hive 2.x using log4j2

Env:

Hive 2.x on MapR

Goal:

By default, MapR Hive is using DRFA(Daily Rolling File Appender) for log4j2. The template for DRFA settings are in /opt/mapr/hive/hive-<version>/conf/hive-log4j2.properties.template
Administrators can copy hive-log4j2.properties.template to hive-log4j2.properties in "conf" directory and make the changes as they want.
However if the daily Hive log is too large and may potentially fill up all the disk space, we can use RFA(Rolling File Appender) instead to set a max size of each log and also the total number of logs.

Note:  Per HIVE-11304, Hive upgraded log4j 1.x to log4j2. So that the previous article is only for Hive 1.x.

Tuesday, May 23, 2017

Popular Posts