1. Namenode fails with error "Login failure for hdfs/hdm.xxx.com@OPENKBINFO.COM from keytab /etc/security/phd/keytab/hdfs.service.keytab".
Error message in namenode log:
************************************************************/ 2014-06-07 16:49:35,421 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2014-06-07 16:49:35,460 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2014-06-07 16:49:35,460 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system started 2014-06-07 16:49:36,024 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join java.io.IOException: Login failure for hdfs/hdm.xxx.com@OPENKBINFO.COM from keytab /etc/security/phd/keytab/hdfs.service.keytab at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:835) at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:283) at org.apache.hadoop.hdfs.server.namenode.NameNode.loginAsNameNodeUser(NameNode.java:423) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:434) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:609) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:594) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1169) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1235) Caused by: javax.security.auth.login.LoginException: java.lang.ExceptionInInitializerError at javax.crypto.SunJCE_h.<clinit>(DashoA13*..) at javax.crypto.Cipher.c(DashoA13*..) at javax.crypto.Cipher.getMaxAllowedKeyLength(DashoA13*..) at sun.security.krb5.internal.crypto.EType.getBuiltInDefaults(EType.java:179) at sun.security.krb5.internal.crypto.EType.isSupported(EType.java:261) at sun.security.krb5.internal.ktab.KeyTab.readServiceKeys(KeyTab.java:263) at sun.security.krb5.EncryptionKey.acquireSecretKeys(EncryptionKey.java:140) at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:635) at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:542) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at javax.security.auth.login.LoginContext.invoke(LoginContext.java:769) at javax.security.auth.login.LoginContext.access$000(LoginContext.java:186) at javax.security.auth.login.LoginContext$5.run(LoginContext.java:706) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.login.LoginContext.invokeCreatorPriv(LoginContext.java:703) at javax.security.auth.login.LoginContext.login(LoginContext.java:575) at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:826) at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:283) at org.apache.hadoop.hdfs.server.namenode.NameNode.loginAsNameNodeUser(NameNode.java:423) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:434) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:609) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:594) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1169) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1235) Caused by: java.lang.SecurityException: Cannot set up certs for trusted CAs at javax.crypto.SunJCE_b.<clinit>(DashoA13*..) ... 27 more Caused by: java.lang.SecurityException: Jurisdiction policy files are not signed by trusted signers! at javax.crypto.SunJCE_b.a(DashoA13*..) at javax.crypto.SunJCE_b.i(DashoA13*..) at javax.crypto.SunJCE_b.g(DashoA13*..) at javax.crypto.SunJCE_b$1.run(DashoA13*..) at java.security.AccessController.doPrivileged(Native Method) ... 28 more at javax.security.auth.login.LoginContext.invoke(LoginContext.java:872) at javax.security.auth.login.LoginContext.access$000(LoginContext.java:186) at javax.security.auth.login.LoginContext$5.run(LoginContext.java:706) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.login.LoginContext.invokeCreatorPriv(LoginContext.java:703) at javax.security.auth.login.LoginContext.login(LoginContext.java:575) at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:826) ... 7 more 2014-06-07 16:49:36,029 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 2014-06-07 16:49:36,031 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at hdm.xxx.com/192.168.192.101 ************************************************************/
Cause:
JDK is 1.6 on all hosts. It brings some compatibility issues.Fix:
Shutdown cluster, remove old JDK 1.6 and install JDK 1.7 on all hosts.rpm -e jdk-1.6.0 yum install java-1.7.0-openjdk alternatives --config java
2. Namenode fails with error "Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled".
Error message in namenode log:
2014-06-07 20:19:16,982 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8020: readAndProcess threw exception javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled)] from client 192.168.192.101. Count of bytes read: 0 javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled)] at com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:177) at org.apache.hadoop.ipc.Server$Connection.saslReadAndProcess(Server.java:1173) at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1350) at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:726) at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:525) at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:500) Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism level: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled) at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:788) at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:342) at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:285) at com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:155) ... 5 more Caused by: KrbException: Encryption type AES256 CTS mode with HMAC SHA1-96 is not supported/enabled at sun.security.krb5.EncryptionKey.findKey(EncryptionKey.java:552) at sun.security.krb5.KrbApReq.authenticate(KrbApReq.java:270) at sun.security.krb5.KrbApReq.<init>(KrbApReq.java:144) at sun.security.jgss.krb5.InitSecContextToken.<init>(InitSecContextToken.java:108) at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:771) ... 8 more
Cause:
JCE jar files are missing due to upgrading JDK 1.6 to JDK 1.7.(Issue 1 above.)Fix:
Follow step <2. Install JCE on all Cluster Hosts> in Installing the MIT Kerberos 5 KDC.3. Namenode fails with error "javax.security.auth.login.LoginException: No key to store".
Error message in namenode log:
Caused by: javax.servlet.ServletException: javax.security.auth.login.LoginException: No key to store at org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.init(KerberosAuthenticationHandler.java:185) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.init(AuthenticationFilter.java:146) at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:107) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:707) at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:254) at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1240) at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:689) at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:482) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.server.handler.HandlerCollection.doStart(HandlerCollection.java:229) at org.eclipse.jetty.server.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:172) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:95) at org.eclipse.jetty.server.Server.doStart(Server.java:279) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.apache.hadoop.http.HttpServer.start(HttpServer.java:682) ... 8 more Caused by: javax.security.auth.login.LoginException: No key to store at com.sun.security.auth.module.Krb5LoginModule.commit(Krb5LoginModule.java:1072) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at javax.security.auth.login.LoginContext.invoke(LoginContext.java:762) at javax.security.auth.login.LoginContext.access$000(LoginContext.java:203) at javax.security.auth.login.LoginContext$4.run(LoginContext.java:690) at javax.security.auth.login.LoginContext$4.run(LoginContext.java:688) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:687) at javax.security.auth.login.LoginContext.login(LoginContext.java:596) at org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.init(KerberosAuthenticationHandler.java:169) ... 24 more 2014-06-07 21:11:33,511 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 2014-06-07 21:11:33,512 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
Cause:
Possible cause is the /tmp/krb* cache file got corrupted or not compatible after I fixed issue 1 and issue 2 above.Fix:
Remove the /tmp/krb* cache file on all hosts and try to restart namenode , then it worked.rm -f /tmp/krb*
4. Node manager fails with error "Couldn't setup connection for yarn/hdw1.xxx.com@OPENKBINFO.COM to null"
Error message in node manager log:
2014-06-08 12:41:02,890 INFO org.apache.hadoop.yarn.service.AbstractService: Service:httpshuffle is started.
2014-06-08 12:41:02,891 INFO org.apache.hadoop.yarn.service.AbstractService: Service:org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices is started.
2014-06-08 12:41:02,891 INFO org.apache.hadoop.yarn.service.AbstractService: Service:containers-monitor is started.
2014-06-08 12:41:02,891 INFO org.apache.hadoop.yarn.service.AbstractService: Service:Dispatcher is started.
2014-06-08 12:41:03,240 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:yarn/hdw1.xxx.com@OPENKBINFO.COM (auth:KERBEROS) cause:java.io.IOException: Failed to specify server's Kerberos principal name
2014-06-08 12:41:03,326 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:yarn/hdw1.xxx.com@OPENKBINFO.COM (auth:KERBEROS) cause:java.io.IOException: Failed to specify server's Kerberos principal name
2014-06-08 12:41:06,708 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:yarn/hdw1.xxx.com@OPENKBINFO.COM (auth:KERBEROS) cause:java.io.IOException: Failed to specify server's Kerberos principal name
2014-06-08 12:41:07,635 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:yarn/hdw1.xxx.com@OPENKBINFO.COM (auth:KERBEROS) cause:java.io.IOException: Failed to specify server's Kerberos principal name
2014-06-08 12:41:08,376 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:yarn/hdw1.xxx.com@OPENKBINFO.COM (auth:KERBEROS) cause:java.io.IOException: Failed to specify server's Kerberos principal name
2014-06-08 12:41:08,697 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:yarn/hdw1.xxx.com@OPENKBINFO.COM (auth:KERBEROS) cause:java.io.IOException: Failed to specify server's Kerberos principal name
2014-06-08 12:41:08,697 WARN org.apache.hadoop.ipc.Client: Couldn't setup connection for yarn/hdw1.xxx.com@OPENKBINFO.COM to null
Cause:
On node manager, "yarn.resourcemanager.principal" and "yarn.resourcemanager.keytab" are missing in yarn-site.xml .Fix:
Add below entries in yarn-site.xml on all node manager.<!-- resource manager secure configuration info --> <property> <name>yarn.resourcemanager.principal</name> <value>yarn/_HOST@OPENKBINFO.COM</value> </property> <property> <name>yarn.resourcemanager.keytab</name> <value>/etc/security/phd/keytab/yarn.service.keytab</value> </property>
5. YARN resource/node manager process is up but its log shows error "User yarn/hdw1.xxx.com@OPENKBINFO.COM (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB, expected client Kerberos principal is null"
Error message in resource/node manager log:
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User yarn/hdw1.xxx.com@OPENKBINFO.COM (auth:KERBEROS) is not authorized for protocol interface org.apache.hadoop.yarn.server.api.ResourceTrackerPB, expected client Kerberos principal is null
at org.apache.hadoop.ipc.Client.call(Client.java:1235)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy26.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
... 6 more
Cause:
Known issue in HADOOP-9444.Fix:
hadoop-policy.xmlReplace all occurrences of ${HADOOP_HDFS_USER} and ${HADOOP_YARN_USER} with *.
good post...thanks...
ReplyDeleteThank you so much... yo have saved my life
ReplyDelete