Thursday, March 19, 2015

How to enable Hive Default Authorization

Env:

Hive 0.12
MapR 4.0.2

Reference:

Hive Default Authorization - Legacy Mode

Goal:

Starting from Hive 0.13, SQL Standards Based Authorization in HiveServer2 was introduced to enable fine grained access control. However on Hive 0.12 or older version, Hive Default Authorization is available. However, this mode does not have a complete access control model, leaving many security gaps unaddressed.
This article introduces how to enable Hive Default Authorization in HiveServer2 and also helps understand the behaviors regarding impersonation.

Solution:

1. Enable Hive Default Authorization.

Add below configurations in hive-site.xml and restart HiveServer2 and Hive Metastore.
<property>
  <name>hive.security.authorization.enabled</name>
  <value>true</value>
  <description>enable or disable the hive client authorization</description>
</property>

<property>
  <name>hive.security.authorization.createtable.owner.grants</name>
  <value>ALL</value>
  <description>the privileges automatically granted to the owner whenever a table gets created. 
   An example like "select,drop" will grant select and drop privilege to the owner of the table</description>
</property>

2. Behaviors when Hive impersonation is disabled.

In Hive 0.12, hive.server2.enable.doAs=false by default, which means the query will run as the user that the HiveServer2 process runs as.
For example, on MapR platform, the HiveServer2 is running as "mapr" OS user.
a. testuser1 log on beeline, create a database db1, but can not create table inside db1.
hive --service beeline
beeline> !connect jdbc:hive2://localhost:10000 testuser1 testuser1 org.apache.hive.jdbc.HiveDriver

0: jdbc:hive2://localhost:10000> create database db1;
No rows affected (0.071 seconds)
0: jdbc:hive2://localhost:10000> use db1;
No rows affected (0.047 seconds)
0: jdbc:hive2://localhost:10000> create table db1.tableuser1(id int);
Error: Error while compiling statement: Authorization failed:No privilege 'Create' found for outputs { database:db1}. Use show grant to get more details. (state=,code=403)
b. Granting privilege to testuser1 does not work, however granting to mapr user works.
0: jdbc:hive2://localhost:10000> grant all on database db1 to user testuser1 ;
No rows affected (0.065 seconds)
0: jdbc:hive2://localhost:10000> create table db1.tableuser1(id int);
Error: Error while compiling statement: Authorization failed:No privilege 'Create' found for outputs { database:db1}. Use show grant to get more details. (state=,code=403)

0: jdbc:hive2://localhost:10000> grant all on database db1 to user mapr ;
No rows affected (0.092 seconds)
0: jdbc:hive2://localhost:10000> create table db1.tableuser1(id int);
No rows affected (0.105 seconds)
This is because all tables and databases are created by "mapr" OS user because hive.server2.enable.doAs=false.
c. testuser2 log on beeline and can have all privilege on db1. 
hive --service beeline
beeline> !connect jdbc:hive2://localhost:10000 testuser2 testuser2 org.apache.hive.jdbc.HiveDriver

0: jdbc:hive2://localhost:10000> use db1;
No rows affected (0.11 seconds)
0: jdbc:hive2://localhost:10000> show tables;
[HiveQueryResultSet/next] 0
+-------------+
|  tab_name   |
+-------------+
| tableuser1  |
+-------------+
1 row selected (0.275 seconds)
0: jdbc:hive2://localhost:10000> create table db1.tableuser2(id int);
No rows affected (0.101 seconds)
0: jdbc:hive2://localhost:10000> show tables;
[HiveQueryResultSet/next] 0
+-------------+
|  tab_name   |
+-------------+
| tableuser1  |
| tableuser2  |
+-------------+
2 rows selected (0.212 seconds)
0: jdbc:hive2://localhost:10000> drop table db1.tableuser1;
No rows affected (0.158 seconds)
0: jdbc:hive2://localhost:10000> show tables;
[HiveQueryResultSet/next] 0
+-------------+
|  tab_name   |
+-------------+
| tableuser2  |
+-------------+
1 row selected (0.16 seconds)
This is because all operations are done by "mapr" user who already got the privileges on database "db1".

3. Behaviors when Hive impersonation is enabled.

Firstly enable Hive impersonation following this documentation.
And then HiveServer2 performs the query processing as the user who submitted the query.
a. testuser1 log on beeline, create a database db1 and just needs privilege for testuser1 to create tables.
0: jdbc:hive2://localhost:10000> create database db1;
No rows affected (1.082 seconds)
0: jdbc:hive2://localhost:10000> use db1;
No rows affected (0.119 seconds)
0: jdbc:hive2://localhost:10000> create table db1.tableuser1(id int);
Error: Error while compiling statement: Authorization failed:No privilege 'Create' found for outputs { database:db1}. Use show grant to get more details. (state=,code=403)
0: jdbc:hive2://localhost:10000> grant all on database db1 to user testuser1 ;
No rows affected (0.133 seconds)
0: jdbc:hive2://localhost:10000> create table db1.tableuser1(id int);
No rows affected (0.657 seconds)
Double confirm that Hive impersonation is enabled by checking the ownership of the table files and database directory.
[warehouse]# ls -altr db1.db
total 1
drwxrwxrwx. 13 root      root      11 Mar 19 15:16 ..
drwxr-xr-x.  2 testuser1 testuser1  0 Mar 19 15:17 tableuser1
drwxr-xr-x.  3 testuser1 testuser1  1 Mar 19 15:17 . 
b. testuser2 log on beeline and can list the tables, but does not have privilege on the tables or database.
0: jdbc:hive2://localhost:10000> use db1;
No rows affected (0.131 seconds)
0: jdbc:hive2://localhost:10000> show tables;
[HiveQueryResultSet/next] 0
+-------------+
|  tab_name   |
+-------------+
| tableuser1  |
+-------------+
1 row selected (0.757 seconds)
0: jdbc:hive2://localhost:10000> select * from tableuser1;
Error: Error while compiling statement: Authorization failed:No privilege 'Select' found for inputs { database:db1, table:tableuser1, columnName:id}. Use show grant to get more details. (state=,code=403)
0: jdbc:hive2://localhost:10000> drop table tableuser1 ;
Error: Error while compiling statement: Authorization failed:No privilege 'Drop' found for outputs { database:db1, table:tableuser1}. Use show grant to get more details. (state=,code=403)
0: jdbc:hive2://localhost:10000> create table db1.tableuser2(id int);
Error: Error while compiling statement: Authorization failed:No privilege 'Create' found for outputs { database:db1}. Use show grant to get more details. (state=,code=403)
c. testuser1 grants CREATE privilege to testuser2.
0: jdbc:hive2://localhost:10000> grant create on database db1 to user testuser2 ;
No rows affected (0.088 seconds)
d. testuser2 still can not create tables because the filesystem owner of "db1" is "testuser1".
0: jdbc:hive2://localhost:10000> create table db1.tableuser2(id int);
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: java.io.IOException Error: Permission denied(13), file: tableuser2, user name: testuser2, ID: 2008) (state=08S01,code=1)
So HDFS system admin should also give permissions to users from OS perspective.
e. Add OS user "testuser2" to group "testuser1" and chmod 775 "db1" directory on HDFS.
# usermod -a -G testuser1 testuser2
# id testuser2
uid=2008(testuser2) gid=2008(testuser2) groups=2008(testuser2),2007(testuser1)

[warehouse]# chmod 775 db1.db
f. Now testuser2 can create tables inside db1.
0: jdbc:hive2://localhost:10000> create table db1.tableuser2(id int);
No rows affected (0.113 seconds)

[warehouse]# ls -altr db1.db
total 1
drwxrwxrwx. 13 root      root      11 Mar 19 15:16 ..
drwxr-xr-x.  2 testuser1 testuser1  0 Mar 19 15:17 tableuser1
drwxr-xr-x.  2 testuser2 testuser2  0 Mar 19 15:30 tableuser2
drwxrwxr-x.  4 testuser1 testuser1  2 Mar 19 15:30 .

Takeaways:

1. Hive Default Authorization is useful only when Hive impersonation is enabled, otherwise it only controls the user that the HiveServer2 process runs as.
2. Hive Default Authorization can not control filesystem permission, so it needs system admin to make sure the users have proper permission on filesystem also.

No comments:

Post a Comment

Popular Posts