Before adding Sentry, below are the general prerequisites need to be done.
This may be mentioned in the problem description.
Please confirm the hive warehouse directory detail in /etc/hive/conf/hive-site.xml file.
The Hive warehouse directory (/user/hive/warehouse) must be owned by the Hive user and group and should have 771 permissions.
# sudo –u hdfs hadoop fs -chown -R hive:hive /user/hive/warehouse # sudo -u hdfs hadoop fs -chmod -R 771 /user/hive/warehouse
Disable impersonation for HiveServer2:
CM – Hive – Configuration – Category:Main – HiveServer2 Enable Impersonation
Uncheck the HiveServer2 Enable Impersonation property – save changes.
Enable Hive user to submit jobs:
CM – YARN – configuration – search ‘Allowed system users’
Ensure the Allowed System Users property includes the hive user. If not, add hive.
Save changes.
IMPORTANT – Disable Sentry Policy Files:
CM – Hive – configuration – search ‘sentry’ –
uncheck “Enable Sentry Authorization using Policy Files” – save changes.
CM – Impala – configuration – search ‘sentry’ –
uncheck “Enable Sentry Authorization using Policy Files” – save changes.
To install Sentry:
Go to CM – Cluster dropdown – add service – select Sentry – continue
Customize role assignments for Sentry service and gateway
Provide database details as given in the problem – test connection – Finish
If db details not given, select embedded database – Finish
Confirm Sentry service is up and running.
To configure Sentry:
Configuration will likely to be enabling sentry service for hive, impala, hue.
Go to CM – Hive – configuration – search ‘sentry’ – Enable Sentry service
Ensure prerequisites are done for hive.
CM – Impala – configuration – search ‘sentry’ – Enable Sentry service
CM – Hue – configuration – search ‘sentry’ – Enable Sentry service
Add the Hive, Impala and Hue Groups to Sentry’s Admin Group:
CM – Sentry – configuration – Category:Main – search “Admin groups”
Add hive, impala, hue in the admin groups property – save changes.
Problem Scenarios:
Add a sentry service to the cluster – customize roles assignment as given.
Add a sentry service to the cluster – customize roles assignment as given – troubleshoot errors if any occurs during starting the service.
Add a sentry service to the cluster – customize roles assignment as given – configure it for hive.
Thus we covered how to install and configure Sentry service.
—
Use the comments section below to post your doubts, questions and feedback.
Please follow my blog to get notified of more certification related posts, exam tips, etc.
Can you please let me know on which nodes we need to run these commands , is it on the nodes where hive is installed ?
# sudo –u hdfs hadoop fs -chown -R hive:hive /user/hive/warehouse
# sudo -u hdfs hadoop fs -chmod -R 771 /user/hive/warehouse
You can run it on any nodes (nn, dn, hdfs gateway), as the permissions are applied for HDFS. i.e, HDFS is accessible from any nodes of the cluster.
Hi @Kannan,
you mean from any nodes of the cluster having hdfs client.
Very usefull post.
BR
DF
Hi,
Yes, You’re right. The node should contain any HDFS roles such as datanode, hdfs gateway, etc to access HDFS. I should have explained that clearly.
Thanks.
Hi @Kannan,
From any nodes having HDFS clients 🙂
Very good post anyway.
Regards
DF
Hi Ali,
Thanks for pointing out. You’re right.
The node should have HDFS roles DN,HDFS gateway to access HDFS.
I’m assuming that if hadoop binaries are installed in a node and we manually copy the client configuration files, then we should be able to access HDFS without any roles assigned. i could be wrong but have to test it.