Tagged: Presto Toggle Comment Threads | Keyboard Shortcuts
-
Wang
-
Wang
-
Wang
[Presto] Secure with LDAP
For security issue we decided to enable LDAP in presto, to deploy presto into kubernetes cluster we build presto image ourselves which include kerberos authentication and LDAP configurations.
As you see the image structure, configurations under catalog/etc/hive are very important, please pay attention.
krb5.conf and xxx.keytab are used to connect to kerberos
password-authenticator.properties and ldap_server.pem under etc, hive.properties and hive-security.json under catalog are used to connect to LDAP.
password-authenticator.properties
password-authenticator.name=ldap ldap.url=ldaps://<IP>:<PORT> ldap.user-bind-pattern=xxxxxx ldap.user-base-dn=xxxxxx
hive.properties
connector.name=hive-hadoop2 hive.security=file security.config-file=<hive-security.json> hive.metastore.authentication.type=KERBEROS hive.metastore.uri=thrift://<IP>:<PORT> hive.metastore.service.principal=<SERVER-PRINCIPAL> hive.metastore.client.principal=<CLIENT-PRINCIPAL> hive.metastore.client.keytab=<KEYTAB> hive.config.resources=core-site.xml, hdfs-site.xml
hive-security.json
{ "schemas": [{ "user": "user_1", "schema": "db_1", "owner": false }, { "user": " ", "schema": "db_1", "owner": false }, { "user": "user_2", "schema": "db_2", "owner": false }], "tables": [{ "user": "user_1", "schema": "db_1", "table": "table_1", "privileges": ["SELECT"] }, { "user": "user_1", "schema": "db_1", "table": "table_2", "privileges": ["SELECT"] }, { "user": "user_2", "schema": "db_1", "table": ".*", "privileges": ["SELECT"] }, { "user": "user_2", "schema": "db_2", "table": "table_1", "privileges": ["SELECT"] }, { "user": "user_2", "schema": "db_2", "table": "table_2", "privileges": ["SELECT"] }], "sessionProperties": [{ "allow": false }] }
-
Wang
[Presto] Kerberos trouble shooting
When I configured presto cluster to connect hive by kerberos, I met some problems which cost me too much time to solve them, so I summarized the problems, hope could help others.
1.Append -Djava.security.krb5.conf=”krb5.conf location” to etc/jvm.properties
8) Error in custom provider, java.lang.NoClassDefFoundError: Could not initialize class com.facebook.presto.hive.authentication.KerberosHadoopAuthentication at com.facebook.presto.hive.authentication.AuthenticationModules$1.createHadoopAuthentication(AuthenticationModules.java:59) (via modules: com.facebook.presto.hive.authentication.HiveAuthenticationModule -> io.airlift.configuration.ConditionalModule -> com.facebook.presto.hive.authentication.AuthenticationModules$1) while locating com.facebook.presto.hive.authentication.HadoopAuthentication annotated with @com.facebook.presto.hive.ForHiveMetastore() for the 2nd parameter of com.facebook.presto.hive.authentication.KerberosHiveMetastoreAuthentication.<init>(KerberosHiveMetastoreAuthentication.java:44) ... ...
2.Specify hdfs-site.xml/core-site.xml in hive.properties like hive.config.resources=xxx/core-site.xml,xxx/hdfs-site.xml
Query 20180504_150148_00018_v6ndf failed: java.net.UnknownHostException: xxx
3.Download hadoop-lzo jar into plugin/hive-hadoop2
Query 20180504_150959_00002_3f2qe failed: Unable to create input format org.apache.hadoop.mapred.TextInputFormat Caused by: java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:139) at org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:180) at org.apache.hadoop.mapred.TextInputFormat.configure(TextInputFormat.java:45) ... 19 more Caused by: java.lang.ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101) at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:132) ... 21 more
4.Export KRB5_CONFIG & get kerberos tgt, use kinit command
Query 20180504_153940_00000_nrsgy failed: Failed to list directory: hdfs://xxx/user/hive/warehouse/xxx.db/xxx
5.More than one coordinator in the cluster
2018-05-04T18:10:56.410Z WARN http-worker-4560 com.facebook.presto.execution.SqlTaskManager Switching coordinator affinity from hhbts to qhnep 2018-05-04T18:10:56.500Z WARN http-worker-4560 com.facebook.presto.execution.SqlTaskManager Switching coordinator affinity from qhnep to c83wr 2018-05-04T18:10:56.578Z WARN http-worker-4395 com.facebook.presto.execution.SqlTaskManager Switching coordinator affinity from c83wr to ujj9n 2018-05-04T18:10:56.749Z WARN http-worker-4432 com.facebook.presto.execution.SqlTaskManager Switching coordinator affinity from ujj9n to wdsxf 2018-05-04T18:10:57.009Z WARN http-worker-4584 com.facebook.presto.execution.SqlTaskManager Switching coordinator affinity from wdsxf to hhbts
-
Wang
[Presto] Connect hive by kerberos
For data security, hadoop cluster usually implement different security mechanisms, most commonly used mechanism is kerberos. Recently I tested how to connect hive by kerberos in presto.
1.Add krb5.conf/keytab/hdfs-site.xml/core-site.xml in every node.
2.Modify etc/jvm.properties, append -Djava.security.krb5.conf=”krb5.conf location”
3.Create hive.properties under etc/catalog
cat << 'EOF' > etc/catalog/hive.properties connector.name=hive-hadoop2 hive.metastore.uri=thrift://xxx:9083 hive.metastore.authentication.type=KERBEROS hive.metastore.service.principal=xxx@xxx.com hive.metastore.client.principal=xxx@xxx.com hive.metastore.client.keytab="keytab location" hive.config.resources="core-site.xml and hdfs-site.xml" location EOF
4.Download hadoop-lzo jar into plugin/hive-hadoop2
wget http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.16/hadoop-lzo-0.4.16.jar -O plugin/hive-hadoop2
5.Get principal tgt
export KRB5_CONFIG="krb5.conf location" kinit -kt "keytab location" xxx@xxx.com
6.Restart presto
bin/launcher restart
-
Wang
[Presto] Build pseudo cluster
Presto is a distributed query engine which is developed by Facebook, for specific concept and advantages, please refer to the official document, below are the steps how I build pseudo cluster on my mac.
1.download presto
wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.196/presto-server-0.196.tar.gz tar -zvxf presto-server-0.196.tar.gz && cd presto-server-0.196
2.configure configurations
mkdir etc cat << 'EOF' > etc/jvm.config -server -Xmx16G -Xms16G -XX:+UseG1GC -XX:G1HeapRegionSize=32M -XX:+UseGCOverheadLimit -XX:+ExplicitGCInvokesConcurrent -XX:+HeapDumpOnOutOfMemoryError -XX:+ExitOnOutOfMemoryError EOF cat << 'EOF' > etc/log.properties com.facebook.presto=INFO EOF cat << 'EOF' > etc/config1.properties coordinator=true node-scheduler.include-coordinator=true http-server.http.port=8001 query.max-memory=24GB query.max-memory-per-node=8GB discovery-server.enabled=true discovery.uri=http://localhost:8001 EOF cat << 'EOF' > etc/config2.properties coordinator=false node-scheduler.include-coordinator=true http-server.http.port=8002 query.max-memory=24GB query.max-memory-per-node=8GB discovery-server.enabled=true discovery.uri=http://localhost:8001 EOF cat << 'EOF' > etc/config3.properties coordinator=true node-scheduler.include-coordinator=true http-server.http.port=8003 query.max-memory=24GB query.max-memory-per-node=8GB discovery-server.enabled=true discovery.uri=http://localhost:8001 EOF cat << 'EOF' > etc/node1.properties node.environment=test node.id=671d18f9-dd0f-412d-b18c-fe6d7989b040 node.data-dir=/usr/local/Cellar/presto/0.196/data/node1 EOF cat << 'EOF' > etc/node2.properties node.environment=test node.id=e72fdd91-a135-4936-9a3e-f888c5106ed9 node.data-dir=/usr/local/Cellar/presto/0.196/data/node2 EOF cat << 'EOF' > etc/node3.properties node.environment=test node.id=6ab76715-1812-4093-95cf-1945f4cfefe3 node.data-dir=/usr/local/Cellar/presto/0.196/data/node3 EOF
p.s. If you want to restrict operation, please add access-control.properties as below, only permit read operation.
cat << 'EOF' > etc/access-control.properties access-control.name=read-only EOF
3.start presto server
bin/launcher start --config=etc/config1.properties --node-config=etc/node1.properties bin/launcher start --config=etc/config2.properties --node-config=etc/node2.properties bin/launcher start --config=etc/config3.properties --node-config=etc/node3.properties
4.downlaod cli
wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.196/presto-cli-0.196-executable.jar -O bin/presto-cli chmod +x bin/presto-cli
5.create catalogs
cat << 'EOF' > etc/catalog/mysql.properties connector.name=mysql connection-url=jdbc:mysql://localhost:3306?useSSL=false connection-user=presto connection-password=presto EOF cat << 'EOF' > etc/catalog/hive.properties connector.name=hive-hadoop2 hive.metastore.uri=thrift://localhost:9083 EOF
6.connect
bin/presto-cli --server localhost:8001 --catalog hive presto> show catalogs; Catalog --------- hive mysql system (3 rows) Query 20180318_045410_00013_sq83e, FINISHED, 1 node Splits: 1 total, 1 done (100.00%) 0:00 [0 rows, 0B] [0 rows/s, 0B/s]
Screenshot:
P.S. If build cluster, pay attention to below items:
1.node.id in node.properties in every node must be unique in the cluster, you could generate it by uuid/uuidgen.
2.query.max-memory-per-node in config.properties better to be half of -Xmx in jvm.config.
Reply