×

hive安装配置

96
夏无忧阳
2017.01.12 16:45* 字数 1598

HIVE是一个基于Hadoop的数据仓库,适用于一些高延迟性的应用。如果对延迟性要求比较高,则可以选择Hbase。
前提:需要已经安装配置好hadoop参考:hadoop2.7.3伪分布式环境搭建详细安装过程

安装mysql

  1. 下载安装mysql
    yum install mysql-server
  2. 设置默认字符和引擎
    vim /etc/my.cnf
    在[mysqld]下添加
    default-character-set=utf8
    default-storage-engine=INNODB
  3. 启动mysql
    cd /etc/init.d
    ./mysqld start
  4. 进入mysql
    mysql

建立配置hive数据库

  1. 为用户创建一个名为hive的数据库,并设置编码为latin1
    mysql> create database hive default character set latin1;

  2. 查看hive数据库是否成功建立

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| hive               |
| mysql              |
| test               |
+--------------------+
4 rows in set (0.00 sec)
  1. 创建hive用户并授权
 //授权hive用户拥有hive数据库的所有权限
  mysql>  grant all privileges on hive.* to hive@'%' identified by '123456';
Query OK, 0 rows affected (0.00 sec)
  //刷新系统权限表
  mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
  1. 测试hive用户能否链接到mysql
[root@cognos init.d]# mysql -u hive -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
。。。
mysql> use hive;
Database changed
mysql> show tables;
Empty set (0.00 sec)

安装hive

  1. 下载
    hive-2.0.1下载
  2. 解压
    tar -xzvf apache-hive-2.0.1-bin.tar.gz
  3. 将解压后的文件夹重命名并放到hadoop目录下
    mv apache-hive-2.0.1-bin hive
    mv hive /opt/hadoop/
  4. 下载mysql驱动包并放入hive安装目录/lib下
    我这里下载的是mysql-connector-java-5.1.36-bin.jar

配置

  1. 修改环境变量
    vi /etc/profile
    添加以下内容
#HIVE
export HIVE_HOME=/opt/hadoop/hive
export PATHA=$PATH:$HIVE_HOME/bin

source /etc/profile 使更改生效

2.修改hive配置文件

  • 复制几个配置文件
cp hive-default.xml.template hive-default.xml
cp hive-env.sh.template hive-env.sh
cp hive-log4j2.properties.template hive-log4j2.properties              
cp hive-exec-log4j2.properties.template hive-exec-log4j2.properties
  • 修改hive-default.xml
    vim hive-default.xml
    通过vim编辑器的查找命令找到有vavax的位置,并对相关地方进行配置。总共四处。这四处改为之前mysql的配置信息。
#jdbc连接方式
<name>javax.jdo.option.ConnectionDriverName</name>
 <value>com.mysql.jdbc.Driver</value>
#mysql连接配置
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://172.16.7.191:3306/hive?createDatabaseIfNotExist=true</value>
#mysql数据库的用户名
   <name>javax.jdo.option.ConnectionUserName</name>
    <value>hive</value>
#用户对应的密码
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>123456</value>

redhat中vim编辑器的查找命令
:set hls //打开高亮
/XXX //往下查找
?XXX //网上查找


>####启动

1. 启动Hive 的 Metastore Server服务进程
       hive --service metastore &
2. hive第一次登录需要初始化
       schematool -dbType mysql -initSchema
3. 登录hive

[root@cognos conf]# hive
which: no hbase in (/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/bin:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/hadoop/hadoop-2.7.3/bin:/opt/hadoop/hadoop-2.7.3/sbin:/root/bin:/opt/hadoop/hadoop-2.7.3/bin:/opt/hadoop/hadoop-2.7.3/sbin:/opt/hadoop/hadoop-2.7.3/bin:/opt/hadoop/hadoop-2.7.3/sbin:/opt/hadoop/hive/bin)

Logging initialized using configuration in file:/opt/hadoop/hive/conf/hive-log4j2.properties

Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. tez, spark) or using Hive 1.X releases.

大致意思:在Hive2.0后在Mapreduce的框架上将不再支持,希望考虑使用其它的执行引擎(如tez,spark等。)暂时不知道会有什么影响。

hive> show databases;
OK
default
Time taken: 0.728 seconds, Fetched: 1 row(s)

4. 验证
hive配置成功后,mysql同样可以连接到hive数据库,并进行操作。

mysql> use hive
Database changed
mysql> show tables;
+---------------------------+
| Tables_in_hive |
+---------------------------+
| AUX_TABLE |
| BUCKETING_COLS |
| CDS |
| COLUMNS_V2 |
| COMPACTION_QUEUE |
| COMPLETED_COMPACTIONS |
| COMPLETED_TXN_COMPONENTS |
| DATABASE_PARAMS |
| DBS |
| DB_PRIVS |
| DELEGATION_TOKENS |
| FUNCS |
| FUNC_RU |
| GLOBAL_PRIVS |
| HIVE_LOCKS |
| IDXS |
| INDEX_PARAMS |
| MASTER_KEYS |
| NEXT_COMPACTION_QUEUE_ID |
| NEXT_LOCK_ID |
| NEXT_TXN_ID |
| NOTIFICATION_LOG |
| NOTIFICATION_SEQUENCE |
| NUCLEUS_TABLES |
| PARTITIONS |
| PARTITION_EVENTS |
| PARTITION_KEYS |
| PARTITION_KEY_VALS |
| PARTITION_PARAMS |
| PART_COL_PRIVS |
| PART_COL_STATS |
| PART_PRIVS |
| ROLES |
| ROLE_MAP |
| SDS |
| SD_PARAMS |
| SEQUENCE_TABLE |
| SERDES |
| SERDE_PARAMS |
| SKEWED_COL_NAMES |
| SKEWED_COL_VALUE_LOC_MAP |
| SKEWED_STRING_LIST |
| SKEWED_STRING_LIST_VALUES |
| SKEWED_VALUES |
| SORT_COLS |
| TABLE_PARAMS |
| TAB_COL_STATS |
| TBLS |
| TBL_COL_PRIVS |
| TBL_PRIVS |
| TXNS |
| TXN_COMPONENTS |
| TYPES |
| TYPE_FIELDS |
| VERSION |
+---------------------------+
55 rows in set (0.01 sec)





>####报错及解决方法

1. SLF4J多重绑定

which: no hbase in (/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin:/usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/bin:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/hadoop/hadoop-2.7.3/bin:/opt/hadoop/hadoop-2.7.3/sbin:/root/bin:/opt/hadoop/hadoop-2.7.3/bin:/opt/hadoop/hadoop-2.7.3/sbin:/opt/hadoop/hadoop-2.7.3/bin:/opt/hadoop/hadoop-2.7.3/sbin:/opt/hadoop/hive/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hadoop/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

**解决办法**
上述jar包有重复绑定Logger类,删除较旧版本即可。
     rm -rf /opt/hadoop/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar

2. 没有正常启动Hive 的 Metastore Server服务进程。

Logging initialized using configuration in file:/opt/hadoop/hive/conf/hive-log4j2.properties
Exception in thread "main" java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1550)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3080)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3108)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:543)
at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:516)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:712)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:648)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.reflect.InvocationTargetException

**解决方法:**
启动Hive 的 Metastore Server服务进程,执行如下命令:
       hive --service metastore &

3. mysql权限问题
    ```
javax.jdo.JDOFatalDataStoreException: Unable to open a test connection to the given database. JDBC url = jdbc:mysql://172.16.7.191:3306/hive?createDatabaseIfNotExist=true, username = hive. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: ------
java.sql.SQLException: Access denied for user 'hive'@'cognos' (using password: YES)

解决办法:
将hive-default.xml文件中的jdbc:mysql://172.16.7.191:3306换成localhost:3306
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>

  1. hive第一次登录没有初始化
avax.jdo.JDODataStoreException: Required table missing : "VERSION" in Catalog "" Schema "". DataNucleus requires this table to perform its persistence operations. Either your MetaData is incorrect, or you need to enable "datanucleus.schema.autoCreateTables"

解决办法:
hive在第一次登录的时候需要用 schematool -dbType mysql -initSchema命令初始化。执行执行以下命令
schematool -dbType mysql -initSchema

  1. 不明确的路径指代system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
Logging initialized using configuration in file:/opt/hadoop/hive/conf/hive-log4j2.properties
Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D

原因是system:java.io.tmpdir变量在配置文件中无法获取到实际的值,就是找不到路径,正常情况下Hive启动的时候会产生临时文件和日志文件。由于文件无法被创建,所以进程就启动不了。
解决办法:
在配置文件default-site.xml里找"system:java.io.tmpdir"把他们都换成绝对路径如:/opt/hadoop/hive/iotmp/
并指认一个system:user.name

<property>
    <name>system:user.name</name>
    <value>user_name</value>
</property>
<property>
    <name>hive.exec.local.scratchdir</name>
    <value>/opt/hadoop/hive/iotmp/${system:user.name}</value>
    <description>Local scratch space for Hive jobs</description>
  </property>
  <property>
    <name>hive.downloaded.resources.dir</name>
    <value>/opt/hadoop/hive/iotmp/${hive.session.id}_resources</value>
    <description>Temporary local directory for added resources in the remote file system.</description>
  </property>

参考:

redhat下mysql安装与使用
mysql 创建和删除用户
HIVE完全分布式集群安装过程(元数据库: MySQL)
[Hive]那些年踩过的Hive坑

大数据
Web note ad 1