1. hive 创建数据库下面的表的时候会出现乱码。
比如:注释乱码
+-------------------------------+----------------------------------------------------+-----------------------------+
| col_name | data_type | comment |
+-------------------------------+----------------------------------------------------+-----------------------------+
| # col_name | data_type | comment |
| area_code | string | ???? |
| brand | string | ???? |
| channel | string | ?? |
| model | string | ???? |
| mid_id | string | ??id |
| os | string | ???? |
| user_id | string | ??id |
| version_code | string | app??? |
2.解决方案: 修改mysql 元数据的编码
2.1 修改表字段注解和表注解
use hive;# mysql元数据库
alter table COLUMNS_V2 modify column COMMENT varchar(256) character set utf8;
alter table TABLE_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8;
2.2 修改分区字段注解
alter table PARTITION_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8 ;
alter table PARTITION_KEYS modify column PKEY_COMMENT varchar(4000) character set utf8;
2.3 修改索引注解
alter table INDEX_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8;
2.4 修改hive 的连接编码:
jdbc:mysql://192.168.1.151/hive?createDatabaseIfNotExist=true&useUnicode=true&characterEncoding=UTF-8
如图:
重启hive
建表
drop table if exists db_pxxlog.dwd_page_log;
CREATE EXTERNAL TABLE db_pxxlog.dwd_page_log (
`area_code` string COMMENT '地区编码',
`brand` string COMMENT '手机品牌',
`channel` string COMMENT '渠道',
`model` string COMMENT '手机型号',
`mid_id` string COMMENT '设备id',
`os` string COMMENT '操作系统',
`user_id` string COMMENT '用户id',
`version_code` string COMMENT 'app版本号'
`ts` BIGINT
) COMMENT '页面日志表' PARTITIONED BY (dt string) STORED AS ORC
LOCATION '/warehouse/tablespace/external/hive/db_pxxlog/dwd/dwd_page_log'
TBLPROPERTIES ('orc.compress' = 'SNAPPY');
+-------------------------------+----------------------------------------------------+-----------------------------+
| col_name | data_type | comment |
+-------------------------------+----------------------------------------------------+-----------------------------+
| # col_name | data_type | comment |
| area_code | string | 地区编码 |
| brand | string | 手机品牌 |
| channel | string | 渠道 |
| model | string | 手机型号 |
| mid_id | string | 设备id |
| os | string | 操作系统 |
| user_id | string | 用户id |
| version_code | string | app版本号 |
| during_time | bigint | 持续时间毫秒 |
| page_item | string | 目标id |
| page_item_type | string | 目标类型 |
| last_page_id | string | 上页类型 |
| page_id | string | 页面ID |
| source_type | string | 来源类型 |