数据库大数据 Hbase进阶

大数据有哪些招聘职位数据库 2024-03-30 6 0

通过hive操作hbase的注意事项：（1）启动yarn服务： yarn-daemon.sh start resourcemanager yarn-daemon.sh start nodemanager （2）在hive中建表时附加上： stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties ("hbase.columns.mapping"=":key,列族:列名,...") tblproperties("hbase.table.name" = "表名");

=====例1=====

--在hbase中创建表 create 'customer','order','addr'

--在hive中创建表映射 create external table customer( name string, order_numb string, order_date string, addr_city string, addr_state string) stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties ("hbase.columns.mapping"=":key,order:numb,order:date,addr:city,addr:state") tblproperties("hbase.table.name" = "customer") ;

--在hive中执行以下更新查询操作 insert into table customer values ('James','1121','2018-05-31','toronto','ON');

select * from customer;

--在hbase中执行如下语句 scan 'customer'

put 'customer','Smith','order:numb','1122' put 'customer','Smith','order:date','2019-09-12' put 'customer','Smith','addr:city','beijing' put 'customer','Smith','addr:state','HD'

--在hive中执行以下查询操作 select * from customer;

=====例2=====

1.在hbase中创建表 --不要create 'hive_hbase_emp_table','info'

2.实现Hive中创建表hive_hbase_emp_table关联HBase CREATE TABLE hive_hbase_emp_table( empno int, ename string, job string, mgr int, hiredate string, sal double, comm double, deptno int) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,info:ename,info:job,info:mgr,info:hiredate,info:sal,info:comm,info:deptno") TBLPROPERTIES ("hbase.table.name" = "hive_hbase_emp_table");

3.在Hive中插入数据，通过HBase查询 --hive中执行如下语句 insert into table hive_hbase_emp_table values(1,'Eric','Developer',5,'2019-12-18',2800.0,312.0,10); --hbase shell中执行如下语句 scan 'hive_hbase_emp_table'

4.在HBase中插入数据，通过Hive查询 --在hbase shell中执行如下语句 put 'hive_hbase_emp_table','2','info:ename','zhangsan' --在hive中执行如下语句 select * from hive_hbase_emp_table;

指定拆分算法，在linux中执行： hbase org.apache.hadoop.hbase.util.RegionSplitter test_split1 HexStringSplit -c 10 -f mycf

#指定拆分点 create 'test_split2','mycf2',SPLITS=>['aaa','bbb','ccc','ddd','eee','fff'] #指定拆分文件先在linux中创建以下内容的文件： aaa bbb ccc ddd eee fff 再在hbase中创建表 create 'test_split3','baseinfo',SPLITS_FILE => '/root/data/splits.txt'

Region 冷合并【必须先停止hbase服务，在linux中执行】 hbase org.apache.hadoop.hbase.util.Merge 【注】包括“表名,分界点,时间戳.regionId.”

Region 热合并【无需停止hbase服务，在hbase中执行】 merge 'region-1','region-2' 【注】region-1和region-2是指regionId

HFile minor合并 hbase.hregion.memstore.flush.size【134217728B=128M】

hbase.regionserver.optionalcacheflushinterval【3600000ms=1h】

hbase.hstore.compactionThreshold【3】

hbase.hstore.compaction.max【10】

HFile major合并 hbase.hregion.majorcompaction【604800000ms】

推荐链接

评论可见，请评论后查看内容，谢谢！！！

您阅读本篇文章共花了：

hbase 数据库大数据

本文由用户于 2024-03-30 发布在金钥匙，如有疑问，请联系我们。
本文链接：https://www.51969.com/post/18701166.html

金钥匙

数据库大数据 Hbase进阶

大数据最新Hadoop与HBase对应版本关系

大数据 Ubuntu20.04安装Hbase、Hadoop、SSH

发表评论取消回复

金钥匙

数据库 大数据 Hbase进阶

大数据 最新Hadoop与HBase对应版本关系

大数据 Ubuntu20.04安装Hbase、Hadoop、SSH

相关文章

发表评论取消回复

数据库大数据 Hbase进阶

大数据最新Hadoop与HBase对应版本关系