实验三 熟悉常用的HBase操作

一、实验目的

(1)理解HBase在Hadoop体系结构中的角色; (2)熟练使用HBase操作常用的Shell命令; (3)熟悉HBase操作常用的Java API。

二、实验平台

操作系统:centos7; Hadoop版本:3.3; HBase版本:2.2.2; JDK版本:1.8; Java IDE:IDEA。

三、实验内容和要求

(一)编程实现以下指定功能,并用Hadoop提供的HBase Shell命令完成相同任务:

(1) 列出HBase所有的表的相关信息,例如表名、创建时间等;

(2) 在终端打印出指定的表的所有记录数据;

(3) 向已经创建好的表添加和删除指定的列族或列;

(4) 清空指定的表的所有记录数据;

(5) 统计表的行数。

(二)HBase数据库操作

1 现有以下关系型数据库中的表和数据,要求将其转换为适合于HBase存储的表并插入数据:

学生表(Student)

学号(S_No)姓名(S_Name)性别(S_Sex)年龄(S_Age)2015001Zhangsanmale232015003Maryfemale222015003Lisimale24

课程表(Course)

课程号(C_No)课程名(C_Name)学分(C_Credit)123001Math2.0123002Computer5.0123003English3.0

选课表(SC)

学号(SC_Sno)课程号(SC_Cno)成绩(SC_Score)201500112300186201500112300369201500212300277201500212300399201500312300198201500312300295

2 请编程实现以下功能:

(1) createTable(String tableName, String[] fields)

创建表,参数tableName为表的名称,字符串数组fields为存储记录各个字段名称的数组。要求当HBase已经存在名为tableName的表的时候,先删除原有的表,然后再创建新的表。

package Main;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.hbase.HBaseConfiguration;

import org.apache.hadoop.hbase.TableName;

import org.apache.hadoop.hbase.client.ColumnFamilyDescriptor;

import org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;

import org.apache.hadoop.hbase.client.Connection;

import org.apache.hadoop.hbase.client.Admin;

import org.apache.hadoop.hbase.client.ConnectionFactory;

import org.apache.hadoop.hbase.client.TableDescriptorBuilder;

import org.apache.hadoop.hbase.util.Bytes;

import java.io.IOException;

public class main {

public static Configuration configuration;

public static Connection connection;

public static Admin admin;

public static void init(){//建立连接

configuration = HBaseConfiguration.create();

configuration.set("hbase.rootdir","hdfs://127.0.0.1:8020/hbase");

try{

connection = ConnectionFactory.createConnection(configuration);

admin = connection.getAdmin();

}catch(IOException e){

e.printStackTrace();

}

}

public static void close(){//关闭连接

try{

if(admin != null){

admin.close();

}

if(connection != null){

connection.close();

}

}catch(IOException e){

e.printStackTrace();

}

}

public static void createTable(String tableName,String[] fields) throws IOException{

init();

TableName tablename = TableName.valueOf(tableName);//定义表名

if(admin.tableExists(tablename)){

System.out.println("table is exists!");

admin.disableTable(tablename);

admin.deleteTable(tablename);

}

TableDescriptorBuilder tableDescriptor = TableDescriptorBuilder.newBuilder(tablename);

for(int i=0;i

ColumnFamilyDescriptor family = ColumnFamilyDescriptorBuilder.newBuilder(Bytes.toBytes(fields[i])).build();

tableDescriptor.setColumnFamily(family);

}

admin.createTable(tableDescriptor.build());

close();

}

public static void main(String[] args){

String[] fields = {"id","score"};

try{

createTable("test",fields);

}catch(IOException e){

e.printStackTrace();

}

}

}

运行结果

(2)addRecord(String tableName, String row, String[] fields, String[] values)

向表tableName、行row(用S_Name表示)和字符串数组fields指定的单元格中添加对应的数据values。其中,fields中每个元素如果对应的列族下还有相应的列限定符的话,用“columnFamily:column”表示。例如,同时向“Math”、“Computer Science”、“English”三列添加成绩时,字符串数组fields为{“Score:Math”, ”Score:Computer Science”, ”Score:English”},数组values存储这三门课的成绩。

package Main;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.hbase.HBaseConfiguration;

import org.apache.hadoop.hbase.TableName;

import org.apache.hadoop.hbase.client.Admin;

import org.apache.hadoop.hbase.client.Connection;

import org.apache.hadoop.hbase.client.ConnectionFactory;

import org.apache.hadoop.hbase.client.Put;

import org.apache.hadoop.hbase.client.Table;

public class main {

public static Configuration configuration;

public static Connection connection;

public static Admin admin;

public static void init(){//建立连接

configuration = HBaseConfiguration.create();

configuration.set("hbase.rootdir","hdfs://127.0.0.1:8020/hbase");

try{

connection = ConnectionFactory.createConnection(configuration);

admin = connection.getAdmin();

}catch(IOException e){

e.printStackTrace();

}

}

public static void close(){//关闭连接

try{

if(admin != null){

admin.close();

}

if(connection != null){

connection.close();

}

}catch(IOException e){

e.printStackTrace();

}

}

public static void addRecord(String tableName,String row,String[] fields,String[] values) throws IOException{

init();//连接Hbase

Table table = connection.getTable(TableName.valueOf(tableName));//表连接

Put put = new Put(row.getBytes());//创建put对象

for(int i=0;i

String[] cols = fields[i].split(":");

if(cols.length == 1){

put.addColumn(fields[i].getBytes(),"".getBytes(),values[i].getBytes());

}

else{

put.addColumn(cols[0].getBytes(),cols[1].getBytes(),values[i].getBytes());

}

table.put(put);//向表中添加数据

}

close();//关闭连接

}

public static void main(String[] args){

String[] fields = {"Score:Math","Score:Computer Science","Score:English"};

String[] values = {"85","80","90"};

try{

addRecord("grade","S_Name",fields,values);

}catch(IOException e){

e.printStackTrace();

}

}

}

3)scanColumn(String tableName, String column)

浏览表tableName某一列的数据,如果某一行记录中该列数据不存在,则返回null。要求当参数column为某一列族名称时,如果底下有若干个列限定符,则要列出每个列限定符代表的列的数据;当参数column为某一列具体名称(例如“Score:Math”)时,只需要列出该列

package Main;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.hbase.Cell;

import org.apache.hadoop.hbase.CellUtil;

import org.apache.hadoop.hbase.HBaseConfiguration;

import org.apache.hadoop.hbase.TableName;

import org.apache.hadoop.hbase.client.Admin;

import org.apache.hadoop.hbase.client.Connection;

import org.apache.hadoop.hbase.client.ConnectionFactory;

import org.apache.hadoop.hbase.client.Result;

import org.apache.hadoop.hbase.client.ResultScanner;

import org.apache.hadoop.hbase.client.Scan;

import org.apache.hadoop.hbase.client.Table;

import org.apache.hadoop.hbase.util.Bytes;

public class main {

public static Configuration configuration;

public static Connection connection;

public static Admin admin;

public static void init(){//建立连接

configuration = HBaseConfiguration.create();

configuration.set("hbase.rootdir","hdfs://localhost:8020/hbase");

try{

connection = ConnectionFactory.createConnection(configuration);

admin = connection.getAdmin();

}catch(IOException e){

e.printStackTrace();

}

}

public static void close(){//关闭连接

try{

if(admin != null){

admin.close();

}

if(connection != null){

connection.close();

}

}catch(IOException e){

e.printStackTrace();

}

}

public static void showResult(Result result){

Cell[] cells = result.rawCells();

for(int i=0;i

System.out.println("RowName:"+new String(CellUtil.cloneRow(cells[i])));//打印行键

System.out.println("ColumnName:"+new String(CellUtil.cloneQualifier(cells[i])));//打印列名

System.out.println("Value:"+new String(CellUtil.cloneValue(cells[i])));//打印值

System.out.println("Column Family:"+new String(CellUtil.cloneFamily(cells[i])));//打印列簇

System.out.println();

}

}

public static void scanColumn(String tableName,String column){

init();

try {

Table table = connection.getTable(TableName.valueOf(tableName));

Scan scan = new Scan();

scan.addFamily(Bytes.toBytes(column));

ResultScanner scanner = table.getScanner(scan);

for(Result result = scanner.next();result != null;result = scanner.next()){

showResult(result);

}

} catch (IOException e) {

e.printStackTrace();

}

finally{

close();

}

}

public static void main(String[] args){

scanColumn("test","id");

}

}

运行结果

(4)modifyData(String tableName, String row, String column)

修改表tableName,行row(可以用学生姓名S_Name表示),列column指定的单元格的数据。

package Main;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.hbase.HBaseConfiguration;

import org.apache.hadoop.hbase.TableName;

import org.apache.hadoop.hbase.client.Admin;

import org.apache.hadoop.hbase.client.Connection;

import org.apache.hadoop.hbase.client.ConnectionFactory;

import org.apache.hadoop.hbase.client.Put;

import org.apache.hadoop.hbase.client.Table;

public class main{

public static Configuration configuration;

public static Connection connection;

public static Admin admin;

public static void init(){//建立连接

configuration = HBaseConfiguration.create();

configuration.set("hbase.rootdir","hdfs://localhost:9000/hbase");

try{

connection = ConnectionFactory.createConnection(configuration);

admin = connection.getAdmin();

}catch(IOException e){

e.printStackTrace();

}

}

public static void close(){//关闭连接

try{

if(admin != null){

admin.close();

}

if(connection != null){

connection.close();

}

}catch(IOException e){

e.printStackTrace();

}

}

public static void modifyData(String tableName,String row,String column,String value) throws IOException{

init();

Table table = connection.getTable(TableName.valueOf(tableName));

Put put = new Put(row.getBytes());

String[] cols = column.split(":");

if(cols.length == 1){

put.addColumn(column.getBytes(),"".getBytes(), value.getBytes());

}

else{

put.addColumn(cols[0].getBytes(), cols[1].getBytes(), value.getBytes());

}

table.put(put);

close();

}

public static void main(String[] args){

try{

modifyData("test","1","score","100");

}

catch(Exception e){

e.printStackTrace();

}

}

}

运行结果

此时row为1的score已经改为100

(5)deleteRow(String tableName, String row)

删除表tableName中row指定的行的记录。

package Main;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.hbase.HBaseConfiguration;

import org.apache.hadoop.hbase.TableName;

import org.apache.hadoop.hbase.client.Admin;

import org.apache.hadoop.hbase.client.Connection;

import org.apache.hadoop.hbase.client.ConnectionFactory;

import org.apache.hadoop.hbase.client.Delete;

import org.apache.hadoop.hbase.client.Table;

public class main {

public static Configuration configuration;

public static Connection connection;

public static Admin admin;

public static void init(){//建立连接

configuration = HBaseConfiguration.create();

configuration.set("hbase.rootdir","hdfs://localhost:8020/hbase");

try{

connection = ConnectionFactory.createConnection(configuration);

admin = connection.getAdmin();

}catch(IOException e){

e.printStackTrace();

}

}

public static void close(){//关闭连接

try{

if(admin != null){

admin.close();

}

if(connection != null){

connection.close();

}

}catch(IOException e){

e.printStackTrace();

}

}

public static void deleteRow(String tableName,String row) throws IOException{

init();

Table table = connection.getTable(TableName.valueOf(tableName));

Delete delete = new Delete(row.getBytes());

table.delete(delete);

close();

}

public static void main(String[] args){

try{

deleteRow("test","2");

}catch(Exception e){

e.printStackTrace();

}

}

}

此时row=2已经被删除

出现的问题

问题一 在安装hbase后master-status用浏览器无法打开,而此时Hmaster和HregionServer,QuorumPeerMain已经启动

问题二

解决方法

问题一 在连接Hbase后使用list命令后发现如下 发现原因是因为我启动Hbase使用的不是hbase自带的zookeeper,而是自己独立安装的,在hbase-env.sh下增加

export HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP="true"

重新启动后 问题解决

问题二 发现是maven导入包出现问题,再将hbase-client包换为2.5.3后问题解决

参考链接

评论可见,请评论后查看内容,谢谢!!!
 您阅读本篇文章共花了: