hbase java框架_Hadoop学习笔记—15.HBase框架学习(基础实践篇)
一、HBase的安裝配置
1.1 偽分布模式安裝
偽分布模式安裝即在一臺計算機上部署HBase的各個角色,HMaster、HRegionServer以及ZooKeeper都在一臺計算機上來模擬。
首先,準備好HBase的安裝包,我這里使用的是HBase-0.94.7的版本,已經上傳至百度網盤之中(URL:http://pan.baidu.com/s/1pJ3HTY7)
(1)通過FTP將hbase的安裝包拷貝到虛擬機hadoop-master中,并執(zhí)行一系列操作:解壓縮、重命名、設置環(huán)境變量
①解壓縮:tar -zvxf hbase-0.94.7-security.tar.gz
②重命名:mv hbase-94.7-security hbase
③設置環(huán)境變量:vim /etc/profile,增加內容如下,修改后重新生效:source /etc/profile
export HBASE_HOME=/usr/local/hbase
export PATH=.:$HADOOP_HOME/bin:$HBASE_HOME/bin:$ZOOKEEPER_HOME/bin:$JAVA_HOME/bin:$PATH
(2)進入hbase/conf目錄下,修改hbase-env.sh文件:
export JAVA_HOME=/usr/local/jdk
export HBASE_MANAGES_ZK=true #告訴HBase使用它自己的zookeeper實例,分布式模式下需要設置為false
(3)在hbase/conf目錄下,繼續(xù)修改hbase-site.xml文件:
hbase.rootdir
hdfs://hadoop-master:9000/hbase
hbase.cluster.distributed
true
hbase.zookeeper.quorum
hadoop-master
dfs.replication
1
(4)【可選步湊】修改regionservers文件,將localhost改為主機名:hadoop-master
(5)啟動HBase:start-hbase.sh
PS:由上一篇可知,HBase是建立在Hadoop HDFS之上的,因此在啟動HBase之前要確保已經啟動了Hadoop,啟動Hadoop的命令是:start-all.sh
(6)驗證是否啟動HBase:jps
由上圖發(fā)現,多了三個java進程:HMaster、HRegionServer以及HQuorumPeer。
還可以通過訪問HBase的Web接口查看:http://hadoop-master:60010
1.2 分布式模式安裝
本次安裝在1.1節(jié)的偽分布模式的基礎上進行修改搭建分布式模式,本次的集群實驗環(huán)境結構如下圖所示:
由上圖可知,HMaster角色是192.168.80.100(主機名:hadoop-master),而兩個HRegionServer角色則是兩臺192.168.80.101(主機名:hadoop-slave1)和192.168.80.102(主機名:hadoop-slave2)組成的。
(1)修改hadoop-master服務器上的的幾個關鍵配置文件:
①修改hbase/conf/hbase-env.sh:將最后一行修改為如下內容
export HBASE_MANAGES_ZK=false ?#不使用HBase自帶的zookeeper實例
②修改hbase/conf/regionservers:將原來的hadoop-master改為如下內容
hadoop-slave1
hadoop-slave2
(2)將hadoop-master上的hbase文件夾與/etc/profile配置文件整體復制到hadoop-slave1與hadoop-slave2中:
scp -r /usr/local/hbase hadoop-slave1:/usr/local/
scp -r /usr/local/hbase hadoop-slave2:/usr/local/
scp /etc/profile hadoop-slave1:/etc/
scp /etc/profile hadoop-slave2:/etc/
(3)在hadoop-slave1與hadoop-slave2中使配置文件生效:
source /etc/profile
(4)在hadoop-master中啟動Hadoop、Zookeeper與HBase:(注意先后順序)
start-all.sh
zkServer.sh start
start-hbase.sh
(5)在HBase的Web接口中查看Hbase集群狀態(tài):
二、HBase Shell基本命令
2.1 DDL:創(chuàng)建與刪除表
(1)創(chuàng)建表:
>create 'users','user_id','address','info'
#這里創(chuàng)建了一張表users,有三個列族user_id,address,info
獲取表users的具體描述:
>describe 'users'
(2)列出所有表:
>list
(3)刪除表:在HBase中刪除表需要兩步,首先disable,其次drop
>disable 'users'
>drop 'users'
2.2 DML:增刪查改
(1)增加記錄:put
>put 'users','xiaoming','info:age','24';
>put 'users','xiaoming','info:birthday','1987-06-17';
>put 'users','xiaoming','info:company','alibaba';
>put 'users','xiaoming','address:contry','china';
>put 'users','xiaoming','address:province','zhejiang';
>put 'users','xiaoming','address:city','hangzhou';
(2)掃描users表的所有記錄:scan
>scan 'users'
(3)獲取一條記錄
①取得一個id(row_key)的所有數據
>get 'users','xiaoming'
②獲取一個id的一個列族的所有數據
>get 'users','xiaoming','info'
③獲取一個id,一個列族中一個列的所有數據
>get 'users','xiaoming','info:age'
(4)更新一條記錄:依然put
例如:更新users表中小明的年齡為29
>put 'users','xiaoming','info:age' ,'29'
>get 'users','xiaoming','info:age
(5)刪除記錄:delete與deleteall
①刪除xiaoming的值的'info:age'字段
>delete 'users','xiaoming','info:age'
②刪除xiaoming的整行信息
>deleteall 'users','xiaoming'
2.3 Other:其他幾個比較有用的命令
(1)count:統計行數
>count 'users'
(2)truncate:清空指定表
>truncate 'users'
三、HBase Java API操作
3.1 預備工作
(1)導入HBase的項目jar包
(2)導入HBase/lib下的所有依賴jar包
3.2 HBase Java開發(fā)必備:獲取配置
/** 獲取HBase配置*/
private staticConfiguration getConfiguration()
{
Configuration conf=HBaseConfiguration.create();
conf.set("hbase.rootdir","hdfs://hadoop-master:9000/hbase");//使用eclipse時必須添加這個,否則無法定位
conf.set("hbase.zookeeper.quorum","hadoop-master");returnconf;
}
3.3 使用HBaseAdmin進行DDL操作
(1)創(chuàng)建表
/** 創(chuàng)建表*/
private static voidcreateTable()throwsIOException {
HBaseAdmin admin= newHBaseAdmin(getConfiguration());if(admin.tableExists(TABLE_NAME)) {
System.out.println("The table is existed!");
}else{
HTableDescriptor tableDesc= newHTableDescriptor(TABLE_NAME);
tableDesc.addFamily(newHColumnDescriptor(FAMILY_NAME));
admin.createTable(tableDesc);
System.out.println("Create table success!");
}
}
(2)刪除表
/** 刪除表*/
private static voiddropTable(String tableName)throwsIOException {
HBaseAdmin admin= newHBaseAdmin(getConfiguration());if(admin.tableExists(tableName)){try{
admin.disableTable(tableName);
admin.deleteTable(tableName);
}catch(IOException e) {
e.printStackTrace();
System.out.println("Delete "+tableName+" failed!");
}
}
System.out.println("Delete "+tableName+" success!");
}
3.4 使用HTable進行DML操作
(1)新增記錄
public static voidputRecord(String tableName, String row,
String columnFamily, String column, String data)throwsIOException{
HTable table= newHTable(getConfiguration(), tableName);
Put p1= newPut(Bytes.toBytes(row));
p1.add(Bytes.toBytes(columnFamily), Bytes.toBytes(column), Bytes.toBytes(data));
table.put(p1);
System.out.println("put'"+row+"',"+columnFamily+":"+column+"','"+data+"'");
}
(2)讀取記錄
public static void getRecord(String tableName, String row) throwsIOException{
HTable table= newHTable(getConfiguration(), tableName);
Get get= newGet(Bytes.toBytes(row));
Result result=table.get(get);
System.out.println("Get: "+result);
}
(3)全表掃描
public static void scan(String tableName) throwsIOException{
HTable table= newHTable(getConfiguration(), tableName);
Scan scan= newScan();
ResultScanner scanner=table.getScanner(scan);for(Result result : scanner) {
System.out.println("Scan: "+result);
}
}
3.5 API實戰(zhàn):詳單入庫
結合本筆記第五篇《自定義類型處理手機上網日志》的手機上網日志為背景,我們要做的就是將日志通過MapReduce導入到HBase中進行存儲。該日志的數據結構定義如下圖所示:(該文件的下載地址為:http://pan.baidu.com/s/1dDzqHWX)
(1)在HBase中通過Shell創(chuàng)建一張表:wlan_log
> create 'wlan_log','cf'
這里為了簡單定義,之定義了一個列族cf
(2)在ecplise中新建一個類:BatchImportJob,該類的代碼如下所示:
packagehbase;importjava.text.SimpleDateFormat;importjava.util.Date;importorg.apache.hadoop.conf.Configuration;importorg.apache.hadoop.hbase.client.Put;importorg.apache.hadoop.hbase.mapreduce.TableOutputFormat;importorg.apache.hadoop.hbase.mapreduce.TableReducer;importorg.apache.hadoop.hbase.util.Bytes;importorg.apache.hadoop.io.LongWritable;importorg.apache.hadoop.io.NullWritable;importorg.apache.hadoop.io.Text;importorg.apache.hadoop.mapreduce.Counter;importorg.apache.hadoop.mapreduce.Job;importorg.apache.hadoop.mapreduce.Mapper;importorg.apache.hadoop.mapreduce.lib.input.FileInputFormat;importorg.apache.hadoop.mapreduce.lib.input.TextInputFormat;public classBatchImportJob {static class BatchImportMapper extendsMapper{
SimpleDateFormat dateformat1= new SimpleDateFormat("yyyyMMddHHmmss");
Text v2= newText();protected voidmap(LongWritable key, Text value, Context context)throwsjava.io.IOException, InterruptedException {final String[] splited = value.toString().split("\t");try{final Date date = new Date(Long.parseLong(splited[0].trim()));final String dateFormat =dateformat1.format(date);
String rowKey= splited[1] + ":" +dateFormat;
v2.set(rowKey+ "\t" +value.toString());
context.write(key, v2);
}catch(NumberFormatException e) {final Counter counter = context.getCounter("BatchImportJob","ErrorFormat");
counter.increment(1L);
System.out.println("出錯了" + splited[0] + " " +e.getMessage());
}
};
}static class BatchImportReducer extendsTableReducer{protected voidreduce(LongWritable key,
java.lang.Iterablevalues, Context context)throwsjava.io.IOException, InterruptedException {for(Text text : values) {final String[] splited = text.toString().split("\t");final Put put = new Put(Bytes.toBytes(splited[0]));
put.add(Bytes.toBytes("cf"), Bytes.toBytes("date"),
Bytes.toBytes(splited[1]));
put.add(Bytes.toBytes("cf"), Bytes.toBytes("msisdn"),
Bytes.toBytes(splited[2]));//省略其他字段,調用put.add(....)即可
context.write(NullWritable.get(), put);
}
};
}public static void main(String[] args) throwsException {final Configuration configuration = newConfiguration();//設置zookeeper
configuration.set("hbase.zookeeper.quorum", "hadoop-master");//設置hbase表名稱
configuration.set(TableOutputFormat.OUTPUT_TABLE, "wlan_log");//將該值改大,防止hbase超時退出
configuration.set("dfs.socket.timeout", "180000");final Job job = new Job(configuration, "HBaseBatchImportJob");
job.setMapperClass(BatchImportMapper.class);
job.setReducerClass(BatchImportReducer.class);//設置map的輸出,不設置reduce的輸出類型
job.setMapOutputKeyClass(LongWritable.class);
job.setMapOutputValueClass(Text.class);
job.setInputFormatClass(TextInputFormat.class);//不再設置輸出路徑,而是設置輸出格式類型
job.setOutputFormatClass(TableOutputFormat.class);
FileInputFormat.setInputPaths(job,"hdfs://hadoop-master:9000/testdir/input/HTTP_20130313143750.dat");boolean success = job.waitForCompletion(true);if(success) {
System.out.println("Bath import to HBase success!");
System.exit(0);
}else{
System.out.println("Batch import to HBase failed!");
System.exit(1);
}
}
}
View Code
通過執(zhí)行后,在HBase中通過Shell命令(list)查看導入結果:
(3)在eclipse中新建一個類:MobileLogQueryApp,對已經存儲的wlan_log進行查詢的Java開發(fā),該類的代碼如下所示:
packagehbase;importjava.io.IOException;importorg.apache.hadoop.conf.Configuration;importorg.apache.hadoop.hbase.HBaseConfiguration;importorg.apache.hadoop.hbase.HColumnDescriptor;importorg.apache.hadoop.hbase.HTableDescriptor;importorg.apache.hadoop.hbase.client.Get;importorg.apache.hadoop.hbase.client.HBaseAdmin;importorg.apache.hadoop.hbase.client.HTable;importorg.apache.hadoop.hbase.client.Put;importorg.apache.hadoop.hbase.client.Result;importorg.apache.hadoop.hbase.client.ResultScanner;importorg.apache.hadoop.hbase.client.Scan;importorg.apache.hadoop.hbase.util.Bytes;public classMobileLogQueryApp {private static final String TABLE_NAME = "wlan_log";private static final String FAMILY_NAME = "cf";/*** HBase Java API基本使用示例
*
*@throwsException*/
public static void main(String[] args) throwsException {
scan(TABLE_NAME,"13600217502");
System.out.println();
scanPeriod(TABLE_NAME,"136");
}/** 查詢手機13600217502的所有上網記錄*/
public static voidscan(String tableName, String mobileNum)throwsIOException {
HTable table= newHTable(getConfiguration(), tableName);
Scan scan= newScan();
scan.setStartRow(Bytes.toBytes(mobileNum+ ":/"));
scan.setStopRow(Bytes.toBytes(mobileNum+ "::"));
ResultScanner scanner=table.getScanner(scan);int i = 0;for(Result result : scanner) {
System.out.println("Scan: " + i + " " +result);
i++;
}
}/** 查詢134號段的所有上網記錄*/
public static voidscanPeriod(String tableName, String period)throwsIOException {
HTable table= newHTable(getConfiguration(), tableName);
Scan scan= newScan();
scan.setStartRow(Bytes.toBytes(period+ "/"));
scan.setStopRow(Bytes.toBytes(period+ ":"));
scan.setMaxVersions(1);
ResultScanner scanner=table.getScanner(scan);int i = 0;for(Result result : scanner) {
System.out.println("Scan: " + i + " " +result);
i++;
}
}/** 獲取HBase配置*/
private staticConfiguration getConfiguration() {
Configuration conf=HBaseConfiguration.create();
conf.set("hbase.rootdir", "hdfs://hadoop-master:9000/hbase");//使用eclipse時必須添加這個,否則無法定位
conf.set("hbase.zookeeper.quorum", "hadoop-master");returnconf;
}
}
View Code
這里主要進行了兩個查詢操作:按指定手機號碼查詢 和 按指定手機號碼網段區(qū)間查詢,執(zhí)行結果如下所示:
參考資料
作者:周旭龍
本文版權歸作者和博客園共有,歡迎轉載,但未經作者同意必須保留此段聲明,且在文章頁面明顯位置給出原文鏈接。
《新程序員》:云原生和全面數字化實踐50位技術專家共同創(chuàng)作,文字、視頻、音頻交互閱讀總結
以上是生活随笔為你收集整理的hbase java框架_Hadoop学习笔记—15.HBase框架学习(基础实践篇)的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 农村信用社二类卡可以升级吗
- 下一篇: java mongodb排序查询_jav