HBase实现谷粒微博案例
HBase實(shí)現(xiàn)谷粒微博案例
- 前言
- 一、啟動(dòng)集群
- 二、功能實(shí)現(xiàn)
- 1.創(chuàng)建工程
- 2.constants包
- 3.utils包
- 3.1 createNameSpace 創(chuàng)建命名空間
- 3.2 isTableExist 判斷表是否存在
- 3.3 createTable 創(chuàng)建表
- 4.dao包
- 4.1 發(fā)微博功能
- 4.2 關(guān)注功能
- 4.3 取消關(guān)注
- 4.4 獲得用戶初始頁(yè)
- 4.5 獲得用戶全部微博內(nèi)容
- 5 test包 測(cè)試
- 總結(jié)
- 參考
前言
最近剛剛在b站上看完了尚硅谷hbase相關(guān)課程,在這里記錄一下,完成這個(gè)項(xiàng)目遇到的坑,以及案例結(jié)果(而且不要吐槽的英語(yǔ)注釋,我是個(gè)英語(yǔ)渣,大家盡量意會(huì)一下…)。如果博客有什么不對(duì)的地方,歡迎大佬指正。
一、啟動(dòng)集群
由于HBase是基于hadoop的一個(gè)非結(jié)構(gòu)化數(shù)據(jù)庫(kù),所以需要啟動(dòng)hadoop集群,并且還需要啟動(dòng)我們的“潤(rùn)滑器” zookeeper。
我使用的版本是
CentOS 6
Hadoop 3
HBase 3
Zookeeper 3 (每一個(gè)虛擬機(jī)都要啟動(dòng)zookeeper)
虛擬機(jī)一共三臺(tái),分別是hadoop102,hadoop103,hadoop104
編譯器使用的是IDEA。
二、功能實(shí)現(xiàn)
1.創(chuàng)建工程
這里我只是給出代碼以及一些相關(guān)解釋,具體的需求分析以及更加詳細(xì)的解釋,可以看后面的參考鏈接,我將會(huì)給出b站尚硅谷視頻的鏈接,有興趣的可以去看看。
打開(kāi)IDEA,創(chuàng)建一個(gè)新的maven工程,然后進(jìn)行pom.xml格式如下。這里是需要導(dǎo)入相關(guān)的依賴包,以及各種依賴的版本,build一下就會(huì)開(kāi)始進(jìn)行下載。
然后在工程中創(chuàng)建5個(gè)文件夾(其中一個(gè)是空的,也可以不創(chuàng)建),后續(xù)需要添加代碼。如圖:
ok,現(xiàn)在我們就可以快樂(lè)的編寫(xiě)代碼了。
2.constants包
在這里我們定義的是一些常量,例如是命名空間、表明、列名等,以便后面我們需要使用的時(shí)候反復(fù)重寫(xiě),防止寫(xiě)錯(cuò),也可以進(jìn)行一個(gè)解耦合。
package constants;import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.*;public class Constants {public static Configuration CONFIGURATION = null;// HBase setting 這里不知道為什么要這要寫(xiě)才可以,視頻里面沒(méi)有寫(xiě)這個(gè)也可以運(yùn)行,// 但是我就會(huì)報(bào)錯(cuò),說(shuō)無(wú)法連接,希望有大佬可以解釋一下。// 那個(gè)set里面第二個(gè)放的是你們集群的名稱,可能會(huì)不一樣,看你們自己的命名了。static {CONFIGURATION = HBaseConfiguration.create();CONFIGURATION.set("hbase.zookeeper.quorum", "hadoop102,hadoop103,hadoop104");}// HBase namespace 這個(gè)是命名空間public static final String NAMESPACE = "weibo";// weibo content table 這個(gè)是內(nèi)容表public static final String CONTENT_TABLE = "weibo:content";public static String CONTENT_TABLE_CF = "info";public static int CONTENT_TABLE_VERSIONS = 1;// user relationship table 這個(gè)是關(guān)系表public static final String RELATION_TABLE = "weibo:relation";public static final String RELATION_TABLE_CF1 = "attends";public static final String RELATION_TABLE_CF2 = "fans";public static final int RELATION_TABLE_VERSIONS = 1;// receipt email box 這個(gè)是收件表public static final String INBOX_TABLE = "weibo:inbox";public static final String INBOX_TABLE_CF = "info";public static final int INBOX_TABLE_VERSIONS = 2;}3.utils包
這里主要是進(jìn)行DDL的操作,例如創(chuàng)建命名空間,以及對(duì)表之類的操作。
代碼總覽,下面再分段解釋:
3.1 createNameSpace 創(chuàng)建命名空間
// 1 create namespacepublic static void createNameSpace(String namespace) throws IOException {// 1 get the connection object 創(chuàng)建一個(gè)連接的對(duì)象Connection connection = ConnectionFactory.createConnection(Constants.CONFIGURATION);// 2 get the admin object 創(chuàng)建一個(gè)admin的對(duì)象,// 對(duì)hbase操作基本上通admin來(lái)操作Admin admin = connection.getAdmin();// 1 create namespace description 創(chuàng)建一個(gè)命名空間的解釋器NamespaceDescriptor build = NamespaceDescriptor.create(namespace).build();// 2 create namespace 創(chuàng)建命名空間admin.createNamespace(build);// 3 close resource 關(guān)閉資源admin.close();connection.close();}3.2 isTableExist 判斷表是否存在
// 2 judge the table exist or notprivate static boolean isTableExist(String tableName) throws IOException {// 1 get the connection objectConnection connection = ConnectionFactory.createConnection(Constants.CONFIGURATION);// 2 get the admin objectAdmin admin = connection.getAdmin();// 3 judge the table exist or not. // tableExists這個(gè)函數(shù)是需要傳入TableName參數(shù)的,所以需要轉(zhuǎn)換。boolean result = admin.tableExists(TableName.valueOf(tableName));// 4 close resourceadmin.close();connection.close();// return valuereturn result;}3.3 createTable 創(chuàng)建表
這一塊蠻重要的,但是也是固定的套路。
// 3 create tablepublic static void createTable(String tableName, int versions, String... cfs) throws IOException {// 1 judge the command value 判斷一下傳入的參數(shù)是否小于等于0if (cfs.length <= 0) {System.out.println("column information is not available.");return ;}// 2 judge the table exist or not 判斷表是否存在if (isTableExist(tableName)) {System.out.println(tableName + "table is exist.");return;}// 3 get the connection objectConnection connection = ConnectionFactory.createConnection(Constants.CONFIGURATION);// 4 get the admin objectAdmin admin = connection.getAdmin();// 5 create table description // 創(chuàng)建表描述器,對(duì)表的操作都是通過(guò)描述器來(lái)操作。HTableDescriptor hTableDescriptor = new HTableDescriptor(TableName.valueOf(tableName));// 6 cycle add column informationfor (String cf : cfs) {// 這里是創(chuàng)建一個(gè)列描述器HColumnDescriptor hColumnDescriptor = new HColumnDescriptor(cf);// 7 set the versionhColumnDescriptor.setMaxVersions(versions);// 將列描述器加入到表描述器當(dāng)中,也就是把列族加進(jìn)去。hTableDescriptor.addFamily(hColumnDescriptor);}// 8 create tableadmin.createTable(hTableDescriptor);// close resourceadmin.close();connection.close();}4.dao包
這里是一些DML的操作,例如一些在表中插入數(shù)據(jù),或者是刪除數(shù)據(jù)的操作。
老規(guī)矩,先來(lái)一個(gè)代碼總覽(比較長(zhǎng)…):
4.1 發(fā)微博功能
這一個(gè)功能比較難,因?yàn)椴粌H對(duì)于當(dāng)前發(fā)布微博的人的內(nèi)容表需要進(jìn)行更新,并且需要對(duì)當(dāng)前這個(gè)人的粉絲的收件表也需要更新,因?yàn)榉劢z需要看到關(guān)注的人的最新動(dòng)態(tài)。所以這里需要對(duì)兩張表,多個(gè)人進(jìn)行操作。
// 1 publish the weibopublic static void publishWeiBo(String uid, String content) throws IOException {// 1 get the connection objectConnection connection = ConnectionFactory.createConnection(Constants.CONFIGURATION);// 2 get the admin objectAdmin admin = connection.getAdmin();// 1 operate weibo content table 對(duì)微博內(nèi)容表進(jìn)行操作// 1.1 get the weibo content table objectTable conTable = connection.getTable(TableName.valueOf(Constants.CONTENT_TABLE));// 1.2 get the current timelong ts = System.currentTimeMillis();// 1.3 get the RowkeyString rowKey = uid + "_" + ts;// 1.4 create put object 將需要put的內(nèi)容對(duì)象先創(chuàng)建好Put conPut = new Put(Bytes.toBytes(rowKey));// 1.5 Assignment the put object 給put對(duì)象賦值conPut.addColumn(Bytes.toBytes(Constants.CONTENT_TABLE_CF), Bytes.toBytes("content"), Bytes.toBytes(content));// 1.6 execute insert data operation// 執(zhí)行put操作,即對(duì)表進(jìn)行插入數(shù)據(jù)conTable.put(conPut);// 2 operate weibo email box 操作微博收件表// 2.1 get user relation table objectTable relaTable = connection.getTable(TableName.valueOf(Constants.RELATION_TABLE));// 2.2 get the fans's information of the person who publish weibo in current// 得到最近發(fā)布微博的被關(guān)注者的粉絲的信息Get get = new Get(Bytes.toBytes(uid));get.addFamily(Bytes.toBytes(Constants.RELATION_TABLE_CF2));Result result = relaTable.get(get);// 2.3 create a map, in order to store the put object of weibo content table// 創(chuàng)建一個(gè)集合為了存儲(chǔ)微博的內(nèi)容ArrayList<Put> inboxPuts = new ArrayList<>();// 2.4 traverse the fans list// 遍歷uid的粉絲for (Cell cell : result.rawCells()) {// 2.5 create the put object of weibo infobox table// 創(chuàng)建粉絲的單元格Put inboxPut = new Put(CellUtil.cloneQualifier(cell));// 2.6 assignment the information to the inbox table// 對(duì)粉絲的收件表進(jìn)行賦值,更新關(guān)注的人的最新微博inboxPut.addColumn(Bytes.toBytes(Constants.INBOX_TABLE_CF), Bytes.toBytes(uid), Bytes.toBytes(rowKey));// 2.7 store the put object into the inbox tableinboxPuts.add(inboxPut);}// 2.8 judge fans exist or not// 判斷當(dāng)前發(fā)布微博的人是否有粉絲if (inboxPuts.size() > 0) {// get the inbox objectTable inboxTable = connection.getTable(TableName.valueOf(Constants.INBOX_TABLE));// execute the insert operationinboxTable.put(inboxPuts);// close the inbox tableinboxTable.close();}// close the resourcerelaTable.close();conTable.close();connection.close();}4.2 關(guān)注功能
關(guān)注某個(gè)人之后,需要獲得這個(gè)人的最近發(fā)布的微博,以及在微博關(guān)系表當(dāng)中需要添加新的關(guān)系。代碼也比較長(zhǎng)…
// 2 focus the userpublic static void addAttends(String uid, String... attends) throws IOException {// judge if or not exist attendsif (attends.length <= 0) {System.out.println("please choose attention person");return ;}// get the connection objectConnection connection = ConnectionFactory.createConnection(Constants.CONFIGURATION);// first, operate user relationship table// 操作微博關(guān)系表// 1 get the user relationship table objectTable relaTable = connection.getTable(TableName.valueOf(Constants.RELATION_TABLE));// 2 create a map, in order to store a put object of user relationship table// 創(chuàng)建關(guān)注的集合,因?yàn)橛锌赡芤淮涡躁P(guān)注多個(gè)人。ArrayList<Put> relaPuts = new ArrayList<>();// 3 create an operator put objectPut uidPut = new Put(Bytes.toBytes(uid));// 4 cycle create a put object of attended personfor (String attend : attends) {// 5 assignment value to operator's put objectuidPut.addColumn(Bytes.toBytes(Constants.RELATION_TABLE_CF1), Bytes.toBytes(attend), Bytes.toBytes(attend));// 6 create a put object of attended person// 這是被關(guān)注者對(duì)象Put attendPut = new Put(Bytes.toBytes(attend));// 7 assignment value to attended person's put objectattendPut.addColumn(Bytes.toBytes(Constants.RELATION_TABLE_CF2), Bytes.toBytes(uid), Bytes.toBytes(uid));// 8 put the attended person's put object into maprelaPuts.add(attendPut);}// 9 add operator's put object into maprelaPuts.add(uidPut);// 10 add data into user relationship tablerelaTable.put(relaPuts);// second operate inbox table// 1 get the weibo content table objectTable contTable = connection.getTable(TableName.valueOf(Constants.CONTENT_TABLE));// 2 create put object of inbox tablePut inboxPut = new Put(Bytes.toBytes(uid));// 3 cycle attends, get the every attended person current published weibofor (String attend : attends) {// 4 get the attended person's current published weibo -> map resultScanner// 獲得被關(guān)注者最近發(fā)布的微博Scan scan = new Scan(Bytes.toBytes(attend + "_"), Bytes.toBytes(attend + "|"));ResultScanner resultScanner = contTable.getScanner(scan);// define a time stamplong ts = System.currentTimeMillis();// 5 traverse the valuefor (Result result : resultScanner) {// 6 assignment the value into the inbox table// 循環(huán)放入粉絲的收件表中inboxPut.addColumn(Bytes.toBytes(Constants.INBOX_TABLE_CF), Bytes.toBytes(attend), ts++, result.getRow());}}// judge the put object exist or notif (!inboxPut.isEmpty()) {// get the inbox table objectTable inboxTable = connection.getTable(TableName.valueOf(Constants.INBOX_TABLE));// insert datainboxTable.put(inboxPut);// close inboxTable connectioninboxTable.close();}// close resourcerelaTable.close();contTable.close();connection.close();}4.3 取消關(guān)注
// 3 delete attentionpublic static void deleteAttends(String uid, String... dels) throws IOException {if (dels.length <= 0) {System.out.println("please choose deletion person");return;}// get the connection objectConnection connection = ConnectionFactory.createConnection(Constants.CONFIGURATION);// first operate user relationship table// 1 get the user relationship table objectTable relatTable = connection.getTable(TableName.valueOf(Constants.RELATION_TABLE));// 2 create a map, in order to store the delete object of user relationship tableArrayList<Delete> relatDelete = new ArrayList<>();// 3 create the delete object of operatorDelete uidDelete = new Delete(Bytes.toBytes(uid));// 4 cycle create the delete object of delete attentionfor (String del : dels) {// 5 assignment value to operator's delete objectuidDelete.addColumns(Bytes.toBytes(Constants.RELATION_TABLE_CF1), Bytes.toBytes(del));// 6 create the conceal person's(取關(guān)者...) deletion objectDelete delDelte = new Delete(Bytes.toBytes(del));// 7 assignment value to the conceal person's deletion objectdelDelte.addColumns(Bytes.toBytes(Constants.RELATION_TABLE_CF2), Bytes.toBytes(uid));// 8 add the conceal person's deletion object into the maprelatDelete.add(delDelte);}// 9 add the operator's deletion object into the maprelatDelete.add(uidDelete);// 10 delete the data in user relationship tablerelatTable.delete(relatDelete);// second operate inbox table// 1 get the inbox table objectTable inboxTable = connection.getTable(TableName.valueOf(Constants.INBOX_TABLE));// 2 create the deletion object of operatorDelete inboxDelete = new Delete(Bytes.toBytes(uid));// 3 assignment value to operator's deletion objectfor (String del : dels) {inboxDelete.addColumns(Bytes.toBytes(Constants.INBOX_TABLE_CF), Bytes.toBytes(del));}// 4 delete data in inbox tableinboxTable.delete(inboxDelete);// close resourcerelatTable.close();inboxTable.close();connection.close();}4.4 獲得用戶初始頁(yè)
// 4 get the initial page datapublic static void getInit(String uid) throws IOException {// 1 get the connection objectConnection connection = ConnectionFactory.createConnection(Constants.CONFIGURATION);// 2 get inbox table objectTable inboxTable = connection.getTable(TableName.valueOf(Constants.INBOX_TABLE));// 3 get the content table objectTable conTable = connection.getTable(TableName.valueOf(Constants.CONTENT_TABLE));// 4 create inbox get object, and get the data max versionGet inboxGet = new Get(Bytes.toBytes(uid));inboxGet.setMaxVersions();Result result = inboxTable.get(inboxGet);// 5 traverse the datafor (Cell cell : result.rawCells()) {// 6 construct the content table get objectGet conGet = new Get(CellUtil.cloneValue(cell));// 7 get the object dataResult conResult = conTable.get(conGet);// 8 print datafor (Cell conCell : conResult.rawCells()) {System.out.println("RK:" + Bytes.toString(CellUtil.cloneRow(conCell)) +", CF:" + Bytes.toString(CellUtil.cloneFamily(conCell)) +", CN:" + Bytes.toString(CellUtil.cloneQualifier(conCell)) +", Value:" + Bytes.toString(CellUtil.cloneValue(conCell)));}}// 9 close resourceinboxTable.close();conTable.close();connection.close();}4.5 獲得用戶全部微博內(nèi)容
// 5 get the all the content of somebody weibopublic static void getWeibo(String uid) throws IOException {// 1 get the connectionConnection connection = ConnectionFactory.createConnection(Constants.CONFIGURATION);// 2 get the weibo content tableTable conTable = connection.getTable(TableName.valueOf(Constants.CONTENT_TABLE));// construct scan objectScan scan = new Scan();// construct filterRowFilter rowFilter = new RowFilter(CompareFilter.CompareOp.EQUAL, new SubstringComparator(uid + "_"));scan.setFilter(rowFilter);// 4 get the dataResultScanner resultScanner = conTable.getScanner(scan);// 5 printf datafor (Result result : resultScanner) {for (Cell cell : result.rawCells()) {System.out.println("RK:" + Bytes.toString(CellUtil.cloneRow(cell)) +", CF:" + Bytes.toString(CellUtil.cloneFamily(cell)) +", CN:" + Bytes.toString(CellUtil.cloneQualifier(cell)) +", Value:" + Bytes.toString(CellUtil.cloneValue(cell)));}}// 6 close resourceconTable.close();connection.close();}5 test包 測(cè)試
在test包當(dāng)中,編寫(xiě)測(cè)試代碼,相對(duì)簡(jiǎn)單。
package test;import constants.Constants; import dao.HBaseDao; import utils.HBaseUtil;import java.io.IOException;public class TestWeiBo {public static void init() {try {// create namespaceHBaseUtil.createNameSpace(Constants.NAMESPACE);// create weibo content tableHBaseUtil.createTable(Constants.CONTENT_TABLE, Constants.CONTENT_TABLE_VERSIONS, Constants.CONTENT_TABLE_CF);// create user relationship tableHBaseUtil.createTable(Constants.RELATION_TABLE, Constants.RELATION_TABLE_VERSIONS,Constants.RELATION_TABLE_CF1, Constants.RELATION_TABLE_CF2);// create inbox tableHBaseUtil.createTable(Constants.INBOX_TABLE, Constants.INBOX_TABLE_VERSIONS, Constants.INBOX_TABLE_CF);} catch (IOException e) {e.printStackTrace();}}public static void main(String[] args) throws IOException, InterruptedException {// initializeinit();System.out.println("init done");// 1001 publish weiboHBaseDao.publishWeiBo("1001","這是1001的微博,Hello World");System.out.println("publish done");// 1002 attend 1001, 1003HBaseDao.addAttends("1002", "1001", "1003");System.out.println("attend done");// get the 1002 initial pageHBaseDao.getInit("1002");System.out.println("*********************111**********************");// 1003 publish 3 weibo, at the same time 1002 publish 2 weiboHBaseDao.publishWeiBo("1003", "快結(jié)束了HBase了!!");Thread.sleep(10);HBaseDao.publishWeiBo("1001", "那你還是個(gè)弟弟啊!!");Thread.sleep(10);HBaseDao.publishWeiBo("1003", "哦,不我要當(dāng)哥哥!!");Thread.sleep(10);HBaseDao.publishWeiBo("1001", "弟弟!!");Thread.sleep(10);HBaseDao.publishWeiBo("1003", "叫哥哥!!");// get 1002 initial pageHBaseDao.getInit("1002");System.out.println("*********************222***********************");// 1002 conceal 1003HBaseDao.deleteAttends("1002", "1003");// get the 1002 initial pageHBaseDao.getInit("1002");System.out.println("*********************333***********************");// 1002 attend 1003 againHBaseDao.addAttends("1002", "1003");// get the 1002 initial pageHBaseDao.getInit("1002");System.out.println("*********************444***********************");// get 1001 detail initial pageHBaseDao.getWeibo("1001");System.out.println("*********************555***********************");} }總結(jié)
不知道為什么在configuration配置的時(shí)候總是存在問(wèn)題,最后參考以前寫(xiě)的測(cè)試API的代碼,加上了configuration.set(),才可以跑通。否則一直顯示的是連接超時(shí),不存在zookeeper的master。具體問(wèn)題還不清楚,感覺(jué)像是代碼找不到連接zookeeper的端口了。
最后我這個(gè)弱雞終于成功跑通了代碼,感覺(jué)還是蠻不錯(cuò)的,記錄一下遇到的坑。
1、編寫(xiě)代碼的時(shí)候,需要記住addColumns()函數(shù)里面是需要對(duì)列進(jìn)行操作,好幾次寫(xiě)成了table,忘了加CF了,然后報(bào)錯(cuò)。并且還需要轉(zhuǎn)換成字節(jié)的形式。
2、需要弄清楚業(yè)務(wù)的邏輯,因?yàn)槔锩嬷皇窃O(shè)計(jì)了三張表所以邏輯相對(duì)簡(jiǎn)單。但是盡可能的還是多考慮考慮一些解耦合的操作或者是需求分析的時(shí)候?qū)?chuàng)建表考慮清楚。
最后,加油吧,諸君。
參考
b站尚硅谷的hbase視頻
總結(jié)
以上是生活随笔為你收集整理的HBase实现谷粒微博案例的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: CAS客户端整合(三) Otrs
- 下一篇: 专家视角 | 龚健雅院士:当“传统”遥感