shell脚本操作hbase的命令
0.進(jìn)入hbase shell
./hbase shell?
help?
help “get” #查看單獨(dú)的某個(gè)命令的幫助
2.DDL(數(shù)據(jù)定義語言Data Definition Language)命令
1. 創(chuàng)建表
create ‘表名稱’,’列名稱1’,’列名稱2’,’列名稱3’?
create 'member','member_id','address','info'1.1 創(chuàng)建namespace(表空間相當(dāng)于數(shù)據(jù)庫)
hbase(main):020:0> create_namespace 'cm'1.2 創(chuàng)建表格和列族
hbase(main):020:0> create 'cm:mydemo','base'2.列出所有的表
list ‘a(chǎn)bc.*’ #顯示abc開頭的表
3.獲得表的描述
describe ‘table_name’
Table play_error_file is ENABLEDplay_error_filecolumn families description{NAME => 'cf',BLOOMFILTER => 'ROW',#根據(jù)應(yīng)用來定,看需要精確到rowkey還是column。bloom filter的作用是對一個(gè)region下查找記錄所在的hfile有用。一個(gè)region下hfile數(shù)量越多,bloom filter的作用越明顯。適合那種compaction(壓縮)趕不上flush速度的應(yīng)用。VERSIONS => '1',# 通常是3,對于更新比較頻繁的應(yīng)用可以設(shè)置為1IN_MEMORY => 'false',KEEP_DELETED_CELLS => 'FALSE',DATA_BLOCK_ENCODING => 'NONE',TTL => 'FOREVER',COMPRESSION => 'NONE',MIN_VERSIONS => '0',BLOCKCACHE =>'true',BLOCKSIZE => '65536',REPLICATION_SCOPE => '0'}4.刪除一個(gè)列族 alter,disable, enable
disable 'member' #刪除列族時(shí)必須先將表給disablealter 'member',{NAME=>'member_id',METHOD=>'delete'}#刪除后繼續(xù)enable 'member'enable 'member'5.刪除表
disable 'table_name'drop 'table_name'6.查詢表是否存在
exists 'table_name'7.判斷表是否enabled
is_enabled 'table_name'3.DML(data manipulation language)操作
1.插入
在ns1:t1或者t1表里的r1行,c1列中插入值,ts1是時(shí)間
put 'ns1:t1', 'r1','c1','value'orput 't1','r1','c1','value'orput 't1','r1','c1','value',ts1orput 't1','r1','c1','value',{ATTRIBUTES=>{'mykey'=>'myvalue'}}put 't1','r1','c1','value',ts1,{ATTRIBUTES=>{'mykey'=>'myvalue'}}put 't1','r1','c1','value',ts1,{VISIBILITY=>'PRIVATE|SECRET}# t是table 't1'表的引用t.put 'r1','c1','value',ts1,{ATTRIBUTES=>{'mykey'=>'myvalue'}}put 'table_name','row_index','info:age','24'put 'table_name','row_index','info:birthday','1987-06-17'put 'table_name','row_index','info:company','tencent'put 'table_name','row_index','address:contry','china'put 'table_name','row_index','address:province','china'put 'table_name','row_index','address:city','shenzhen'2.獲取一條數(shù)據(jù)
# 獲取一個(gè)id的所有數(shù)據(jù)get 'table_name','row_index'# 獲取一個(gè)id,一個(gè)列族的所有數(shù)據(jù)get 'table_name','row_index','info'# 獲取一個(gè)id,一個(gè)列族中一個(gè)列的所有數(shù)據(jù)get 'table_name','row_index','info:age'3.更新一條記錄
將qy的單位改為qq?
put ‘table_name’,’qy’,’info:company’,’qq’
4.通過timestrap來獲取兩個(gè)版本的數(shù)據(jù)
# 得到company為tencent的記錄get 'table_name','qy',{COLUMN=>'info:company',TIMESTRAP=>1321586238965}# 得到company為qq的數(shù)據(jù)get 'table_name','qy',{COLUMN=>'info:company',TIMESTRAP=>1321586271843}5.全表掃描
?scanner規(guī)范:?
TIMERANGE,?
FILTER,?
LIMIT,?
STARTROW(start row),?
STOPROW(stop row),?
ROWPREFIXFILTER(row prefix filter,行前綴)?
TIMESTAMP,?
MAXLENGTH,?
or COLUMNS,?
CACHE,?
or RAW,?
VERSIONS
全表掃描一般不會(huì)用,數(shù)據(jù)量大的時(shí)候會(huì)死人的。。
6.刪除記錄
# 刪除id為temp的記錄的'info:age'字段delete 'member','temp','info:age'# 刪除整行deleteall 'member','temp'7.查詢表中有多少行
count 'table_name'or有對表t1的引用tt.count8.清空表
truncate 'table_name'HBase是先將表disable,再drop the table,最后creating table。1.3 添加數(shù)據(jù)
hbase(main):025:0> put 'mydemo','001','base:name','zhangsanfen'1.4 取值行鍵為001的數(shù)據(jù)
hbase(main):021:0> get 'mydemo','001',{COLUMN=>'base'}1.5 添加一個(gè)‘a(chǎn)dv’列
hbase(main):023:0> alter 'mydemo',{NAME=>'adv'}1.6 查詢兩個(gè)列族中的一個(gè)列的數(shù)據(jù)
hbase(main):026:0> get 'mydemo','001',{COLUMN=>['base:name','adv:like']}1.7 查看表結(jié)構(gòu):desc 'table'
修改版本號(hào),方便查找3歷史記錄 hbase(main):034:0> alter 'mydemo',{NAME=>'base',VERSIONS=>3}1.8 當(dāng)我們修改了三次name的值,通過以下命令查找歷史
hbase(main):043:0> get 'mydemo','001',{COLUMN=>'base:name',VERSIONS=>3}1.9 get查詢一個(gè)需要有rowkey(substring:截取)
方法一:(截取:substring) hbase(main):054:0> get 'mydemo','001',{FILTER=>"ValueFilter(=,'substring:120')"} 方法二:(二進(jìn)制:binary) hbase(main):055:0> get 'mydemo','001',{FILTER=>"ValueFilter(=,'binary:zhangsanfeng')"}2.0 scan 全表查
hbase(main):054:0> scan 'mydemo',FILTER=>"ValueFilter(=,'substring:12')"
5.scan查詢
1.限制條件
scan ‘qy’,{COLUMNS=>’name’}
scan ‘qy’,{COLUMNS=>’name:gender’}
scan ‘qy’,{COLUMNS=>[‘name’,’foo’]}
限制查找條數(shù):
scan ‘qy’,{COLUMNS=>[‘name’,’foo’],LIMIT=>2}
限制時(shí)間范圍:
scan ‘qy’,{ FILTER=>"ValueFilter(=,'substring:120')", TIMERANGE=>[1448045892646,1448045892647]}
QualifierFilter:列過濾器
QualifierFilter對列的名稱進(jìn)行過濾,而不是列的值。
scan ‘qy’,{FILTER=>”PrefixFilter(‘t’) AND QualifierFilter(>=,’binary:b’)”}
TimestampsFilter:時(shí)間戳過濾器
scan ‘qy’,{FILTER=>”TimestampsFilter(1448069941270,1548069941230)” }
scan ‘qy’,{FILTER=>”(QualifierFilter(>=,’binary:b’)) AND (TimestampsFilter(1348069941270,1548069941270))” }
ColumnPaginationFilter
scan ‘qy’,{FILTER=>org.apache.hbase.filter.ColumnPaginationFilter.new(2,0)}
cannot load Java class org.apache.hbase.filter.ColumnPaginationFilter
hbase shell應(yīng)用filter?
1.導(dǎo)入需要的類
2.執(zhí)行命令
scan 'tablename',STARTROW=>'start',COLUMNS=>['family:qualifier'],FILTER=>SingleColumnValueFilter.new(Bytes.toBytes('family'),Bytes.toBytes('qualifier'))shell執(zhí)行hbase命令一:
exec hbase_home/bin/hbase shell <<EOF? status create 'testtable','colfaml' list 'testtable' put 'testtable','myrow-1','colfaml:q1','value-1' scan 'testtable' disable 'testtable' drop 'testtable' EOF
但是EOF處會(huì)自動(dòng)退出整個(gè)腳本,無法執(zhí)行后面內(nèi)容,不適用于大的自動(dòng)化腳本
作者綜合hbase官網(wǎng)和一些腳本知識(shí),編寫了命令二為可用:
echo "status create 'testtable','colfaml' list 'testtable' put 'testtable','myrow-1','colfaml:q1','value-1' scan 'testtable' disable 'testtable' drop 'testtable'" | hbase_home/bin/hbase shell?-n 2>&1 status=$? echo "The status was " $status if [ $status == 0 ]; thenecho "功能測試成功" elseecho "功能測試錯(cuò)誤" fi echo "完畢"
注:2>&1為錯(cuò)誤重定向?yàn)闃?biāo)準(zhǔn)輸出1的意思,即命令正確返回0,錯(cuò)誤返回1
前面若添加 > /dev/null 即hbase_home/bin/hbase shell?-n > /dev/null 2>&1則為將命令輸出到只寫文件/dev/null,控制臺(tái)不打印命令輸出,且命令正確返回0,錯(cuò)誤返回1
這樣解決了自動(dòng)退出問題的同時(shí)還能自動(dòng)判斷命令執(zhí)行情況
?shell批量刪除hbase數(shù)據(jù)
#第一步:通過時(shí)間戳找到要?jiǎng)h除的數(shù)據(jù)
#第二步:構(gòu)建刪除數(shù)據(jù)的shell
#第三步:給delete_all_by_rowkey.sh 加可執(zhí)行權(quán)限 ?執(zhí)行刪除shell
#!/bin/bash -l
echo '--------------程序從這里開始------------'
# ?${1} ns:table_name
# ?${2} columns
# ?${3} ttl
# ?${4} stop_date
# ?${5} start_time : if ${4} do not input, the start time is defaults to 0;# hbase(main):002:0> import java.util.Date
# => Java::JavaUtil::Date
# hbase(main):003:0> Date.new(1341304672637).toString()
# => "Tue Jul 03 16:37:52 CST 2012"
# hbase(main):004:0> Date.new(1341301228326).toString()
# => "Tue Jul 03 15:40:28 CST 2012"
# 在shell中,如果e799bee5baa6e4b893e5b19e31333363383464有可讀日期,能否轉(zhuǎn)成long類型呢?
# hbase(main):005:0> import java.text.SimpleDateFormat
# => Java::JavaText::SimpleDateFormat
# hbase(main):006:0> import java.text.ParsePosition
# => Java::JavaText::ParsePosition
# hbase(main):015:0> SimpleDateFormat.new("yy/MM/dd").parse("12/07/03",ParsePosition.new(0)).getTime()
# => 1341244800000table_name=${1}
columns=${2}
ttl=${3}
stop_date=${4}
start_date=0
if [ $# -eq 5 ];thenstart_date=${5}
fi
echo "table_name : ${table_name}columns : ${columns}ttl : ${ttl}stop_date : ${stop_date}start_date : ${start_date}
"base_path=$(cd `dirname $0`; pwd)echo '---------------正在創(chuàng)建緩存文件夾--------------'mkdir -p ${base_path}/cache_of_delete/${table_name}/
touch ${base_path}/cache_of_delete/${table_name}/rowkey.txt
touch ${base_path}/cache_of_delete/${table_name}/delete_all_by_rowkey.sh# #######第一步:通過時(shí)間戳找到要?jiǎng)h除的數(shù)據(jù)
# 注:這里只有rowkey和其中一列,因?yàn)槟康氖钦业絩owkey
echo " scan '${table_name}',{COLUMNS=>'${columns}', TIMERANGE=>[${start_date},${stop_date}]}" | hbase shell ?| grep 'column' | grep 'timestamp' ?|awk '{print $1}' > ?${base_path}/cache_of_delete/${table_name}/rowkey.txt# ######第二步:構(gòu)建刪除數(shù)據(jù)的shell
echo '#!/bin/bash -l ' > ${base_path}/cache_of_delete/${table_name}/delete_all_by_rowkey.sh
echo 'exec hbase shell <<EOF ' >> ${base_path}/cache_of_delete/${table_name}/delete_all_by_rowkey.shcat ${base_path}/cache_of_delete/${table_name}/rowkey.txt|awk '{print "deleteall '\'${table_name}\''", ",", "'\''"$1"'\''"}' ?>> ${base_path}/cache_of_delete/${table_name}/delete_all_by_rowkey.shecho "EOF " >> ${base_path}/cache_of_delete/${table_name}/delete_all_by_rowkey.sh# ########第三步:給delete_all_by_rowkey.sh 加可執(zhí)行權(quán)限 ?執(zhí)行刪除shell
chmod +x ${base_path}/cache_of_delete/${table_name}/delete_all_by_rowkey.sh
#sh ${base_path}/cache_of_delete/${table_name}/delete_all_by_rowkey.sh >> ${base_path}/cache_of_delete/${table_name}/delete.log# ?##### 第四步: 修改hbase的TTL值
echo '#!/bin/bash -l ' > ${base_path}/cache_of_delete/${table_name}/alter_ttl.sh
echo 'exec hbase shell <<EOF ' >> ${base_path}/cache_of_delete/${table_name}/alter_ttl.sh
echo 'desc '${table_name}' '>> ${base_path}/cache_of_delete/${table_name}/alter_ttl.sh
echo 'disable ? '${table_name}' ' ?>> ${base_path}/cache_of_delete/${table_name}/alter_ttl.sh
echo 'alter '${table_name}', { NAME=>'f',TTL=>'${ttl}'} ? '>> ${base_path}/cache_of_delete/${table_name}/alter_ttl.sh
echo 'enable ? '${table_name}' ' ?>> ${base_path}/cache_of_delete/${table_name}/alter_ttl.sh
echo 'desc '${table_name}' '>> ${base_path}/cache_of_delete/${table_name}/alter_ttl.sh
echo "EOF " >> ${base_path}/cache_of_delete/${table_name}/alter_ttl.sh
chmod +x ?${base_path}/cache_of_delete/${table_name}/alter_ttl.sh
#sh ${base_path}/cache_of_delete/${table_name}/alter_ttl.shecho '---------------正在刪除緩存文件夾--------------'#rm -rf ${base_path}/cache_of_delete/${table_name}/delete_all_by_rowkey.sh
echo '--------------程序到這里結(jié)束------------'
Linux下時(shí)間轉(zhuǎn)換的一些命令:
- date +%s?? 可以得到UNIX的時(shí)間戳;
- 用shell將日期時(shí)間與時(shí)間戳互轉(zhuǎn):
??????date -d "2015-08-04 00:00:00" +%s?? ? 輸出:1438617600
- 而時(shí)間戳轉(zhuǎn)換為字符串可以這樣做:
? ? ??date -d @1438617600? "+%Y-%m-%d"? ? 輸出:2015-08-04??
? ? ? date -d @1438617600 ?"+%Y-%m-%d %H:%M:%S" ? 2015-08-04 00:00:00"
- 如果需要得到指定日期的前后幾天:
? ? ? seconds=`date -d "2015-08-04 00:00:00" +%s`?????? #得到時(shí)間戳
? ? ? seconds_new=`expr $seconds + 86400`?????????????????? #加上一天的秒數(shù)86400
? ? ? date_new=`date -d @$seconds_new "+%Y-%m-%d"`?? #獲得指定日前加上一天的日前
date -d "1970-01-01 UTC 1287331200 seconds" "+%F %T"
date -d @時(shí)間戳 "+%Y-%m-%d %H:%M:%S"?
current=“2015-03-11 12:33:41”
echo $current
timeStamp=`date -d "$current" +%s` ? ? ?#將current轉(zhuǎn)換為時(shí)間戳,精確到秒
currentTimeStamp=$((timeStamp*1000+`date "+%N"`/1000000)) #將current轉(zhuǎn)換為時(shí)間戳,精確到毫秒
echo $currentTimeStamp
hbase按照時(shí)間戳刪除記錄
1、按照時(shí)間戳范圍查詢記錄
echo "scan 'event_log', { COLUMN => 'cf:sid', TIMERANGE => [1466265600272, 1471622400481]} " | ?hbase shell > ./record.txt
其中這里的cf:sid和key一致, 時(shí)間戳范圍需要按照時(shí)間自己轉(zhuǎn)換:
#current=`date?"+%Y-%m-%d?%H:%M:%S"`?????#獲取當(dāng)前時(shí)間,例:2015-03-11?12:33:41
current=“2015-03-11 12:33:41”
echo $current
timeStamp=`date -d "$current" +%s` ? ? ?#將current轉(zhuǎn)換為時(shí)間戳,精確到秒
currentTimeStamp=$((timeStamp*1000+`date "+%N"`/1000000)) #將current轉(zhuǎn)換為時(shí)間戳,精確到毫秒
echo $currentTimeStamp
2、通過shell命令提取record.txt中的sid字段,并拼成hbase刪除行命令
cat record.txt|awk '{print "deleteall '\''event_log'\''", ",", "'\''"$1"'\''"}' > del.sh
3、生成hbase刪除腳本
在del.sh頭尾分別加上:
#!/bin/sh?
exec hbase?shell?<<EOF?
和
EOF?
4、執(zhí)行刪除腳本
sh del.sh
hbase的timestamp怎么換算?
hbase shell中timestamp轉(zhuǎn)為可讀格式hbase(main):002:0> import java.util.Date => Java::JavaUtil::Date hbase(main):003:0> Date.new(1341304672637).toString() => "Tue Jul 03 16:37:52 CST 2012" hbase(main):004:0> Date.new(1341301228326).toString() => "Tue Jul 03 15:40:28 CST 2012" #### 在shell中,如果有可讀日期,能否轉(zhuǎn)成long類型呢?hbase(main):005:0> import java.text.SimpleDateFormat => Java::JavaText::SimpleDateFormat hbase(main):006:0> import java.text.ParsePosition => Java::JavaText::ParsePositionhbase(main):015:0> SimpleDateFormat.new("yy/MM/dd").parse("12/07/03",ParsePosition.new(0)).getTime() => 1341244800000總結(jié)
以上是生活随笔為你收集整理的shell脚本操作hbase的命令的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: python探究小市值因子的有效性
- 下一篇: 期末考试-第一章-计算机视觉综述知识整理