函数分类,HIVE CLI命令,简单函数,聚合函数,集合函数,特殊函数(窗口函数,分析函数,混合函数,UDTF),常用函数Demo
1.1. 函數分類
1.2. HIVE CLI命令
顯示當前會話有多少函數可用?
SHOW FUNCTIONS;
顯示函數的描述信息?
DESC FUNCTION concat;
顯示函數的擴展描述信息?
DESC FUNCTION EXTENDED concat;
1.3. 簡單函數
函數的計算粒度為單條記錄。?
關系運算?
數學運算?
邏輯運算?
數值計算?
類型轉換?
日期函數?
條件函數?
字符串函數?
統計函數
1.4. 聚合函數
函數處理的數據粒度為多條記錄。?
sum()—求和?
count()—求數據量?
avg()—求平均直?
distinct—求不同值數?
min—求最小值?
max—求最大值
1.5. 集合函數
復合類型構建?
復雜類型訪問?
復雜類型長度
1.6. 特殊函數
窗口函數
應用場景?
用于分區排序?
動態Group By?
Top N?
累計計算?
層次查詢
Windowing functions
lead lag FIRST_VALUE LAST_VALUE分析函數
Analytics functions
RANK ROW_NUMBER DENSE_RANK CUME_DIST PERCENT_RANK NTILE混合函數
java_method(class,method [,arg1 [,arg2])reflect(class,method [,arg1 [,arg2..]])hash(a1 [,a2...])UDTF
lateralView: LATERAL VIEW udtf(expression) tableAlias AS columnAlias (‘,‘ columnAlias)*? fromClause: FROM baseTable (lateralView)*?ateral view用于和split, explode等UDTF一起使用,它能夠將一行數據拆成多行數據,在此基礎上可以對拆分后的數據進行聚合。lateral view首先為原始表的每行調用UDTF,UTDF會把一行拆分成一或者多行,lateral view再把結果組合,產生一個支持別名表的虛擬表。
1.7. 常用函數Demo:
create table employee( id string, money double, type string)row format delimited fields terminated by ‘\t‘ lines terminated by ‘\n‘ stored as textfile; load data local inpath '/home/tuzq/software/hivedata/employee.txt' into table employee; 優先級依次為NOT AND ORselect id,money from employee where (id = '2' or id = '3' or id = '4' orid = '5') AND (money > 120 AND money < 250);
?
數據準備:在/home/tuzq/software/hivedata下創建employee.txt,
數據內容如下:
| [root@hadoop1 hivedata]# cat employee.txt 1???? 100.0????? fds 2???? 150.0????? hdfs 3???? 200.0????? hdfs 4???? 35.0 hadoop 5???? 280.7????? hive [root@hadoop1 hivedata]# |
hive中的顯示效果如下:
帶有條件的查詢:
cast類型轉換
select cast(1.5 as int);
if判斷
if(con,‘‘,‘‘); hive (default)> select if(2>1,‘YES‘,‘NO‘); YES case when con then ‘‘ when con then ‘‘ else? ‘‘ end (‘‘里面類型要一樣)select case when id='1' then 'A0001'when id='2' then 'A0002' when id='3' then 'A0003' when id='4' then 'A0004' end from employee;
查詢的顯示效果如下:
get_json_object
get_json_object(json 解析函數,用來處理json,必須是json格式)select get_json_object(‘{"name":"jack","age":"20"}‘,‘$.name‘);
URL解析函數
parse_url(string urlString, string partToExtract [, string keyToExtract]) select parse_url(‘http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1‘, ‘HOST‘) from employee limit 1;
字符串連接函數: concat?
語法: concat(string A, stringB…)?
返回值: string?
說明:返回輸入字符串連接后的結果,支持任意個輸入字符串?
舉例:
?
帶分隔符字符串連接函數: concat_ws?
語法: concat_ws(string SEP,string A, string B…)?
返回值: string?
說明:返回輸入字符串連接后的結果, SEP 表示各個字符串間的分隔符
再如案例:
?
列出該字段所有不重復的值,相當于去重 collect_set(id)? //返回的是數組 列出該字段所有的值,列出來不去重 collect_list(id)?? //返回的是數組 select collect_set(id) from taborder; 求和 sum(money) 統計列數 count(*) select sum(num),count(*) from taborder;
窗口函數
first_value(第一行值) first_value(money) over (partition by id order by money) select ch,num,first_value(num) over (partition by ch order by num) from taborder; rows between 1 preceding and 1 following (當前行以及當前行的前一行與后一行) hive (liguodong)> select ch,num,first_value(num) over (partition by ch order by num ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) from taborder; last_value 最后一行值 hive (liguodong)> select ch,num,last_value(num) over (partition by ch) from taborder; ?lead 去當前行后面的第二行的值 lead(money,2) over (order by money) lag ?去當前行前面的第二行的值 lag(money,2) over (order by money) ``` ? ``` select ch, num, lead(num,2) over (order by num) from taborder; ?select ch, num, lag(num,2) over (order by num) from taborder;rank排名
rank() over(partition by id order by money) select ch, num, rank() over(partition by ch order by num) as rank from taborder; select ch, num, dense_rank() over(partition by ch order by num) as dense_rank from taborder;
cume_dist
cume_dist (相同值的最大行號/行數) cume_dist() over (partition by id order by money) percent_rank (相同值的最小行號-1)/(行數-1) 第一個總是從0開始 percent_rank() over (partition by id order by money) select ch,num,cume_dist() over (partition by ch order by num) as cume_dist, percent_rank() over (partition by ch order by num) as percent_rank from taborder; ?ntile分片 ?ntile(2) over (order by money desc)? 分兩份 select ch,num,ntile(2) over (order by num desc) from taborder;
混合函數
select id,java_method("java.lang,Math","sqrt",cast(id as double)) as sqrt from hiveTest;UDTF
?select id,adid ?from employee ?lateral view explode(split(type,‘B‘)) tt as adid; explode 把一列轉成多行 hive (liguodong)>? select id,adid ??????????????? >? from hiveDemo ??????????????? >? lateral view explode(split(str,‘,‘)) tt as adid;
正則表達式?
使用正則表達式的函數?
regexp_replace(string subject A,stringB,string C)?
regexp_extract(string subject,stringpattern,int index)
?
?
總結
以上是生活随笔為你收集整理的函数分类,HIVE CLI命令,简单函数,聚合函数,集合函数,特殊函数(窗口函数,分析函数,混合函数,UDTF),常用函数Demo的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: w7升级安装时黑屏怎么办 w7升级安装黑
- 下一篇: 管理员权限怎么解除 取消管理员权限的方法