HiveQL:查询
文章目錄
- 1. select from
- 1.1 正則表達式指定列
- 1.2 使用列值計算
- 1.3 使用函數(shù)
- 1.4 limit 限制返回行數(shù)
- 1.5 別名 as name
- 1.6 case when then 語句
- 2. where 語句
- 3. JOIN 優(yōu)化
- 4. 抽樣查詢
- 5. union all
學自《hive編程指南》
1. select from
hive (default)> create table employees(> name string,> salary float,> subordinates array<string>,> deductions map<string, float>,> address struct<street:string, city:string, state:string, zip:int>)> partitioned by(country string, state string);hive (default)> load data local inpath "/home/hadoop/workspace/employees.txt"> overwrite into table employees> partition(country='US', state='CA'); Loading data to table default.employees partition (country=US, state=CA)hive (default)> select * from employees; John Doe 100000.0 ["Mary Smith","Todd Jones"] {"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1} {"street":"1 Michigan Ave.","city":"Chicago","state":"IL","zip":60600} US CA Mary Smith 80000.0 ["Bill King"] {"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1} {"street":"100 Ontario St.","city":"Chicago","state":"IL","zip":60601} US CA Todd Jones 70000.0 [] {"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1} {"street":"200 Chicago Ave.","city":"Oak Park","state":"IL","zip":60700} US CA Bill King 60000.0 [] {"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1} {"street":"300 Obscure Dr.","city":"Obscuria","state":"IL","zip":60100} US CA Boss Man 200000.0 ["John Doe","Fred Finance"] {"Federal Taxes":0.3,"State Taxes":0.07,"Insurance":0.05} {"street":"1 Pretentious Drive.","city":"Chicago","state":"IL","zip":60500} US CA Fred Finance 150000.0 ["Stacy Accountant"] {"Federal Taxes":0.3,"State Taxes":0.07,"Insurance":0.05} {"street":"2 Pretentious Drive.","city":"Chicago","state":"IL","zip":60500} US CA Stacy Accountant 60000.0 [] {"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1} {"street":"300 Main St.","city":"Naperville","state":"IL","zip":60563} US CA- 可以對表起別名
- 提取數(shù)組元素 [idx],不存在為NULL,提取出的字符串也沒有引號
- 提取 map 元素 [key]
- 提取 struct 中的元素,使用 .
1.1 正則表達式指定列
select `price.*` from stocks;以 price為前綴的列
1.2 使用列值計算
- 計算稅后薪資
1.3 使用函數(shù)
- 聚合函數(shù)
- 表生成函數(shù),將單列擴展為多行或者多列
- 內(nèi)置函數(shù)
1.4 limit 限制返回行數(shù)
limit n 返回 n 行
1.5 別名 as name
1.6 case when then 語句
hive (default)> select name, salary,> case when salary < 50000 then 'low'> else 'high'> end as bracket from employees;John Doe 100000.0 high Mary Smith 80000.0 high Todd Jones 70000.0 high Bill King 60000.0 high Boss Man 200000.0 high Fred Finance 150000.0 high Stacy Accountant 60000.0 high2. where 語句
- 過濾條件
- like, rlike(正則)
3. JOIN 優(yōu)化
多個表 join 把小的表放在左邊
4. 抽樣查詢
- 分桶抽樣
- 不使用 rand(), 每次結(jié)果是一樣的
- 百分比抽樣
5. union all
將多個表進行合并,每個表必須有相同的列,且字段類型一致
hive> select name from(> select e1.name from employees e1 where e1.name like "Mary%"> union all> select e2.name from employees e2 where e2.name like "Bill%"> ) name_tab> sort by name;WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Query ID = hadoop_20210411221203_b3dde291-8596-4b91-95e0-707eeaa873f6 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes):set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers:set hive.exec.reducers.max=<number> In order to set a constant number of reducers:set mapreduce.job.reduces=<number> Job running in-process (local Hadoop) 2021-04-11 22:12:04,856 Stage-1 map = 100%, reduce = 100% Ended Job = job_local1468526053_0003 MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 31360 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msecBill King Mary Smith總結(jié)
- 上一篇: LeetCode 878. 第 N 个神
- 下一篇: Chapter4-2_Speech_Sy