當(dāng)前位置：首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

HIVE快速入门

發(fā)布時(shí)間：2025/6/15 编程问答 21 豆豆

生活随笔收集整理的這篇文章主要介紹了 HIVE快速入门小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

（一）簡(jiǎn)單入門(mén)
1、創(chuàng)建一個(gè)表
create table if not exists ljh_emp(
name string,
salary float,
gender string)
comment 'basic information of a employee'
row format delimited fields terminated by ',’;

2、準(zhǔn)備數(shù)據(jù)文件
創(chuàng)建test目錄且目錄只有一個(gè)文件，文件內(nèi)容如下：
ljh,25000,male
jediael,25000,male
llq,15000,female

3、將數(shù)據(jù)導(dǎo)入表中
load data local inpath '/home/ljhn1829/test' into table ljh_emp;

4、查詢(xún)表中的內(nèi)容
select * from ljh_emp;
OK
ljh?? ?25000.0?? ?male
jediael?? ?25000.0?? ?male
llq?? ?15000.0?? ?female
Time taken: 0.159 seconds, Fetched: 3 row(s)

（二）關(guān)于分隔符
1、默認(rèn)分隔符
hive中的行默認(rèn)分隔符為 \n，字段分隔符為 ctrl+A，此外還有ctrl+B，ctrl+C，可以用于分隔array,struct,map等，詳見(jiàn)《hive編程指南》P44。
因此，若在建表是不指定row format delimited fields terminated by ‘,’，則認(rèn)為默認(rèn)字段分隔符為ctrl+A。
可以有2種解決方案：
一是在創(chuàng)建表時(shí)指定分隔符，如上例所示，
二是在數(shù)據(jù)文件中使用ctrl+A，見(jiàn)下例

2、在數(shù)據(jù)文件中使用ctrl+A全分隔符
（1）創(chuàng)建表
create table ljh_test_emp(name string, salary float, gender string);
（2）準(zhǔn)備數(shù)據(jù)文件
創(chuàng)建test2目錄，目錄下只有一個(gè)文件，文件內(nèi)容如下：
ljh^A25000^Amale
jediael^A25000^Amale
llq^A15000^Afemale
其中的^A字符僅在vi時(shí)才能看到，cat不能看到。
輸出^A的方法是：在vi的插入模式下，先按ctrl+V，再按ctrl+A
（3）將數(shù)據(jù)導(dǎo)入表
create table ljh_test_emp(name string, salary float, gender string);
（4）查詢(xún)數(shù)據(jù)
hive> select * from ljh_test_emp;
OK
ljh?? ?25000.0?? ?male
jediael?? ?25000.0?? ?male
llq?? ?15000.0?? ?female
Time taken: 0.2 seconds, Fetched: 3 row(s)

3、未指定分隔符，且又未使用ctrl+A作文件中的分隔符，出現(xiàn)以下錯(cuò)誤
(1)創(chuàng)建表
create table if not exists ljh_emp_test(
name string,
salary float,
gender string)
comment 'basic information of a employee’;
（2）準(zhǔn)備數(shù)據(jù)
ljh,25000,male
jediael,25000,male
llq,15000,female
（3）將數(shù)據(jù)導(dǎo)入表中
load data local inpath '/home/ljhn1829/test' into table ljh_emp_test;
（4）查看表中數(shù)據(jù)
select * from ljh_emp_test;
OK
ljh,25000,male?? ?NULL?? ?NULL
jediael,25000,male?? ?NULL?? ?NULL
llq,15000,female?? ?NULL?? ?NULL
Time taken: 0.185 seconds, Fetched: 3 row(s)
可以看出，由于分隔符為ctrl+A，因此導(dǎo)入數(shù)據(jù)時(shí)將文件中的每一行內(nèi)容均只當(dāng)作第一個(gè)字段，導(dǎo)致后面2個(gè)字段均為null。

（三）復(fù)雜一點(diǎn)的表
1、創(chuàng)建表
create table employees (
??? name string,
??? slalary float,
??? suboddinates array<string>,
??? deductions map<string,float>,
??? address struct<stree:string, city:string, state:string, zip:int>
)
partitioned by(country string, state string);

2、準(zhǔn)備數(shù)據(jù)
John Doe^A100001.1^AMary Smith^BTodd Jones^AFederal Taxes^C.2^BStateTaxes^C.05^BInsurance^C.1^A1 Michigan Ave.^BChicago^BIL^B60600
Mary Smith^A80000.0^ABill King^AFederal Taxes^C.2^BState Taxes^C.05^BInsurance^C.1^A100 Ontario St.^BChicago^BIL^B60601
Todd Jones^A70000.0^A^AFederal Taxes^C.15^BState Taxes^C.03^BInsurance^C.1^A200 Chicago Ave.^BOak Park^BIL^B60700
Bill King^A60001.0^A^AFederal Taxes^C.15^BState Taxes^C.03^BInsurance^C.1^A300 Obscure Dr.^BObscuria^BIL^B60100
注意? ^A：分隔字段 ^B：分隔array/struct/map中的元素 ^C：分隔map中的KV
詳見(jiàn)《hive編程指南》P44。

3、將數(shù)據(jù)導(dǎo)入表中
load data local inpath '/home/ljhn1829/phd' into table employees partition(country='us',state='ca');

4、查看表數(shù)據(jù)
hive> select * from employees;
OK
John Doe?? ?100001.1?? ?["Mary Smith","Todd Jones"]?? ?{"Federal Taxes":0.2,"StateTaxes":0.05,"Insurance":0.1}?? ?{"stree":"1 Michigan Ave.","city":"Chicago","state":"IL","zip":60600}?? ?us?? ?ca
Mary Smith?? ?80000.0?? ?["Bill King"]?? ?{"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1}?? ?{"stree":"100 Ontario St.","city":"Chicago","state":"IL","zip":60601}?? ?us?? ?ca
Todd Jones?? ?70000.0?? ?[]?? ?{"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1}?? ?{"stree":"200 Chicago Ave.","city":"Oak Park","state":"IL","zip":60700}?? ?us?? ?ca
Bill King?? ?60001.0?? ?[]?? ?{"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1}?? ?{"stree":"300 Obscure Dr.","city":"Obscuria","state":"IL","zip":60100}?? ?us?? ?ca
Time taken: 0.312 seconds, Fetched: 4 row(s)

5、查看hdfs中的文件
hadoop fs -ls /data/gamein/g4_us/meta/employees/country=us/state=ca
Found 1 items
-rwxr-x---?? 3 ljhn1829 g4_us??????? 428 2015-05-12 12:49 /data/gamein/g4_us/meta/employees/country=us/state=ca/progamming_hive_data.txt
該文件中的內(nèi)容與原有文件一致。

（四）通過(guò)select子句插入數(shù)據(jù)
1、創(chuàng)建表
create table employees2 (
??? name string,
??? slalary float,
??? suboddinates array<string>,
??? deductions map<string,float>,
??? address struct<stree:string, city:string, state:string, zip:int>
)
partitioned by(country string, state string);

2、插入數(shù)據(jù)
hive>? set hive.exec.dynamic.partition.mode=nonstrict;
否則會(huì)出現(xiàn)以下異常：
FAILED: SemanticException [Error 10096]: Dynamic partition strict mode requires at least one static partition column. To turn this off set hive.exec.dynamic.partition.mode=nonstrict

insert into table employees2?
partition (country,state)
select name,slalary,suboddinates,deductions,address, e.country, e.state
from employees e;

總結(jié)

以上是生活随笔為你收集整理的HIVE快速入门的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： php显示TABLE数据
下一篇： Hadoop Hive sql语法详解