當前位置：首頁 > 运维知识 > 数据库 >内容正文

数据库

mysql load data infile 导入数据某一列空_Sql数据挑战赛amp;网络销售案例分析

發(fā)布時間：2024/7/23 数据库 35 豆豆

生活随笔收集整理的這篇文章主要介紹了 mysql load data infile 导入数据某一列空_Sql数据挑战赛amp;网络销售案例分析小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

SQL挑戰(zhàn)賽

第一期:

1: 編寫一個查詢，列出員工姓名列表，員工每月工資超過2000美元且員工工作時間少于10個月。通過提升employee_id對結(jié)果進行排序

select name from employee where salary > 2000 and months < 10 order by employee_id;

2: 查詢 Employee 表格中以元音字母開頭的 name 名字。結(jié)果不包含名字重復(fù)記錄。

方法一:left函數(shù) select distinct name from employee where left(name,1) in ("a","o","i""e","u")方法二:like模糊匹配 select distinct name from employee where name like '%a'||'%o'|| '%i'|| '%e'|| '%u'方法三:substr函數(shù) SUBSTR (str, pos, len):由 <str> 中的第 <pos> 位置開始，選出接下去的 <len> 個字元。 select distinct name from employee where substr(name,1,1) in ("a","o","i""e","u")

3:編寫一個查詢，去掉一個最高收入，去掉一個最低收入，該公司員工平均收入是多少？

方法一: select avg(salary) from employee where salary != (select max(salary) from employee) And salary !=(select min(salary) from employee);方法二: select (sum(salary)-max(salary)-min(salary))/(count(*)-2) from employee;方法三: select avg(salary) from employee t where salary not in (select max(salary) from employee unionselect min(salary) from employee)

4：簡述NULL, 空字符串""與 0的區(qū)別

NULL在數(shù)據(jù)庫中表示沒有這條記錄

空字符串""為一個長度為0的字符串

0為數(shù)字0.

在count時,count(0) 與 count("")都會被聚合,但count(null)不會

第二期

數(shù)據(jù)表解釋:

market_data表的字段介紹為：

order_id(訂單ID),order_time(訂單時間),

customer_name(用戶名稱),quantity(購買數(shù)量），

sale(銷售額），profit(利潤)

各項指標的定義為：

R值為：用戶最后一次購買到現(xiàn)在（2016年12月31日）的時間間隔，輸出月份。

L值為：用戶第一次購買和最后一次購買之間的時間間隔，輸出月份。

F值為：用戶的總共購買次數(shù)，僅計算2016年的即可。

M值為：用戶的全部銷售額，僅計算2016年的即可。

1.查詢所有用戶的R值和L值。

#TIMESTAMPDIFF(interval,datetime_expr1,datetime_expr2) 返回日期或日期時間表達式datetime_expr1 和datetime_expr2the 之間的整數(shù)差。其結(jié)果的單位由interval 參數(shù)給出。#TIMESTAMPADD(interval,int_expr,datetime_expr) 將整型表達式int_expr 添加到日期或日期時間表達式 datetime_expr中。式中的interval和上文中列舉的取值是一樣的。#DATEDIFF(date1,date2)返回兩個日期之間的天數(shù) select customer_name,timestampdiff(month,max(order_time),'2016-13-31') as R,timestampdiff(month,min(order_time),max(order_time)) as L from market_data group bycustomer_name

2.查詢用戶的R值，F值和M值，注意F值和M值，僅計算2016年度的數(shù)字。

#IF(expr1,expr2,expr3) 如果 expr1 是TRUE (expr1 <> 0 and expr1 <> NULL)，則 IF()的返回值為expr2; 否則返回值則為 expr3。IF() 的返回值為數(shù)字值或字符串值，具體情況視其所在語境而定。select customer_name,timestampdiff(month,max(order_time),'2016-13-31') as R,count(if(year(order_time)=2016,order_id,null)) as F,round(sum(if(year(order_time)=2016,sale,null)),2) as M frommarket_data Group bycustomer_name

3.查詢用戶的R值，L值和用戶生命周期劃分。生命周期劃分如下：

（新用戶：R<=6 and L<=12；忠誠用戶：R<=6 and L>12;

流失的老用戶：R>6 and L>12; 一次性用戶：R>6 and L<=12）

select temp.*, casewhen R<=6 and L<=12 then '新用戶'when R<=6 and L>12 then '忠誠用戶'when R>6 and L>12 then '流失的老用戶'when R>6 and L<=12 then '一次性用戶' end as 用戶生命周期 from(select customer_name,timestampdiff(month,max(order_time),'2016-13-31') as R,timestampdiff(month,min(order_time),max(order_time)) as L from market_data group bycustomer_name) as temp

第三期

Cinema表結(jié)構(gòu)各字段介紹如下：

Seat_id（座位號，依次遞增），free（0表示有人，1表示空座），fare(對應(yīng)座位的票價）

題目為：

1：查找表中最便宜的票價是多少？

方法一: select * from cinema where free=1 order by fare limit 1方法二 select min(fare) from cinema where free=1

2：女友要求你定的座位必須是連續(xù)的（輸出可用位置的seat_id）

select c1.seat_id,c2.seat_id from cinema c1- -兩表連接(自連接)join cinema c2- -限制條件位連續(xù)座位on c1.seat_id+1 =c2.seat_id- -空閑座位 where c1.free=1 and c2.free=1;

3：女友要求買連續(xù)的座位中總價最低的兩個座位(輸出對應(yīng)的seat_id和總價)

select c1.seat_id,c2.seat_id,c1.fare+c2.fare from cinema c1- -兩表連接(自連接)join cinema c2- -限制條件位連續(xù)座位on c1.seat_id+1 =c2.seat_id- -空閑座位 where c1.free=1 and c2.free=1- -對價格排序并限制輸出一個 order by c1.fare+c2.fare limit 1

第四期

employ表內(nèi)字段的解釋如下：

position_name（職位名稱），min_salary（最低薪資，單位元），

max_salary（最高薪資，單位元），city（工作城市），

educational（學(xué)歷要求），people（招聘人數(shù)），industry（行業(yè)）

題目為：

1.查找不同學(xué)歷要求的平均最低薪資和平均最高薪資；

select educational,round(avg(min_salary),round(avg(max_salary)) from employ group by educational;

2.查找每個行業(yè)，最高工資和最低工資差距最大的職位名稱。

selectin industry,position_name,max(max_salary-min_salary) from employ group by industry

3.查找各個城市電商行業(yè)的平均薪資水平，并按照薪資高低水平進行排序。（崗位的薪資用最低薪資加最高薪資除以二當成平均薪資計算即可，注意要考慮到職位招聘人數(shù)）

select city,round(sum((max_salary+min_salary)/2*people)/sum(people)) as 平均薪資 from employee where industry='互聯(lián)網(wǎng)/電子商務(wù)' group by city order by 平均薪資 desc;

4.問答題：說明UNION和UNION ALL的差別

都是做表的合并連結(jié)

union會刪除重復(fù)值;union all 表中數(shù)據(jù)全部合并,忽略重復(fù)值

數(shù)據(jù)來源于某網(wǎng)站銷售統(tǒng)計

網(wǎng)絡(luò)訂單數(shù)據(jù)

用戶信息

提取碼:3k6i?pan.baidu.com

分析步驟

0.數(shù)據(jù)導(dǎo)入

首先需要先創(chuàng)建對應(yīng)的數(shù)據(jù)庫和相應(yīng)的表

創(chuàng)建orderinfo 表

2..創(chuàng)建userinfo表

#userinfo和orderinfo數(shù)據(jù)信息如下： userinfo 客戶信息表 userId 客戶idsex 性別birth 出生年月日orderinfo 訂單信息表 orderId 訂單序號userId 客戶idisPaid 是否支付price 商品價格paidTime 支付時間

登錄mysql導(dǎo)入相應(yīng)的數(shù)據(jù)

load data local infile "file" into table dbname.tablename ...

# 登錄 mysql --local-infile -uroot -p # 導(dǎo)入數(shù)據(jù)orderinfo load data local infile 'F:BaiduNetdiskDownloadSQLorder_info_utf.csv' into table data.orderinfo fields terminated by ','; # 導(dǎo)入數(shù)據(jù)userinfo load data local infile 'F:BaiduNetdiskDownloadSQLuser_info_utf.csv' into table data.userinfo fields terminated by ',';

2.觀察數(shù)據(jù)，對時間進行處理 ; 更新字符串為日期格式

update orderinfo set paidtime=replace(paidtime,'/','-') where paidtime is not null update orderinfo set paidtime=str_to_date(paidtime,'%Y-%m-%d %H:%i') where paidtime is not null

3.查看數(shù)據(jù)

1.不同月份的下單人數(shù)

思路：按月份進行分組，對用戶進行去重統(tǒng)計

select month(paidTime) as dtmonth, count(distinct userId) as count_users from orderinfo where isPaid = '已支付' group by month(paidTime)

2 用戶三月份的回購率和復(fù)購率

復(fù)購率：自然月內(nèi)，購買多次的用戶占比

首先先找出已支付中3月份的用戶id和對應(yīng)次數(shù)，按用戶分組
然后再嵌套一層，復(fù)購率：購買次數(shù)大于1/ 總購買次數(shù)

select count(ct),count(if(ct>1,1,null)) from(select userID,Count(userId) as ct from orderinfowhere isPaid = '已支付'and month(paidTime) = 3group by userIdorder by userId) t

復(fù)購率： 16916 / 54799 = 0.308

回購率：曾經(jīng)購買過的用戶在某一時期內(nèi)再次購買的占比

首先先查詢已支付userId ,和支付月份的統(tǒng)計

select userId, date_format(paidTime, '%Y-%m-01') as m from orderinfowhere isPaid = '已支付'group by userId , date_format(paidTime,'%Y-%m-01')

然后使用date_sub函數(shù)，將表關(guān)聯(lián)，篩選出本月的消費的userID，和下月的回購userID，即可計算出回購率

select t1.m,count(t1.m) as 消費總數(shù),count(t2.m) as 復(fù)購率,count(t2.m)/ count(t1.m) as 回購率 from ( select userId, date_format(paidTime, '%Y-%m-01') as m from orderinfowhere isPaid = '已支付'group by userId , date_format(paidTime,'%Y-%m-01')) t1 left join ( select userId, date_format(paidTime, '%Y-%m-01') as m from orderinfowhere isPaid = '已支付'group by userId , date_format(paidTime,'%Y-%m-01')) t2 on t1.userId = t2.userId and t1.m = date_sub(t2.m, interval 1 month) group by t1.m

3 統(tǒng)計男女用戶的消費頻次

userinfo因為性別有空值，需要篩選出t orderinfo 再和表t連接統(tǒng)計出用戶男女消費次數(shù)

select o.userId,sex,count(o.userId)as ct from orderinfo oinner join(select * from userinfowhere sex != '') ton o.userId = t.userIdgroup by userId,sexorder by userId

根據(jù)上表，在進行子查詢，統(tǒng)計出男性消費頻次

select sex,avg(ct) from(select o.userId,sex,count(o.userId)as ct from orderinfo oinner join(select * from userinfowhere sex != '') ton o.userId = t.userIdgroup by userId,sexorder by userId)t2 group by sex

4 統(tǒng)計多次消費用戶，分析第一次和最后一次的消費間隔

首先把多次消費的用戶,和相應(yīng)第一次最后一次消費時間提取出來
然后使用datediff 計算時間間隔，以天為單位

select userId,max(paidTime),min(paidTime),datediff(max(paidTime),min(paidTime)) from data.orderinfo where isPaid = '已支付' group by userId having count(1) > 1 order by userId

5 統(tǒng)計不同年齡段用戶的消費金額差異

通過表聯(lián)結(jié)，給用戶劃分不同的年齡段，以10年為基準，過濾出生日期為1900-00-00的異常值,篩選出用戶消費頻次和消費金額

select o.userId,age,price,count(o.userId)as ct from orderinfo o inner join (select userId, ceil((year(now()) - year(birth))/10) as agefrom userinfowhere birth > 1901-00-00) t on o.userId = t.userId where isPaid = '已支付' group by userId order by userId

統(tǒng)計出年齡段的消費頻次和消費金額

select t2.age,avg(ct),avg(price) from (select o.userId,age,price,count(o.userId)as ct from orderinfo o inner join(select userId, ceil((year(now()) - year(birth))/10) as agefrom userinfowhere birth > 1901-00-00)ton o.userId = t.userIdwhere ispaid = '已支付'group by userId, age) t2 group by age order by age

ceil : 向上取整

6 統(tǒng)計消費的二八法則：消費top20%的用戶貢獻了多少消費額度

按照用戶消費總額排序

select userId,sum(price) as total from orderinfo o where isPaid = '已支付' group by userId order by total desc

查看總用戶數(shù)和總金額

select count(userId),sum(total) from (select userId,sum(price) as total from orderinfo owhere isPaid = '已支付'group by userIdorder by total desc) as t

查看前20%的用戶數(shù)量有多少

select count(userId)*0.2,sum(total) from (select userId,sum(price) as total from orderinfo owhere isPaid = '已支付'group by userIdorder by total desc)as t

limit限制前17000用戶

select count(userId),sum(total) from ( select userId,sum(price) as total from orderinfo o where isPaid = '已支付' group by userId order by total desc limit 17129) t

top20%用戶的消費總額占比情況:top20%用戶的消費總額/所有用戶的消費總額=73.93%
top20%的用戶貢獻了73.93%消費額度。

總結(jié)

以上是生活随笔為你收集整理的mysql load data infile 导入数据某一列空_Sql数据挑战赛amp;网络销售案例分析的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇： php中没有dockerfile,Doc
下一篇： js原生后代选择器_CSS 后代选择器