生活随笔
收集整理的這篇文章主要介紹了
r语言的逻辑回归分类
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
iris 是r語言內置的數據集
head(iris) # 與python的不同iris.head()
Sepal.LengthSepal.WidthPetal.LengthPetal.WidthSpecies
| 5.1 | 3.5 | 1.4 | 0.2 | setosa |
| 4.9 | 3.0 | 1.4 | 0.2 | setosa |
| 4.7 | 3.2 | 1.3 | 0.2 | setosa |
| 4.6 | 3.1 | 1.5 | 0.2 | setosa |
| 5.0 | 3.6 | 1.4 | 0.2 | setosa |
| 5.4 | 3.9 | 1.7 | 0.4 | setosa |
# 查看數據的行和列
dim(iris)
1505
# 數據的類型
mode(iris)
‘list’
# columns的名字
names(iris)
'Sepal.Length''Sepal.Width''Petal.Length''Petal.Width''Species'
# r是data.frame py是pandas.Dateframe
str(iris)
'data.frame': 150 obs. of 5 variables:$ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...$ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...$ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...$ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...$ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
# 查看數據集的屬性
attributes(iris)
# 數據的概述
summary(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100 1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300 Median :5.800 Median :3.000 Median :4.350 Median :1.300 Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199 3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800 Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500 Species setosa :50 versicolor:50 virginica :50
# 查看分類的種類
table(iris$Species)
setosa versicolor virginica 50 50 50
# 畫圖 Sepal萼片長度
hist(iris$Sepal.Length)
# 密度分布圖
plot(density(iris$Sepal.Length))
# 花萼長度散點圖
plot(iris$Sepal.Length,iris$Sepal.Width)
plot(iris)
# 邏輯回歸 只能分兩類
a<-which(iris$Species=='virginica')
head(a) # 對應的編號
101102103104105106
# 取出其他的兩類
myir <- iris[-a,]
# 數據分樣 測試和訓練
s <- sample(100,80) # 100抽80
# 排序
s <- sort(s)
ir_trian <- myir[s,]
head(ir_trian)
Sepal.LengthSepal.WidthPetal.LengthPetal.WidthSpecies
1| 5.1 | 3.5 | 1.4 | 0.2 | setosa |
3| 4.7 | 3.2 | 1.3 | 0.2 | setosa |
4| 4.6 | 3.1 | 1.5 | 0.2 | setosa |
5| 5.0 | 3.6 | 1.4 | 0.2 | setosa |
7| 4.6 | 3.4 | 1.4 | 0.3 | setosa |
9| 4.4 | 2.9 | 1.4 | 0.2 | setosa |
ir_test <- myir[-s,]
model <-glm(Species~.,family = binomial(link="logit"),data= ir_trian)
summary(model)
Call:
glm(formula = Species ~ ., family = binomial(link = "logit"), data = ir_trian)Deviance Residuals: Min 1Q Median 3Q Max
-1.570e-05 -2.110e-08 2.110e-08 2.110e-08 1.865e-05 Coefficients:Estimate Std. Error z value Pr(>|z|)
(Intercept) 4.691 681526.322 0 1
Sepal.Length -9.568 216769.252 0 1
Sepal.Width -7.254 99870.123 0 1
Petal.Length 18.946 153746.614 0 1
Petal.Width 25.341 222619.596 0 1(Dispersion parameter for binomial family taken to be 1)Null deviance: 1.1070e+02 on 79 degrees of freedom
Residual deviance: 1.0579e-09 on 75 degrees of freedom
AIC: 10Number of Fisher Scoring iterations: 25
# 殘差
a<- predict(model,type="response")
# 大于0.5 為1
res_train <- ifelse(a>0.5,1,0)
b<- predict(model,type="response",newdata=ir_test)
res_test <- ifelse (b>0.5,1,0)
model <- glm(Species~.,family = binomial(link = "logit"),data= ir_trian,control= list(maxit=100))
summary(model)
Call:
glm(formula = Species ~ ., family = binomial(link = "logit"), data = ir_trian, control = list(maxit = 100))Deviance Residuals: Min 1Q Median 3Q Max
-9.535e-06 -2.110e-08 2.110e-08 2.110e-08 1.132e-05 Coefficients:Estimate Std. Error z value Pr(>|z|)
(Intercept) 5.292e+00 1.125e+06 0 1
Sepal.Length -1.013e+01 3.577e+05 0 1
Sepal.Width -7.501e+00 1.645e+05 0 1
Petal.Length 1.988e+01 2.534e+05 0 1
Petal.Width 2.634e+01 3.667e+05 0 1(Dispersion parameter for binomial family taken to be 1)Null deviance: 1.1070e+02 on 79 degrees of freedom
Residual deviance: 3.8911e-10 on 75 degrees of freedom
AIC: 10Number of Fisher Scoring iterations: 26
總結
以上是生活随笔為你收集整理的r语言的逻辑回归分类的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。