【机器学习】机器学习可视化利器--Yellowbrick
本文分享機器學習工具Scikit-Learn強力擴展yellowbrick。
通過幾行代碼可視化特征值、模型、模型評估等,幫助更便捷的的選擇機器學習模型和調參,依賴Matplotlib和Scikit-Learn。
本文目錄
yellowbrick安裝
#?清華源加速安裝 pip?install?yellowbrick?-i?https://pypi.tuna.tsinghua.edu.cn/simpleyellowbrick核心“武器”?-?Visualizers
Visualizers可以理解為一個scikit-learn的估計器(estimator)對象,但是附加了可視化的屬性,使用過程與使用scikit-learn模型類似:
導入特定的visualizers;
實例化visualizers;
擬合visualizers;
可視化展示。
yellowbrick實例快速上手
展示ROC曲線,評估不同模型效果
特征工程中,展示PCA降維效果
回歸模型中,展示預測值和真實值之間的殘差,Q-Q plot評估模型效果。
展示Lasso回歸模型效果
更多實例見下一節~~
yellowbrick常用的Visualizers
特征展示(Feature Visualization)
Rank Features: pairwise ranking of features to detect relationships
Parallel Coordinates: horizontal visualization of instances
Radial Visualization: separation of instances around a circular plot
PCA Projection: projection of instances based on principal components
Manifold Visualization: high dimensional visualization with manifold learning
Joint Plots: direct data visualization with feature selection
分類模型展示(Classification Visualization)
Class Prediction Error: shows error and support in classification
Classification Report: visual representation of precision, recall, and F1
ROC/AUC Curves: receiver operator characteristics and area under the curve
Precision-Recall Curves: precision vs recall for different probability thresholds
Confusion Matrices: visual description of class decision making
Discrimination Threshold: find a threshold that best separates binary classes
回歸模型展示(Regression Visualization)
Prediction Error Plot: find model breakdowns along the domain of the target
Residuals Plot: show the difference in residuals of training and test data
Alpha Selection: show how the choice of alpha influences regularization
Cook’s Distance: show the influence of instances on linear regression
聚類模型展示(Clustering Visualization)
K-Elbow Plot: select k using the elbow method and various metrics
Silhouette Plot: select k by visualizing silhouette coefficient values
Intercluster Distance Maps: show relative distance and size/importance of clusters
模型選擇(Model Selection Visualization)
Validation Curve: tune a model with respect to a single hyperparameter
Learning Curve: show if a model might benefit from more data or less complexity
Feature Importances: rank features by importance or linear coefficients for a specific model
Recursive Feature Elimination: find the best subset of features based on importance
目標展示(Target Visualization)
Balanced Binning Reference: generate a histogram with vertical lines showing the recommended value point to bin the data into evenly distributed bins
Class Balance: see how the distribution of classes affects the model
Feature Correlation: display the correlation between features and dependent variables
文本展示(Text Visualization)
Term Frequency: visualize the frequency distribution of terms in the corpus
t-SNE Corpus Visualization: use stochastic neighbor embedding to project documents
Dispersion Plot: visualize how key terms are dispersed throughout a corpus
UMAP Corpus Visualization: plot similar documents closer together to discover clusters
PosTag Visualization: plot the counts of different parts-of-speech throughout a tagged corpus
yellowbrick圖形個性化設置
https://www.scikit-yb.org/en/latest/index.html
-END-
👇點擊👇 Python可視化 Python入門及提高 統計學入門及提高 R可視化 關注「pythonic生物人」 加微信群:掃碼備注學習 |
總結
以上是生活随笔為你收集整理的【机器学习】机器学习可视化利器--Yellowbrick的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 迅雷加载项会导致IE9浏览器崩溃
- 下一篇: div+css 单行或者多行超出文本,数