Py之imblearn:imblearn/imbalanced-learn库的简介、安装、使用方法之详细攻略
Py之imblearn:imblearn/imbalanced-learn庫的簡介、安裝、使用方法之詳細攻略
?
?
?
目錄
imblearn/imbalanced-learn庫的簡介
imblearn/imbalanced-learn庫的安裝
imblearn/imbalanced-learn庫的使用方法
?
?
imblearn/imbalanced-learn庫的簡介
? ? ? ? imblearn/imbalanced-learn是一個python包,它提供了許多重采樣技術,常用于顯示強烈類間不平衡的數據集中。它與scikit learn兼容,是?scikit-learn-contrib?項目的一部分。
在python3.6+下測試了imbalanced-learn。依賴性要求基于上一個scikit學習版本:
- scipy(>=0.19.1)
- numpy(>=1.13.3)
- scikit-learn(>=0.22)
- joblib(>=0.11)
- keras 2 (optional)
- tensorflow (optional)
?
?
?
imblearn/imbalanced-learn庫的安裝
pip install imblearn
pip install imbalanced-learn
pip install -U imbalanced-learn
conda install -c conda-forge imbalanced-learn
?
?
imblearn/imbalanced-learn庫的使用方法
? ? ? ? ?大多數分類算法只有在每個類的樣本數量大致相同的情況下才能達到最優。高度傾斜的數據集,其中少數被一個或多個類大大超過,已經證明是一個挑戰,但同時變得越來越普遍。
解決這個問題的一種方法是通過重新采樣數據集來抵消這種不平衡,希望得到一個比其他方法更健壯和公平的決策邊界。
Re-sampling techniques are divided in two categories:
Below is a list of the methods currently implemented in this module.
-
Under-sampling
- Random majority under-sampling with replacement
- Extraction of majority-minority Tomek links?[1]
- Under-sampling with Cluster Centroids
- NearMiss-(1 & 2 & 3)?[2]
- Condensed Nearest Neighbour?[3]
- One-Sided Selection?[4]
- Neighboorhood Cleaning Rule?[5]
- Edited Nearest Neighbours?[6]
- Instance Hardness Threshold?[7]
- Repeated Edited Nearest Neighbours?[14]
- AllKNN?[14]
-
Over-sampling
- Random minority over-sampling with replacement
- SMOTE - Synthetic Minority Over-sampling Technique?[8]
- SMOTENC - SMOTE for Nominal Continuous?[8]
- bSMOTE(1 & 2) - Borderline SMOTE of types 1 and 2?[9]
- SVM SMOTE - Support Vectors SMOTE?[10]
- ADASYN - Adaptive synthetic sampling approach for imbalanced learning?[15]
- KMeans-SMOTE?[17]
-
Over-sampling followed by under-sampling
- SMOTE + Tomek links?[12]
- SMOTE + ENN?[11]
-
Ensemble classifier using samplers internally
- Easy Ensemble classifier?[13]
- Balanced Random Forest?[16]
- Balanced Bagging
- RUSBoost?[18]
- Mini-batch resampling for Keras and Tensorflow
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
總結
以上是生活随笔為你收集整理的Py之imblearn:imblearn/imbalanced-learn库的简介、安装、使用方法之详细攻略的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: ML之回归预测:利用6个单独+2个集成模
- 下一篇: WPS:Excel数据表格查询定位技巧之