spearman相关性_Spearman的相关性及其在机器学习中的意义
spearman相關性
This article is about correlation and its implication in the machine learning. In my previous article, I have discussed Pearson’s correlation coefficient and later we have written a code to show the usefulness of finding Pearson’s correlation coefficient. Well, you must be thinking that why is there a need to use Spearman's correlation when we already have Pearson’s correlation to find out the correlation between the feature values and the target values? The answer is that "Pearson’s correlation works fine only with the linear relationships whereas Spearman's correlation works well even with the non-linear relationships".
本文介紹了相關性及其在機器學習中的含義。 在上一篇文章中,我討論了Pearson的相關系數 ,后來我們編寫了代碼以顯示找到Pearson的相關系數的有用性。 好吧,您必須考慮一下, 當我們已經有了Pearson的相關性以找出特征值與目標值之間的相關性時 , 為什么需要使用Spearman的相關性? 答案是“皮爾遜相關僅適用于線性關系,而斯皮爾曼相關甚至適用于非線性關系” 。
Another advantage of using Spearman’s correlation is that since it uses ranks to find the correlation values, therefore, this correlation well suited for continuous as well as discrete datasets.
使用Spearman相關性的另一個優點是,由于它使用秩來查找相關值,因此,此相關性非常適合于連續數據集和離散數據集。
Image source: https://digensia.files.wordpress.com/2012/04/s1.png
圖片來源: https : //digensia.files.wordpress.com/2012/04/s1.png
Here, the the value of dican be calculated as X-Y where X= feature values and Y= target values.
在這里,dican的值可以計算為XY ,其中X =特征值 , Y =目標值 。
The Dataset used can be downloaded from here: headbrain4.CSV
可以從此處下載使用的數據集: headbrain4.CSV
Since we have used the continuous dataset. i.e. the same dataset used for Pearson’s correlation, you will not be able to observe much of a difference between the Pearson and Spearman correlation, you can download any discrete dataset and you’ll see the difference.
由于我們使用了連續數據集。 也就是說,與用于Pearson相關的數據集相同,您將無法觀察到Pearson和Spearman相關之間的很大差異,您可以下載任何離散的數據集,然后看到差異。
So now, let us see how we can use Spearman's correlation in our machine learning program using python programming:
現在,讓我們看看如何使用python編程在我們的機器學習程序中使用Spearman的相關性:
# -*- coding: utf-8 -*- """ Created on Sun Jul 29 22:21:12 2018@author: Raunak Goswami """import numpy as np import pandas as pd import matplotlib.pyplot as plt#reading the data """ here the directory of my code and the headbrain4.csv file is same make sure both the files are stored in the same folder or directory """ data=pd.read_csv('headbrain4.csv')#this will show the first five records of the whole data data.head()#this will create a variable w which has the feature values i.e Gender w=data.iloc[:,0:1].values #this will create a variable x which has the feature values i.e Age Range y=data.iloc[:,1:2].values #this will create a variable x which has the feature values i.e head size x=data.iloc[:,2:3].values #this will create a variable y which has the target value i.e brain weight z=data.iloc[:,3:4].values print(round(data['Gender'].corr(data['Brain Weight(grams)'],method='spearman'))) plt.scatter(w,z,c='red') plt.title('scattered graph for Spearman correlation between Gender and brainweight' ) plt.xlabel('Gender') plt.ylabel('brain weight') plt.show()print(round(data['Age Range'].corr(data['Brain Weight(grams)'],method='spearman'))) plt.scatter(x,z,c='red') plt.title('scattered graph for Spearman correlation between age and brainweight' ) plt.xlabel('age range') plt.ylabel('brain weight') plt.show()print(round((data['Head Size(cm^3)'].corr(data['Brain Weight(grams)'],method='spearman')))) plt.scatter(x,z,c='red') plt.title('scattered graph for Spearman correlation between head size and brainweight' ) plt.xlabel('head size') plt.ylabel('brain weight') plt.show()data.info() data['Head Size(cm^3)'].corr(data['Brain Weight(grams)']) k1=data.corr(method='spearman') print("The table for all possible values of spearman's coeffecients is as follows") print(k1)After you run your code in Spyder tool provided by anaconda distribution just go to your variable explorer and search for the variable named as k1 and double-click to see the values in that variable and you’ll see something like this:
在anaconda發行版提供的Spyder工具中運行代碼后,轉到變量資源管理器并搜索名為k1的變量,然后雙擊以查看該變量中的值,您將看到類似以下內容:
Here,1 signifies a perfect correlation,0 is for no correlation and -1 signifies a negative correlation.
此處,1表示完全相關,0表示沒有相關,-1表示負相關。
As you look carefully, you will see that the value of the correlation between brain weight and head size is always 1. If you remember were getting a similar value of correlation in Pearson’s correlation
仔細觀察,您會發現大腦重量和頭部大小之間的相關性值始終為1。如果您記得在皮爾森相關性中獲得了相似的相關性值
Now, just go to the ipython console you will see some self-explanatory scattered graphs, in case you are having any trouble understanding those graphs just have a look at my previous article about Pearson’s correlation and its implication in machine learning and you’ll get to know.
現在,只要轉到ipython控制臺,您將看到一些不言自明的分散圖,以防萬一您無法理解這些圖,請看一下我以前關于Pearson的相關性及其在機器學習中的含義的文章,您將獲得要知道。
This was all for today guys hope you liked it if you have any queries just drop a comment below and I would be happy to help you.
今天,這就是全部,如果您有任何疑問,希望您喜歡它,只需在下面發表評論,我們將竭誠為您服務。
翻譯自: https://www.includehelp.com/ml-ai/spearmans-correlation-and-its-implication-in-machine-learning.aspx
spearman相關性
總結
以上是生活随笔為你收集整理的spearman相关性_Spearman的相关性及其在机器学习中的意义的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Java RandomAccessFil
- 下一篇: 皮尔逊相关性_皮尔逊的相关性及其在机器学