数据统计 测试方法_统计测试:了解如何为数据选择最佳测试!
數(shù)據(jù)統(tǒng)計(jì) 測(cè)試方法
This post is not meant for seasoned statisticians. This is geared towards data scientists and machine learning (ML) learners & practitioners, who like me, do not come from a statistical background.
?他的職位是不是意味著經(jīng)驗(yàn)豐富的統(tǒng)計(jì)人員。 這是針對(duì)數(shù)據(jù)科學(xué)家和機(jī)器學(xué)習(xí)(ML)學(xué)習(xí)者和從業(yè)者的 ,他們和我一樣,并非來(lái)自統(tǒng)計(jì)背景。
For a person being from a non-statistical background the most confusing aspect of statistics, are the fundamental statistical tests, and when to use which test?. This post is an attempt to mark out the difference between the most common tests and the relevant key assumptions.
對(duì)于一個(gè)非統(tǒng)計(jì)學(xué)背景的人來(lái)說(shuō),統(tǒng)計(jì)方面最令人困惑的方面是基本統(tǒng)計(jì)檢驗(yàn) ,以及何時(shí)使用哪種檢驗(yàn)? 這篇文章是試圖指出最常見(jiàn)的測(cè)試和相關(guān)的關(guān)鍵假設(shè)之間的差異。
目錄 (Table of contents)
Terminologies: (KEY TERMINOLOGIES FOR THIS POST)
術(shù)語(yǔ):( 此職位的主要術(shù)語(yǔ))
1)術(shù)語(yǔ): (1) TERMINOLIGIES:)
獨(dú)立變量和獨(dú)立變量 (DEPENDENT AND INDEPENDENT VARIABLES)
An independent variable often called “predictor variable”, is a variable that is being manipulated in order to observe the effect on a dependent variable, sometimes called an outcome/output variable.
通常被稱為“預(yù)測(cè)變量”的自變量是為了觀察對(duì)因變量的影響而被操縱的變量,有時(shí)稱為結(jié)果/輸出變量。
- Independent variable(s)-> Predictor variable(s) 自變量->預(yù)測(cè)變量
- Dependent variable(s) -> Outcome/Output variable(s) 因變量->結(jié)果/輸出變量
變量類型 (TYPES OF VARIABLES)
It is important to distinguish the difference between the type of variables because this plays a key role in determining the correct type of statistical test to adopt. There are two main categories:
區(qū)分變量類型之間的差異非常重要,因?yàn)檫@在確定要采用的正確統(tǒng)計(jì)檢驗(yàn)類型中起著關(guān)鍵作用。 主要有兩個(gè)類別:
QUANTITATIVE: express the amounts of things (e.g. the number of cigarettes in a pack). The two different types of quantitative variables are:
數(shù)量 : 表達(dá)物品的數(shù)量(例如,一包香煙的數(shù)量)。 兩種不同類型的定量變量是:
CONTINOUS (a.k.a Ratio): is used to describe measures and can usually be divided into units smaller than one (e.g. 1.50 kg).
連續(xù) (又稱比率 ):用于描述度量,通常可以劃分為小于一的單位(例如1.50千克)。
DISCRETE (a.k.a Interval): is used to describe counts and usually can’t be divided into units smaller than one (e.g. 1 cigarette).
DISCRETE (又名Interval ):用于描述計(jì)數(shù),通常不能分為小于1的單位(例如1支香煙)。
CATEGORICAL: express groupings of things (e.g. the different type of fruits). The three different types of categorical variables are:
類別 : 表達(dá)事物的分組(例如,不同類型的水果)。 三種不同類型的類別變量是:
ORDINAL: represent data with an order (e.g. rankings).
序數(shù):表示具有順序的數(shù)據(jù)(例如排名)。
NOMINAL: represent group names (e.g. brands or species names).
名詞:代表組名(例如品牌或品種名稱)。
BINARY: represent data with a yes/no or 1/0 outcome (e.g. LEFT or RIGHT).
BINARY :表示結(jié)果為是/否或1/0的數(shù)據(jù)(例如,左或右)。
2)統(tǒng)計(jì)測(cè)試 (2) STATISTICAL TESTS)
Statistics is all about data. Data alone is not interesting. It is the interpretation of the data that we are interested in.
統(tǒng)計(jì)信息都是關(guān)于數(shù)據(jù)的。 單獨(dú)的數(shù)據(jù)并不有趣。 它是對(duì)我們感興趣的數(shù)據(jù)的解釋。
In Statistics, one very important thing is statistical testing, if statistics “is the interpretation of the data”, statistical testing can be considered as the “formal procedure for investigating our ideas about the world”.
在統(tǒng)計(jì)中,非常重要的一件事是統(tǒng)計(jì)測(cè)試,如果統(tǒng)計(jì)“是對(duì)數(shù)據(jù)的解釋”,則統(tǒng)計(jì)測(cè)試可以被視為“調(diào)查我們對(duì)世界的看法的正式程序”。
In other words, whenever we want to make claims about the distribution of data or whether one set of results are different from another set of results, data scientists must rely on hypothesis testing.
換句話說(shuō),每當(dāng)我們要對(duì)數(shù)據(jù)的分布或一組結(jié)果是否與另一組結(jié)果有所不同時(shí),數(shù)據(jù)科學(xué)家必須依靠假設(shè)檢驗(yàn)。
假設(shè)檢驗(yàn) (HYPOTHESIS TESTING)
Using Hypothesis Testing, we try to interpret or draw conclusions about the population using sample data, evaluating two mutually exclusive statements about a population to determine which statement is best supported by the sample data.
使用“ 假設(shè)檢驗(yàn)” ,我們嘗試使用樣本數(shù)據(jù)來(lái)解釋或得出有關(guān)總體的結(jié)論,評(píng)估關(guān)于總體的兩個(gè)互斥陳述,以確定樣本數(shù)據(jù)最能支持哪種陳述。
假設(shè)檢驗(yàn)有五個(gè)主要步驟: (THERE ARE FIVE MAIN STEPS IN HYPOTHESIS TESTING:)
Step 1) State your hypothesis as a Null (Ho) and Alternate (Ha) hypothesis.
步驟1)將您的假設(shè)陳述為零(Ho)和替代(Ha)假設(shè)。
Step 2) Choose a significance level (also called alpha or α).
步驟2)選擇顯著性水平(也稱為alpha或α)。
Step 3) Collect data in a way designed to test the hypothesis.
步驟3)以旨在檢驗(yàn)假設(shè)的方式收集數(shù)據(jù)。
Step 4) Perform an appropriate statistical test: compute the p-value and compare from the test to the significance level.
步驟4)執(zhí)行適當(dāng)?shù)慕y(tǒng)計(jì)檢驗(yàn):計(jì)算p值,然后將檢驗(yàn)與顯著性水平進(jìn)行比較。
Step 5) Decide whether to “ REJECT ” the null hypothesis(Ho) or “ FAIL TO REJECT ” the null hypothesis(Ho).
步驟5)決定是“拒絕”無(wú)效假設(shè)(Ho)還是“失敗”無(wú)效假設(shè)(Ho)。
Note: Though the specific details might vary, the procedure you will use when testing a hypothesis will always follow some version of these steps.
注意 :盡管具體細(xì)節(jié)可能有所不同,但是在檢驗(yàn)假設(shè)時(shí)將使用的過(guò)程將始終遵循這些步驟的某些版本。
If you want to further understand hypothesis testing, I would highly recommend these two great posts on Hypothesis testing.
如果您想進(jìn)一步了解假設(shè)檢驗(yàn),我強(qiáng)烈推薦有關(guān)假設(shè)檢驗(yàn)的這兩篇好文章。
3)統(tǒng)計(jì)假設(shè) (3) STATISTICAL ASSUMPTIONS)
Statistical tests make some common assumptions about the data being tested (If these assumptions are violated then the test may not be valid: e.g. the resulting p-value may not be correct)
統(tǒng)計(jì)測(cè)試對(duì)要測(cè)試的數(shù)據(jù)做出一些通用假設(shè)(如果違反了這些假設(shè),則該測(cè)試可能無(wú)效:例如,得出的p值可能不正確)
Independence of observations: the observations/variables you include in your test should not be related(e.g. several tests from a same test subject are not independent, while several tests from multiple different test subjects are independent)
觀察結(jié)果的獨(dú)立性 :您包含在測(cè)試中的觀察值/變量不應(yīng)該相關(guān)(例如,來(lái)自同一測(cè)試對(duì)象的多個(gè)測(cè)試不是獨(dú)立的,而來(lái)自多個(gè)不同測(cè)試對(duì)象的多個(gè)測(cè)試是獨(dú)立的)
Homogeneity of variance: the “variance” within each group is being compared should be similar to the rest of the group variance. If a group has a bigger variance than the other(s) this will limit the test’s effectiveness.
方差的同質(zhì)性 :比較每個(gè)組中的“方差”應(yīng)與其余組方差相似。 如果組的方差大于其他方,這將限制測(cè)試的有效性。
Normality of data: the data follows a normal distribution, normality means that the distribution of the test is normally distributed (or bell-shaped) with mean 0, with 1 standard deviation and a symmetric bell-shaped curve.
數(shù)據(jù)的正態(tài)性 :數(shù)據(jù)遵循正態(tài)分布,正態(tài)性表示測(cè)試的分布呈正態(tài)分布(或鐘形),平均值為0,標(biāo)準(zhǔn)差為1,鐘形曲線對(duì)稱。
4)參數(shù)測(cè)試 (4) PARAMETRIC TESTS)
Parametric tests are the ones that can only be run with data that stick with the “three statistical assumptions” mentioned above. The most common types of parametric tests are divided into three categories.
參數(shù)測(cè)試是只能使用符合上述“三個(gè)統(tǒng)計(jì)假設(shè)”的數(shù)據(jù)運(yùn)行的測(cè)試。 最常見(jiàn)的參數(shù)測(cè)試類型分為三類。
回歸測(cè)試: (Regression tests:)
These tests are used test cause-and-effect relationships, if the change in one or more continuous variable predicts change in another variable.
如果一個(gè)或多個(gè)連續(xù)變量的變化預(yù)示著另一個(gè)變量的變化,則將這些檢驗(yàn)用于檢驗(yàn)因果關(guān)系 。
Simple linear regression: tests how a change in the predictor variable predicts the level of change in the outcome variable.
簡(jiǎn)單線性回歸:測(cè)試預(yù)測(cè)變量的變化如何預(yù)測(cè)結(jié)果變量的變化水平。
Multiple linear regression: tests how changes in the combination of two or more predictor variables predict the level of change in the outcome variable
多元線性回歸:測(cè)試兩個(gè)或多個(gè)預(yù)測(cè)變量組合的變化如何預(yù)測(cè)結(jié)果變量的變化水平
Logistic regression: is used to describe data and to explain the relationship between one dependent (binary) variable and one or more nominal, ordinal, interval or ratio-level independent variable(s).
Logistic回歸:用于描述數(shù)據(jù)并解釋一個(gè)(二元)變量與一個(gè)或多個(gè)名義,有序,區(qū)間或比率級(jí)別的自變量之間的關(guān)系。
比較測(cè)試: (Comparison tests:)
These tests look for the difference between the means of variables:Comparison of Means.
這些測(cè)試尋找變量均值之間的差異:均值比較。
T-tests are used when comparing the means of precisely two groups (e.g. the average heights of men and women).
在精確比較兩組的平均值(例如,男性和女性的平均身高)時(shí),使用T檢驗(yàn) 。
Independent t-test: Tests the difference between the same variable from different populations (e.g., comparing dogs to cats)
獨(dú)立t檢驗(yàn) :測(cè)試來(lái)自不同人群的相同變量之間的差異 (例如,比較狗和貓)
ANOVA and MANOVA tests are used to compare the means of more than two groups or more(e.g. the average weights of children, teenagers, and adults).
ANOVA和MANOVA檢驗(yàn)用于比較兩組或以上兩組的均值(例如,兒童,青少年和成人的平均體重)。
關(guān)聯(lián)測(cè)試: (Correlation tests:)
These tests look for an association between variable checking whether two variables are related.
這些測(cè)試在變量之間尋找關(guān)聯(lián),檢查兩個(gè)變量是否相關(guān)。
Pearson Correlation: Tests for the strength of the association between two continuous variables.
皮爾遜相關(guān):測(cè)試兩個(gè)連續(xù)變量之間關(guān)聯(lián)的強(qiáng)度。
Spearman Correlation: Tests for the strength of the association between two ordinal variables (it does not rely on the assumption of normally distributed data)
Spearman相關(guān)性:測(cè)試兩個(gè)序數(shù)變量之間的關(guān)聯(lián)強(qiáng)度(它不依賴于正態(tài)分布數(shù)據(jù)的假設(shè))
Chi-Square Test: Tests for the strength of the association between two categorical variables.
卡方檢驗(yàn):測(cè)試兩個(gè)類別變量之間的關(guān)聯(lián)強(qiáng)度。
5)流程圖:選擇參數(shù)測(cè)試 (5) FLOWCHART: CHOOSING A PARAMETRIC TEST)
This flowchart will help you choose among the above described parametric tests. For nonparametric alternatives, check the following section.
該流程圖將幫助您在上述參數(shù)測(cè)試中進(jìn)行選擇。 對(duì)于非參數(shù)替代,請(qǐng)檢查以下部分。
PARAMETRIC TEST Flowchart (Image by author)參數(shù)測(cè)試流程圖(作者提供)6)處理非正態(tài)分布 (6) DEALING WITH NON- NORMAL DISTRIBUTIONS)
Although the normal distribution takes centre part in statistics, many processes follow non-normal distributions. Many datasets naturally fit a non-normal model:
盡管正態(tài)分布在統(tǒng)計(jì)中占據(jù)中心位置,但是許多過(guò)程遵循非正態(tài)分布。 許多數(shù)據(jù)集自然適合于非正常模型:
-The number of accidents tends to fit a “Poisson distribution”
-事故數(shù)量趨于符合“泊松分布”
-The Lifetimes of products usually fit a “Weibull distribution”.
-產(chǎn)品的使用壽命通常符合“威布爾分布”。
非正態(tài)分布的示例 (Example of Non-Normal Distributions)
那么,我們?nèi)绾翁幚矸钦龖B(tài)分布? (Well then, How do we deal with non-Normal-Distributions?)
When your data is supposed to fit a normal distribution but doesn’t, we could do a few things to handle them:
當(dāng)您的數(shù)據(jù)應(yīng)該符合正態(tài)分布但不符合正態(tài)分布時(shí),我們可以做一些事情來(lái)處理它們:
- We may still be able to run parametric tests if your sample size is large enough (usually over 20 items) and try to interpret the results accordingly. 如果您的樣本量足夠大(通常超過(guò)20個(gè)項(xiàng)目),我們?nèi)匀豢梢赃\(yùn)行參數(shù)測(cè)試,并嘗試相應(yīng)地解釋結(jié)果。
- We may choose to transform the data with different statistical techniques, forcing it to fit a normal distribution. 我們可能選擇使用不同的統(tǒng)計(jì)技術(shù)來(lái)轉(zhuǎn)換數(shù)據(jù),迫使其適應(yīng)正態(tài)分布。
If the sample size is small, skewed or if it represents another distribution type, you might run a non-parametric test.
如果樣本量小,偏斜或代表其他分布類型,則可以運(yùn)行非參數(shù)檢驗(yàn) 。
非參數(shù)測(cè)試 (Non-Parametric Tests)
Non-parametric tests (figure below) don’t make as many assumptions about the data and are useful when one or more of the three statistical assumptions are violated.
非參數(shù)檢驗(yàn)(下圖)對(duì)數(shù)據(jù)的假設(shè)不多,當(dāng)違反三個(gè)統(tǒng)計(jì)假設(shè)中的一個(gè)或多個(gè)時(shí)很有用。
Note that: The inferences that non-parametric tests make aren’t as strong as the parametric tests.
請(qǐng)注意:非參數(shù)測(cè)試的推論不如參數(shù)測(cè)試強(qiáng)。
NON- PARAMETRIC TESTS(Image by author)非參數(shù)測(cè)試(作者提供)Hope you find this post informative and useful. Please let me know if you have any feedback. Thanks a lot for reading!
希望您發(fā)現(xiàn)這篇文章有益和有用。 如果您有任何反饋意見(jiàn),請(qǐng)告訴我。 非常感謝您的閱讀!
翻譯自: https://towardsdatascience.com/statistical-testing-understanding-how-to-select-the-best-test-for-your-data-52141c305168
數(shù)據(jù)統(tǒng)計(jì) 測(cè)試方法
總結(jié)
以上是生活随笔為你收集整理的数据统计 测试方法_统计测试:了解如何为数据选择最佳测试!的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 梦到来大姨妈是不是快来了
- 下一篇: 为什么梦到前男友