有没有改期末考试成绩的软件_如果考试成绩没有正常分配怎么办?
有沒有改期末考試成績的軟件
Usually, when I tell you a student has got 90 marks, you would think this is a very good student. Instead, if I say the marks are 75, that probably means the student might be average. However, as a Data Scientist/Analyst, we need at least ask two questions immediately:
通常,當我告訴您一個學生獲得90分時,您會認為這是一個非常好的學生。 相反,如果我說分數是75,則可能意味著該學生可能是平均水平。 但是,作為數據科學家/分析師,我們至少需要立即提出兩個問題:
The first question is obviously important and perhaps everyone would ask because 90/150 is definitely not better than 75/100. The second question is a little bit subtle and possibly only a “data person” will have this sensitivity.
第一個問題顯然很重要,也許每個人都會問,因為90/150絕對不比75/100好。 第二個問題有些微妙,可能只有“數據人”才具有這種敏感性。
In fact, in order to make sure an exam having its results normally distributed in the class, it is quite common to select exam questions as follows:
實際上,為了確保考試的成績能在班級中正常分布,選擇考試問題非常普遍,如下所示:
What if we have 100% easy questions or 100% difficult questions? If so, we’re very likely to have results that are not normally distributed in a class.
如果我們有100%的簡單問題或100%的困難問題怎么辦? 如果是這樣,我們很可能會獲得不在類中正常分布的結果。
為什么我們需要規范化/標準化數據? (Why Do We Need to Normalise/Standardise Data?)
Photo by SamuelFrancisJohnson on Pixabay 塞繆爾· 弗朗西斯·約翰遜的照片· Pixabay上的免費照片Then, we have our main topic now. I have been a tutor at a University for 5 years. It sometimes cannot be guaranteed that the exam questions are precisely followed the above proportions. To make sure it is fair to all the students, in other words, not too many students failed or too many students got A grades, sometimes we need to normalise the marks to make sure it follows the normal distribution.
然后,我們現在有了我們的主要主題。 我已經在大學任教5年了。 有時不能保證考試題準確地遵循上述比例。 為確保對所有學生公平,換句話說,沒有太多的學生不及格或沒有太多的學生獲得A級,有時我們需要對分數進行歸一化以確保其服從正態分布。
Also, when we want to compare students from different universities, or we want to aggregate the results of multiple exams, normalisation is also very important.
另外,當我們想比較來自不同大學的學生,或者想要匯總多次考試的結果時,標準化也很重要。
For demonstration purposes, Let’s suppose we have 5 different exams. Because we need to randomly generate the expected distribution, the following imports are needed.
為了演示的目的,我們假設我們有5種不同的考試。 因為我們需要隨機生成期望的分布,所以需要以下導入。
import numpy as npimport pandas as pd
import matplotlib.pyplot as pltfrom scipy.stats import skewnorm # used to generate skewed dist
1.滿分是100。基本問題太多 (1. Full mark is 100. Too many basic questions)
ex1 = np.array(skewnorm.rvs(a=-10, loc=95, scale=20, size=200)).astype(int)2.滿分為100。難題太多 (2. Full mark is 100. Too many difficult questions)
ex2 = np.array(skewnorm.rvs(a=5, loc=30, scale=20, size=200)).astype(int)3.滿分是100。正態分布 (3. Full mark is 100. Normally distributed)
ex3 = np.random.normal(70, 15, 200).astype(int)ex3 = ex3[ex3 <= 100]
4.滿分是50。正態分布 (4. Full mark is 50. Normally distributed)
ex4 = np.random.normal(25, 7, 200).astype(int)5.滿分是200。正態分布 (5. Full mark is 200. Normally distributed)
ex5 = np.random.normal(120, 30, 200).astype(int)Let’s plot them together using Seaborn distplot.
讓我們使用Seaborn distplot它們繪制在一起。
plt.figure(figsize=(16,10))sns.distplot(ex1)
sns.distplot(ex2)
sns.distplot(ex3)
sns.distplot(ex4)
sns.distplot(ex5)
plt.show()
It is very obvious that these 5 different exams have completely different distributions. When we get such a dataset, we can’t compare them directly.
很明顯,這5種不同的考試分布完全不同。 當我們獲得這樣的數據集時,我們不能直接比較它們。
最小-最大歸一化 (Min-Max Normalisation)
ReadyElements on ReadyElements·Pixabay上的Pixabay免費照片The basic idea of Min-Max Normalisation is to normalise all the values into the interval [0,1]. It is fairly easy to do this.
最小-最大歸一化的基本思想是將所有值歸一化為間隔[0,1]。 這很容易做到。
from sklearn import preprocessingmin_max_scaler = preprocessing.MinMaxScaler()ex1_norm_min_max = min_max_scaler.fit_transform(ex1.reshape(-1,1))
ex2_norm_min_max = min_max_scaler.fit_transform(ex2.reshape(-1,1))
ex3_norm_min_max = min_max_scaler.fit_transform(ex3.reshape(-1,1))
ex4_norm_min_max = min_max_scaler.fit_transform(ex4.reshape(-1,1))
ex5_norm_min_max = min_max_scaler.fit_transform(ex5.reshape(-1,1))
Please be noticed that we need to convert our NumPy arrays into vectors before they can be normalised. So, the easiest way of doing this is to reshape them into column vectors reshape(-1, 1).
請注意,我們需要先將NumPy數組轉換為向量,然后才能對其進行歸一化。 因此,最簡單的方法是將它們整形為列向量reshape(-1, 1) 。
After they are normalised, we don’t have to convert it back to a 1-D array to visualise. Below is the histogram after normalising. It is more confident now to put the 5 different exam results together.
將它們標準化后,我們不必將其轉換回一維數組即可進行可視化。 以下是歸一化后的直方圖。 現在將5種不同的考試結果放在一起更有信心。
Z分數標準化 (Z-Score Standardisation)
Photo by aitoff on Pixabay照片由aitoff在Pixabay上發布Z-Score is another commonly used technique. It is called standardisation rather than normalisation because it “standardises” the data in two aspects:
Z分數是另一種常用的技術。 之所以稱為標準化而不是標準化是因為它從兩個方面“標準化”數據:
Therefore, we can calculate it as follows.
因此,我們可以如下計算。
from sklearn import preprocessingex1_scaled = preprocessing.scale(ex1)ex2_scaled = preprocessing.scale(ex2)
ex3_scaled = preprocessing.scale(ex3)
ex4_scaled = preprocessing.scale(ex4)
ex5_scaled = preprocessing.scale(ex5)
It can be seen that the Z-Score Standardisation not only normalised the exam results, but also re-scaled them.
可以看出,Z-Score標準化不僅規范了考試結果,而且還對它們進行了重新定標。
摘要 (Summary)
pasja1000 on pasja1000在PixabayPixabay上In this article, the exams are used as examples to explain why we need to normalise or standardise datasets. In fact, I have seen many learners and Data Science students who are really enthusiasts of algorithms. They may know many different types of data mining and machine learning algorithms. However, I would say that data transformation is more important than selecting an algorithm in most of the time.
在本文中,以考試為例來說明為什么我們需要標準化或標準化數據集。 實際上,我見過很多真正喜歡算法的學習者和數據科學專業的學生。 他們可能知道許多不同類型的數據挖掘和機器學習算法。 但是,我要說的是,在大多數情況下,數據轉換比選擇算法更重要。
Therefore, I have also demonstrated how to use the Python Sci-kit Learn library to easily normalise/standardise the data. Hope it helps someone who has just entered in the Data Science and Data Analytics area.
因此,我還演示了如何使用Python Sci-kit Learn庫輕松地對數據進行標準化/標準化。 希望它對剛進入“數據科學和數據分析”區域的人有所幫助。
All the code used in this article can be found in this Google Colab Notebook.
本文中使用的所有代碼都可以在此Google Colab筆記本中找到。
翻譯自: https://towardsdatascience.com/what-if-the-exam-marks-are-not-normally-distributed-67e2d2d56286
有沒有改期末考試成績的軟件
總結
以上是生活随笔為你收集整理的有没有改期末考试成绩的软件_如果考试成绩没有正常分配怎么办?的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 数据仓库项目分析_数据分析项目:仓库库存
- 下一篇: 梦到外面下大雨屋里漏雨是什么