mongodb 群集图_群集和重叠条形图
mongodb 群集圖
為什么和如何 (Why & How)
1.- Clustered Bar Charts
1.- 集群條形圖
AKA: grouped, side-by-side, multiset [bar charts, bar graphs, column charts]
AKA :分組,并排,多組[條形圖,條形圖,柱形圖]
Why: Clustered Bar Charts (CBC) display numerical information about the relative proportion that exists between a main category and its subgroups that belongs to a second categorical variable. Similar to Stacked Bar Graphs, they should be used for Comparisons and Proportions but with emphasis on Composition. Unlike Stacked Bar Graphs, the elements that make up the subcategories may be diffusely related. CBC are particularly effective when a whole is divided into multiple parts. They enable to make comparisons across subcategories whilst Stacked Bar Graphs make comparisons within subcategories.
原因 :集群條形圖(CBC)顯示有關(guān)主要類別及其子類別之間的相對(duì)比例的數(shù)字信息,該子類別屬于第二個(gè)類別變量。 與堆積條形圖類似, 它們應(yīng)用于比較和比例,但重點(diǎn)是組成。 與堆積條形圖不同,構(gòu)成子類別的元素可能是分散相關(guān)的。 當(dāng)一個(gè)整體分為多個(gè)部分時(shí),CBC尤其有效。 它們使您可以跨子類別進(jìn)行比較,而堆疊條形圖則可以在子類別內(nèi)進(jìn)行比較。
They allow to visualize how subgroups change over time, but the chart becomes difficult to read with the extension in time and with the increase in the number of subcategories. They should not be used for Relationship or Distribution analysis.
它們可以可視化子組隨時(shí)間的變化,但是隨著時(shí)間的延長(zhǎng)和子類別數(shù)量的增加,圖表變得難以閱讀。 它們不應(yīng)用于關(guān)系或分布分析。
How: as usual with bar charts, CBC are two-dimensional with two axes: one axis shows categories, the other axis shows numerical values. The axis where the categories are indicated does not have a scale to highlight that it refers to discrete (mutually exclusive) groups. The axis with numerical values must have a scale with its corresponding measurements units.
方式 :與通常的條形圖一樣,CBC是帶有兩個(gè)軸的二維:一個(gè)軸顯示類別,另一個(gè)軸顯示數(shù)值。 指示類別的軸沒(méi)有刻度以突出顯示它指的是離散(互斥)組。 帶有數(shù)值的軸必須具有帶有相應(yīng)測(cè)量單位的刻度。
CBC are represented by means of sets of rectangular bars that can be oriented horizontally or vertically. Each principal category is divided into a cluster of bars representing subcategories of the second categorical variable. The quantity of each subcategory is shown by the length or height of those rectangular bars that are located side by side forming a cluster, with gaps between clusters slightly wider than a single standard bar.
CBC用可以水平或垂直定向的矩形條表示。 每個(gè)主要類別分為代表第二個(gè)類別變量的子類別的一組條形 。 每個(gè)子類別的數(shù)量由并排形成一個(gè)簇的那些矩形條的長(zhǎng)度或高度顯示,簇之間的間隙比單個(gè)標(biāo)準(zhǔn)條稍寬。
Fig. 1: schematic diagram of a clustered bar chart. The figure was developed with Matplotlib.圖1:群集條形圖的示意圖。 該圖是用Matplotlib開發(fā)的。Subcategories can be ordinal or nominal but equivalent subgroups must have the same color in each cluster so as not to confuse the audience. It is essential to use an appropriate color palette, a balanced spacing and a layout that facilitates comparison. As bars are heavy visual markers, use gridlines scantily just for improving the storytelling.
子類別可以是順序的或名義的,但是等效的子組在每個(gè)群集中必須具有相同的顏色,以免引起聽眾的困惑。 必須使用適當(dāng)?shù)恼{(diào)色板,平衡的間距和便于比較的布局。 由于條形圖是較重的視覺(jué)標(biāo)記,因此請(qǐng)僅使用網(wǎng)格線以改善講故事的效果。
The following figure shows data about a company performance related with sales, expenses and profits for the 2016–2019 period. It is a vertically oriented clustered bar chart with years as the main category. Sales, expenses and profit are yearly represented as a cluster. The visualization clearly highlights that in 2018, even with the increase in expenses and reduction in sales, profit remained relatively constant.
下圖顯示了2016-2019年期間與銷售,費(fèi)用和利潤(rùn)相關(guān)的公司績(jī)效數(shù)據(jù)。 它是一個(gè)垂直定向的群集條形圖,以年為主要類別。 銷售,費(fèi)用和利潤(rùn)以年為單位表示。 可視化清楚地表明,2018年即使支出增加和銷售減少,利潤(rùn)仍保持相對(duì)穩(wěn)定。
Fig. 2: economic performance of a fictitious company during the 2016–2019 period. The figure was developed with Matplotlib.圖2:虛擬公司在2016-2019年期間的經(jīng)濟(jì)表現(xiàn)。 該圖是用Matplotlib開發(fā)的。It is interesting to compare the same data represented by means of a stacked bar chart. As previously indicated, CBC are appropriate when you want to compare across subcategories: sales in 2016 versus 2017 vs. 2018 vs. 2019; expenses in 2016 versus 2017 vs. 2018 vs. 2019; profit in 2016 versus 2017 vs. 2018 vs. 2019. On the contrary, the stacked bar chart only enables to do a good comparison for the segments near the baseline (sales) because expenses and profits have different initial baselines. Also, the height of each principal bar (sum of sales + expenses + profit of a particular year) does not make any sense.
比較通過(guò)堆疊條形圖表示的相同數(shù)據(jù)很有趣。 如前所述,CBC適用于您要比較子類別的情況:2016年與2017年對(duì)比2018年與2019年對(duì)比; 2016年與2017年對(duì)比2018年與2019年的支出; 2016年與2017年,2018年與2019年的利潤(rùn)之間的關(guān)系。相反,堆積的條形圖只能對(duì)接近基線(銷售額)的細(xì)分市場(chǎng)進(jìn)行很好的比較,因?yàn)橘M(fèi)用和利潤(rùn)具有不同的初始基線。 同樣,每個(gè)主要金條的高度(銷售總和+費(fèi)用+特定年份的利潤(rùn))沒(méi)有任何意義。
Fig. 3: stacked bar graph with the same data as Fig. 2.圖3:具有與圖2相同數(shù)據(jù)的堆疊條形圖Next figure is related with statistics of tertiary education in the European Union (EU-28) in 2017. There were 19.8 million tertiary students that year, women accounted for 54% of that number although the majority of the students following doctoral titles were men. Besides, a quarter of all students were involved in business, administration and law studies. The following clustered bar chart shows that female surpasses male in Education, Social Sciences, Arts and Humanities, Health and Welfare and also in Business, Administration and Law studies. On the other hand, male surpasses female in IT and Engineering, Manufacturing and Construction studies (Eurostat, 2020). The chart clearly displays numerical information about the participation of men and women in tertiary education across broad fields of education. It is a CBC horizontally oriented where educational fields make up the principal category while gender is the second categorical variable.
下一個(gè)數(shù)字與2017年歐盟(EU-28)的高等教育統(tǒng)計(jì)相關(guān)。當(dāng)年有1980萬(wàn)名大學(xué)生,女性占該數(shù)字的54%,盡管獲得博士學(xué)位的大多數(shù)是男性。 此外,所有學(xué)生的四分之一都參與了商業(yè),行政和法律研究。 下面的條形圖顯示,在教育,社會(huì)科學(xué),藝術(shù)與人文科學(xué),健康與福利以及商業(yè),行政和法律研究中,女性超過(guò)男性。 另一方面,在信息技術(shù)和工程,制造和建筑研究中,男性超過(guò)女性(歐盟統(tǒng)計(jì)局,2020年)。 該圖表清楚地顯示了在廣泛的教育領(lǐng)域中男女參與高等教育的數(shù)字信息。 它是CBC的水平取向,其中教育領(lǐng)域構(gòu)成主要類別,而性別是第二個(gè)類別變量。
Fig. 3: distribution of tertiary education students by field and gender for the European Union during 2017. Source (#1)圖3:2017年歐洲聯(lián)盟按領(lǐng)域和性別分列的高等教育學(xué)生分布。來(lái)源(#1)The main problem with clustered bar graphs is that they don’t clearly visualize the ratio of the individual parts relative to the whole. As a result, proportions are not easy to evaluate. Their strength is related with direct comparisons between equivalent subcategories of the second categorical variable.
聚集條形圖的主要問(wèn)題在于,它們無(wú)法清晰地可視化各個(gè)部分相對(duì)于整個(gè)部分的比率。 結(jié)果,比例不容易評(píng)估。 它們的強(qiáng)度與第二個(gè)類別變量的等效子類別之間的直接比較有關(guān)。
2.- Overlapped Bar Charts
2.- 重疊的條形圖
AKA: Overlay, Overlapping, Superimposed [bar charts, bar graphs, column charts]
AKA :重疊,重疊,疊加[條形圖,條形圖,柱形圖]
Why: Overlapped Bar Charts (OVC) are used to make comparisons between different items or categories. OVC compare only two numerical variables per item or category in a single diagram. The numerical variables must be closely related to merit a comparison. They are also used to show trends over time based on similar premises. They should not be used for Relationship or Distribution analysis.
原因 :重疊的條形圖(OVC)用于在不同項(xiàng)目或類別之間進(jìn)行比較 。 OVC在單個(gè)圖中僅比較每個(gè)項(xiàng)目或類別的兩個(gè)數(shù)字變量 。 數(shù)值變量必須與優(yōu)點(diǎn)比較緊密相關(guān)。 它們還用于根據(jù)類似前提顯示一段時(shí)間內(nèi)的趨勢(shì)。 它們不應(yīng)用于關(guān)系或分布分析。
The conceptual idea related with OVC is to contrast numerical values ??of two variables that overlapped one onto other allows to describe the message (storytelling) with greater expositional power. In such sense, they are better than Clustered Bar Graphs because the comparison is intuitively superior. This kind of chart shows surpluses and shortages with remarkable precision, particularly when appropriate grids are added to it. They are frequently used to show level of progress against an objective or against a benchmark.
與OVC相關(guān)的概念是將兩個(gè)相互重疊的變量的數(shù)值進(jìn)行對(duì)比,從而以更大的論述能力來(lái)描述消息(講故事)。 從這種意義上講,它們比聚類條形圖更好,因?yàn)樵谥庇^上比較效果更好。 這種圖表以非常精確的精度顯示了盈余和短缺,特別是在添加適當(dāng)?shù)木W(wǎng)格時(shí)。 它們通常用于顯示相對(duì)于目標(biāo)或基準(zhǔn)的進(jìn)度水平。
Fig. 4: schematic diagram of a overlapped bar chart. The figure was developed with Matplotlib圖4:重疊條形圖的示意圖。 該圖是用Matplotlib開發(fā)的 How: it is a two dimensional graph with two axis -similar to every standard bar chart- with rectangular bars that can be oriented horizontally or vertically. One axis shows categories, the other axis shows numerical values related with two variables. Bars representing the same category share the same baseline and the same location on the corresponding axis. Both numerical variables must be closely related and share the same numerical scale. The width of the bars is different for each numerical variable with the smaller going forward for clarity of reading. The drawback is that for some categories one of the bars is the shorter while it is the longer for others.Fig. 5: Actual versus Budgeted expenses for a fictitious company during the 2012–2019 period. The figure was developed with Matplotlib圖5:虛擬公司在2012-2019年期間的實(shí)際支出與預(yù)算支出。 該圖是用Matplotlib開發(fā)的Some visualization tools allow to partially overlap several numerical variables (multiple data series) such that rectangles representing each successive numerical variable are partially hidden by other rectangles located in front of them. Conceptually, they are equivalent to clustered (grouped) bar charts when the rectangles representing the different data sets begin to overlap instead of being located side by side. OVC implies the extreme case where a rectangle overlaps 100% ahead of another rectangle. Undoubtedly, audiences will find very difficult to make comparisons with three or more partially overlapping bars. Its use could be justified when data of multiple subcategories must be compared over very long periods of time in a single diagram.
一些可視化工具允許部分重疊幾個(gè)數(shù)值變量(多個(gè)數(shù)據(jù)系列),以便表示每個(gè)連續(xù)數(shù)值變量的矩形被位于它們前面的其他矩形部分隱藏。 從概念上講,當(dāng)代表不同數(shù)據(jù)集的矩形開始重疊而不是并排放置時(shí),它們等效于聚簇(分組)條形圖。 OVC表示一個(gè)極端情況,即一個(gè)矩形在另一個(gè)矩形之前重疊100%。 無(wú)疑,觀眾將很難對(duì)三個(gè)或更多部分重疊的條形進(jìn)行比較。 當(dāng)必須在很長(zhǎng)一段時(shí)間內(nèi)在一個(gè)圖中比較多個(gè)子類別的數(shù)據(jù)時(shí),可以證明其用途合理。
Fig. 6: partially overlapped bar charts, source Peltier Tech Blog (#2)圖6:部分重疊的條形圖,來(lái)源Peltier Tech Blog(#2)To sum up, you might use a clustered bar graph when you want to make direct comparisons across parts of a whole. On the other hand, overlapped bar graphs enable to do excellent comparisons between two closely related numerical variables.
綜上所述 ,當(dāng)您想對(duì)整個(gè)部分進(jìn)行直接比較時(shí),可以使用聚簇條形圖。 另一方面,重疊的條形圖可以在兩個(gè)緊密相關(guān)的數(shù)值變量之間進(jìn)行出色的比較。
As usual with standard bar graphs, I recommend the following tips and warnings for both types of charts:
與標(biāo)準(zhǔn)條形圖一樣,對(duì)于這兩種類型的圖表,我建議以下提示和警告 :
Start the baseline at 0: if the bars are truncated, the actual value is not properly reflected;
將基線從0開始:如果條形被截?cái)?#xff0c;則實(shí)際值不能正確反映;
Vertical orientation (column charts) is recommended when chronological data (time series, temporal data) or negative numerical values ??are present (Fig. 2 & Fig. 5). On the other hand, it is preferable to use horizontal orientations when graphing numerous categories, in particular with very long labels (Fig. 3);
如果存在時(shí)間順序數(shù)據(jù)(時(shí)間序列,時(shí)間數(shù)據(jù))或負(fù)數(shù)值(圖2和圖5),則建議使用垂直方向(柱形圖)。 另一方面,在繪制多個(gè)類別的圖形時(shí),尤其是使用非常長(zhǎng)的標(biāo)簽時(shí),最好使用水平方向(圖3)。
Partially overlapped bar charts only display a good message if longer bars are always behind shorter ones;
如果長(zhǎng)條總是在短條之后,則部分重疊的條圖只會(huì)顯示一個(gè)好消息。
Avoid all 3D effects. Although they are aesthetically pleasing, they are against all the rules for an appropriate Data Visualization.
避免所有3D效果。 盡管它們?cè)诿缹W(xué)上令人愉悅,但它們違反了適當(dāng)數(shù)據(jù)可視化的所有規(guī)則。
If you find this article of interest, please read my previous:
如果您發(fā)現(xiàn)這篇文章感興趣,請(qǐng)閱讀我以前的文章:
Stacked Bar Graphs, Why & How, Storytelling & Warnings
堆疊條形圖,原因和方式,講故事和警告
#1: https://ec.europa.eu/eurostat/statistics-explained/index.php/Tertiary_education_statistics#Fields_of_education
#1:https://ec.europa.eu/eurostat/statistics-explained/index.php/Tertiary_education_statistics#Fields_of_education
#2: Peltier Tech Blog, https://peltiertech.com/stacked-vs-clustered/
#2:Peltier技術(shù)博客, https ://peltiertech.com/stacked-vs-clustered/
翻譯自: https://towardsdatascience.com/clustered-overlapped-bar-charts-94f1db93778e
mongodb 群集圖
總結(jié)
以上是生活随笔為你收集整理的mongodb 群集图_群集和重叠条形图的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 如何在UI设计中制作完美阴影
- 下一篇: 《高性能MySQL》读书笔记