3c技能和背包需要改建吗?_认为您需要仪表板? 您应该改建一个笔记本。
3c技能和背包需要改建嗎?
by Mahdi Karabiben
通過Mahdi Karabiben
認(rèn)為您需要儀表板? 您應(yīng)該改建一個(gè)筆記本。 (Think you need a Dashboard? You should build a Notebook instead.)
After first establishing themselves as a key component of the standard Business Intelligence model during the first years of the millennium, dashboards were rapidly adopted by most companies as the go-to tool to present data-driven insights and indicators.
在千禧年的最初幾年中,儀表板首先成為標(biāo)準(zhǔn)商業(yè)智能模型的關(guān)鍵組成部分之后,儀表板被大多數(shù)公司Swift采用,成為呈現(xiàn)數(shù)據(jù)驅(qū)動的見解和指標(biāo)的必備工具。
When Hadoop was introduced afterwards in 2007, its launch was followed by a set of Big Data technologies that radically changed how things are done behind the curtains. They allowed parallelism on a previously unimaginable scale. These changes were, for a long period, limited to data storage and data processing. Changing the way the end users accessed data felt like an unnecessary step, because dashboards were still doing a fine job.
Hadoop于2007年推出之后,其發(fā)布之后是一系列大數(shù)據(jù)技術(shù),這些技術(shù)從根本上改變了幕后工作方式。 他們允許以前所未有的規(guī)模進(jìn)行并行處理。 長期以來,這些更改僅限于數(shù)據(jù)存儲和數(shù)據(jù)處理。 改變最終用戶訪問數(shù)據(jù)的方式似乎是不必要的步驟,因?yàn)閮x表板仍然做得很好。
In a Big Data era that completely changed how companies process their data, dashboards managed to remain the de facto standard for making sense of the mind-boggling amounts of data being produced on a daily basis. Most companies offering dashboarding solutions rapidly adapted their products to Big Data technologies. They also offered connectors that allowed dashboards to remain the undisputed go-to tool when it comes to understanding data.
在徹底改變公司處理數(shù)據(jù)方式的大數(shù)據(jù)時(shí)代,儀表板設(shè)法保持了事實(shí)上的標(biāo)準(zhǔn),以使每天產(chǎn)生的數(shù)據(jù)量令人難以置信。 提供儀表盤解決方案的大多數(shù)公司都將其產(chǎn)品快速適應(yīng)了大數(shù)據(jù)技術(shù)。 他們還提供了連接器,使儀表板在理解數(shù)據(jù)時(shí)仍然是無可爭議的首選工具。
But with continuous changes and improvements to the standard Big Data technologies happening at a staggering pace, maybe it’s time to update the Big Data User Experience?
但是隨著標(biāo)準(zhǔn)大數(shù)據(jù)技術(shù)的不斷變化和改進(jìn)以驚人的速度發(fā)生,也許是時(shí)候更新大數(shù)據(jù)用戶體驗(yàn)了嗎?
儀表板的問題:您總是落后一步 (The problem with dashboards: you’re always one step behind)
When they started being integrated into technology stacks at the turn of the century, dashboards answered to a clear and coherent need: presenting KPIs and data-driven insights that offer answers to established questions. They were the portal to the company’s data, and allowed people with multiple roles and needs to understand what the data has to say. In essence, dashboards were first introduced to democratize data discovery.
當(dāng)它們在世紀(jì)之交開始被集成到技術(shù)棧中時(shí),儀表板就滿足了明確而協(xié)調(diào)的需求:提供KPI和數(shù)據(jù)驅(qū)動的見解,從而為既定問題提供答案。 它們是公司數(shù)據(jù)的門戶,并允許具有多種角色和需求的人員了解數(shù)據(jù)的含義。 本質(zhì)上,首先引入了儀表板以使數(shù)據(jù)發(fā)現(xiàn)民主化。
But at the turn of the century, data flows were very structured, the data didn’t have that much to say, and the range of questions to ask it was limited.
但是在世紀(jì)之交,數(shù)據(jù)流已經(jīng)非常結(jié)構(gòu)化,數(shù)據(jù)沒有太多話要說,要問的問題范圍也很有限。
That no longer is the case. With the exponential growth of the data being produced daily, the value of this new black gold reaches new highs every day. The volumes of data available for exploitation in this Big Data era don’t just offer answers to a specific set of questions. They offer you questions you still haven’t thought about asking yet. This led to the rise of data exploration, with data scientists trying to extract as much value from data as possible.
情況不再如此。 隨著每天生成的數(shù)據(jù)呈指數(shù)增長,這種新的黑金的價(jià)值每天都達(dá)到新的高點(diǎn)。 在這個(gè)大數(shù)據(jù)時(shí)代,可用于開發(fā)的數(shù)據(jù)量不僅為特定問題提供了答案。 他們向您提供您尚未想到的問題。 這導(dǎo)致數(shù)據(jù)探索的興起,數(shù)據(jù)科學(xué)家試圖從數(shù)據(jù)中提取盡可能多的價(jià)值。
Relying on dashboards to visualize and extract value from your data means that you have to use another technology (usually notebooks) to explore it and decide what gets to be accessible through your dashboards. Such a mechanism means that the dashboard comes always at a second phase of extracting value from data. In this era where the amounts of data available allow for an infinite number of possibilities when it comes to data exploration, no dashboard could be enough to extract all of the value your data offers.
依靠儀表板來可視化數(shù)據(jù)并從數(shù)據(jù)中提取價(jià)值意味著您必須使用另一種技術(shù)(通常是筆記本電腦 )來探索它并確定可通過儀表板訪問的內(nèi)容。 這種機(jī)制意味著儀表板始終處于從數(shù)據(jù)提取價(jià)值的第二階段。 在這個(gè)時(shí)代,可用數(shù)據(jù)量為數(shù)據(jù)探索提供了無限的可能性,沒有任何儀表板足以提取數(shù)據(jù)所提供的所有價(jià)值。
Working with this two-step mechanism means that collaboration between different roles remains limited. This is because the data architectures become too complex due to the number of technologies used by the different data specialists.
使用此兩步機(jī)制意味著不同角色之間的協(xié)作仍然受到限制。 這是因?yàn)橛捎诓煌瑪?shù)據(jù)專家使用的技術(shù)數(shù)量眾多,因此數(shù)據(jù)體系結(jié)構(gòu)變得過于復(fù)雜。
This chain of people using different technologies for different needs means that in order to add certain insights to a dashboard, a data analyst needs to wait for a data scientist to work on the data via a notebook. In turn the data scientist may need to wait for a data engineer to offer the data in a certain structure through a script. And remember — throughout this whole time-consuming process, the value of the data keeps decreasing.
使用不同技術(shù)滿足不同需求的人員鏈意味著,為了向儀表板添加某些見解,數(shù)據(jù)分析師需要等待數(shù)據(jù)科學(xué)家通過筆記本來處理數(shù)據(jù)。 反過來,數(shù)據(jù)科學(xué)家可能需要等待數(shù)據(jù)工程師通過腳本以某種結(jié)構(gòu)提供數(shù)據(jù)。 請記住,在整個(gè)耗時(shí)的過程中,數(shù)據(jù)的價(jià)值一直在下降。
Multiple dashboard-providers have tried to integrate data exploration capabilities within their platforms, with Tableau notably offering an impressive Spark connector that allows you to run Spark SQL jobs directly from your dashboard. Still, the capabilities remain limited and the interactivity is only partial, which leaves the end-user always one step behind.
多個(gè)儀表板提供者已嘗試將數(shù)據(jù)探索功能集成到其平臺中,Tableau尤其提供了令人印象深刻的Spark連接器 ,該連接器使您可以直接從儀表板運(yùn)行Spark SQL作業(yè)。 盡管如此,功能仍然有限,并且交互性僅是部分的,這使最終用戶始終落后一步。
Whether you’re using Kibana, Tableau, or Qlikview, your dashboard can offer valuable insights regarding your data. The problem with such technologies is that they were built with data discovery in mind. And because of that they neglect one key element made possible on a massive scale in this Big Data era: data exploration.
無論您使用的是Kibana,Tableau還是Qlikview,儀表板都可以提供有關(guān)數(shù)據(jù)的寶貴見解。 此類技術(shù)的問題在于它們在構(gòu)建時(shí)就考慮了數(shù)據(jù)發(fā)現(xiàn)。 因此,它們忽略了在大數(shù)據(jù)時(shí)代大規(guī)模實(shí)現(xiàn)的一個(gè)關(guān)鍵要素: 數(shù)據(jù)探索 。
As data flows keep growing exponentially, dedicating the main portal to your data merely to insights means that you’re only reading the first page of a very interesting book.
隨著數(shù)據(jù)流呈指數(shù)級增長,將主要門戶數(shù)據(jù)僅用于洞察力意味著您僅閱讀一本非常有趣的書的第一頁。
筆記本,以及它們?nèi)绾螌⒔换バ蕴岣叩揭粋€(gè)全新的水平 (Notebooks, and how they take interactivity to a completely new level)
As mentioned above, notebooks have been the standard tool for data exploration for the past few years. Since the release of project Jupyter in 2014, and through the set of functionalities it offered on top of what was already available via IPython, notebooks attracted data scientists as an ideal data exploration tool thanks mainly to one key concept: interactivity.
如上所述,在過去的幾年中,筆記本電腦一直是數(shù)據(jù)探索的標(biāo)準(zhǔn)工具。 自2014年發(fā)布Jupyter項(xiàng)目以來, 筆記本計(jì)算機(jī)憑借其在IPython已有功能之上提供的功能集,主要由于一個(gè)關(guān)鍵概念: 交互性 ,吸引了數(shù)據(jù)科學(xué)家作為理想的數(shù)據(jù)探索工具。
Thanks to kernels (within the Jupyter ecosystem) and interpreters (within Apache Zeppelin), notebooks let you explore your data through a multitude of Big Data processing technologies. They then offer immediate access to the data via built-in visualization modules and output mechanisms. Gathering both of these capabilities into the same tool is the key to using such tool for both data discovery and exploration.
借助內(nèi)核(在Jupyter生態(tài)系統(tǒng)內(nèi))和解釋器(在Apache Zeppelin中),筆記本使您可以通過多種大數(shù)據(jù)處理技術(shù)來探索數(shù)據(jù)。 然后,他們可以通過內(nèi)置的可視化模塊和輸出機(jī)制立即訪問數(shù)據(jù)。 將這兩種功能整合到同一個(gè)工具中,是將此類工具用于數(shù)據(jù)發(fā)現(xiàn)和探索的關(guān)鍵。
Notebooks are not only a tool that allows for direct access to data, they do so while maintaining complete interactivity. They completely blur the line that separates data scientists and data analysts and allow people with these two roles to collaborate together seamlessly.
筆記本電腦不僅是一種可以直接訪問數(shù)據(jù)的工具,而且還可以保持完全的交互性。 它們完全模糊了區(qū)分?jǐn)?shù)據(jù)科學(xué)家和數(shù)據(jù)分析師的界限,并允許具有這兩個(gè)角色的人們無縫地協(xié)作。
This works perfectly thanks to the powerful protocol that notebooks rely on and to their main building block, cells (paragraphs in Zeppelin). By offering multiple cell types (for code and text), notebooks allow for efficient collaboration.
由于筆記本電腦所依賴的強(qiáng)大協(xié)議及其主要構(gòu)造單元-細(xì)胞(齊柏林飛艇中的段落),它可以完美地工作。 通過提供多種單元格類型(用于代碼和文本),筆記本電腦可實(shí)現(xiàn)高效的協(xié)作。
To show their efficiency compared to dashboards, let’s go back to the scenario we talked about earlier. In a notebook-based architecture, when a data analyst needs certain insights within a notebook, the data engineer can add a code cell within which they manipulate the data through the adequate data processing technology. Then the data scientist uses this data in another code cell to extract the desired information and offer the output to the data analyst. This all happens without any of these three data specialists leaving the notebook.
為了顯示它們與儀表盤相比的效率,讓我們回到前面討論的場景。 在基于筆記本的體系結(jié)構(gòu)中,當(dāng)數(shù)據(jù)分析人員需要在筆記本中提供某些洞察力時(shí),數(shù)據(jù)工程師可以添加一個(gè)代碼單元,在其中通過適當(dāng)?shù)臄?shù)據(jù)處理技術(shù)來操縱數(shù)據(jù)。 然后,數(shù)據(jù)科學(xué)家在另一個(gè)代碼單元中使用此數(shù)據(jù)來提取所需的信息,并將輸出提供給數(shù)據(jù)分析人員。 這一切都是在這三位數(shù)據(jù)專家都沒有離開筆記本的情況下發(fā)生的。
In an era where Fast Data is the norm, extracting value from your data through a structured pipeline using different tools for each step is no longer a sustainable pattern. The data that comes through an unstructured real-time data flow may offer valuable insights when used for batch processes. But it offers even more value when it’s progressively analyzed via near-real-time processing and interactive dashboards (i.e. notebooks) that offer complete access to the raw data and sophisticated visualizations.
在以快速數(shù)據(jù)為準(zhǔn)則的時(shí)代,通過結(jié)構(gòu)化的管道為每個(gè)步驟使用不同的工具從數(shù)據(jù)中提取價(jià)值已不再是可持續(xù)的模式。 當(dāng)用于批處理時(shí),通過非結(jié)構(gòu)化實(shí)時(shí)數(shù)據(jù)流獲得的數(shù)據(jù)可能會提供有價(jià)值的見解。 但是,當(dāng)通過近實(shí)時(shí)處理和交互式儀表板(即筆記本電腦)進(jìn)行逐步分析時(shí),它可以提供更大的價(jià)值,這些儀表板可以完全訪問原始數(shù)據(jù)和復(fù)雜的可視化效果。
翻譯自: https://www.freecodecamp.org/news/think-you-need-a-dashboard-you-should-build-a-notebook-instead-33104d913f95/
3c技能和背包需要改建嗎?
總結(jié)
以上是生活随笔為你收集整理的3c技能和背包需要改建吗?_认为您需要仪表板? 您应该改建一个笔记本。的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: refract推导_我们如何利用Refr
- 下一篇: 梦到自己吃黄瓜是什么意思