您应该如何改变数据科学教育
There are many types of data scientists, with varying skillets and responsibilities. In my opinion, the most important groups you can segment data scientists into are those who write code used in production, and those who do reporting.
數據科學家的類型很多,技能和職責各不相同。 我認為,您可以將數據科學家劃分為幾個最重要的小組,即編寫生產中使用的代碼的人和進行報告的人。
分析師 (Analysts)
For a lack of a better term, I called the second group analysts. This does not mean they are not data scientists, people in these roles benefit from knowing machine learning, the ins and outs of data, and general programming skills. However, in this position communication is much more important, and your knowledge of dash-boarding, charts, presentation, and interpretable statistics is more likely to be a key to success.
由于缺乏更好的任期,我??打電話給第二小組的分析師。 這并不意味著他們不是數據科學家,扮演這些角色的人們將從了解機器學習,數據的來龍去脈以及常規編程技能中受益。 但是,在這個職位上,溝通更為重要,您對儀表盤,圖表,演示文稿和可解釋的統計信息的了解更有可能是成功的關鍵。
工程師 (Engineers)
An engineer writes code that gets used in production, meaning it effects business processes by being integrated with an organization’s existing software. Here, your knowledge of machine learning is more important, and you’re more likely to be working on a team that is code-oriented, unlike a business intelligence or business analysis team.
工程師編寫的代碼會在生產中使用,這意味著它通過與組織的現有軟件集成來影響業務流程。 在這里,您對機器學習的了解更為重要,并且您更有可能在面向代碼的團隊中工作,這與商業智能或業務分析團隊不同。
數據科學教育的目標 (The Target of Data Science Education)
There is no collegiate major for data science (at most universities), so pretty much everyone “moves” into data science, but the people that studied computer science or were software engineers usually take the data science engineer position, while people with other backgrounds more typically move into the analyst position. Data science education, as it stands today is much more focused on the position of the analyst than the engineer. In my opinion, this is because it is much easier to teach the skills an analyst needs, and many individuals and organizations want to cash in on the sudden spike of people wanting to change their career.
沒有數據科學專業的大學(在大多數大學中),因此幾乎每個人都“ 進入”了數據科學,但是學習計算機科學或軟件工程師的人通常擔任數據科學工程師的職位,而具有其他背景的人則更多通常進入分析師職位。 當今的數據科學教育比工程師更關注分析師的職位。 在我看來,這是因為教授分析師所需的技能要容易得多,而且許多個人和組織希望從希望改變職業的人們的突如其來中獲利。
Since there is so much hype around “The Sexiest Job of the 21st Century,” as some have put it, parties involved around data science education over emphasize the availability and demand for analysts. So, the primary issue I take with data science education is that there seems to be and not a lot of focus on engineering skills. I believe the engineering skills are more useful, in higher demand, and fewer people have them. When I talk with people beginning their career in data science I usually hear about their skills related to analysis, but I almost exclusively read about engineering skills in job posts.
正如一些人所說,由于“ 21世紀最性感的工作”周圍有太多宣傳,因此參與數據科學教育的各方都過分強調了分析人員的可用性和需求。 因此,我對數據科學教育的首要問題是似乎對工程技能的關注不是很多。 我相信工程技能會更有用,需求更高,而且擁有的人會更少。 當我與開始從事數據科學職業的人們交談時,我通常會聽到他們與分析相關的技能,但是我幾乎只閱讀有關職位的工程技能。
你應該改變什么 (What You Should Change)
If you find yourself learning yet another algorithm, charting package, dashboarding tool, or your projects seem to be stuck in jupyter notebooks, then it’s time for a change. Once you have the hang of a few of one of these topics, I don’t see a large point in continuing to grow deeper knowledge. For example, if you already know Tableau and Cognos, is there much benefit to also knowing Power BI? I’m sure you are a smart person so if you have a need to learn an additional tool in a domain you already know, I think you should wait until you have that need. There are other skills that have a higher return on investment, most of them are on the engineering side of things. It can be reassuring to learn more about things that are already familiar to us, but I encourage you to take a step out of your comfort zone. The I think you should make is to focus on engineering skills. The best reason to make this transition is that the ratio of jobs available to people with the required skills is better, which is probably what you care about.
如果您發現自己正在學習另一種算法,制圖軟件包,儀表板工具,或者您的項目似乎卡在了jupyter筆記本中,那么是時候進行更改了。 一旦您掌握了這些主題中的一個,我就不會在繼續積累更深的知識上有太大的意義。 例如,如果您已經了解Tableau和Cognos,那么了解Power BI會有很多好處嗎? 我確定您是一個聰明的人,因此,如果您需要在已經知道的領域中學習其他工具,我認為您應該等到有此需求時再進行學習。 還有其他技能具有較高的投資回報率,其中大多數是在工程方面。 可以放心地了解有關我們已經熟悉的事情的更多信息,但是我鼓勵您走出舒適區。 我認為您應該做的是專注于工程技能。 進行此過渡的最佳理由是,具有所需技能的人所能獲得的工作比例更好,這可能是您所關心的。
進一步來說 (More Specifically)
Now if you’re convinced that you should care about being an engineer, next take the steps to learn further about programming and DevOps. I’ve argued previously in Towards Data Science about the gap in programming ability and the struggle even many in technology have with basic programming here, but these are some of my recommendations to learn:
現在,如果您確信自己應該關心成為一名工程師,那么請采取步驟進一步學習編程和DevOps。 我之前在《邁向數據科學》一書中曾論證過編程能力的差距以及甚至許多技術人員在這里與基本編程的斗爭,但這是我要學習的一些建議:
- Pick up another programming, make it a general purpose one like JavaScript, Java, or C++. 選擇另一種程序,使其成為通用程序,例如JavaScript,Java或C ++。
I have never come across a single data science tutorial that talked about Object Oriented Programming. I think this is essential to being a good programmer, and it understanding the principles of OOP helps you understand better any code you may write. The course Programming Foundations: Object Oriented Programming is a great place if you’re at the very beginning.
我從來沒有碰到過任何有關面向對象編程的數據科學教程。 我認為這對于成為一名優秀的程序員至關重要,它理解OOP的原理有助于您更好地理解可能編寫的任何代碼。 如果您剛開始,那么“ 編程基礎:面向對象的編程 ”課程是一個絕佳的選擇。
Learn how the command line works, if you practice, knowledge of the command line can make you very efficient. If you are not in practice, you’ll still need it at some point to complete many substantial projects. To learn this I took the course Unix for Mac OS X Users by Kevin Skoglund.
了解命令行的工作原理,如果您練習的話,對命令行的了解可以使您非常高效。 如果您不在實踐中,則在某些時候仍然需要它來完成許多重大項目。 為了學習這一點,我參加了Kevin Skoglund的“ Mac OS X用戶的Unix課程”。
Learn to properly use git and GitHub. Frankly, if you are writing code and it doesn’t end up on GitHub (or another version control site), then I would be doubtful your code gets used. Again, I would recommend a course by Kevin Skoglund, called Git Essential Training: The Basics.
學習正確使用git和GitHub。 坦白說,如果您正在編寫代碼,但最終沒有出現在GitHub(或其他版本控制站點)上,那么我會懷疑您的代碼是否被使用。 同樣,我會推薦Kevin Skoglund開設的一門課程,稱為Git基本培訓:基礎知識 。
Once you have done all of that I would move onto learning about DevOps, since it makes you more efficient, easier to collaborate with, and the best engineering teams place a lot of emphasis on DevOps practices. DevOps combines software development (Dev) and IT operations (Ops) into one field which enables the rapid production of new features while maintain system reliability. Knowing about DevOps can help with continuous integration (testing code and the build as you push to your repository), continuous deployment (updating models and having them hit the production immediately), packaging (making sure models can scale up to more users), and monitoring (keeping track of model scoring for users).
完成所有這些工作后,我將繼續學習DevOps,因為它可以使您更高效,更輕松地進行協作,并且最好的工程團隊會非常重視DevOps的實踐。 DevOps將軟件開發(Dev)和IT運營(Ops)整合到一個領域,從而能夠在保持系統可靠性的同時快速生產新功能。 了解DevOps可以幫助進行持續集成(在推送到存儲庫時測試代碼和構建),持續部署(更新模型并使它們立即投入生產),打包(確保模型可以擴展到更多用戶),以及監控(跟蹤用戶模型評分)。
Essential to understanding the DevOps is understanding the cloud, where services are deployed and called as you would an API. The cloud is a great enabler of machine learning, since your ML models can scale very easily on the cloud and they become easier to test, update, and replace. This is why you may frequently see job descriptions require experience with AWS, GCP, Azure, IBM Public Cloud, Cloudflare, or another provider. I think once you get to the point that you have comfort in doing data science on the cloud, if you don’t look more attractive as a candidate for a data scientist job, you will at least have a better understanding of how the work of a data science engineer gets put into place.
了解DevOps的關鍵是了解云,就像在API一樣部署和調用服務。 云可以極大地促進機器學習,因為您的ML模型可以在云上輕松擴展,并且可以更輕松地進行測試,更新和替換。 這就是為什么您可能經常看到職位描述需要具有AWS,GCP,Azure,IBM Public Cloud,Cloudflare或其他提供商的經驗的原因。 我認為,一旦掌握了在云上進行數據科學的經驗之后,如果您看起來不像數據科學家那樣有吸引力,那么您至少將對它的工作方式有更好的了解。一名數據科學工程師就位。
翻譯自: https://towardsdatascience.com/how-you-should-change-your-data-science-education-710d01f36ebd
總結
以上是生活随笔為你收集整理的您应该如何改变数据科学教育的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 科技树点歪了?丰田新专利:握握门把手就能
- 下一篇: 背大锅?调查称30%的离婚和智能手机有关