青年报告_了解青年的情绪
青年報(bào)告
Youth-led media is any effort created, planned, implemented, and reflected upon by young people in the form of media, including websites, newspapers, television shows, and publications. Such platforms connect writers, artists, and photographers in the age range of 13–24 all around the globe and promote and defend a free youth press. Members of these platforms not only have the freedom to express their own opinions on various issues and topics but also represent various communities and let their voices be heard.
青年領(lǐng)導(dǎo)的媒體是年輕人以媒體的形式(包括網(wǎng)站,報(bào)紙,電視節(jié)目和出版物)創(chuàng)建,計(jì)劃,實(shí)施和反思的任何努力。 這樣的平臺(tái)將全球13至24歲的作家,藝術(shù)家和攝影師聯(lián)系起來(lái),并促進(jìn)和捍衛(wèi)自由的青年報(bào)刊。 這些平臺(tái)的成員不僅可以自由表達(dá)自己在各種問(wèn)題和主題上的意見(jiàn),而且可以代表各種社區(qū)并發(fā)表自己的聲音。
Hence, such platforms prove to be a good source of data to understand and analyze youth aspirations across various parts of the globe. In the remaining sections, we will explain our methodology of data collection and will list down our results and insights derived from the analysis of various topics.
因此,這些平臺(tái)被證明是理解和分析全球各地青年志向的良好數(shù)據(jù)來(lái)源。 在其余各節(jié)中,我們將解釋我們的數(shù)據(jù)收集方法,并將列出我們的結(jié)果和從各種主題分析中得出的見(jiàn)解。
本節(jié)談?wù)撌裁?#xff1f; (What does the section talk about?)
This Section is overall given insights about the data was distributed over newspapers and articles, the insights and visualizations tell us about how youths are going on and how their sentiments change overtime period (Ranges from 2015–2020)
本節(jié)總體上給出了關(guān)于數(shù)據(jù)分布在報(bào)紙和文章上的見(jiàn)解,這些見(jiàn)解和可視化告訴我們有關(guān)青年的發(fā)展?fàn)顩r以及他們的情緒隨時(shí)間變化的情況(2015-2020年的范圍)
我們?yōu)槭裁催x擇這個(gè)主題? (Why did we choose this topic?)
This topic aims to analyze data from a different perspective i.e Outside Social media. This is the reason we choose this topic to scrape and analyze the data i.e present over there outside social media and we present our insights accordingly.
本主題旨在從不同角度(即外部社交媒體)分析數(shù)據(jù)。 這就是我們選擇該主題來(lái)抓取和分析數(shù)據(jù)(即存在于外部社交媒體上的數(shù)據(jù))的原因,我們相應(yīng)地提出了自己的見(jiàn)解。
目標(biāo) (Objectives)
- To scrape and process News articles from different resources, to prepare it for sentiment analysis and topic modeling, in order to draw useful insights about the sentiment of the youth from it. 從不同的資源中抓取和處理新聞文章,以準(zhǔn)備進(jìn)行情感分析和主題建模,以便從中獲得有關(guān)青年情感的有用見(jiàn)解。
- To conduct sentiment analysis, for understanding the youth sentiment better. 進(jìn)行情緒分析,以更好地了解青年情緒。
- To collect the insights from all of these points and to visualize the results in a cogent manner for the audience. 從所有這些方面收集見(jiàn)解,并以令人信服的方式為觀(guān)眾呈現(xiàn)結(jié)果。
方法 (Methodology)
數(shù)據(jù)采集 (Data Collection)
To collect articles, we scraped data from various media platforms (ref. Table 1) using a scraper we made using BeautifulSoup and requests a library in Python. Lots of articles were scraped ranging from the year 1994 to 2020 and merged to a final dataset that we used for analysis. We also focused on extracting articles for certain categories, viz:
為了收集文章,我們使用了使用BeautifulSoup制作的抓取工具,從各種媒體平臺(tái)(參見(jiàn)表1)抓取了數(shù)據(jù),并請(qǐng)求使用Python庫(kù)。 從1994年到2020年,我們刮掉了許多文章,并將其合并為我們用于分析的最終數(shù)據(jù)集。 我們還專(zhuān)注于提取某些類(lèi)別的文章,即:
- Education 教育
- Environment & Climate 環(huán)境與氣候
- Human Rights 人權(quán)
- COVID-19 新冠肺炎
- Politics 政治
- Health and Leisure 健康休閑
使用的工具: (Tools Used:)
For scraping data:
對(duì)于抓取數(shù)據(jù):
- Beautiful soup 美麗的湯
- Requests 要求
- Selenium Selenium
For visualizing data:
為了可視化數(shù)據(jù):
- Matplotlib Matplotlib
- Seaborn Seaborn
- Python-Plotly Python皮
- Matplotlib-Animations Matplotlib動(dòng)畫(huà)
- Tableau 畫(huà)面
- Python Word Clouds Python文字云
For sentiment analysis:
對(duì)于情緒分析:
- Text Blob 文字斑點(diǎn)
- Empath Analysis 移情分析
- Region-Based Analysis 基于區(qū)域的分析
- Knowledge Graph 知識(shí)圖
- Network Analysis 網(wǎng)絡(luò)分析
數(shù)據(jù)預(yù)處理 (Data Preprocessing)
With all the articles scraped, next, we focused on preprocessing the articles. While preprocessing, one of our major challenges was to identify and remove promotional content from the articles. To start with, we removed all the URLs from the articles. Next, we identified the templates that each of the platforms used for advertisements or for promoting other articles and used regular expressions to identify and remove them from the articles. We then sent our articles through a basic preprocessing pipeline to change the case, stem, lemmatize and remove special characters and regular stopwords, etc. We also identified certain redundant words like journalism, etc. that didn’t add to the analysis and removed them from our dataset.
在抓取所有文章之后,接下來(lái),我們將重點(diǎn)放在預(yù)處理文章上。 在進(jìn)行預(yù)處理時(shí),我們面臨的主要挑戰(zhàn)之一是從文章中識(shí)別并刪除促銷(xiāo)內(nèi)容。 首先,我們從文章中刪除了所有URL。 接下來(lái),我們確定了每個(gè)平臺(tái)用于廣告或促銷(xiāo)其他文章的模板,并使用正則表達(dá)式來(lái)標(biāo)識(shí)它們并將其從文章中刪除。 然后,我們通過(guò)基本的預(yù)處理流程發(fā)送文章,以更改大小寫(xiě),詞干,詞形化并刪除特殊字符和常規(guī)停用詞等。我們還發(fā)現(xiàn)了某些未添加到分析中的多余詞(例如新聞等),并將其刪除從我們的數(shù)據(jù)集中。
Additionally, we also did a keyword analysis as a preprocessing step so as to ensure that we have everything ready before we start with our analysis. Next, we used Stanford’s NER and Python’s geopy library to identify locations with respect to the articles. Then, we used LDA and Empath based analysis for topic modeling and recognized 9 following topics:
此外,我們還將關(guān)鍵字分析作為預(yù)處理步驟,以確保在開(kāi)始分析之前已準(zhǔn)備就緒。 接下來(lái),我們使用Stanford的NER和Python的geopy庫(kù)來(lái)確定關(guān)于文章的位置。 然后,我們使用基于LDA和Empath的分析進(jìn)行主題建模,并識(shí)別出以下9個(gè)主題:
Environment (Climate Change)
環(huán)境(氣候變化)
Leadership & Politics (Democracy, Leadership)
領(lǐng)導(dǎo)與政治(民主,領(lǐng)導(dǎo))
Health
健康
COVID-19
新冠肺炎
Education
教育
Technology
技術(shù)
Human Rights (LGBT, Black Lives Matter, Bullying)
人權(quán)(LGBT,重要的黑人生活,欺凌)
Terrorism and Violence
恐怖主義與暴力
Career and Employment
職業(yè)與就業(yè)
ENVIRONMENT ( Author: Mr. Mateus Broilo)
環(huán)境(作者:Mateus Broilo先生)
There is no question that the Environment is a key topic that gathers the concern of the whole society, from youngsters to adults, and to elders. However, the youth of today are the future of tomorrow and for this reason, they are the part of society that most probably will suffer the most in years to come. The environment can not be seen as a cultural movement, simply because it is not. But it must be seen as and dealt like a political movement and as an economical trend where most of the time it serves the will of powerful corporations.
毫無(wú)疑問(wèn),環(huán)境是一個(gè)關(guān)鍵話(huà)題,引起了整個(gè)社會(huì)的關(guān)注,從年輕人到成年人,再到老年人。 但是,今天的年輕人是明天的未來(lái),因此,他們是社會(huì)的一部分,很可能會(huì)在未來(lái)幾年遭受最大的痛苦。 不能僅僅因?yàn)榄h(huán)境就將其視為文化運(yùn)動(dòng)。 但是,必須將它視為一種政治運(yùn)動(dòng),并將其視為一種經(jīng)濟(jì)趨勢(shì),在大多數(shù)情況下,它服務(wù)于強(qiáng)大企業(yè)的意愿。
The Word Cloud shows some of the most common and meaningful words related to the Environment topic analysis. Notice that words like climate, change, people, plastic, and others presented in Below Figure may be correlated to the basic concerns of the young people. And not surprisingly they appear as the most common words in over the 380 articles analyzed. Clearly “climate” and “change” are two pieces of a bigram. Climate is changing and that is a fact. “People” are part of the problem, but also can be the solution, mostly the youth. After all, the youth aspirations are a heat map towards where the world actually should be going to. And just for curiosity, have you ever found a “plastic” bottle on the beach? See Below Figures for more clarity.
詞云顯示與環(huán)境主題分析相關(guān)的一些最常見(jiàn)和最有意義的詞。 請(qǐng)注意,下圖中顯示的諸如氣候,變化,人,塑料等字眼可能與年輕人的基本關(guān)切相關(guān)。 毫不奇怪,它們?cè)诒环治龅?80篇文章中成為最常見(jiàn)的詞。 顯然,“氣候”和“變化”是二元論的兩個(gè)部分。 氣候在變化,這是事實(shí)。 “人”是問(wèn)題的一部分,但也可以是解決方案,主要是青年。 畢竟,青年的志向是世界應(yīng)去往何處的熱點(diǎn)圖。 只是出于好奇,您是否曾經(jīng)在海灘上找到過(guò)“塑料”瓶? 請(qǐng)參閱下面的圖以更清楚。
The top-20 list of most frequent Keywords.最常見(jiàn)的關(guān)鍵字排名前20位。 Article sentiments per year analyzed.每年分析文章情緒。One last analysis is to look for lexicons, in other words, to perform a text analysis across lexical categories. Here the main objective is to connect the text with a broad range of sentiments beyond positive, negative, and neutral, as shown in Figure 3.6.15. On the other hand in Figure 3.6.16 we see the most common levels in which the text articles can be categorized and in 3.6.17 the empath values associated with the most meaningful levels that impacts the environmental movement.
最后一種分析是尋找詞典,換句話(huà)說(shuō),對(duì)詞匯類(lèi)別進(jìn)行文本分析。 此處的主要目的是將文本與正面,負(fù)面和中立之外的廣泛情感聯(lián)系起來(lái),如圖3.6.15所示。 另一方面,在圖3.6.16中,我們看到可以對(duì)文本文章進(jìn)行分類(lèi)的最常見(jiàn)級(jí)別,在3.6.17中,我們看到了與影響環(huán)境運(yùn)動(dòng)的最有意義的級(jí)別相關(guān)聯(lián)的移情值。
Empath values are hued by the most meaningful levels connected to the environmental issue.與環(huán)境問(wèn)題相關(guān)的最有意義的層次決定著移情價(jià)值。2. CARRER AND EDUCATION (Author: Mr. Mario Vasquez Arias, Ms. Adelore Similoluwa Gloria)
2. 就業(yè)和 教育(作者:Mario Vasquez Arias先生,Adelore Similoluwa Gloria女士)
Education is one of the chosen categories and, at the same time, fundamental to this study because, if we talk about young people, it should not be lacking. Most young people are at some level of education, be it primary school, high school, or university. Therefore many young people spend a lot of time in educational sites becoming their second home and directly affecting the lives of each young person. As they are considered home, they reflect their personalities, concerns, and other feelings that the young person has at that time, so it is important to analyze this aspect.
教育是選擇的類(lèi)別之一,同時(shí)也是這項(xiàng)研究的基礎(chǔ),因?yàn)槿绻覀冋務(wù)撃贻p人,就不應(yīng)該缺乏教育。 大多數(shù)年輕人受過(guò)一定程度的教育,無(wú)論是小學(xué),高中還是大學(xué)。 因此,許多年輕人在教育場(chǎng)所花費(fèi)大量時(shí)間成為他們的第二故鄉(xiāng),并直接影響每個(gè)年輕人的生活。 當(dāng)他們被視為家時(shí),它們反映了年輕人當(dāng)時(shí)的性格,關(guān)注和其他感受,因此分析此方面很重要。
OmdenaOmdenaWe can see in the word cloud that the two words that stand out the most are “school” and “students”, which allude precisely to what education represents, so it is obvious to expect those results. The word “time” also stands out, which we can infer is all the time that young people spend in school, which is a large part of the day for five days and for many years (these words are also visible in the graph). The word “high school” shows that the articles scraped and from which the analyses were made were more focused on a younger population and the target population, precisely. Another key word is “work”, which implies that young people not only study but also work, probably because of economic conditions. Another word that can be visualized is “immigrant”, which is an aspect that has been seen quite a lot in recent years, and education would not be exempt from this. The word “home problem” is seen in a smaller size but also important to note, as this reflects that sometimes students bring home problems to school, affecting their performance on grades and mental health.
我們可以在詞云中看到,最突出的兩個(gè)詞是“學(xué)校”和“學(xué)生”,這恰好暗示了教育所代表的含義,因此可以預(yù)期得到這些結(jié)果。 “時(shí)間”一詞也很突出,我們可以推斷出年輕人在學(xué)校上的所有時(shí)間,這是一天中大部分時(shí)間,持續(xù)五天和很多年(這些單詞在圖表中也可見(jiàn))。 “高中”一詞表明,準(zhǔn)確地刮掉并進(jìn)行分析的文章更加側(cè)重于年輕人口和目標(biāo)人群。 另一個(gè)關(guān)鍵詞是“工作”,這意味著年輕人不僅學(xué)習(xí)而且工作,這可能是由于經(jīng)濟(jì)狀況所致。 可以形象化的另一個(gè)詞是“移民”,這是近年來(lái)已經(jīng)出現(xiàn)的很多方面,而且教育也不能免除這一點(diǎn)。 “家庭問(wèn)題”一詞看起來(lái)較小,但也要注意,因?yàn)檫@反映出有時(shí)學(xué)生將家庭問(wèn)題帶到學(xué)校,影響他們?cè)谀昙?jí)和心理健康方面的表現(xiàn)。
Empath Analysis:
移情分析 :
The four values of empathy with the highest level are school (which is also the most repeated), reading, social networks, and holidays. In this range of time, it is what the young people but have emphasized in their thoughts, the school that already we said that it is like its second home; the habit of the reading that is something that has been increasing, or in physical or digital means; the social networks that these were a boom in the society and practically all the young people know and handle this type of technological services; and the vacations that are a few dates enough waited by the young people to enjoy the free time, their hobbies and the rest. Another value to highlight is technology, which appears at a lower level, but is still relevant for young people, due to the great advance of technology and the great proliferation of services and devices that are available to anyone, especially to this young population.
同理心最高的四個(gè)值是學(xué)校(也是重復(fù)次數(shù)最多的),閱讀,社交網(wǎng)絡(luò)和假期。 在這段時(shí)間里,正是年輕人在思想中強(qiáng)調(diào)的,我們已經(jīng)說(shuō)過(guò)的學(xué)校就像是第二故鄉(xiāng)。 以某種形式或以物理或數(shù)字方式增加的閱讀習(xí)慣; 這些社交網(wǎng)絡(luò)正在社會(huì)中蓬勃發(fā)展,幾乎所有年輕人都知道并使用這種技術(shù)服務(wù); 還有一些假期足以讓年輕人等著享受空閑時(shí)間,他們的業(yè)余愛(ài)好和其他時(shí)間。 值得強(qiáng)調(diào)的另一個(gè)價(jià)值是技術(shù),由于技術(shù)的飛速發(fā)展以及任何人(尤其是這個(gè)年輕人口)可以使用的服務(wù)和設(shè)備的廣泛普及,它的出現(xiàn)水平較低,但仍然與年輕人相關(guān)。
We can observe the average of the sentiment values for each year, where we have the highest peak in 2016 and the lowest in 2020, the latter could be due to the negative feelings generated by the pandemic generated by the coronavirus, which generates feelings of anxiety, confinement due to quarantine, loneliness, and depression, among others.
我們可以觀(guān)察到每年的情緒平均值,其中我們?cè)?016年達(dá)到最高峰,而在2020年達(dá)到最低峰,后者可能是由于冠狀病毒引起的大流行所產(chǎn)生的負(fù)面情緒,從而產(chǎn)生了焦慮感,由于隔離,孤獨(dú)和沮喪等原因?qū)е碌慕]。
3. TERRORISM AND VIOLENCE (Author: Ms. Shanya Sharma)
3. 恐怖主義與暴力(作者:Shanya Sharma女士)
Fig: Emotion Patterns圖:情緒模式Sentiments Trends over time:
情緒隨時(shí)間變化的趨勢(shì):
Sentiments Trends情緒趨勢(shì)The dip in sentiments for 2019 can be associated to 2019 Oakland Gun Violence. The same can be inferred from the keywords extracted from 2019 terrorism articles.
2019年情緒下降可能與2019年奧克蘭槍支暴力有關(guān)。 從2019年恐怖主義文章中提取的關(guān)鍵詞可以推斷出同樣的情況。
Keywords like domestic violence can be seen for the year 2020 which can have a direct relation to COVID-19 and lockdown
在2020年可以看到像家庭暴力這樣的關(guān)鍵詞,它可能與COVID-19和鎖定有直接關(guān)系
Police brutality is also a frequent keyword for 2020 data indicating that police brutality for imposing lockdown (for e.g. India) or that surrounding George Floyd’s case kept the youth in terror.
警察暴行也是2020年數(shù)據(jù)的常見(jiàn)關(guān)鍵詞,表明警察因?qū)嵤┓怄i而暴行(例如印度)或周?chē)膯讨巍じヂ逡恋?George Floyd)案使年輕人感到恐怖。
4. HUMAN RIGHTS(Author: Mr. Opeyemi Fabiyi)
4 。 人權(quán)(作者:奧貝米·法比伊先生)
OmdenaOmdenaKeywords for certain locations
特定位置的關(guān)鍵字
Let’s Look the Emotion Trends:
讓我們看看情緒趨勢(shì):
Racism
種族主義
Violence
暴力
Poverty
貧窮
Immigration
出入境
Homophobia
恐同
6. Some concerning insights that came were:
6.一些有關(guān)的見(jiàn)解包括:
Youth in India is worried about Menstrual Hygiene
印度的年輕人擔(dān)心月經(jīng)衛(wèi)生
Sex-Trafficking is a cause of concern in developed nations like the US.
在美國(guó)等發(fā)達(dá)國(guó)家,性販運(yùn)問(wèn)題引起人們的關(guān)注。
SEX TRAFFICKING:
性交易:
5. POLITICS(Author: Ms. Kriti Rai Saini)
5.政治(作者:Kriti Rai Saini女士)
Lexicons Associated with Positive.
Lexicons與陽(yáng)性相關(guān)。
Lexicons Associated with negative.
Lexicons與陰性相關(guān)。
Let’s analyze yearly changes on sentiments over the years due to politics on different topics.
讓我們分析一下多年來(lái)由于不同主題的政治而導(dǎo)致的情緒年度變化。
6. COVID 19(Author: Ms. Monalisa Panda)
6. COVID 19(作者:Monalisa Panda女士)
The present of all topics in the articles文章中所有主題的呈現(xiàn)In the above fig, we can see that there are only a few articles present in covid that is only the year 2020.
在上圖中,我們可以看到,只有2020年,covid中只有幾篇文章。
Mean Sentiments over different Months before lockdown vs after lockdown.
鎖定前與鎖定后不同月份的平均情緒。
Word clouds of all the articles based on the topic COVID-19:
基于主題COVID-19的所有文章的詞云:
Based on Positive Sentiments:
基于積極情緒:
So these are listed as positive sentiments on the topic of COVID-19, mostly the words detected are:
因此,這些被列為COVID-19主題的積極情緒,大部分檢測(cè)到的單詞是:
People, Time: Most people get time to spend with their families and Relatives.
人,時(shí)間:大多數(shù)人有時(shí)間陪伴家人和親戚。
Based on Negative Sentiments:
基于負(fù)面情緒:
So here in this word cloud, we can see that misinformation and racism, discrimination is some of the negative key holders in the case of the COVID topic.
因此,在這個(gè)詞云中,我們可以看到,就COVID主題而言,錯(cuò)誤信息和種族主義,歧視是負(fù)面因素的一部分。
Racism
種族主義
The peak in the negative emotions can be associated with US Presidential Elections 2016
負(fù)面情緒的高峰可能與2016年美國(guó)總統(tǒng)大選有關(guān)
Fear wrt racism was gradually decreasing but saw a slight rise in 2020
恐懼種族主義正在逐漸減少,但在2020年會(huì)略有上升
Region-based Positive and Negative sentiments on the topic of COVID:
基于區(qū)域的正面和負(fù)面情緒,涉及COVID:
Positive Sentiments:
積極情緒:
Negative Sentiments Regions:
負(fù)面情緒區(qū)域:
So these are the Whole analysis with all the topics mentioned in the Top. Once again I would like to thank all Omdena to give this Wonderful Opportunity to work on this project.
因此,這些是整體分析,上面列出了所有主題。 我要再次感謝所有Omdena給予這個(gè)工作的美好機(jī)會(huì)。
To visit for Upcoming Projects Go to Omdena
要訪(fǎng)問(wèn)即將進(jìn)行的項(xiàng)目,請(qǐng)?jiān)L問(wèn)Omdena
Thank You!
謝謝!
Monalisa Panda
蒙娜麗莎·熊貓(Monalisa Panda)
翻譯自: https://medium.com/omdena/understanding-youths-sentiments-c25ccbdb5702
青年報(bào)告
總結(jié)
以上是生活随笔為你收集整理的青年报告_了解青年的情绪的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 梦到洗头是什么意思周公解梦
- 下一篇: 梦到骑马摔下来是什么意思