Original title: Although the joys and sorrows of human beings are different, the sentiment analysis model can understand it
Social media has gradually become a part of peoples lives today, and it It has also become an important source of data for psychologists to conduct research. At the same time, researchers are also trying to use natural language processing and machine learning techniques to predict the emotional fluctuations of social media users.soul partner
The sudden outbreak of COVID-19 last year has profoundly affected peoples lives. In this special historical period, the psychology of the public has become sensitive and fragile.soul partner
Reduced going out and contact during the epidemic has allowed people to spend more time on social networks. Some people inevitably vent their unsatisfactory work and life to others through the Internet. Negative emotions such as panic, anxiety, sadness, and helplessness also increased.soul partner
Faced with public emergencies, social media users generally have negative emotions such as anger, fear, worry, confusion, sadness, etc.soul partner
According to the survey, the average daily time spent on social media by Internet users around the world reaches 2 hours and 22 minutes. Social media is no longer limited to social functions. position.soul partner
Whether it is domestic WeChat Moments, Weibo, QQ space, etc., or foreign Twitter, Instagram, Facebook, all carry the status of thousands of users.soul partner
For psychology researchers, these social media posts undoubtedly provide a considerable amount of research data.soul partner
In their latest study, researchers Johannes Eichstaedt of Stanford University and Aaron Weidman of the University of Michigan used natural language processing tools to analyze the posts of Facebook users.soul partner
Research shows that machine learning models can gain insight into a persons mood and fluctuations through social media with an accuracy comparable to traditional psychological measures.soul partner
Read between the lines and read your emotions
In recent years, a large amount of information on the Internet has become an important data source in personality science . Numerous studies have shown that the use of social media profiles is effective in categorizing personality-related dimensions.soul partner
Eichstaedt and Weidman The latest research of , provides a cutting-edge case for using social media big data to analyze and track peoples psychological state.
Using social media language to track fluctuations in mental states:soul partner
A case study based on weekly mood swings
< strong>Sampling calibration
The authors used two basic emotional dimensions, valence and arousal, to rate the sentiment of posts on Facebook.soul partner
Note: “Valence” and “Arousal” are two dimensions for evaluating emotions in psychology. The former expresses the degree of positive/negative feelings felt, distinguishing between positive and negative emotions; the latter expresses calmness/excitement Degree.soul partner
They first asked human research assistants, who already had a foundation in psychological research, to annotate 2,895 public Facebook posts from an earlier study.soul partner
Research helps to score each post for valence and arousal. A 9-point scale was used (for “valence”, 1=”negative”, 9=”positive”, and similarly, for “arousal”, 1=”low”, 9=”high”).soul partner
“Valence” and “Arousal” annotations for posts by psychology research assistants
This emotion tracking dataset has been made public: https://osf.io/pbjer/files/
After completing these reviews, the posts are used to train a machine learning model that will be able to predict which language conveys which emotion.soul partner
The authors then fit a series of models to these scoring data, each of which revealed a possible clear link between valence and arousal.soul partner
For domestic NLP researchers, the Chinese sentiment analysis dataset is more applicable. Therefore, Hyperneuron recommends a Chinese Weibo sentiment analysis dataset from 2014 NLPCC.soul partner
The evaluation data comes from Sina Weibo. For the entire Weibo input, the task is to determine whether the Weibo contains emotions. For microblogs containing emotions, it is required to determine the emotion classification output as anger (anger), disgust (disgust), fear (fear), happiness (happy), like (like), sadness (sadness), surprise (surprise).soul partner
The details of the dataset are as follows:soul partner
Chinese Weibo sentiment analysis dataset
Data provided: NLPCC2014
Published: 2014
Included Quantity: Hundreds of thousands of microblog texts
Data format: .xmlsoul partner
Data size: 18 MB
Download address: https://hyper.ai/datasets/14390
Model Creation
to my soulmate
The team used DLATK (Differential Language Analysis ToolKit) to extract Linguistic features in selected Facebook posts, based on the relative frequency of words, phrases, retain words that are more than three times more frequent than phrases that occur by chance. Finally, 1439 sentence components were filtered to predict “valence”, and 675 sentence components were used to predict “arousal”.soul partner
Next, train a ridge regression model based on the entire language feature set to predict valence and arousal, and use 10-fold cross-validation (i.e. model on 90% of the data, then evaluated on the remaining 10%).soul partner
The cross-validation out-of-sample prediction accuracy of the model is: “valence” prediction accuracy of 0.63; “evoked” accuracy of 0.82. Compared to other standard measures of emotion before, the model was found to be more accurate in estimating these alternative measures.soul partner
Validation samplesoul partner
To test the model, the research team sampled an additional 640 U.S. users, with equal numbers of men and women, from more than 65,000 Facebook posts , and the conditions that need to be met are: at least 14 consecutive weeks to publish more than 10 statuses.soul partner
Finally, the research team collected 303,575 posts by these users as a validation sample.soul partner
Experimental Results
The author visualized the users emotional evaluation, as shown in the figure below, which describes a female (left) and a male (right) Weekly mood and arousal fluctuations, and Big Five personality trait predictions.soul partner
Note: The five personality traits are the structural models used to describe personality traits in modern psychology. These include: Extraversion, Neuroticism, Affinity, Conscientiousness, and Openness to Experience.soul partner
The abscissa is the “valence” value, and the ordinate is the “arousal” valuesoul partner
As you can see from the figure, the female user on the left Mood fluctuations are large, and the frequency of high pleasure (Valence) and high excitement (Arousal) is high.soul partner
In contrast, the male user on the right has less mood swings and rarely experiences high pleasure or high excitement.
This is also a new finding in the teams experiment: Women tend to be more optimistic and have a wider range of mood changes than men. This is in line withsoul partner
Additionally, the teams analysis also found correlations between “valence” and “arousal” values and the Big Five.soul partner
Model Evaluation
Facebook users who provided verification samples had previously voluntarily participated in the “My Personality” questionnaire, evaluating their five personality traits .soul partner
The results showed that the machine learning models predictions of their personality were consistent with predictions using psychological survey methods.
Defect Analysis
Of course, the author also pointed out the current problems with this model.soul partner
First, they sampled relatively active Facebook users, but they were chosen because they provided frequent enough status updates, but they were not likely to be representative of all Americans.soul partner
Secondly, different social platforms have different attributes and styles. Whether the results obtained by using Facebook posts can be replicated on different social media such as Twitter is still unknown.soul partner
Therefore, these limitations and universality issues are also directions for researchers to further explore in the future.
The potential of social platforms for psychology is unlimitedsoul partner
Perhaps for many people, social platforms are nothing more than a way to share life, beautiful photos, and watch gossip place, but in fact it holds great potential in psychological research.soul partner
Through data mining and machine learning, it is possible to extract signals from a huge amount of data, identify people suffering from depression, anxiety and other emotional disorders, and then take some treatment measures in time. In this regard, there are already mature cases in China.soul partner
Huang Zhisheng, an artificial intelligence scholar at Vrije Universiteit Amsterdam in the Netherlands, created an AI program called “Tree Hole Rescue Team” in 2018 to search for posts with suicidal tendencies on Weibo. Then, the location of users who have suicidal thoughts will be locked through “Clubs and Horses”, and rescue volunteers will be dispatched to find and guide them in time.soul partner
Now this team of volunteers is still active in the front line of psychological counseling.soul partner
As of the end of September 2020, since its establishment two years agosoul partner
“Tree Hole Rescue Team” has prevented 3289 suicides
p>In addition, social media-based sentiment analysis technology can also track traumatic events (such as large earthquakes, wars, new coronary pneumonia epidemics, etc.) and their psychological impact on people, thereby helping government departments to effectively conduct public opinion guidance, scientific Rescue and public emotional reassurance work.soul partner
For individuals, maybe they can use these tools in the future to analyze and analyze the little emotions of boys/girlfriends, and everyone will no longer have to guess~soul partner
< img>soul partner
Author | Nervous Xiaoxi; Editor | Fish Ball DumplingsReturn to Sohu, see more
Editor in charge:
free soulmate drawing