From the perspective of natural language processing technology, the task of sentiment analysis is to extract the entity of the comment from the text of the comment, as well as the sentiment tendency expressed by the commenter for the entity, all the core technologies of natural language. question. Therefore, sentiment analysis is considered a subtask of natural language processing.soulmate dreams
Compared with other artificial intelligence techniques, Sentiment Analysis is a bit special because Other fields are based on objective data for analysis and prediction, but sentiment analysis has a strong personal subjective element. The goal of sentiment analysis is to analyze peoples sentimental tendencies and opinions about entities and their attributes from texts.soulmate dreams
With the development of social media such as Twitter and e-commerce platforms, a large amount of content with opinions has been generated, which provides the required data basis for sentiment analysis. Today, emotion recognition has been widely used in many fields.soulmate dreams
For example:soulmate dreams
- In the field of commodity retail, user evaluations are very important feedback information for retailers and manufacturers. It can quantify the degree of praise and criticism of the product and its competing products, so as to understand the users appeal for the product and the comparison between its own product and its competing products.
- In the field of social public opinion, the trend of public opinion can be effectively grasped by analyzing the publics comments on social hot events.
- In terms of corporate public opinion, sentiment analysis can be used to quickly understand the societys evaluation of the company, provide decision-making basis for the companys strategic planning, and enhance the companys competitiveness in the market.
- In the field of financial transactions, analyze traders attitudes towards stocks and other financial derivatives, and provide auxiliary basis for market transactions.
At present, the vast majority of artificial intelligence open platforms are capable of sentiment analysis. In addition to sentiment analysis in the general field, there are also analysis in several specific fields of automobiles, kitchenware, catering, news and Weibo.soulmate dreams
Sentiment analysis example of Posen Chinese Semantic Open Platform
So what exactly is sentiment analysis?soulmate dreams
From the perspective of natural language processing technology, the task of sentiment analysis is to extract the entity of the comment from the text of the comment, as well as the sentiment tendency expressed by the commenter for the entity, which is the core of all natural language. Technical issues, such as: lexical semantics, referential resolution, stinginess, information extraction, semantic analysis, etc., will be used in sentiment analysis.soulmate dreams
Therefore, sentiment analysis is considered as a sub-task of natural language processing, and we can express peoples sentiment towards an entity target in a quintuple format: (e, a, s ,h,t)soulmate dreams
- e represents the target entity of sentiment analysis, which can be a concrete instance or a class, but must be the only object.
- a represents the attribute of a specific evaluation of a viewpoint in entity e.
- s represents the sentiment contained in the opinion on the attribute a of entity e, which is usually divided into three categories: positive positive, negative negative and neutral. It can also be converted into an evaluation level of 1 star to 5 stars through a regression algorithm.
- h is the holder of an emotional opinion, which may be the evaluator himself or someone else.
- t is the time the idea was published.
Take the picture as an example, e refers to a restaurant, a is the cost-effectiveness attribute of the restaurant, s is a positive evaluation of the cost-effectiveness of the restaurant, h is the commenter himself, t is 7/27/19. So the sentiment analysis of this review can be expressed as a quintuple (a restaurant, value for money, positive compliment, reviewer, 7/27/19).soulmate dreams
Users evaluation of a restaurantsoulmate dreams
Sentiment analysis can be roughly divided into three levels of tasks according to the granularity of the processing text, namely text level, sentence level level and attribute level.soulmate dreams
Lets take a look at them separately:soulmate dreams
1. Text-level sentiment analysis
The goal of text-level sentiment analysis is to judge the entire text Whether the document expresses a positive or negative sentiment, such as a book review, or a comment on a hot current affairs news, as long as the text to be analyzed exceeds the scope of a sentence, it can be regarded as a discourse-level sentiment analysis.soulmate dreams
There is a premise for text-level sentiment analysis, that is, the opinions expressed in the whole text are only for a single entity e, and only contain the opinions of one opinion holder h.soulmate dreams
This approach regards the entire document as a whole, and does not study the specific entities and entity attributes contained in the text, which makes the text-level sentiment analysis more limited in practical applications, unable to analyze the text in a piece of text. Multiple entities are analyzed separately, and the opinions of multiple opinion holders in the text are also indistinguishable.soulmate dreams
For example the text of the review is: I think this phone is great.soulmate dreams
The reviewer expresses a positive evaluation of the phone as a whole, but if it is: I think the camera function of this phone is very good, but the signal is not very good such a sentence, there is a positive evaluation in the same comment Derogatory words appear in the word again, which cannot be distinguished by discourse-level analysis, and can only be analyzed as a whole.soulmate dreams
But fortunately, there are many scenarios where there is no need to distinguish between opinion-evaluated entities and opinion-holders. For example, in the sentiment analysis of product reviews, the object of the review can be the product being reviewed by default. The opinion holders are also the reviewers themselves.soulmate dreams
Of course, this also depends on what the product being reviewed is. If it is a travel service such as parent-child travel, then the review is likely to contain more than one opinion holder. In practical work, text-level sentiment analysis cannot satisfy us for more detailed evaluations. If we need to conduct more precise and detailed analysis of comments, we need to split each sentence in the text. This is sentence-level sentiment analysis research. The problem.soulmate dreams
2. Sentence-level sentiment analysissoulmate dreams
Similar to text-level sentiment analysis, the task of sentence-level sentiment analysis is to judge whether a sentence expresses a positive meaning or a Derogatory sentiment, although the granularity reaches the sentence level, the same premise of sentence-level analysis and discourse-level analysis is that a sentence only expresses one opinion and one emotion, and there is only one opinion holder.soulmate dreams
Sentence-level analysis is indistinguishable if a sentence contains more than two evaluations or the opinions of multiple opinion holders. Fortunately, in real life, most sentences express only one emotion.soulmate dreams
Since sentence-level sentiment analysis has the same limitations as chapter-level sentiment analysis, whats the point of doing sentence-level sentiment analysis?soulmate dreams
On this issue, we need to explain the difference between subjective and objective sentences in linguistics. In our daily language, sentences can be divided into subjective sentences and objective sentences according to whether there is subjective emotion of the speaker in the sentence, for example: I like this new mobile phone. It is a subjective sentence that expresses the inner emotion or opinion of the speaker, and: This APP was updated with new functions yesterday. It is an objective sentence, which states an objective factual information and does not contain the subjective emotions of the speaker.soulmate dreams
By distinguishing whether a sentence is a subjective sentence, it can help us filter out some sentences that do not contain emotion, making data processing more efficient.soulmate dreams
But in practice, we will find that such a classification method does not seem to be particularly accurate, because a subjective sentence may not express any emotional information, and knowledge expresses expectations or guesses. Example: I think hes on his way home now. This sentence is a subjective sentence, expressing the speakers guess, but not expressing any emotion.soulmate dreams
And the objective sentence may also contain emotional information, indicating that the speaker does not want this fact to happen, for example: the new car just bought yesterday was scratched. This sentence is an objective sentence, but combined with common sense, we will find that this sentence actually contains the negative emotions of the speaker.soulmate dreams
Therefore, just classifying sentences subjectively and objectively is not enough to meet the requirement of filtering data. What we need is to classify whether sentences contain emotional information. If a sentence directly expresses or implies emotional information, it is considered that the sentence contains emotional opinions, and the sentences without emotional opinions can be filtered.soulmate dreams
At present, most of the classification techniques for whether sentences contain emotional information use supervised learning algorithms. This method requires a large amount of manual annotation data to classify sentences based on sentence features.soulmate dreams
In short, we can divide sentence-level sentiment analysis into two steps:soulmate dreams
- The first step is to determine whether the sentence to be analyzed contains opinion information;< li> The second step is to analyze the sentiment of these sentences containing opinion information, find the tendency of the sentiment, and judge whether it is positive or negative.
The method of analyzing sentiment tendency is similar to the chapter level, and it can still be processed by supervised learning or by the method of sentiment word dictionary, which we will explain in detail in the following sections. Sentence-level sentiment analysis has more granularity than chapter-level sentiment analysis, but it can only judge the overall sentiment, ignoring the attributes of the entity being evaluated, and it cannot judge comparative sentiment views.soulmate dreams
For example: the user experience of product A is much better than that of product B. For a sentence that expresses multiple emotions in such a sentence, we cannot simply classify it as a positive or derogatory emotion, but we need to further refine the granularity, extract the attributes of the evaluation entity, and combine the attributes Correlates with related entities, which is attribute-level sentiment analysis.soulmate dreams
3. Attribute-level sentiment analysissoulmate dreams
The above-mentioned chapter-level and sentence-level sentiment analysis cannot know exactly what the evaluators like and dislike. What is the specific thing that you like, and at the same time, it is impossible to distinguish the situation of having a positive tendency towards the A attribute of a certain evaluated entity, but a derogatory tendency towards the B attribute. But in actual language expression, a sentence may contain multiple viewpoints with different emotional tendencies.soulmate dreams
Example: I like the decor of this restaurant, but the taste of the food is average. Similar to such sentences, it is difficult to understand the attribute level of the object through the sentiment analysis at the text level and the sentence level.soulmate dreams
In order to be more refined on the basis of sentence-level analysis, we need to find or extract the subject information of the evaluation object from the text, and judge whether the evaluator expresses praise for each attribute according to the context of the text. Or derogatory emotions, this is called attribute-level sentiment analysis.soulmate dreams
Attribute-level sentiment analysis focuses on the evaluated entities and their attributes, including evaluators and evaluation time. A complete quintuple opinion summary of the target entity and its attributes.soulmate dreams
From a technical point of view, attribute-level sentiment analysis can be divided into the following six steps:soulmate dreams
- Entity extraction and resolution:Extraction All expressions involving entities in the document, and clustering methods are used to group the expressions of the same entity into one class, each class corresponding to a unique entity.
- Attribute extraction and resolution:Extract the attributes of all entities in the document, and cluster these attributes. Each attribute category corresponds to a unique attribute of the object entity.
- Extraction and elimination of opinion holders:Extract the opinion holders in the document, and cluster the holders. Each opinion holder category corresponds to a unique opinion holder. an opinion holder.
- Time Extraction and Normalization:Extract the publication time of each opinion and normalize the format of different times.
- Sentiment classification and regression of attributes:Sentiment analysis is performed on a specific attribute to determine whether it is a positive, derogatory or neutral emotion, or assign a numerical value to the attribute through a regression algorithm sentiment score, such as 1 to 5.
- Generate opinion quintuple:Use the results of tasks 1-6 to construct a quintuple of all opinions in the document.
About entity extraction and referential resolution in text, we have already introduced them in the relevant chapters of knowledge graph, so I wont repeat them here. For the three types of sentiment analysis tasks: text-level, sentence-level, and attribute-level, a lot of research has been done and many classification methods have been proposed. These methods can be roughly divided into two types: dictionary-based and machine learning-based. Detailed explanation.soulmate dreams
Dictionary-based sentiment analysissoulmate dreams
Sentiment words are inseparable from sentiment analysis. Sentiment words are the most basic units that carry sentiment information, except for basic words In addition, some phrases and idioms that contain emotional meanings are also collectively referred to as emotional words. The sentiment analysis method based on sentiment dictionary is mainly based on a dictionary containing marked sentiment words and phrases, in which the sentiment tendency and sentiment intensity of sentiment words are included. Generally, positive sentiments are marked as positive numbers, and derogatory sentiments are generally marked as positive numbers. The sentiment label for is negative.soulmate dreams
The specific steps are shown in the figure. First, the text to be analyzed is firstly segmented, and the result of the segmentation is preprocessed to remove text data such as stop words and useless words. Then, the result of the word segmentation is matched with the words in the sentiment dictionary, and the text is added according to the sentiment score marked in the dictionary. If the final calculation result is positive, it is a positive sentiment, if it is negative, it is a derogatory sentiment. Scores with insignificant affective tendencies were either neutral affective or no affective.soulmate dreams
Dictionary-based sentiment analysis processsoulmate dreams
Sentiment dictionary is the core of the entire analysis process. The quality of sentiment word labeling data directly determines the result of sentiment classification. In this regard, it can be directly used Existing open source sentiment dictionaries. For example: BosonNLPs sentiment dictionary based on Weibo, news, forums and other data sources, Hownet sentiment dictionary, National Taiwan University Simplified Chinese sentiment polarity dictionary (NTSUSD), snownlp framework dictionary, etc. At the same time, you can also use Harbin Institute of Technology to organize The synonym word forest expands the dictionary as an auxiliary, through this dictionary, you can find the synonyms of emotional words and expand the scope of the emotional dictionary.soulmate dreams
Of course, we can also train sentiment dictionaries according to the needs of the business. At present, there are three construction methods for mainstream sentiment word dictionaries: artificial methods, dictionary-based methods and corpus-based methods.soulmate dreams
As for the sentiment assignment of sentiment words, the easiest way is to assign all positive sentiment words as +1, derogatory sentiment words as -1, and finally add them to get the result of sentiment analysis. However, this assignment method obviously does not meet the actual needs. In actual language expression, there are many expressions that can change the intensity of emotion, and the most typical one is degree adverbs.soulmate dreams
There are two kinds of degree adverbs:soulmate dreams
soul mate meaning spiritual
One is to strengthen the original emotion of emotional words, which is called emotional intensifiers, such as good compared to good emotion The degree will be stronger, very good and stronger than very good. The other is emotional weakening words, such as not so good, although it is also a positive tendency, but the emotional intensity will be much weaker than that of good. If an enhanced word appears, the sentiment score needs to be increased on the basis of the original assignment, and if a weakened word appears, the corresponding sentiment score needs to be reduced.soulmate dreams
Another situation that needs attention No, making it a derogatory term.soulmate dreams
Early research will directly take the opposite number of the emotion words that are matched with negative words, that is, if the good emotional tendency is +1, then the bad emotional tendency is -1. However, this simple and rude rule cannot correspond to the real expression of emotion. For example, too good is a word with a stronger tendency to praise than good. If the value of good is +1, then too good can be assigned a value of +3, plus a negative word is not very good to -3 is obviously a bit too derogatory, and it may be more appropriate to assign it to -1 or -0.5.soulmate dreams
Based on this situation, we can also add a degree of assignment to negative words instead of simply taking the opposite number. For words that express strong negation, for example, they are not assigned as ±4. When it encounters a combination with a compliment word, the compliment word takes a negative number, and the combination with a derogatory word takes a positive number. For example, the unpleasant assignment of a pejorative word is -3, and the negative word becomes a less unpleasant emotional score. (-3+4=1).soulmate dreams
The third case to be aware of is conditional words. If a conditional word appears in a sentence, the sentence is probably not suitable for sentiment analysis. For example, if I can travel tomorrow, then I Must be very happy. , there are obvious positive sentiment words in this sentence, but because of the conditional word if, this sentence does not express the true emotions of the opinion holder, but an assumption.soulmate dreams
In addition to conditional sentences, there is another language expression that needs to be excluded in the data preprocessing stage, that is, interrogative sentences.soulmate dreams
For example, is this restaurant really as good as you say it is? , although it is good to have a very strong positive sentiment word in the sentence, it still cannot be classified as a positive sentence. Interrogative sentences usually have fixed endings, such as…? Or…? , but some interrogative sentences will omit the ending words and use punctuation directly? , for example, are you unhappy today? , this sentence contains unhappiness composed of negative words and positive words, but it cannot be classified as a derogatory emotion.soulmate dreams
The last situation to be aware of is transition words, typical words are, however, the emotional tendencies that appear before the transition words are usually opposite to the sentiment tendencies after the transition words, for example: I was in this hotel last time The accommodation experience was very good, but this time I was disappointed. In this transitional sentence, very good before the transitional word is a strong positive word, but the real emotional expression is very disappointed after the transitional word, which should finally be classified as a derogatory emotion.soulmate dreams
Of course, there are also situations where there are turning words, but the emotion of the sentence itself has not changed. For example, you have made great progress in the exam this time, but I think you can do it Better, the turning word here does not have a turning meaning, but a progressive meaning.soulmate dreams
In actual operation, we need to first determine which emotional expression of the transition sentence is true before we can perform correct analysis and calculation.soulmate dreams
Constructing a sentiment dictionary is a labor-intensive task. In addition to the above-mentioned problems, there are also problems such as low accuracy and difficulty in quickly including new words and Internet terms into the dictionary. At the same time, based on the dictionary There are also many limitations in the analysis method.soulmate dreams
For example, a sentence may have emotional words, but no emotion. Or a sentence that does not contain any emotional words, but does contain the emotion of the speaker. And the problem that the meaning of some emotional words will change with the change of the context. For example, the word shrewd can be used as a positive word to praise others, and it can also be used as a negative word to criticize others.soulmate dreams
Although there are many problems at present, the dictionary-based sentiment analysis method also has an irreplaceable advantage, that is, this analysis method is highly versatile, and in most cases it can be done without special field data annotation. Analyze the sentiment expressed by the text, which can be the preferred solution for general domain sentiment analysis.soulmate dreams
Emotion recognition based on machine learningsoulmate dreams
We have introduced many classification algorithms in the chapter on machine learning algorithms, such as logistic regression, Naive Bayes, KNN etc. These algorithms can all be used for emotion recognition.soulmate dreams
The specific method, like machine learning, needs to be divided into two steps: the first step is to construct an algorithm model based on the training data; the second step is to input the test data into the algorithm model and output the corresponding results, then connect the Lets do a detailed explanation.soulmate dreams
First of all, we need to prepare some text data for training, and manually label the data for sentiment classification. Under the usual practice: if it is a two-category of positive meaning and negative meaning, the positive meaning is marked as 1, and the negative meaning is marked as 0. If it is three categories of positive meaning, negative meaning and neutral meaning, the positive meaning is marked as 1, the neutral meaning is marked as 0, and the negative meaning is marked as 0. is -1.soulmate dreams
In this link, if a purely manual method is used for labeling, it may have a certain impact on the labeling results due to personal subjective factors. For efficiency, there are some other tricky ways to automatically label the data.soulmate dreams
For example: In the field of e-commerce, product reviews usually have a 5-star rating in addition to text data. We can use the users 5-star rating as the labeling basis. If it is 1 -2 stars are marked as derogatory, 3 stars are marked as neutral, and 4-5 stars are marked as positive.soulmate dreams
Another example: in the community field, many communities will have the function of like and dislike on posts, and this data can also be used as a reference for sentiment annotation.soulmate dreams
The second step is to segment the text marked with emotional tendencies, and preprocess the data. The previous article has already introduced a lot of word segmentation, so I wont go into too much detail here.soulmate dreams
The third step is to mark the words with emotional characteristics from the results of word segmentation. Here, I will say that if you are classifying emotions, you can refer to the emotional dictionary for labeling, or you can use the TF-IDF algorithm. Automatically extract the feature words of the document for labeling. If the analysis is in a specific field, it is also necessary to mark the words of the specific field. For example, for sentiment analysis of product evaluation, it is necessary to mark the product name, category name, attribute name, etc.soulmate dreams
The fourth step is to construct a bag of words model according to the word frequency of word segmentation to form a matrix of characteristic words, as shown in the table. In this step, each feature word can be weighted according to business needs, and the feature word score can be obtained by multiplying the word frequency by the weight.soulmate dreams
The last step is to use the feature word matrix as input data according to the classification algorithm to obtain the final classification model.soulmate dreams
After the classification model is trained, the test set can be classified. The specific process is similar to the modeling process. First, the test text data is segmented and data preprocessed, and then according to The feature word matrix extracts the feature words of the test text to construct a word bag matrix, and the word frequency data of the word bag matrix is substituted into the previously trained model as input data for classification, and the classification result is obtained.soulmate dreams
Using machine learning-based methods for sentiment analysis has the following shortcomings:soulmate dreams
- The first is that the language description differences between each application domain lead to training The resulting classification model cannot be applied to other domains and needs to be constructed separately.
- Secondly, the final classification effect depends on the selection of training text and the correct sentiment labeling, and peoples understanding of sentiment is subjective, if the labeling is biased, it will affect the final result.
In addition to dictionary-based and machine learning-based methods, some scholars use the two in combination to make up for the shortcomings of the two methods, and the classification effect is better than using one method alone. .soulmate dreams
In addition, some scholars have tried to use deep learning methods such as LSTM to analyze sentiment. I believe that in the future, sentiment analysis will be applied to more products to help us better understand user needs and improve User experience with smart products.soulmate dreams
Difficulties and challenges of emotion recognitionsoulmate dreams
With the application of algorithms such as deep neural networks, the research direction of emotion analysis has made great progress. However, there are still some problems that have not yet been solved. In the process of practical operation, special attention should be paid to the following types of data:soulmate dreams
(1) Kaomoji, emoji and expression packssoulmate dreams
Communication on the Internet is not only carried out through simple words, a large number of emotional expressions are realized through emoji or emoticons, such as the classic emoji for smiling faces :D, such texts Expressions cannot be linked to context, so it is difficult to tell what the physical objects they are evaluating are.soulmate dreams
But fortunately, this kind of data itself represents a very strong emotional tendency. To analyze emotions at the granularity of the chapter and sentence levels, we can use a specific emoji as a special phrase. Build an emotion dictionary and manually assign emotion scores. For emoji expressions, standard emoji codes can also be compiled into the emotion dictionary. The recognition of emoticons is a computer vision problem. At present, no scholars have carried out research in this field.soulmate dreams
(2) Ironic sentencesoulmate dreams
Ironic sentence is a special kind of emotional expression. Commendation, but the actual meaning is derogatory, or the literal meaning is derogatory but the actual meaning is compliment.soulmate dreams
Example: Great! This takeaway cured my years of constipation!soulmate dreams
Ironic sentences are very difficult to deal with in sentiment analysis, because to distinguish the meaning of such sentences, it is usually necessary to combine common sense or relevant background knowledge to understand, and it is impossible to correct only through context. Decipher the meaning of sarcasm. Ironic sentences are not common in the evaluation of commodities, but are more common in the evaluation of public opinion or social news. Identifying ironic sentences is a research difficulty in the direction of sentiment analysis.soulmate dreams
(3) Comparative sentencessoulmate dreams
Comparative sentences are also a special kind of emotional expression, for example: I think this dress suits me well, but I Prefer that one.soulmate dreams
There are usually more than two entities or attributes in this kind of comparison sentence. If only at the sentence-level granularity, it can be identified that the sentence contains a positive sentiment, but at the attribute-level granularity Under the circumstance, the emotion defined by the emotion quintuple cannot judge one entity as the attribute of another entity, and it is difficult to distinguish which entity or attribute the opinion holder expresses emotion to. Such statements are very common in product reviews and require special attention.soulmate dreams
(4) Sentiment classificationsoulmate dreams
At present, the analysis of sentiment is still in its infancy, and only three kinds of sentiments are positive, negative and neutral. However, the emotions in real life are far more than these three types, for example: in the field of psychology, the emotional wheel proposed by the famous psychologist Robert Plutchik contains 8 basic emotions, And each emotion is divided into different emotional intensity levels, and the 8 emotions can also be combined with each other to form more emotions, as shown in the figure.soulmate dreams
Placeks emotional wheelsoulmate dreams
The emotional wheel is widely used in user experience design, and many emotional designs are based on the emotional wheel . But in the field of artificial intelligence, multi-classification of emotions is much more difficult than the three-classification task of sentiment analysis, and most current classification methods are less than 50% accurate.soulmate dreams
This is because the emotion itself contains too many categories, and different categories may have similarities. An emotional word may express different emotional categories in different contexts. Algorithms have a hard time classifying it. Even manual sentiment labeling of text is often ineffective, because sentiment is very subjective, and different people may have different understandings of different texts, which makes the process of manually labeling sentiment analogies extremely difficult.soulmate dreams
How to make machines understand real emotions is still an unsolved problem.soulmate dreams
This article was originally published by @黄汉星 Originally published on Everyone is a Product Manager. Reproduction is prohibited without permissionsoulmate dreams
The title image is from Unsplash, based on the CC0 agreementsoulmate dreams