With the recent development of deep learning, research in artificial intelligence (AI) has gained new vigor and prominence. As stated by Noam Chomsky, however, "you do not get discoveries in the sciences by taking huge amounts of data, throwing them into a computer and doing statistical analysis of them: that’s not the way you understand things, you have to have theoretical insights". Sentic computing is a multi-disciplinary approach to natural language understanding that aims to bridge the gap between statistical natural language processing (NLP) and many other disciplines that are necessary for understanding human language, such as linguistics, commonsense reasoning, affective computing, and more. Sentic computing, whose term derives from the Latin 'sensus' (as in commonsense) and 'sentire' (root of words such as sentiment and sentience), takes a holistic approach to natural language understanding by concomitantly addressing many NLP problems (such as anaphora resolution, named entity recognition, microtext analysis, word sense disambiguation, aspect extraction, and more) and, hence, enables the analysis of text not only at document, page or paragraph level, but also at sentence, clause, and concept level. In particular, sentic computing's novelty gravitates around three key shifts:
1. Shift from mono- to multi-disciplinarity – evidenced by the concomitant use of AI and Semantic Web techniques, for knowledge representation and inference; mathematics, for carrying out tasks such as graph mining and multi-dimensionality reduction; linguistics, for discourse analysis and pragmatics; psychology, for cognitive and affective modeling; sociology, for understanding social network dynamics and social influence; finally ethics, for understanding related issues about the nature of mind and the creation of emotional machines.
2. Shift from syntax to semantics – enabled by the adoption of the bag-of-concepts model in stead of simply counting word co-occurrence frequencies in text. Working at concept-level entails preserving the meaning carried by multi-word expressions such as cloud_computing, which represent ‘semantic atoms’ that should never be broken down into single words. In the bag-of-words model, for example, the concept cloud_computing would be split into computing and cloud, which may wrongly activate concepts related to the weather and, hence, compromise categorization accuracy.
3. Shift from statistics to linguistics – implemented by allowing sentiments to flow from concept to concept based on the dependency relation between clauses. The sentence “iPhone7 is expensive but nice”, for example, is equal to “iPhone7 is nice but expensive” from a bag-of-words perspective. However, the two sentences bear opposite polarity: the former is positive as the user seems to be willing to make the effort to buy the product despite its high price, the latter is negative as the user complains about the price of iPhone7 although he/she likes it.
The core element of sentic computing is SenticNet, a knowledge base of 50,000 commonsense concepts. Unlike many other sentiment analysis resources, SenticNet is not built by manually labelling pieces of knowledge coming from general NLP resources such as WordNet or DBPedia. Instead, it is automatically constructed by applying graph-mining and multi-dimensional scaling techniques on the affective commonsense knowledge collected from three different sources, namely: WordNet-Affect, Open Mind Common Sense and GECKA. This knowledge is represented redundantly at three levels: semantic network, matrix, and vector space. Subsequently, semantics and sentics are calculated though the ensemble application of spreading activation, neural networks and an emotion categorization model. More details about this process are provided in the latest sentic computing book (chapter 2).
SenticNet can be used for different sentiment analysis tasks including polarity detection, which is perfomed by means of sentic patterns. In particular, a semantic parser is firstly used to deconstruct natural language text into concepts. Secondly, linguistic patterns are used in concomitance with SenticNet to infer polarity from sentences. If no match is found in SenticNet or in the linguistic patterns, machine learning is used. More details about this process are provided in the latest sentic computing book (chapter 3).