In the rapidly growing realm of online retail or e-commerce, people often shop for intended products via online retail platforms due to convenience and flexibility purposes. The purchasing decision process of a customer towards a certain brand or product can be determined by the online product reviews which are generated by other customers who share their user experience towards the specific brand or product without any boundaries. These reviews are valuable from a data-driven perspective as consumer sentiments can be derived from them, which then can be used to gain deeper insights into user satisfaction and perceived product quality and brand reputation by customers, thus assisting the company in fine-tuning their business strategies towards the product promptly, cater more towards consumer product preference, and drive their business to new heights. This data-driven process is known as sentiment analysis, which is an analysis towards data such as sentences, audio and images, based on the underlying sentiment contents and classifying their sentiment polarity into positive, neutral and negative so that deeper insights can be gained into one’s opinions and attitudes towards a certain object. However, unlike structured data, product reviews are unstructured, subjective and nuanced due to the constant variability and ambiguity in human language expression, context and linguistic-related aspects such as slang and emoticons, thus making it difficult to analyse in addition to the increasing number of reviews and the length of reviews which vary from short to long depending on the way the user wants to express their views. Therefore, one of the ways to perform sentiment analysis in this context is deep learning techniques, which are a subfield of machine learning that uses multiple processing layers to learn and model high-level, complex patterns and representations in text data automatically. Some of the techniques used in past studies for sentiment analysis include CNN, LSTM and bi-LSTM as sequence-based review text data can be better managed while their contextual meaning can be better captured, thus enhancing sentiment classification for product or service reviews.
The revolution of how consumers shop online was initiated by e-commerce platforms such as Amazon, Shopee, and Lazada. Nowadays, consumers can sit back at home, navigating through all the products or services available on retail websites, and with just a few clicks, they can purchase anything they want. As homo sapiens learn through feedback and experience, be it themselves or from others, their purchasing decisions are greatly driven by product reviews that are left by other product users who share their user experience and product quality with others. Therefore, the underlying sentiments in these reviews must be analysed so that potential issues with the products can be discovered and their product offerings can be enhanced to boost consumer satisfaction towards the products. E-commerce sentiment analysis on the large volume of product review data, however, can be challenging due to the nature of human language which has increased complexity and informality to the extent that sentiment capturing can be hindered by the reviews that are sometimes full of noises such as stopwords, incomplete reviews and nuanced expressions using machine learning approaches. Hence, deep learning techniques may be a more effective approach for sentiment analysis as complex and intricate between-word relationships and patterns can be better captured, thus enhancing the accuracy of sentiment classification.
The purpose of this deep learning project is to develop different deep learning models for three-class sentiment classification of e-commerce product reviews into positive, negative and neutral. Therefore, the following objectives are outlined:
- To develop three initial deep learning models including hybrid CNN-LSTM and two stacked CNN-biLSTM
- To perform model tuning and validation on each of the above-mentioned hybrid deep learning models
- To evaluate the performance of the initial models and their respective tuned models