Commit 74ef519c authored by  Joel  Oksanen's avatar Joel Oksanen
Browse files

Finished intro and bg, started ontology extraction.

parent a2ecd08d
......@@ -12,20 +12,21 @@ class ConnectionManager: ObservableObject {
@Published var product = Product()
@Published var messages: [Message] = [Message]()
private let ip = "192.168.0.13"
private let ip = "192.168.1.104"
private let port = "8000"
private var messageQueue: [Message]? = nil
init() {
requestProduct(id: "B0075SUK14")
requestProduct(id: "B000AYW0M2", type: "watch")
// B00RTGK0N0 - red Canon camera
// B004J3V90Y - Canon T3i
// B0012YA85A - Canon Rebel XSI
// B003ZYF3LO - Nikon D3100
// B0075SUK14 - Backpack
}
private func requestProduct(id: String) {
let url = URL(string: "http://" + ip + ":" + port + "/ios_server/product/?id=" + id)!
private func requestProduct(id: String, type: String) {
let url = URL(string: "http://" + ip + ":" + port + "/ios_server/product/?id=" + id + "&type=" + type)!
let task = URLSession.shared.dataTask(with: url) {(data, response, error) in
guard let data = data else { return }
......
\chapter{Feature-dependent sentiment analysis}
\section{Exploration}
\subsection{Sentence-level SA trained on domain-specific data}
\subsection{Feature-dependent SA trained on SemEval-2014 data}
\section{Evaluation}
\ No newline at end of file
\chapter{Conclusion}
\ No newline at end of file
\chapter{Conclusion}
\section{Further work}
\subsection{Applications}
\ No newline at end of file
\chapter{Evaluation}
\ No newline at end of file
\chapter{Ontology extraction}
\section{Exploration}
In this paper, we will limit the extraction of features to unigram, bigram, and trigram nouns, as the vast majority of terms for products and features fall into these categories. Although not strictly necessary, this will greatly help limit our search for features within the review texts with little effect on the recall of our model.
%\subsection{ConceptNet}
%
%\subsection{Hand-crafted features}
%
%\subsection{Masked BERT for unsupervised ontology extraction}
\section{Implementation}
Our method of ontology extraction is a multi-step pipeline using both hand-crafted grammatical features and two BERT-based models trained using \textit{distantly supervised learning}. The first model is used for \textit{named-entity recognition} (NER) of the various features of the product, while the second model is used for \textit{relation extraction} (RE) in order to extract sub-feature relations between the recognised features. In addition, we use a \textit{Word2Vec} \cite{RefWorks:doc:5edbafdde4b0482c79eb8d95} model to extract word vectors, which are used to group the features into sets of synonyms, or \textit{synsets}, using a method proposed by Leeuwenberg et al.\ \cite{RefWorks:doc:5eaebe76e4b098fe9e0217c2}. The pipeline is structured as follows:
\begin{enumerate}
\item Noun extraction
\item Feature extraction
\item Synonym extraction
\item Ontology extraction.
\end{enumerate}
In this section, we will first detail the annotation method used to obtain the training data for the two BERT-based models, after which we will go through the different pipeline steps in detail.
\subsection{Annotation of training data for masked BERT}
Annotating training data that would be representative of the whole set of Amazon products would be nearly impossible due to the sheer number of different product categories on Amazon. However, in review texts, certain grammatical constructs stay the same regardless of the product. Take for example the two review sentences:
\begin{center}
\textit{I love the \textbf{lens} on this \textbf{camera}} \quad and \quad \textit{I love the \textbf{material} of this \textbf{sweater}.}
\end{center}
Clearly, \textit{lens} is a feature of \textit{camera} and \textit{material} is a feature of \textit{sweater}. If we mask the entities mentioned in the two sentences, we obtain:
\begin{center}
\textit{I love the \textbf{e1} on this \textbf{e2}} \quad and \quad \textit{I love the \textbf{e1} of this \textbf{e2},}
\end{center}
which are nearly identical. So while the entities in review texts are domain-specific, the context is often largely domain-independent. Therefore, using the masked sentences, it would be possible to train a classifier to recognise that \textit{e1} and \textit{e2} are both entities, and that \textit{e1} is a sub-feature of \textit{e2}.
In fact, BERT has inbuilt support for masking, as it plays a central role in its pre-training phase. One of the two tasks on which BERT is pre-trained is \textit{masked language modelling}, in which randomly chosen words are replaced with a \texttt{[MASK]} token and the model is asked to predict the masked words in a large corpus of text.
Of course, the above example is an idealised scenario. If we mask the entities in the following review sentences:
\begin{center}
\textit{The \textbf{camera housing} is made of shiny black \textbf{plastic} but it feels nicely weighted and solid}
and
\textit{It's a lovely warm \textbf{jumper} that has a nice \textbf{feel} to it,}
\end{center}
we obtain the masked sentences:
\begin{center}
\textit{The \textbf{e1} is made of shiny black \textbf{e2} but it feels nicely weighted and solid}
and
\textit{It's a lovely warm \textbf{e1} that has a nice \textbf{e2} to it,}
\end{center}
which still contain some rather domain-specific terms such as \textit{shiny}, \textit{weighted}, and \textit{warm}. However, these words are less likely to be product-specific, but are often common to wider categories of products, such as electronics or clothing. Therefore, there is no need to annotate each and every product to represent the whole set of Amazon products, but a small and varied set of products should be enough to train a domain-independent classifier.
Even annotating texts for just a few products would still require the annotation of at least thousands of texts in order to obtain a sufficiently large dataset. However, we can reduce the amount of work substantially by taking advantage of \textit{distantly supervised learning}, where rather than annotating each and every sentence by hand, we automatically annotate a large amount of text using predetermined heuristics. In our case, the heuristic will be a manually labelled ontology for the product: using this ontology, an automated process can search review texts for terms that appear in the ontology and label as well as mask them correctly. This will allow us to easily annotate data for multiple products, as we will only have to annotate their ontologies, which usually consist of no more than a hundred terms.
Notice that distant supervision is made possible here by the masking of the terms in the ontologies. We are annotating a much smaller set of terms than the size of the resulting training set, which means that the training set will be highly saturated with the annotated terms. Therefore, if the words weren't masked, the classifier would simply learn to recognise the terms in the ontology and completely ignore their context. However, due to the masking of the terms, the classifier is forced to rely solely on their context, which is more varied.
\begin{figure}[h]
\centering
\includegraphics[width=12cm]{images/entity_annotator.png}
\caption{Entity annotator interface}
\label{fig:entity_annotator}
\end{figure}
A program was written to ease the annotation of the ontologies, the interface of which is shown in Figure \ref{fig:entity_annotator} for sweater reviews. The program takes as input review sentences for a given product category such as sweaters, and counts the number of times each noun occurs in the sentences using the same method as the ontology extractor, which is detailed in Section \ref{sec:noun_extraction}. It then displays the nouns one by one to the annotator in descending order of count, and for each noun, the annotator will label it as either the root of the ontology tree (the product), a sub-feature or a synonym of an existing node in the tree, or \textit{nan} for nouns that are not an argument (for example \textit{daughter} in the review text \textit{my daughter loves it}). In order to simplify the annotation process, a noun with a lower count can only be annotated as a sub-feature of a noun with a higher count, as the reverse is rarely true. The annotation of \textit{nan} nouns is important as it allows us to obtain negative samples for the training data.
Once the annotator has labelled a sufficient number of nouns, the program can use the annotated ontology to label the entire set of review sentences to be used as training data for either the feature or relation extraction models.
For the feature extraction training data, the program will select sentences with exactly one of the annotated nouns and label the sentence with a binary label representing if the masked word is an argument or not. Although selecting only a subset of the sentences will limit the amount of training data, it allows us to reduce the NER task to a binary classification problem instead of a sequence-labelling problem. Furthermore, even relatively small product categories have large amounts of review data available, so we can afford to prune some of it in order to improve the accuracy of our model. Table \ref{tab:fe_training_data} shows a positive and a negative example.
\begin{table}[h]
\centering
{\renewcommand{\arraystretch}{1.2}
\begin{tabular}{|c|c|c|}
\hline
\texttt{text} & \texttt{noun} & \texttt{is\textunderscore argument} \\
\hline \hline
"I love this sweater." & "sweater" & 1 \\
\hline
"My daughter loves it!" & "daughter" & 0 \\
\hline
\end{tabular}
}
\caption{Example training data for feature extraction}
\label{tab:fe_training_data}
\end{table}
For the relation extraction training data, the program will select sentences with exactly two of the annotated nouns, and mask both of the nouns as well as label the sentence with one of three labels: 0 for no relation between the masked nouns, 1 if the second masked noun is a feature of the first masked noun, and 2 if the first masked noun is a feature of the second masked noun. A noun $n_1$ is considered a feature of noun $n_2$ iff $n_1$ is a descendant of $n_2$ in the annotated ontology tree. For example, \textit{fabric} is considered a feature of both \textit{sweater} and \textit{material} based on the ontology tree in Figure \ref{fig:entity_annotator}. Table \ref{tab:re_training_data} shows examples for each of the labels.
\begin{table}[h]
\centering
{\renewcommand{\arraystretch}{1.2}
\begin{tabular}{|c|c|c|c|}
\hline
\texttt{text} & \texttt{noun\textunderscore 1} & \texttt{noun\textunderscore 2} & \texttt{is\textunderscore argument} \\
\hline \hline
"I like the colour and the material." & "colour" & "material" & 0 \\
\hline
"My daughter loves this sweater." & "daughter" & "sweater" & 0 \\
\hline
"The sweater's fabric is so soft." & "sweater" & "fabric" & 1 \\
\hline
"The colour of the sweater is beautiful." & "colour" & "sweater" & 2 \\
\hline
\end{tabular}
}
\caption{Example training data for relation extraction}
\label{tab:re_training_data}
\end{table}
We used the program to obtain training data for a variety of five randomly selected products: digital cameras, backpacks, laptops, guitars, and cardigans. After resampling to balance out the number of instances for each of the classes, we obtained the training data shown in Table \ref{tab:training_data}.
\begin{table}[h]
\centering
{\renewcommand{\arraystretch}{1.2}
\begin{tabular}{|c||c|c|}
\hline
& \multicolumn{2}{c|}{Number of training instances} \\
\cline{2-3}
& per product & total \\
\hline \hline
Feature extraction & 56,526 & 282,630 \\
\hline
Relation extraction & 25,110 & 125,550 \\
\hline
\end{tabular}
}
\caption{Training data counts for ontology extraction}
\label{tab:training_data}
\end{table}
\subsection{Noun extraction}
\label{sec:noun_extraction}
\subsection{Feature extraction}
\subsubsection{BERT for feature extraction}
\subsubsection{Training BERT model}
\subsection{Synonym extraction}
\subsection{Ontology extraction}
\subsubsection{BERT for relation extraction}
\subsubsection{Training BERT model}
\subsubsection{Ontology construction from votes}
\section{Evaluation}
\ No newline at end of file
\chapter{Introduction}
In this chapter, we will first discuss the motivations behind the project and then specify its main objectives.
In this chapter, we will first discuss the motivations behind the project and then specify its main objectives and contributions.
\section{Motivations}
People spend an ever growing share of their earnings online, from purchasing daily necessities on e-commerce sites such as Amazon\footnote{https://www.amazon.com/} to streaming movies on services such as Netflix\footnote{https://www.netflix.com/}. As the market shifts online, people's purchase decisions are increasingly based on product reviews either accompanying the products on their e-commerce sites, or on specialised review websites, such as Rotten Tomatoes\footnote{https://www.rottentomatoes.com/} for movies. These reviews can be written by fellow consumers who have purchased the product or by professional critics, such as in the latter example, but what unites most online review platforms is the massive number of individual reviews: a particular type of electronic toothbrush can have more than 10,000 reviews on Amazon\footnote{https://www.amazon.com/Philips-Sonicare-Electric-Rechargeable-Toothbrush/dp/B00QZ67ODE/}. As people cannot possibly go through all of the individual reviews, purchase decisions are often based on various kinds of review aggregations. The presentation of a review aggregation must be concise and intuitive in order to be effective, but it must also retain some nuances of the original reviews, so that consumers can understand \textit{why} a product is considered good or bad and if the arguments align with their individual preferences. \par
Clear explanations of review aggregations are also central to the success of e-commerce site recommender systems, as it has been shown that explanations can help to improve the overall acceptance of a recommender system \cite{RefWorks:doc:5e2f3970e4b0241a7d69e2a4} and their recommendations are often largely based on other consumers' reviews.
People spend an ever growing share of their earnings online, from purchasing daily necessities on e-commerce sites such as Amazon to streaming movies on services such as Netflix. As the market shifts online, people's purchase decisions are increasingly based on product reviews either accompanying the products on e-commerce sites, or on specialised review websites, such as Rotten Tomatoes\footnote{https://www.rottentomatoes.com/} for movies. These reviews can be written by fellow consumers who have purchased the product or by professional critics, such as in the latter example, but what unites most online review platforms is the massive number of individual reviews: a single wrist watch can have more than 20,000 reviews on Amazon\footnote{https://www.amazon.co.uk/Casio-Collection-Unisex-Adults-F-91W-1YER/dp/B000J34HN4}. As people cannot possibly go through all of the individual reviews, purchase decisions are often based on various kinds of review aggregations. The presentation of a review aggregation must be concise and intuitive in order to be effective, but it must also retain some nuances of the original reviews, so that consumers can understand \textit{why} a product is considered good or bad and if the arguments align with their individual preferences. \par
Explanations of review aggregations are also relevant to e-commerce site recommender systems, as the recommendations are often largely based on existing consumer reviews. It has been shown that explanations can help to improve the overall acceptance of a recommender system \cite{RefWorks:doc:5e2f3970e4b0241a7d69e2a4}, and therefore insightful explanations of review aggregations are important to their effectiveness.
\section{Objectives}
There have already been some attempts to improve explanations for review aggregations, some of which are discussed in Chapter 2. One such attempt is what is called an Argumentative Dialogical Agent (ADA), proposed by Cocarascu et al.\ \cite{RefWorks:doc:5e08939de4b0912a82c3d46c} and implemented for the Rotten Tomatoes and Trip Advisor\footnote{https://www.tripadvisor.com/} platforms \cite{RefWorks:doc:5e0de20ee4b055d63d355913}. The goal of this project is to extend upon the work of Cocarascu et al. in order to design and implement a more generalised ADA to provide explanations for Amazon product reviews. The main objectives for the extended agent are threefold:
Some systems have already been proposed to improve explanations for review aggregations, a few of which are discussed in Chapter 2. One such system is what is called an Argumentative Dialogical Agent (ADA), proposed by Cocarascu et al.\ in \cite{RefWorks:doc:5e08939de4b0912a82c3d46c}, which has been explored in the movie and hotel review domains of Rotten Tomatoes \cite{RefWorks:doc:5e08939de4b0912a82c3d46c} and Trip Advisor\footnote{https://www.tripadvisor.com/} \cite{RefWorks:doc:5e0de20ee4b055d63d355913}, respectively. ADA provides the user with insight about reviews for a particular product through conversation centred around the product's features. Cocarascu et al.\ provide the following example of a conversation about the film \textit{The Post}:
\begin{table*}
\centering
\begin{tabular}{@{}p{1cm}p{\textwidth-4cm}}
\textbf{User}:&\textit{Why was The Post highly rated?}\\
\textbf{ADA}:&\textit{The movie was highly rated because the \textbf{acting} was great, although the \textbf{writing} was poor.}\\
\textbf{User}:&\textit{Why was the acting considered to be great?}\\
\textbf{ADA}:&\textit{Its acting was considered to be great because \textbf{Meryl Streep} was great.}\\
\textbf{User}:&\textit{What did critics say about Meryl Streep being great?}\\
\textbf{ADA}:&\textit{"...Streep’s hesitations, rue, and ultimate valor are soul-deep..."}\\
\end{tabular}
\end{table*}
\newpage
In order to participate in such dialog, ADA has to have knowledge about the \textit{ontology} of the product (i.e. \textit{acting} is a sub-category of \textit{film} and \textit{Meryl Streep} is a sub-category of \textit{acting}), as well as knowledge about the reviewers' sentiments towards the various aspects of the film. ADA obtains some of this knowledge from the review texts using \textit{natural language processing} (NLP) methods. In the papers by Cocarascu et al., the ontologies for films and hotels are mostly given, although mining some aspects from the review texts was also experimented with. The opinions of reviewers were extracted from the review texts using \textit{sentiment analysis} (SA) methods.
The objective of this project is to extend upon the work of Cocarascu et al. in order to design and implement a more generalised ADA to provide explanations for any product reviews on Amazon. As Amazon contains reviews for products ranging from pet supplies to digital music and car parts, this involves taking a more unsupervised and domain-independent approach to the various NLP tasks within the agent, as well as considering the effects of such variance on the dialogue between the user and the agent. Furthermore, ADA's complex methodology has never before been implemented within a fully developed end-to-end system, presenting a considerable software engineering challenge.
\section{Contributions}
The main contributions of this project are threefold:
\begin{enumerate}
\item We propose and implement a novel method for domain-independent product ontology extraction, which allows us to build a \textit{feature-based representation} of any product in an unsupervised manner, an example of which is shown in Figure \ref{fig:toothbrushrepresentation} for wrist watches. The ontologies extracted with this method display significantly higher recall when compared to the widely used manually and autonomously annotated semantic networks WordNet and ConceptNet, respectively.
\item We implement a modified version of the method by Gao et al.\ \cite{RefWorks:doc:5ed3c3bbe4b0445db7f0a369} for feature-dependent sentiment analysis, which allows us to assess the reviewers' sentiment towards the various features of the product, whilst improving upon the accuracy of the phrase-level sentiment analysis in \cite{RefWorks:doc:5e08939de4b0912a82c3d46c}.
\item We implement ADA for the first time and from scratch as a fully developed system, with a backend agent including the aforementioned enhancements and a user-facing iOS chat application, which is shown in Figure \ref{fig:toothbrushscreenshot}.
\end{enumerate}
For both 1.\ and 2.\ we use BERT, a language model proposed by Devlin et al.\ \cite{RefWorks:doc:5e8dcdc3e4b0dba02bfdfa80}, which has produced state-of-the-art results in a wide variety of different NLP tasks.
\vspace{2cm}
\begin{figure}[h]
\centering
\begin{forest}
for tree = {l=2cm}
[watch
[band
[links]
[leather]
[clasp]
]
[face
[hands]
]
[price]
[quality]
[size]
[look]
[battery]
[\dots]
]
\end{forest}
\caption{Feature-based representation for a wrist watch}
\label{fig:toothbrushrepresentation}
\end{figure}
\begin{figure}[b]
\centering
\frame{\includegraphics[height=15cm]{images/watch_screenshot.png}}
\caption{A conversation between the user and ADA on the iOS app about a wrist watch}
\label{fig:toothbrushscreenshot}
\end{figure}
\begin{itemize}
\item \textbf{Generalise} the agent to work with a larger variety of different products. Currently ADA has only been implemented for movie and hotel reviews, two highly homogeneous domains in which there is not much variance in key features and review language from one product to another. Implementing ADA for Amazon reviews will require more general NLP methods in extracting review aggregations.
\item \textbf{Enhance dialogue} between the user and the agent to support conversational search. Currently, the agent is able to respond to a limited number of questions centred solely around explanations for a single product review aggregation. Given the large amounts of current research into explainable recommender systems and the potential of ADA in this domain, we will extend its dialogue to support conversational search.
\item \textbf{Learn} from user feedback. The enhanced agent should be able to query and implement information and opinions provided by the user to improve its review aggregations and product recommendations.
\end{itemize}
In addition to the above, we will implement a conversational user interface for the ADA on the Alexa\footnote{https://www.amazon.com/b?\&node=13727921011/} virtual assistant. The user interface will provide the user with a novel way to obtain explainable product recommendations using voice commands on Alexa-compatible smart speakers.
......@@ -13,6 +13,8 @@
\usepackage{graphicx}
\usepackage[colorinlistoftodos]{todonotes}
\usepackage[colorlinks=true, allcolors=blue]{hyperref}
\usepackage[edges]{forest}
\usepackage{multirow}
\usepackage{amsthm}
%% \DeclareMathSymbol{\Alpha}{\mathalpha}{operators}{"41}
......@@ -37,7 +39,7 @@
\theoremstyle{def}
\newtheorem{definition}{Definition}[chapter]
\title{Project Title}
\title{ADA Loves BERT: A System for Domain-Independent Feature-Based Product Review Aggregation}
\author{Joel Oksanen}
% Update supervisor and other title stuff in title/title.tex
......@@ -59,8 +61,9 @@ Your acknowledgements go here
\input{introduction/introduction.tex}
\input{background/background.tex}
\input{project/project.tex}
\input{evaluation/evaluation.tex}
\input{feature_extraction/feature_extraction.tex}
\input{SA/SA.tex}
\input{system/system.tex}
\input{conclusion/conclusion.tex}
\input{appendix/appendix.tex}
......
\chapter{Project}
\ No newline at end of file
\chapter{System}
\section{Architecture}
\subsection{Backend}
\subsection{iOS botplication}
\section{Evaluation}
\subsection{Performance}
\subsection{User evaluation}
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment