Commit 7c4096e6 authored by  Joel  Oksanen's avatar Joel Oksanen
Browse files

1) Started report 2) Implemented item file

parent f1cba1d6
......@@ -7,34 +7,13 @@ from anytree import Node, PostOrderIter
from functools import reduce
from matplotlib import pyplot
from scipy.stats import pearsonr
from sklearn.metrics import mean_absolute_error
import pickle
from review_tokenizer import tokenize_review, reduce_noise
from item import *
sentiment_threshold = 0.95
camera = Node('camera')
image = Node('image', parent=camera)
video = Node('video', parent=camera)
battery = Node('battery', parent=camera)
flash = Node('flash', parent=camera)
audio = Node('audio', parent=camera)
price = Node('price', parent=camera)
shipping = Node('shipping', parent=camera)
reviewables = [camera, image, video, battery, flash, audio, price, shipping]
features = [image, video, battery, flash, audio, price, shipping]
glossary = {
camera: ['camera', 'device', 'product'],
image: ['image', 'picture', ' pic '],
video: ['video'],
battery: ['battery'],
flash: ['flash'],
audio: ['audio', 'sound'],
price: ['price', 'value', 'cost'],
shipping: ['ship']
f = open('camera_review_classifier.pickle', 'rb')
classifier = pickle.load(f)
......@@ -212,8 +191,13 @@ for product_id, reviews in grouped:
# calculate Pearson's correlation
correlation, _ = pearsonr(camera_strengths, star_rating_averages)
print("pearson correlation: ", correlation)
# calculate MAE
scaled_star_rating_avgs = list(map(lambda x: (x - 1) / 4, star_rating_averages))
mae = mean_absolute_error(scaled_star_rating_avgs, camera_strengths)
print("mae: ", mae)
# plot result correlation
pyplot.scatter(camera_strengths, star_rating_averages)
pyplot.scatter(camera_strengths, scaled_star_rating_avgs)
\@writefile{toc}{\contentsline {chapter}{\numberline {1}Introduction}{5}{chapter.1}\protected@file@percent }
\@writefile{lof}{\addvspace {10\p@ }}
\@writefile{lot}{\addvspace {10\p@ }}
\@writefile{toc}{\contentsline {section}{\numberline {1.1}Objectives}{5}{section.1.1}\protected@file@percent }
\@writefile{toc}{\contentsline {section}{\numberline {1.2}Challenges}{5}{section.1.2}\protected@file@percent }
\@writefile{toc}{\contentsline {section}{\numberline {1.3}Contributions}{5}{section.1.3}\protected@file@percent }
\@writefile{toc}{\contentsline {chapter}{\numberline {2}Background}{6}{chapter.2}\protected@file@percent }
\@writefile{lof}{\addvspace {10\p@ }}
\@writefile{lot}{\addvspace {10\p@ }}
\@writefile{toc}{\contentsline {chapter}{\numberline {3}PROJECT X}{7}{chapter.3}\protected@file@percent }
\@writefile{lof}{\addvspace {10\p@ }}
\@writefile{lot}{\addvspace {10\p@ }}
\@writefile{toc}{\contentsline {chapter}{\numberline {4}Evaluation}{8}{chapter.4}\protected@file@percent }
\@writefile{lof}{\addvspace {10\p@ }}
\@writefile{lot}{\addvspace {10\p@ }}
\@writefile{toc}{\contentsline {chapter}{\numberline {5}Conclusion}{9}{chapter.5}\protected@file@percent }
\@writefile{lof}{\addvspace {10\p@ }}
\@writefile{lot}{\addvspace {10\p@ }}
\@writefile{toc}{\contentsline {chapter}{\numberline {A}First Appendix}{10}{appendix.A}\protected@file@percent }
\@writefile{lof}{\addvspace {10\p@ }}
\@writefile{lot}{\addvspace {10\p@ }}
This is 8-bit Big BibTeX version 0.99d
Implementation: C for Unix
Release version: 3.71 (04 mar 2019)
The 8-bit codepage and sorting file: 88591lat.csf
The top-level auxiliary file: /Users/joeloksanen/individual_project/individual_project_report/.texpadtmp/main.aux
The style file: plain.bst
Reallocated glb_str_ptr (elt_size=8) to 10 items from 0.
Reallocated global_strs (elt_size=20001) to 10 items from 0.
Reallocated glb_str_end (elt_size=8) to 10 items from 0.
Database file #1: ../bibs/export.bib
Warning--I didn't find a database entry for "greenwade93"
Here's how much of BibTeX's memory you used:
Cites: 0 out of 750
Fields: 0 out of 5000
Hash table: 29849 out of 30000
Strings: 496 out of 30000
Free string pool: 4139 out of 65000
Wizard functions: 2118 out of 3000
(There was 1 warning)
\babel@toc {english}{}
\addvspace {10\p@ }
\addvspace {10\p@ }
\addvspace {10\p@ }
\addvspace {10\p@ }
\addvspace {10\p@ }
\addvspace {10\p@ }
This diff is collapsed.
\babel@toc {english}{}
\addvspace {10\p@ }
\addvspace {10\p@ }
\addvspace {10\p@ }
\addvspace {10\p@ }
\addvspace {10\p@ }
\addvspace {10\p@ }
\BOOKMARK [0][-]{chapter.1}{Introduction}{}% 1
\BOOKMARK [1][-]{section.1.1}{Objectives}{chapter.1}% 2
\BOOKMARK [1][-]{section.1.2}{Challenges}{chapter.1}% 3
\BOOKMARK [1][-]{section.1.3}{Contributions}{chapter.1}% 4
\BOOKMARK [0][-]{chapter.2}{Background}{}% 5
\BOOKMARK [0][-]{chapter.3}{PROJECT X}{}% 6
\BOOKMARK [0][-]{chapter.4}{Evaluation}{}% 7
\BOOKMARK [0][-]{chapter.5}{Conclusion}{}% 8
\BOOKMARK [0][-]{appendix.A}{First Appendix}{}% 9
This diff is collapsed.
\babel@toc {english}{}
\contentsline {chapter}{\numberline {1}Introduction}{5}{chapter.1}%
\contentsline {section}{\numberline {1.1}Objectives}{5}{section.1.1}%
\contentsline {section}{\numberline {1.2}Challenges}{5}{section.1.2}%
\contentsline {section}{\numberline {1.3}Contributions}{5}{section.1.3}%
\contentsline {chapter}{\numberline {2}Background}{6}{chapter.2}%
\contentsline {chapter}{\numberline {3}PROJECT X}{7}{chapter.3}%
\contentsline {chapter}{\numberline {4}Evaluation}{8}{chapter.4}%
\contentsline {chapter}{\numberline {5}Conclusion}{9}{chapter.5}%
\contentsline {chapter}{\numberline {A}First Appendix}{10}{appendix.A}%
\chapter{First Appendix}
\ No newline at end of file
This diff is collapsed.
author={Julian McAuley and Jure Leskovec},
title={Hidden Factors and Hidden Topics: Understanding Rating Dimensions with Review Text},
booktitle={Proceedings of the 7th ACM Conference on Recommender Systems},
series={RecSys ’13},
publisher={Association for Computing Machinery},
address={New York, NY, USA},
location={Hong Kong, China},
author={Fatih Gedikli and Dietmar Jannach and Mouzhi Ge},
title={How should I explain? A comparison of different explanation types for recommender systems},
journal={International Journal of Human-Computer Studies},
note={ID: 272548},
author={Long Jiang and Mo Yu and Ming Zhou and Xiaohua Liu and Tiejun Zhao},
title={Target-dependent Twitter Sentiment Classification},
booktitle={Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies},
publisher={Association for Computational Linguistics},
address={Portland, Oregon, USA},
author={Li Dong and Furu Wei and Chuanqi Tan and Duyu Tang and Ming Zhou and Ke Xu},
title={Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification},
booktitle={Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
publisher={Association for Computational Linguistics},
address={Baltimore, Maryland},
author={Filip Radlinski and Nick Craswell},
title={A Theoretical Framework for Conversational Search},
booktitle={Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval},
series={CHIIR ’17},
publisher={Association for Computing Machinery},
address={New York, NY, USA},
location={Oslo, Norway},
author={Haoyu Song and Wei-Nan Zhang and Yiming Cui and Dong Wang and Ting Liu},
title={Exploiting Persona Information for Diverse Generation of Conversational Responses},
booktitle={Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, {IJCAI-19}},
publisher={International Joint Conferences on Artificial Intelligence Organization},
author={Lei Zhang and Shuai Wang and Bing Liu},
title={Deep learning for sentiment analysis: A survey},
journal={Wiley Interdiscip.Rev.Data Min.Knowl.Discov.},
author={Oana Cocarascu and Antonio Rago and Francesca Toni},
year={Jun 1, 2019},
title={From formal argumentation to conversational systems},
publisher={Association for Computing Machinery},
author={Oana Cocarascu and Antonio Rago and Francesca Toni},
title={Extracting Dialogical Explanations for Review Aggregations with Argumentative Dialogical Agents},
booktitle={Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems},
series={AAMAS '19},
publisher={International Foundation for Autonomous Agents and Multiagent Systems},
address={Richland, SC},
location={Montreal QC, Canada},
\ No newline at end of file
\ No newline at end of file
In this chapter, we will first discuss the motivations behind the project and then specify its main objectives.
People spend an ever growing share of their earnings online, from purchasing daily necessities on e-commerce sites such as Amazon\footnote{} to streaming movies on services such as Netflix\footnote{}. As the market shifts online, people's purchase decisions are increasingly based on product reviews either accompanying the products on their e-commerce sites, or on specialised review websites, such as Rotten Tomatoes\footnote{} for movies. These reviews can be written by fellow consumers who have purchased the product or by professional critics, such as in the latter example, but what unites most online review platforms is the massive number of individual reviews: a particular type of electronic toothbrush can have more than 10,000 reviews on Amazon\footnote{}. As people cannot possibly go through all of the individual reviews, purchase decisions are often based on various kinds of review aggregations. The presentation of a review aggregation must be concise and intuitive in order to be effective, but it must also retain some nuances of the original reviews, so that consumers can understand \textit{why} a product is considered good or bad and if the arguments align with their individual preferences. \par
Clear explanations of review aggregations are also central to the success of e-commerce site recommender systems, as it has been shown that explanations can help to improve the overall acceptance of a recommender system \cite{RefWorks:doc:5e2f3970e4b0241a7d69e2a4} and their recommendations are often largely based on other consumers' reviews.
There have already been some attempts to improve explanations for review aggregations, some of which are discussed in Chapter 2. One such attempt is what is called an Argumentative Dialogical Agent (ADA), proposed by Cocarascu et al. \cite{RefWorks:doc:5e08939de4b0912a82c3d46c} and implemented for the Rotten Tomatoes and Trip Advisor\footnote{} platforms \cite{RefWorks:doc:5e0de20ee4b055d63d355913}. The goal of this project is to extend upon the work of Cocarascu et al. in order to design and implement a more generalised ADA to provide explanations for Amazon product reviews. The main objectives for the extended agent are threefold:
\item \textbf{Generalise} the agent to work with a larger variety of different products. Currently ADA has only been implemented for movie and hotel reviews, two highly homogeneous domains in which there is not much variance in key features and review language from one product to another. Implementing ADA for Amazon reviews will require more general NLP methods in extracting review aggregations.
\item \textbf{Enhance dialogue} between the user and the agent to support conversational search. Currently, the agent is able to respond to a limited number of questions centred solely around explanations for a single product review aggregation. Given the large amounts of current research into explainable recommender systems and the potential of ADA in this domain, we will extend its dialogue to support conversational search.
\item \textbf{Learn} from user feedback. The enhanced agent should be able to query and implement information and opinions provided by the user to improve its review aggregations and product recommendations.
In addition to the above, we will implement a conversational user interface for the ADA on the Alexa\footnote{\&node=13727921011/} virtual assistant. The user interface will provide the user with a novel way to obtain explainable product recommendations using voice commands on Alexa-compatible smart speakers.
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment