In the blog, we will discuss the concept of recommendation system, working approach of review based recommendation system, and python code for implementing it.
A recommendation system is a machine learning model that recommends online movies, clothes, blogs, and more to ease your selection process in a way that recommended products are based on your previous history of selection. For example, you’d see “top picks for you” on Netflix after watching a few movies or series on this content platform or you while searching for a few products or clothes on an online shopping platform like Amazon.
You would have gone through the recommendations it is offering to the users, you must’ve also seen the automated playlist created by an audio streaming platform like Spotify for you, all of this is a result of the Recommendation System. (Must read: How Spotify uses Machine Learning models?)
According to the report of Mckinsey, 75% of Netflix views are boosted with the help of recommendations systems whereas around 35% of Amazon purchases are boosted with the help of this machine learning algorithm.
Stats Showing Growth of Netflix and Amazon with the help of Recommendation System
While there are many types of recommendation systems such as Popularity based recommendation system, classification model, content-based recommendation system, and more, what we will be discussing is a review-based recommendation system in machine learning and how to implement it using python code.
Earlier, the recommendations were based on the product trends which means the product that is being used more was recommended almost to everyone, some other approaches used rating histories in order to provide recommendations. Later on, researchers dwelled a little and found that the user’s textual reviews could act as an important data source as input to the recommendation system. So in the review-based recommendation system, both textual reviews, as well as ratings or trends, can be used as input.
The main purpose, for which the review based recommendation system was developed, was to extract relevant information from the user’s textual review of a product, movie, or song. This is how we can amalgamate machine learning with natural language processing. (Related blog: Top 10 Natural Processing Languages (NLP) Libraries with Python)
The Reviews are taken as a dataset and various analysis methods such as text analysis and opinion mining are performed as the first step, later on, a user profile is created on the basis of the result we got through text analysis and opinion mining. The obtained result is engaged with the recommender approaches to achieve precise recommendations for the individual user. (Read also: 6 Dynamic Challenges in Recommendation System).
Working of Review-Based Recommendation System, Source: ResearchGate
The textual reviews can be taken as the input with the help of word embeddings instead of TF-IDF approach. We shall see how this model performs on the real-world dataset, our implementation will be based on the customer reviews of Amazon products.
We are using consumer reviews of amazon products as a dataset, You can download it from Kaggle.
Step 1: Importing Libraries and reading dataset with the help of pandas Libraries
%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.neighbors import NearestNeighbors
from scipy.spatial.distance import cosine
from sklearn.feature_extraction.text import CountVectorizer, TfidfTransformer
import re
import string
from wordcloud import WordCloud, STOPWORDS
from sklearn.metrics import mean_squared_error
import csv
df = pd.read_csv(r'Datafiniti_Amazon_Consumer_Reviews_of_Amazon_Products.csv')
Step 2: Viewing Index
print(df.columns)
print(df.shape)
Step 3: Viewing dataset
df.head()
Now count ‘asins’ ( all non-null values) and also grouping the mean of ‘asins’
count = df.groupby("asins", as_index=False).count()
mean = df.groupby("asins", as_index=False).mean()
dfMerged = pd.merge(df, count, how='right', on=['asins'])
dfMerged
Step 3: Taking non-null values of reviews.text, reviews. ratings, and asins.
df1 = df[['reviews.text','reviews.rating','asins']]
df1 = df1.dropna()
df1
dfProductReview = df.groupby("asins", as_index=False).mean()
dfProductReview.head(3)
Step 4: Grouping Reviews of Individual Products
ProductReviewSummary = df1.groupby("asins")["reviews.text"].apply(str)
p = ProductReviewSummary.to_frame()
p['reviews.text'] = p['reviews.text'].str.replace('\d+'," ")
p['reviews.text'] = p['reviews.text'].str.replace('\n'," ")
p['reviews.text'] = p['reviews.text'].str.strip(" ")
p.shape[0]
-> 24
Step 5: Tfidf Matrix and Cosine Similarity
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
tf = TfidfVectorizer(analyzer='word', ngram_range=(1, 3), min_df=0, stop_words='english')
tfidf_matrix = tf.fit_transform(p['reviews.text'])
print((tfidf_matrix.shape))
-> (24, 9968)
cosine_similarities = cosine_similarity(tfidf_matrix,Y=None,dense_output=False)
cnum = (cosine_similarities.toarray())
print(((cosine_similarities[0][:1,:-19])))
type(cosine_similarities)
Step 6: Recommendations
def get_recommendations(id):
print("the product selected is {}".format(p.index[id]))
a = cosine_similarities.getcol(id)
val = list(enumerate(a.data))
#print(val)
b= dict(val)
print(b)
c = sorted(b.items(),key=lambda x:x[1],reverse=True)[1:4]
k = 1
for idx in c:
print("The {} Recommendation is {}".format(k,p.index[idx[0]]))
k += 1
get_recommendations(0)
We could have also used the Count vector with KNN machine learning algorithm to get recommendations.
The recommendation system is another wonder of machine learning to ease the selection process, with time, we are seeing methods of implementations are changing, or sometimes new data is becoming a key resource for the machine learning or natural language processing models. (Check also: Machine Learning vs Deep Learning). While everything in the computer science field is moving so fast, we are committed to providing information services as fast as we can at analytics steps.
5 Factors Influencing Consumer Behavior
READ MOREElasticity of Demand and its Types
READ MOREAn Overview of Descriptive Analysis
READ MOREWhat is PESTLE Analysis? Everything you need to know about it
READ MOREWhat is Managerial Economics? Definition, Types, Nature, Principles, and Scope
READ MORE5 Factors Affecting the Price Elasticity of Demand (PED)
READ MORE6 Major Branches of Artificial Intelligence (AI)
READ MOREScope of Managerial Economics
READ MOREDijkstra’s Algorithm: The Shortest Path Algorithm
READ MOREDifferent Types of Research Methods
READ MORE
Latest Comments