Project 3:Movie Recomendation using python

Basicly there are two types of recomendation system 

  • content based filtering
  • colaborative filtering

you can google out this and get to know about it

i am here basicly to put down the projects and code

so all what we do is predict the movie for the  customer using the angular distance (we can calculate using two method ie. euclidian diatance and angular distance) according to the problem you have to decide which one will be suitable for the project

i think this much description is ell and good lets cary on with the codes

so are you ready?

this is the basic code to get the matrix of the text realation ,i mean the realtion between the words in  two sentence.


from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity
text=["London Paris London","Paris Paris London"]

cv=CountVectorizer()
cv_matrix=cv.fit_transform(text)
#print(cv_matrix.toarray())
similirity_scores= cosine_similarity(cv_matrix)
print(similirity_scores)

output:

[[1.  0.8]

 [0.8 1. ]]

Download the dataset from "https://drive.google.com/file/d/1sJ9N2T2zDQwvywHCC6RCO68olL97Mp4O/view"

so as we are done with the basics ,now lets come up to the new project .

import pandas as pd
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
df=pd.read_csv("movie_dataset.csv")
df.head(3)

lets combine four columns and make a datasets

def combine_features(row):
    return row['keywords']+""+row['cast']+""+row['genres']+""+row['director']

df["combine_features"]=df.apply(combine_features,axis=1)
df["combine_features"].head()
Output:


above  pre requestie understanding can help you.

from sklearn.feature_extraction.text import CountVectorizer
cv= CountVectorizer()

cv_matrix=cv.fit_transform(df["combine_features"])
#print(cv_matrix.toarray())
similirity_scores= cosine_similarity(cv_matrix)
print(similirity_scores)

so here what i did is sent the comand to the user of inputing the movie name 

def get_title_from_index(index):
    return df[df.index == index]["title"].values[0]

def get_index_from_title(title):
    return df[df.title == title]["index"].values[0]

movies_user_like=input("movie_name:  ")
movie_index= get_index_from_title(movies_user_like)
similar_movies=list(enumerate(similirity_scores[movie_index]))
sorted_similar_movies=sorted(similar_movies,key=lambda x:x[1],reverse=True)

i=0
for movie in sorted_similar_movies:
    print (get_title_from_index(movie[0]))
    i=i+1
    if i>10:
        break

Output:



more updation can be done

  •  like in inputing the movie name we can give try accept statemnt  and if some movie is not in the list pop up should be given 
  • using the mean and average of vote we can recomned most popular movie.

Thank you 

Comments

Popular posts from this blog

spealized the work. Be ready for the future

scatterplot/ violon plot /histogram /boxplot

lest just create a basic bot operation in python