hobbyist

Posts

Showing posts from September, 2020

Project 3:Movie Recomendation using python

- September 13, 2020

Basicly there are two types of recomendation system content based filtering colaborative filtering you can google out this and get to know about it i am here basicly to put down the projects and code so all what we do is predict the movie for the customer using the angular distance (we can calculate using two method ie. euclidian diatance and angular distance) according to the problem you have to decide which one will be suitable for the project i think this much description is ell and good lets cary on with the codes so are you ready? this is the basic code to get the matrix of the text realation ,i mean the realtion between the words in two sentence. from sklearn.feature_extraction.text import CountVectorizer from sklearn.metrics.pairwise import cosine_similarity text=[ "London Paris London" , "Paris Paris London" ] cv=CountVectorizer() cv_matrix=cv.fit_transform(text) #print(cv_matrix.toarray()) similirity_scor...

Project 2: pdf extractor using python

- September 09, 2020

let us prepare a project of few codes to extract the whole pdf import pip install PyPDF2 code: ............................................................................................................................................. from PyPDF2 import PdfFileReader #read a pdf file ie. by rb mode file=open("Handbook.pdf",'rb') #reader ia s variable use to read file reader=PdfFileReader(file) #lets get the info of the pdf document print("document info:",reader.getDocumentInfo()) print() #getNumPages() this comand can get you page numbers of pdf print("number pf pages are:",reader.getNumPages()) #lets take variable "pages" to take comand over get number of pages pages=reader.getNumPages() for i in range(0,pages): print("page number=",i+1) pageObj = reader.getPage(i) print(pageObj.extractText()) print() print(reader.getDocumentInfo().creator) file.close() ............................................................

k nearest neighbors with well defined k value

- September 06, 2020

so let us understand how we can chooose the perfect k for our model from the last model i had prepared a function def regression(model): x_train,x_test,y_train,y_test= train_test_split(x,y,test_size=0.2) reg_all=model reg_all.fit(x_train,y_train) y_predict=reg_all.predict(x_test) rmse_value=np.sqrt(mean_squared_error(y_test,y_predict)) print("rms error={}".format(rmse_value)) i have prepared cross value squared to get the mean of rmse where k=3 denote the mean of three iterated value of rmse Lasso is a way to conterect over fitting (we can also use ridge) to check. from sklearn.model_selection import cross_val_score from sklearn.linear_model import Lasso def regression_cv(model,k=3): scores=cross_val_score(model,x,y,scoring='neg_mean_squared_error',cv=k) rmse=np.sqrt(-scores) print('reg rmse:',rmse) print('reg mean:',rmse.mean()) importi...

Packt publication linear regression

- September 05, 2020

code: ''' step1: import all the required packages step2: read the csv file and deop the null value step3: declare x & y ie. independent and target vallue step4:split train and test values from the data set step5:call Linear regression and fit x & y train value step6: predict y for our x value from test value and find rmse value rmse value can be usefull to undersand the effeciency of our model ''' import pandas as pd import numpy as np from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_squared_error from sklearn.model_selection import train_test_split In [20]: df_housing . head () Out[20]: CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO B LSTAT MEDV 0 0.00632 18.0 2.31 0 0.538 6.575 65.2 4.0900 1 296 35.3 396.9 4.98 24.00 1 0.02731 0.0 7.07 0 0.469 6.421 78.9 4.9671 2 242 17.8 396.9 9.14 21.60 2 0.02731 0.0 7.07 0 0.469 6.421 78.9 4.9671 2 242 17.8 396.9 9.14 21.61 3 0.02731 0.0 7.07 0 0.4...

image editor using python

- September 03, 2020

install packages like opencv & numpy import cv2 import numpy as np num_down= 2 num_bilaterial = 7 img_rgb =cv2.imread( "animesh.jpg" ) print (img_rgb.shape) img_rgb=cv2.resize(img_rgb,( 400 , 400 )) #downsampling, bilaterialfilter img_color=img_rgb for _ in range (num_down): img_color=cv2.pyrDown(img_color) for _ in range (num_bilaterial): img_color=cv2.bilateralFilter(img_color, d = 9 , sigmaColor = 9 , sigmaSpace = 7 ) for _ in range (num_down): img_color=cv2.pyrUp(img_color) #editing tools #image to gray scale img_gray= cv2.cvtColor(img_rgb,cv2.COLOR_RGB2GRAY) #bluring the image img_blur= cv2.medianBlur(img_gray, 9 ) #thersholding img_edge= cv2.adaptiveThreshold(img_blur, 255 ,cv2.ADAPTIVE_THRESH_MEAN_C,cv2.THRESH_BINARY, blockSize = 9 ,...

scatterplot/ violon plot /histogram /boxplot

- September 02, 2020

dataset addresss=[" https://data.gov.uk/dataset/bb3520e6-dd76-46d9-8bdd-86f0a2178be9/organogram-of-staff-roles-salaries/datafile/cb432dfd-a5eb-4eaa-8523-961601d5601b/preview#organogram "] (save dataset as uk_statistic) . uk=pd.read_csv("uk_statistic") .......................boxplot..................................................... x=uk['Salary Cost of Reports (£)'] y=uk['Actual Pay Floor (£)'] plt.boxplot(x) plt.title("UK Sststistics") plt.xlabel('salary cost of Report') plt.ylabel('Actual pay floor') plt.show() .........................................violin plot................................................... x=uk['Salary Cost of Reports (£)'] #y=uk['Actual Pay Floor (£)'] plt.violinplot(x) plt.show ..................................................histogram................................................ title = 'UK Sststistics' plt.figure(figsize=(10,6)) plt.hist(uk['Actual Pay Floor (£...

Replacing a nulll value with a mean\0\median and Heatmap

- September 02, 2020

Replacing a nulll value with a mean df_housing["AGE"]=df_housing["AGE"].fillna(df_housing.mean()) df_housing["AGE"] Replacing a nulll value with a "0" df_housing["AGE"]=df_housing["AGE"].fillna(df_housing.mean()) Replacing a nulll value with a median df_housing["AGE"]=df_housing["AGE"].fillna(df_housing.median()) Correlation Correlation is a statistical measure between -1 and +1 that indicates how closely two variables are related. A correlation of -1 or +1 means that variables are completely dependent, and they fall in a perfectly straight line. A correlation of 0 indicates that an increase in one variable gives no information whatsoever about the other variable. Visually, this would be points all over the place. Correlations usually fall somewhere in the middle. For instance, a correlation of 0.75 represents a fairly strong relationship, whereas a correlation of 0.25 is a reasonably weak relationship. ...