التخطي إلى المحتوى الرئيسي

Machine learning:- K-means

Why do we need Data Preprocessing?


A real-world data generally contains noises, missing values, and maybe in an unusable format that cannot be directly used for machine learning models. Data preprocessing is required tasks for cleaning the data and making it suitable for a machine learning model which also increases the accuracy and efficiency of a machine learning model.


It involves the below steps:
  • Getting the dataset
  • Importing libraries
  • Importing datasets
  • Finding Missing Data
  • Encoding Categorical Data
  • Splitting dataset into training and test set
  • Feature scaling



from sklearn.impute import SimpleImputer 
import pandas as pd
import numpy as np
from sklearn.preprocessing import LabelEncoder,OneHotEncoder 
from sklearn.preprocessing import StandardScaler
from sklearn.compose import ColumnTransformer
from sklearn.model_selection import train_test_split  
from sklearn.preprocessing import StandardScaler
data_set= pd.read_csv('d:/Data.csv') 

imputer= SimpleImputer(missing_values =np.nan, strategy='mean')  
#Fitting imputer object to the independent variables x.   
x= data_set.iloc[:,:-1].values 

imputer= imputer.fit(x[:, 1:3])  
#Replacing missing data with the calculated mean value  
x[:, 1:3]= imputer.transform(x[:, 1:3])  
print(x)

label_encoder_x_1 = LabelEncoder()
x[: , 0] = label_encoder_x_1.fit_transform(x[:,0])
transformer = ColumnTransformer(
   [('Country', OneHotEncoder(sparse=False),[0]),],remainder='passthrough'
)
x = transformer.fit_transform(x)
print(x)
y= data_set.iloc[:,3].values
labelencoder_y= LabelEncoder()  
y= labelencoder_y.fit_transform(y)  
print(y)
x_train, x_test, y_train, y_test= train_test_split(x, y, test_size= 0.2, random_state=0)

#print("Training Data Set are ",x_train,y_train)

#print("Testing Data Set are",x_test,y_test)

st_x= StandardScaler()  

x_train= st_x.fit_transform(x_train)  

x_test= st_x.transform(x_test) 

print(x_train)
print(x_test)

Machine learning:- K-means


K-means clustering is a method for finding clusters and cluster centers in a set of unlabelled data.

cluster means group of matching data ,we can show using different label for example we can create three different sub-group for red ,green and blue to manage related data ,if item will be belonging from red color then it will be the part of red cluster.

step for clustering:-

1) prepare data using repository or from array using numpy
2)  if we want to re scale data then we can use whiten()
3)  calculate centroid point from data based on number of cluster.
4)  display possible matching from cluster with values it will return the minimum difference using 0,1 and 2 ... form
Complete code of K-means Clustering algorithm ,it will be mainly implemented in ML:-
Complete example of Clustering concept
from numpy import hstack,array
from numpy.random import rand
from scipy.cluster.vq import *
data = vstack((rand(10,3) + array([1,1,1]),rand(10,3)))
centroids,_ = kmeans(data,5)
print(centroids)
clx,_ = vq(data,centroids)
print(clx) 
                       

تعليقات

المشاركات الشائعة من هذه المدونة

Uncontrolled form input in React-JS

  Uncontrolled form input in React-JS? If we want to take input from users without any separate event handling then we can uncontrolled the data binding technique. The uncontrolled input is similar to the traditional HTML form inputs. The DOM itself handles the form data. Here, the HTML elements maintain their own state that will be updated when the input value changes. To write an uncontrolled component, you need to use a ref to get form values from the DOM. In other words, there is no need to write an event handler for every state update. You can use a ref to access the input field value of the form from the DOM. Example of Uncontrolled Form Input:- import React from "react" ; export class Info extends React . Component {     constructor ( props )     {         super ( props );         this . fun = this . fun . bind ( this ); //event method binding         this . input = React . createRef ();...

JSP Page design using Internal CSS

  JSP is used to design the user interface of an application, CSS is used to provide set of properties. Jsp provide proper page template to create user interface of dynamic web application. We can write CSS using three different ways 1)  inline CSS:-   we will write CSS tag under HTML elements <div style="width:200px; height:100px; background-color:green;"></div> 2)  Internal CSS:-  we will write CSS under <style> block. <style type="text/css"> #abc { width:200px;  height:100px;  background-color:green; } </style> <div id="abc"></div> 3) External CSS:-  we will write CSS to create a separate file and link it into HTML Web pages. create a separate file and named it style.css #abc { width:200px;  height:100px;  background-color:green; } go into Jsp page and link style.css <link href="style.css"  type="text/css" rel="stylesheet"   /> <div id="abc"> </div> Exam...

JDBC using JSP and Servlet

JDBC means Java Database Connectivity ,It is intermediates from Application to database. JDBC has different type of divers and provides to communicate from database server. JDBC contain four different type of approach to communicate with Database Type 1:- JDBC-ODBC Driver Type2:- JDBC Vendor specific Type3 :- JDBC Network Specific Type4:- JDBC Client-Server based Driver  or JAVA thin driver:- Mostly we prefer Type 4 type of Driver to communicate with database server. Step for JDBC:- 1  Create Database using MYSQL ,ORACLE ,MS-SQL or any other database 2   Create Table using database server 3   Create Form according to database table 4  Submit Form and get form data into servlet 5  write JDBC Code:-     5.1)   import package    import java.sql.*     5.2)  Add JDBC Driver according to database ide tools     5.3)  call driver in program         ...