Skip to main content

Pandas Introduction, Data Selection using pandas


It is the upper layer of NumPy that has been created by the NumPy library, pandas are used to provide data operation, data selection, and data structure similar to SQL query of the database.

How to install pandas in the machine?

pip install pandas

conda instal pandas

Pandas provide two different types of Objects:-

1)  Series:- Series is called Single Dimension Array that is used to contain labeled data under array objects.

It is also used to implement numerical operations similar to NumPy array but it is used to implement calculation on Dataframe columns.

Example 1st:-

Create Script to display data on series Without label

import pandas as pd

import numpy as np

arr = np.array([12,23,34,56,89,11])

data = pd.Series(arr)

print(data)

Create Script to display data on series With label

import pandas as pd

import numpy as np

arr = np.array([12,23,34,56,89,11])

data = pd.Series(arr,index=['P','q','r','s','t','u'])

print(data)

Create Series with List?

import pandas as pd

import numpy as np

arr = [12,23,34,56,89,11]

data = pd.Series(arr,index=['P','q','r','s','t','u'])

print(data)

Create Series with Dictionary?

Dictionary provide data using key=>value pair hence we can create labeled series using dictionary objects.

import pandas as pd

import numpy as np

arr = {'rno':1001,'sname':'jay kumar','branch':'cs','fees':45000}

data = pd.Series(arr)

print(data)

Create Series using Scaler Data:-

Scalar data means integer, float, double and String type data.

import pandas as pd

import numpy as np

data = pd.Series('SCS',index=['A','B','C','D','E'])

print(data)

Create Series object using NumPy predefine functions:-

import pandas as pd

import numpy as np

data = pd.Series(np.linspace(3,100,5))

print(data)

2)  DataFrame:-

It is a two-dimensional array that is used to contain data using rows and columns. It is mostly used to map the dataset data into applications.

When we load any repository then it returns data under CSV or Excel file that can be easily manipulated by Data Frame Objects.

How to create Data Frame Objects:-

import pandas as pd

 # list of strings

lst = ['SCS', 'For', 'SCS, 'is', 

            'portal', 'for', 'SCS]

 # Calling DataFrame constructor on list

df = pd.DataFrame(lst)

print(df)

Creating Data Frame Objects using Dictionary to List Objects that store elements into Multidimensional pattern.

import pandas as pd

 # intialise data of lists.

data = {'Name':['C', 'C++', 'DS', 'JAVA'],

        'Age':[20, 21, 19, 18]}

 # Create DataFrame

df = pd.DataFrame(data)

# Print the output.

print(df)

In this article, I am explaining the different view functions of pandas?

head():-  this function is used to return nth-number of rows of data frame and series both.

Syntax: Dataframe.head(n=5)

Parameters:
n: integer value, number of rows to be returned

Return type: Dataframe with top n rows

if we did not write any parameter on the head then it returns the top 5 records?

import pandas as pd

# making data frame

data = pd.read_csv("nba.csv")

# calling head() method

# storing in new variable

data_top = data.head()

# display

data_top

Convert data frame to series using pandas?

import pandas as pd

# making data frame

data = pd.read_csv("nba.csv")

# number of rows to return

n = 9

# creating series

series = data["Name"]

# returning top n rows

top = series.head(n = n)

# display

print(top)

display rows from the bottom?

pandas provide tail() to display row from bottom 

# importing pandas module

import pandas as pd

# making data frame

data = pd.read_csv("nba.csv")

# calling head() method

# storing in new variable

data_top = data.tail()

# display

data_top

# importing pandas module

import pandas as pd

# making data frame

data = pd.read_csv("nba.csv")

# number of rows to return

n = 2

# creating series

series = data["Name"]

# returning top n rows

top = series.tail(n = n)

# display    top


Statistical operation using pandas?

If we want to perform max, min,avg, std functionality then we can use describe()  in pandas.

Syntax: DataFrame.describe(percentiles=None, include=None, exclude=None)

Parameters:
percentile: list-like data type of numbers between 0-1 to return the respective percentile
include: List of data types to be included while describing data frame. Default is None
exclude: List of data types to be Excluded while describing data frame. Default is None

Return type: Statistical summary of the data frame.

,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

Projection or Show particular data :-

It is used to show particular data  from data frame

# Import pandas package

import pandas as pd

# Define a dictionary containing employee data

data = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'],

'Age':[27, 24, 22, 32],

'Address':['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],

'Qualification':['Msc', 'MA', 'MCA', 'Phd']}

# Convert the dictionary into DataFrame

df = pd.DataFrame(data)

# select two columns

print(df[['Name', 'Qualification']])

How to add a new column attribute in the data frame?

# Import pandas package

import pandas as pd

# Define a dictionary containing Students data

data = {'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'],

'Height': [5.1, 6.2, 5.1, 5.2],

'Qualification': ['Msc', 'MA', 'Msc', 'Msc']}

# Convert the dictionary into DataFrame

df = pd.DataFrame(data)

# Declare a list that is to be converted into a column

address = ['Delhi', 'Bangalore', 'Chennai', 'Patna']

# Using 'Address' as the column name

# and equating it to the list

df['Address'] = address

# Observe the result

print(df)

Select data based on rows?

# importing pandas package

import pandas as pd

# making data frame from csv file

data = pd.read_csv("nba.csv", index_col ="Name")

# retrieving row by loc method

first = data.loc["Avery Bradley"]

second = data.loc["R.J. Hunter"]

print(first, "\n\n\n", second)

How to merge rows in pandas?

import pandas as pd     
# making data frame 
df = pd.read_csv("nba.csv", index_col ="Name"
  
df.head(10)  
new_row = pd.DataFrame({'Name':'Geeks', 'Team':'Boston', 'Number':3,
                        'Position':'PG', 'Age':33, 'Height':'6-2',
                        'Weight':189, 'College':'MIT', 'Salary':99999},
                                                            index =[0])
# simply concatenate both dataframes
df = pd.concat([new_row, df]).reset_index(drop = True)
df.head(5)


Deletion of rows in Django:-

mport pandas as pd

# making data frame from csv file
data = pd.read_csv("nba.csv", index_col ="Name" )
  # dropping passed values
data.drop(["Avery Bradley", "John Holland", "R.J. Hunter","R.J. Hunter"], inplace = True)
  
# display
data




Comments

Popular posts from this blog

Uncontrolled form input in React-JS

  Uncontrolled form input in React-JS? If we want to take input from users without any separate event handling then we can uncontrolled the data binding technique. The uncontrolled input is similar to the traditional HTML form inputs. The DOM itself handles the form data. Here, the HTML elements maintain their own state that will be updated when the input value changes. To write an uncontrolled component, you need to use a ref to get form values from the DOM. In other words, there is no need to write an event handler for every state update. You can use a ref to access the input field value of the form from the DOM. Example of Uncontrolled Form Input:- import React from "react" ; export class Info extends React . Component {     constructor ( props )     {         super ( props );         this . fun = this . fun . bind ( this ); //event method binding         this . input = React . createRef ();...

JSP Page design using Internal CSS

  JSP is used to design the user interface of an application, CSS is used to provide set of properties. Jsp provide proper page template to create user interface of dynamic web application. We can write CSS using three different ways 1)  inline CSS:-   we will write CSS tag under HTML elements <div style="width:200px; height:100px; background-color:green;"></div> 2)  Internal CSS:-  we will write CSS under <style> block. <style type="text/css"> #abc { width:200px;  height:100px;  background-color:green; } </style> <div id="abc"></div> 3) External CSS:-  we will write CSS to create a separate file and link it into HTML Web pages. create a separate file and named it style.css #abc { width:200px;  height:100px;  background-color:green; } go into Jsp page and link style.css <link href="style.css"  type="text/css" rel="stylesheet"   /> <div id="abc"> </div> Exam...

DSA in C# | Data Structure and Algorithm using C#

  DSA in C# |  Data Structure and Algorithm using C#: Lecture 1: Introduction to Data Structures and Algorithms (1 Hour) 1.1 What are Data Structures? Data Structures are ways to store and organize data so it can be used efficiently. Think of data structures as containers that hold data in a specific format. Types of Data Structures: Primitive Data Structures : These are basic structures built into the language. Example: int , float , char , bool in C#. Example : csharp int age = 25;  // 'age' stores an integer value. bool isStudent = true;  // 'isStudent' stores a boolean value. Non-Primitive Data Structures : These are more complex and are built using primitive types. They are divided into: Linear : Arrays, Lists, Queues, Stacks (data is arranged in a sequence). Non-Linear : Trees, Graphs (data is connected in more complex ways). Example : // Array is a simple linear data structure int[] number...