It is the upper layer of NumPy that has been created by the NumPy library, pandas are used to provide data operation, data selection, and data structure similar to SQL query of the database.
How to install pandas in the machine?
pip install pandas
conda instal pandas
Pandas provide two different types of Objects:-
1) Series:- Series is called Single Dimension Array that is used to contain labeled data under array objects.
It is also used to implement numerical operations similar to NumPy array but it is used to implement calculation on Dataframe columns.
Example 1st:-
Create Script to display data on series Without label
import pandas as pd
import numpy as np
arr = np.array([12,23,34,56,89,11])
data = pd.Series(arr)
print(data)
Create Script to display data on series With label
import pandas as pd
import numpy as np
arr = np.array([12,23,34,56,89,11])
data = pd.Series(arr,index=['P','q','r','s','t','u'])
print(data)
Create Series with List?
import pandas as pd
import numpy as np
arr = [12,23,34,56,89,11]
data = pd.Series(arr,index=['P','q','r','s','t','u'])
print(data)
Create Series with Dictionary?
Dictionary provide data using key=>value pair hence we can create labeled series using dictionary objects.
import pandas as pd
import numpy as np
arr = {'rno':1001,'sname':'jay kumar','branch':'cs','fees':45000}
data = pd.Series(arr)
print(data)
Create Series using Scaler Data:-
Scalar data means integer, float, double and String type data.
import pandas as pd
import numpy as np
data = pd.Series('SCS',index=['A','B','C','D','E'])
print(data)
Create Series object using NumPy predefine functions:-
import pandas as pd
import numpy as np
data = pd.Series(np.linspace(3,100,5))
print(data)
2) DataFrame:-
It is a two-dimensional array that is used to contain data using rows and columns. It is mostly used to map the dataset data into applications.
When we load any repository then it returns data under CSV or Excel file that can be easily manipulated by Data Frame Objects.
How to create Data Frame Objects:-
import pandas as pd
# list of strings
lst = ['SCS', 'For', 'SCS, 'is',
'portal', 'for', 'SCS]
# Calling DataFrame constructor on list
df = pd.DataFrame(lst)
print(df)
Creating Data Frame Objects using Dictionary to List Objects that store elements into Multidimensional pattern.
import pandas as pd
# intialise data of lists.
data = {'Name':['C', 'C++', 'DS', 'JAVA'],
'Age':[20, 21, 19, 18]}
# Create DataFrame
df = pd.DataFrame(data)
# Print the output.
print(df)
In this article, I am explaining the different view functions of pandas?
head():- this function is used to return nth-number of rows of data frame and series both.
Syntax: Dataframe.head(n=5)
Parameters:
n: integer value, number of rows to be returned
Return type: Dataframe with top n rows
if we did not write any parameter on the head then it returns the top 5 records?
import pandas as pd
# making data frame
data = pd.read_csv("nba.csv")
# calling head() method
# storing in new variable
data_top = data.head()
# display
data_top
Convert data frame to series using pandas?
import pandas as pd
# making data frame
data = pd.read_csv("nba.csv")
# number of rows to return
n = 9
# creating series
series = data["Name"]
# returning top n rows
top = series.head(n = n)
# display
print(top)
display rows from the bottom?
pandas provide tail() to display row from bottom
# importing pandas module
import pandas as pd
# making data frame
data = pd.read_csv("nba.csv")
# calling head() method
# storing in new variable
data_top = data.tail()
# display
data_top
# importing pandas module
import pandas as pd
# making data frame
data = pd.read_csv("nba.csv")
# number of rows to return
n = 2
# creating series
series = data["Name"]
# returning top n rows
top = series.tail(n = n)
# display top
Statistical operation using pandas?
If we want to perform max, min,avg, std functionality then we can use describe() in pandas.
Syntax: DataFrame.describe(percentiles=None, include=None, exclude=None)
Parameters:
percentile: list-like data type of numbers between 0-1 to return the respective percentile
include: List of data types to be included while describing data frame. Default is None
exclude: List of data types to be Excluded while describing data frame. Default is None
Return type: Statistical summary of the data frame.
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Projection or Show particular data :-
It is used to show particular data from data frame
# Import pandas package
import pandas as pd
# Define a dictionary containing employee data
data = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Age':[27, 24, 22, 32],
'Address':['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
'Qualification':['Msc', 'MA', 'MCA', 'Phd']}
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)
# select two columns
print(df[['Name', 'Qualification']])
How to add a new column attribute in the data frame?
# Import pandas package
import pandas as pd
# Define a dictionary containing Students data
data = {'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Height': [5.1, 6.2, 5.1, 5.2],
'Qualification': ['Msc', 'MA', 'Msc', 'Msc']}
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)
# Declare a list that is to be converted into a column
address = ['Delhi', 'Bangalore', 'Chennai', 'Patna']
# Using 'Address' as the column name
# and equating it to the list
df['Address'] = address
# Observe the result
print(df)
Select data based on rows?
# importing pandas package
import pandas as pd
# making data frame from csv file
data = pd.read_csv("nba.csv", index_col ="Name")
# retrieving row by loc method
first = data.loc["Avery Bradley"]
second = data.loc["R.J. Hunter"]
print(first, "\n\n\n", second)
How to merge rows in pandas?
Deletion of rows in Django:-
POST Answer of Questions and ASK to Doubt