SlideShare a Scribd company logo
Module 4: Introduction to data structures in
Pandas
Introduction
●Pandas is an open-source library that uses for working with relational or labeled
data both easily and intuitively.
●It provides various data structures and operations for manipulating numerical data
and time series.
●It offers a tool for cleaning and processes your data.
●It is the most popular Python library that is used for data analysis.
●It supports two data structures:
● Series
● Dataframe
What is a Series?
• A Pandas Series is like a column in a table.
• It is a one-dimensional array holding data of any type.
• If nothing else is specified, the values are labeled with their index
number.
• First value has index 0, second value has index 1 etc.
• This label can be used to access a specified value.
• Syntax: pandas.Series(data=None, index=None, dtype=None,
name=None, copy=False)
data: array- Contains data stored in Series.
index: array-like or Index (1d)
dtype: str, numpy.dtype, or ExtensionDtype, optional
name: str, optional
copy: bool, default False
Example1:
import pandas as pd
# a simple char list
list = ['h', 'e', 'l', 'l', 'o']
# create series from a char list
res = pd.Series(list)
print(res)
Output:
0 h
1 e
2 l
3 l
4 0
dtype: object
Example 2: Create a simple Pandas Series
from a list:
import pandas as pd
a = [1, 7, 2]
myvar = pd.Series(a)
print(myvar)
print(myvar[0])
Output:
0 1
1 7
2 2
dtype: int64
1
Example 3: Create label
import pandas as pd
a = [1,7,2]
myvar = pd.Series(a, index = ["x","y","z"])
print(myvar)
Output:
x 1
y 7
z 2
dtype: int64
Key/Value Objects as Series
Create a simple Pandas Series from a dictionary:
Example 1
import pandas as pd
calories = {"day1": 420, "day2": 380, "day3": 390}
myvar = pd.Series(calories)
print(myvar)
Output:
day1 420
day2 380
day3 390
dtype: int64
Example 2
import pandas as pd
dic = { 'Id': 1013, 'Name': ‘Mohit',
'State': 'Manipal','Age': 24}
res = pd.Series(dic)
print(res)
Output:
Id 1013
Name Mohit
State Manipal
Age 24
dtype: object
Operations on a Series
• Pandas Series provides two very useful methods for extracting the data from
the top and bottom of the Series Object.
• These methods are head() and tail().
1. Head() Method
• head() method is used to get the elements from the top of the series. By
default, it gives 5 elements.
Syntax:
<Series Object> . head(n = 5)
Example:
Consider the following Series, we will perform the operations on the below given
Series S.
head() Function without argument
If we do not give any argument inside head() function, it will give by default 5
values from the top.
import pandas as pd # Creating a Pandas Series
data = pd.Series([10, 20, 30, 40, 50, 60, 70, 80]) #
Using head() method (default n=5)
print(data.head())
When a positive number is provided, the head() function will extract the top n rows from Series
Object. In the below given example, I have given 7, so 7 rows from the top has been extracted.
head() Function with
Positive Argument
head() Function with negative
Argument
2. Tail() Method
tail() method gives the elements of series from the bottom.
Syntax:
<Series Object> . tail(n = 5)
Example:
Consider the following Series, we will perform the operations on the below given Series S.
tail() function without argument
If we do not provide any argument tail() function gives be default 5 values from the bottom of the
Series Object.
tail() function Positive and negative arguments
3. Vector operations
• Like NumPy array, series support vector operations.
• Batch operations on data without writing any for loops. This is usually
called vectorization.
Mathematical operations on Pandas Series
1. You can perform arithmetic operations like addition, subtraction,
division, multiplication on two Series objects.
2. The operations are performed only on the matching indexes.
3. For all non-matching indexes, NaN (Not a Number) will be
returned.
Let us consider the following two Series S1 and S2. We will perform
mathematical operations on these Series.
introduction to data structures in pandas
introduction to data structures in pandas
introduction to data structures in pandas
introduction to data structures in pandas
introduction to data structures in pandas
introduction to data structures in pandas
introduction to data structures in pandas
DataFrames
DataFrames
• Data sets in Pandas are usually multi-dimensional tables, called
DataFrames.
• Series is like a column, a DataFrame is the whole table.
• A Pandas DataFrame is a 2 dimensional data structure, like a 2
dimensional array, or a table with rows and columns.
• Pandas use the loc attribute to return one or more specified row(s)
• With the index argument, you can name your own indexes.
● Pandas DataFrame is a two-dimensional size-mutable, potentially
heterogeneous tabular data structure with labeled axes (rows and
columns).
● A Data frame is a two-dimensional data structure, i.e., data is
aligned in a tabular fashion in rows and columns like a spreadsheet
or SQL table, or a dict of Series objects.
● Pandas DataFrame consists of three principal components:
• Data
• Rows
• Columns.
Example1:
import pandas as pd
# list of strings
lst = ['welcome', 'to', 'gods', 'own', 'country']
# Calling DataFrame constructor on list
df = pd.DataFrame(lst)
display(df)
Output:
2. Create a DataFrame from two Series:
import pandas as pd
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
myvar = pd.DataFrame(data)
print(myvar)
Output:
3.Creating DataFrame from dict of array/lists.
# Python code demonstrate creating
# DataFrame from dict narray / lists
# By default addresses.
import pandas as pd
# initialise data of lists.
data = {'Name':['Tom', 'nick', 'krish', 'jack'],
'Age':[20, 21, 19, 18]}
# Create DataFrame
df = pd.DataFrame(data)
# Print the output.
display(df)
DATAFRAME OPERATIONS
Selection of column: The [ ] operator is used to select a column by
mentioning the respective column name.
# Import pandas package
import pandas as pd
# Define a dictionary containing employee data
data = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Age':[27, 24, 22, 32],
'Address':['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
'Qualification':['Msc', 'MA', 'MCA', 'Phd']}
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)
# select two columns
How to Select Rows and Column from
Pandas DataFrame based on condition?
Example 1: Selecting rows.
pandas.DataFrame.loc is a function used to select rows from Pandas
DataFrame based on the condition provided.
Syntax: df.loc[df[‘cname’] ‘condition’]
Parameters:
● df: represents data frame
● cname: represents column name
● condition: represents condition on which rows has to be selected
from pandas import DataFrame ** Example for selecting row
# Creating a data frame
Data = {'Name': ['Mohe', 'Shyni', 'Parul', 'Sam'],
'ID': [12, 43, 54, 32],
'Place': ['Delhi', 'Kochi', 'Pune', 'Patna']
}
df = DataFrame(Data, columns = ['Name', 'ID', 'Place'])
# Print original data frame
print("Original data frame:n")
display(df)
# Selecting the product of Electronic Type
select_prod = df.loc[df['Name'] == 'Mohe']
print("n")
# Print selected rows based on the condition
print("Selecting rows:n")
display (select_prod)
Example for selecting column
# Importing pandas as pd
from pandas import DataFrame
# Creating a data frame
Data = {'Name': ['Mohe', 'Shyni', 'Parul', 'Sam'],
'ID': [12, 43, 54, 32],
'Place': ['Delhi', 'Kochi', 'Pune', 'Patna']
}
df = DataFrame(Data, columns = ['Name', 'ID', 'Place'])
# Print original data frame
print("Original data frame:")
display(df)
print("Selected column: ")
display(df[['Name', 'ID']] )
Add a New Column
Let’s create a DataFrame object to begin.
Method 1:
import pandas as pd
df = pd.DataFrame({'price': [3, 89, 45, 6], 'amount': [57, 42, 70, 43]})
df['total'] = df['price'] * df['amount']
Method 2
If you want to specify where your new column should be inserted in the DataFrame, you can use
the DataFrame.insert() method. The insert method has four parameters:
insert(loc, new column name, value, allow_duplications)
● loc: the column insertion index
● column: new column label
● value: desired row data
● allow_duplications: (optional) will not create a new column if a column with the same label
already exists
We can insert our new 'total' column at index 0 in our DataFrame object using the following code.
df.insert(0, 'total', df['price']*df['amount'], False)
Delete a Column
● The best way to delete Dataframe columns in Pandas is with the DataFrame.drop() method.
● The drop method is very flexible and can be used to drop specific rows or columns.
● It can also drop multiple columns at a time by either the column’s index or the column’s name.
● labels: index or column labels to drop
● axis: whether to drop labels from the index (0 or 'index') or columns (1 or 'columns')
● inplace: if True, complete the operation inplace and return None (data frame has to make
changes permanent)
df.drop('total', 1, inplace=True)
df.drop(df.columns[[1, 2]], 1, inplace=True)
Rename a Column
The simplest way to achieve this in Pandas is with the DataFrame.rename() method.
● columns: dictionary-like transformations to apply to the column labels
● inplace: if True, complete the operation inplace and return None
df.rename(columns={'amount': 'quantity'}, inplace=True)
Binary Operations of Data frame
1. ADDITION OF TWO DATA FRAMES
● The Python library pandas, offers several methods to handle two-dimensional data
through the class DataFrame.
● Two DataFrames can be added by using the add() method of pandas DataFrame class.
● Calling add() method is similar to calling the operator +. However, the add() method
can be passed a fill value, which will be used for NaN values in the DataFrame.
radd()
• Also, you can use ‘radd()’, this works the same as add(), the difference is that if we
want A+B, we use add(), else if we want B+A, we use radd(). (It won’t make any
difference in addition but it would make sense when we need subtraction and division.)
import pandas as pd
dataSet1 = [(10, 20, 30),
(40, 50, 60),
(70, 80, 90)]
dataFrame1 = pd.DataFrame(data=dataSet1)
dataSet2 = [(5, 15, 25),
(35, 45, 55),
(65, 75, 85)]
dataFrame2 = pd.DataFrame(data=dataSet2)
print("DataFrame1:")
print(dataFrame1)
print("DataFrame2:")
print(dataFrame2)
result = dataFrame1.add(dataFrame2)
print("Result of adding two pandas dataframes:")
print(result)
introduction to data structures in pandas
2. Subtracting A Pandas DataFrame From Another
DataFrame
● Python pandas library provides multitude of functions to work on two dimensioanl
Data through the DataFrame class.
● The sub() method of pandas DataFrame subtracts the elements of one DataFrame
from the elements of another DataFrame.
● Invoking sub() method on a DataFrame object is equivalent to calling the binary
subtraction operator(-).
● The sub() method supports passing a parameter for missing values(np.nan, None).
● rsub(): if you want A-B, then use ‘sub()’, but if you want B-A, then use ‘rsub()’
SYNTAX
● dataFrame1.rsub(dataFrame2)
import pandas as pd
# Create Data
data1 = [(2, 4, 6, 8),
(1, 3, 5, 7),
(5, 0, 0, 9)]
data2 = [(1, 1, 0 , 1),
(1, 0, 1 , 1),
(0, 1, 1 , 0)]
# Construct DataFrame1
dataFrame1 = pd.DataFrame(data=data1)
print("DataFrame1:")
print(dataFrame1)
# Construct DataFrame2
dataFrame2 = pd.DataFrame(data=data2)
print("DataFrame2:")
print(dataFrame2)
# Subtracting DataFrame2 from DataFrame1
subtractionResults = dataFrame1 - dataFrame2
print("Result of subtracting dataFrame1 from dataFrame2:")
print(subtractionResults)
introduction to data structures in pandas
3. Multiplying A DataFrame With Another DataFrame, Series Or A
Python Sequence
• The mul() method of DataFrame object multiplies the elements of a DataFrame object
with another DataFrame object, series or any other Python sequence.
• mul() does an elementwise multiplication of a DataFrame with another DataFrame, a
pandas Series or a Python Sequence.
• Calling the mul() method is similar to using the binary multiplication operator(*).
• The mul() method provides a parameter fill_value using which values can be passed to
replace the np.nan, None values present in the data.
• rmul(): if you want A*B, then use ‘mul()’, but if you want B*A, then use ‘rmul()’
SYNTAX
• dataFrame1.rmul(dataFrame2)
Example:
# Multiply two DataFrames
multiplicationResults = dataFrame1.mul(dataFrame2)
print("Result of element-wise multiplication of two Data Frames:")
print(multiplicationResults)
4. Dataframe Division Operations
• div() method divides element-wise division of one pandas DataFrame by another.
• DataFrame elements can be divided by a pandas series or by a Python sequence as well.
• Calling div() on a DataFrame instance is equivalent to invoking the division operator (/).
• The div() method provides the fill_value parameter which is used for replacing the np.nan
and None values present in the DataFrame or in the resultant value with any other value.
• rdiv(): if you want A/B, then use ‘div()’, but if you want B/A, then use ‘rdiv()’
SYNTAX
• dataFrame1.rdiv(dataFrame2)
Example:
# Divide the DataFrame1 elements by the elements of DataFrame2
divisionResults = dataFrame1.div(dataFrame2)
print("Elements of DataFrame1:")
print(dataFrame1)
print("Elements of DataFrame2:")
print(dataFrame2)
print("DataFrame1 elements divided by DataFrame2 elements:")
print(divisionResults)
Ad

More Related Content

Similar to introduction to data structures in pandas (20)

Data Frame Data structure in Python pandas.pptx
Data Frame Data structure in Python pandas.pptxData Frame Data structure in Python pandas.pptx
Data Frame Data structure in Python pandas.pptx
Ramakrishna Reddy Bijjam
 
Lecture on Python Pandas for Decision Making
Lecture on Python Pandas for Decision MakingLecture on Python Pandas for Decision Making
Lecture on Python Pandas for Decision Making
ssuser46aec4
 
Pythonggggg. Ghhhjj-for-Data-Analysis.pptx
Pythonggggg. Ghhhjj-for-Data-Analysis.pptxPythonggggg. Ghhhjj-for-Data-Analysis.pptx
Pythonggggg. Ghhhjj-for-Data-Analysis.pptx
sahilurrahemankhan
 
ppanda.pptx
ppanda.pptxppanda.pptx
ppanda.pptx
DOLKUMARCHANDRA
 
Pandas yayyyyyyyyyyyyyyyyyin Python.pptx
Pandas yayyyyyyyyyyyyyyyyyin Python.pptxPandas yayyyyyyyyyyyyyyyyyin Python.pptx
Pandas yayyyyyyyyyyyyyyyyyin Python.pptx
AamnaRaza1
 
Unit 3_Numpy_VP.pptx
Unit 3_Numpy_VP.pptxUnit 3_Numpy_VP.pptx
Unit 3_Numpy_VP.pptx
vishnupriyapm4
 
Pandas.pptx
Pandas.pptxPandas.pptx
Pandas.pptx
Ramakrishna Reddy Bijjam
 
pandas directories on the python language.pptx
pandas directories on the python language.pptxpandas directories on the python language.pptx
pandas directories on the python language.pptx
SumitMajukar
 
Introduction To Pandas:Basics with syntax and examples.pptx
Introduction To Pandas:Basics with syntax and examples.pptxIntroduction To Pandas:Basics with syntax and examples.pptx
Introduction To Pandas:Basics with syntax and examples.pptx
sonali sonavane
 
Introduction to pandas
Introduction to pandasIntroduction to pandas
Introduction to pandas
Piyush rai
 
introductiontopandas- for 190615082420.pptx
introductiontopandas- for 190615082420.pptxintroductiontopandas- for 190615082420.pptx
introductiontopandas- for 190615082420.pptx
rahulborate13
 
Python Pandas.pptx
Python Pandas.pptxPython Pandas.pptx
Python Pandas.pptx
SujayaBiju
 
Unit 4_Working with Graphs _python (2).pptx
Unit 4_Working with Graphs _python (2).pptxUnit 4_Working with Graphs _python (2).pptx
Unit 4_Working with Graphs _python (2).pptx
prakashvs7
 
Pandas-(Ziad).pptx
Pandas-(Ziad).pptxPandas-(Ziad).pptx
Pandas-(Ziad).pptx
Sivam Chinna
 
pandas dataframe notes.pdf
pandas dataframe notes.pdfpandas dataframe notes.pdf
pandas dataframe notes.pdf
AjeshSurejan2
 
Unit 3_Numpy_VP.pptx
Unit 3_Numpy_VP.pptxUnit 3_Numpy_VP.pptx
Unit 3_Numpy_VP.pptx
vishnupriyapm4
 
Python Library-Series.pptx
Python Library-Series.pptxPython Library-Series.pptx
Python Library-Series.pptx
JustinDsouza12
 
Numpy_Pandas_for beginners_________.pptx
Numpy_Pandas_for beginners_________.pptxNumpy_Pandas_for beginners_________.pptx
Numpy_Pandas_for beginners_________.pptx
Abhi Marvel
 
PPT on Data Science Using Python
PPT on Data Science Using PythonPPT on Data Science Using Python
PPT on Data Science Using Python
NishantKumar1179
 
Unit 1 Ch 2 Data Frames digital vis.pptx
Unit 1 Ch 2 Data Frames digital vis.pptxUnit 1 Ch 2 Data Frames digital vis.pptx
Unit 1 Ch 2 Data Frames digital vis.pptx
abida451786
 
Data Frame Data structure in Python pandas.pptx
Data Frame Data structure in Python pandas.pptxData Frame Data structure in Python pandas.pptx
Data Frame Data structure in Python pandas.pptx
Ramakrishna Reddy Bijjam
 
Lecture on Python Pandas for Decision Making
Lecture on Python Pandas for Decision MakingLecture on Python Pandas for Decision Making
Lecture on Python Pandas for Decision Making
ssuser46aec4
 
Pythonggggg. Ghhhjj-for-Data-Analysis.pptx
Pythonggggg. Ghhhjj-for-Data-Analysis.pptxPythonggggg. Ghhhjj-for-Data-Analysis.pptx
Pythonggggg. Ghhhjj-for-Data-Analysis.pptx
sahilurrahemankhan
 
Pandas yayyyyyyyyyyyyyyyyyin Python.pptx
Pandas yayyyyyyyyyyyyyyyyyin Python.pptxPandas yayyyyyyyyyyyyyyyyyin Python.pptx
Pandas yayyyyyyyyyyyyyyyyyin Python.pptx
AamnaRaza1
 
pandas directories on the python language.pptx
pandas directories on the python language.pptxpandas directories on the python language.pptx
pandas directories on the python language.pptx
SumitMajukar
 
Introduction To Pandas:Basics with syntax and examples.pptx
Introduction To Pandas:Basics with syntax and examples.pptxIntroduction To Pandas:Basics with syntax and examples.pptx
Introduction To Pandas:Basics with syntax and examples.pptx
sonali sonavane
 
Introduction to pandas
Introduction to pandasIntroduction to pandas
Introduction to pandas
Piyush rai
 
introductiontopandas- for 190615082420.pptx
introductiontopandas- for 190615082420.pptxintroductiontopandas- for 190615082420.pptx
introductiontopandas- for 190615082420.pptx
rahulborate13
 
Python Pandas.pptx
Python Pandas.pptxPython Pandas.pptx
Python Pandas.pptx
SujayaBiju
 
Unit 4_Working with Graphs _python (2).pptx
Unit 4_Working with Graphs _python (2).pptxUnit 4_Working with Graphs _python (2).pptx
Unit 4_Working with Graphs _python (2).pptx
prakashvs7
 
Pandas-(Ziad).pptx
Pandas-(Ziad).pptxPandas-(Ziad).pptx
Pandas-(Ziad).pptx
Sivam Chinna
 
pandas dataframe notes.pdf
pandas dataframe notes.pdfpandas dataframe notes.pdf
pandas dataframe notes.pdf
AjeshSurejan2
 
Python Library-Series.pptx
Python Library-Series.pptxPython Library-Series.pptx
Python Library-Series.pptx
JustinDsouza12
 
Numpy_Pandas_for beginners_________.pptx
Numpy_Pandas_for beginners_________.pptxNumpy_Pandas_for beginners_________.pptx
Numpy_Pandas_for beginners_________.pptx
Abhi Marvel
 
PPT on Data Science Using Python
PPT on Data Science Using PythonPPT on Data Science Using Python
PPT on Data Science Using Python
NishantKumar1179
 
Unit 1 Ch 2 Data Frames digital vis.pptx
Unit 1 Ch 2 Data Frames digital vis.pptxUnit 1 Ch 2 Data Frames digital vis.pptx
Unit 1 Ch 2 Data Frames digital vis.pptx
abida451786
 

More from vidhyapm2 (15)

python file handling,exceptions are discussed
python file handling,exceptions are discussedpython file handling,exceptions are discussed
python file handling,exceptions are discussed
vidhyapm2
 
different Storage classse in c programmings.pptx
different Storage classse in c programmings.pptxdifferent Storage classse in c programmings.pptx
different Storage classse in c programmings.pptx
vidhyapm2
 
Files _Part 2_Sequential and random access.pptx
Files _Part 2_Sequential and random access.pptxFiles _Part 2_Sequential and random access.pptx
Files _Part 2_Sequential and random access.pptx
vidhyapm2
 
Using virtualization technologies, Load balancing and virtualization,Understa...
Using virtualization technologies, Load balancing and virtualization,Understa...Using virtualization technologies, Load balancing and virtualization,Understa...
Using virtualization technologies, Load balancing and virtualization,Understa...
vidhyapm2
 
Cloud Computing Fundamentals: What is Cloud Computing, Essential Characterist...
Cloud Computing Fundamentals: What is Cloud Computing, Essential Characterist...Cloud Computing Fundamentals: What is Cloud Computing, Essential Characterist...
Cloud Computing Fundamentals: What is Cloud Computing, Essential Characterist...
vidhyapm2
 
hand assembly of sic/xe architecture is explained in this
hand assembly of sic/xe architecture is explained in thishand assembly of sic/xe architecture is explained in this
hand assembly of sic/xe architecture is explained in this
vidhyapm2
 
AVL rotations are discussed here. 4 different rotations
AVL rotations are discussed here. 4 different rotationsAVL rotations are discussed here. 4 different rotations
AVL rotations are discussed here. 4 different rotations
vidhyapm2
 
AVL tree. AVL insertion and deletion is discussed in this
AVL tree. AVL insertion and deletion is discussed in thisAVL tree. AVL insertion and deletion is discussed in this
AVL tree. AVL insertion and deletion is discussed in this
vidhyapm2
 
Binary search trees. non linear data structure
Binary search trees. non linear data structureBinary search trees. non linear data structure
Binary search trees. non linear data structure
vidhyapm2
 
security mechanisms required for cloud computing
security mechanisms required for cloud computingsecurity mechanisms required for cloud computing
security mechanisms required for cloud computing
vidhyapm2
 
cloud management mechanisms are included in this
cloud management mechanisms are included in thiscloud management mechanisms are included in this
cloud management mechanisms are included in this
vidhyapm2
 
cloud computing architectures. different architectures
cloud computing architectures. different architecturescloud computing architectures. different architectures
cloud computing architectures. different architectures
vidhyapm2
 
cloud delivery model considerations are explained here
cloud delivery model considerations are explained herecloud delivery model considerations are explained here
cloud delivery model considerations are explained here
vidhyapm2
 
cloud infrasturture mechanisms are being explained here
cloud infrasturture mechanisms are being explained herecloud infrasturture mechanisms are being explained here
cloud infrasturture mechanisms are being explained here
vidhyapm2
 
Module_1 Linear search search and Bsearch).pptx
Module_1 Linear search search and Bsearch).pptxModule_1 Linear search search and Bsearch).pptx
Module_1 Linear search search and Bsearch).pptx
vidhyapm2
 
python file handling,exceptions are discussed
python file handling,exceptions are discussedpython file handling,exceptions are discussed
python file handling,exceptions are discussed
vidhyapm2
 
different Storage classse in c programmings.pptx
different Storage classse in c programmings.pptxdifferent Storage classse in c programmings.pptx
different Storage classse in c programmings.pptx
vidhyapm2
 
Files _Part 2_Sequential and random access.pptx
Files _Part 2_Sequential and random access.pptxFiles _Part 2_Sequential and random access.pptx
Files _Part 2_Sequential and random access.pptx
vidhyapm2
 
Using virtualization technologies, Load balancing and virtualization,Understa...
Using virtualization technologies, Load balancing and virtualization,Understa...Using virtualization technologies, Load balancing and virtualization,Understa...
Using virtualization technologies, Load balancing and virtualization,Understa...
vidhyapm2
 
Cloud Computing Fundamentals: What is Cloud Computing, Essential Characterist...
Cloud Computing Fundamentals: What is Cloud Computing, Essential Characterist...Cloud Computing Fundamentals: What is Cloud Computing, Essential Characterist...
Cloud Computing Fundamentals: What is Cloud Computing, Essential Characterist...
vidhyapm2
 
hand assembly of sic/xe architecture is explained in this
hand assembly of sic/xe architecture is explained in thishand assembly of sic/xe architecture is explained in this
hand assembly of sic/xe architecture is explained in this
vidhyapm2
 
AVL rotations are discussed here. 4 different rotations
AVL rotations are discussed here. 4 different rotationsAVL rotations are discussed here. 4 different rotations
AVL rotations are discussed here. 4 different rotations
vidhyapm2
 
AVL tree. AVL insertion and deletion is discussed in this
AVL tree. AVL insertion and deletion is discussed in thisAVL tree. AVL insertion and deletion is discussed in this
AVL tree. AVL insertion and deletion is discussed in this
vidhyapm2
 
Binary search trees. non linear data structure
Binary search trees. non linear data structureBinary search trees. non linear data structure
Binary search trees. non linear data structure
vidhyapm2
 
security mechanisms required for cloud computing
security mechanisms required for cloud computingsecurity mechanisms required for cloud computing
security mechanisms required for cloud computing
vidhyapm2
 
cloud management mechanisms are included in this
cloud management mechanisms are included in thiscloud management mechanisms are included in this
cloud management mechanisms are included in this
vidhyapm2
 
cloud computing architectures. different architectures
cloud computing architectures. different architecturescloud computing architectures. different architectures
cloud computing architectures. different architectures
vidhyapm2
 
cloud delivery model considerations are explained here
cloud delivery model considerations are explained herecloud delivery model considerations are explained here
cloud delivery model considerations are explained here
vidhyapm2
 
cloud infrasturture mechanisms are being explained here
cloud infrasturture mechanisms are being explained herecloud infrasturture mechanisms are being explained here
cloud infrasturture mechanisms are being explained here
vidhyapm2
 
Module_1 Linear search search and Bsearch).pptx
Module_1 Linear search search and Bsearch).pptxModule_1 Linear search search and Bsearch).pptx
Module_1 Linear search search and Bsearch).pptx
vidhyapm2
 
Ad

Recently uploaded (20)

Lecture - 7 Canals of the topic of the civil engineering
Lecture - 7  Canals of the topic of the civil engineeringLecture - 7  Canals of the topic of the civil engineering
Lecture - 7 Canals of the topic of the civil engineering
MJawadkhan1
 
Evonik Overview Visiomer Specialty Methacrylates.pdf
Evonik Overview Visiomer Specialty Methacrylates.pdfEvonik Overview Visiomer Specialty Methacrylates.pdf
Evonik Overview Visiomer Specialty Methacrylates.pdf
szhang13
 
hypermedia_system_revisit_roy_fielding .
hypermedia_system_revisit_roy_fielding .hypermedia_system_revisit_roy_fielding .
hypermedia_system_revisit_roy_fielding .
NABLAS株式会社
 
Agents chapter of Artificial intelligence
Agents chapter of Artificial intelligenceAgents chapter of Artificial intelligence
Agents chapter of Artificial intelligence
DebdeepMukherjee9
 
seninarppt.pptx1bhjiikjhggghjykoirgjuyhhhjj
seninarppt.pptx1bhjiikjhggghjykoirgjuyhhhjjseninarppt.pptx1bhjiikjhggghjykoirgjuyhhhjj
seninarppt.pptx1bhjiikjhggghjykoirgjuyhhhjj
AjijahamadKhaji
 
Design Optimization of Reinforced Concrete Waffle Slab Using Genetic Algorithm
Design Optimization of Reinforced Concrete Waffle Slab Using Genetic AlgorithmDesign Optimization of Reinforced Concrete Waffle Slab Using Genetic Algorithm
Design Optimization of Reinforced Concrete Waffle Slab Using Genetic Algorithm
Journal of Soft Computing in Civil Engineering
 
How to Build a Desktop Weather Station Using ESP32 and E-ink Display
How to Build a Desktop Weather Station Using ESP32 and E-ink DisplayHow to Build a Desktop Weather Station Using ESP32 and E-ink Display
How to Build a Desktop Weather Station Using ESP32 and E-ink Display
CircuitDigest
 
Artificial intelligence and machine learning.pptx
Artificial intelligence and machine learning.pptxArtificial intelligence and machine learning.pptx
Artificial intelligence and machine learning.pptx
rakshanatarajan005
 
Slide share PPT of NOx control technologies.pptx
Slide share PPT of  NOx control technologies.pptxSlide share PPT of  NOx control technologies.pptx
Slide share PPT of NOx control technologies.pptx
vvsasane
 
Control Methods of Noise Pollutions.pptx
Control Methods of Noise Pollutions.pptxControl Methods of Noise Pollutions.pptx
Control Methods of Noise Pollutions.pptx
vvsasane
 
Water Industry Process Automation & Control Monthly May 2025
Water Industry Process Automation & Control Monthly May 2025Water Industry Process Automation & Control Monthly May 2025
Water Industry Process Automation & Control Monthly May 2025
Water Industry Process Automation & Control
 
Frontend Architecture Diagram/Guide For Frontend Engineers
Frontend Architecture Diagram/Guide For Frontend EngineersFrontend Architecture Diagram/Guide For Frontend Engineers
Frontend Architecture Diagram/Guide For Frontend Engineers
Michael Hertzberg
 
2.3 Genetically Modified Organisms (1).ppt
2.3 Genetically Modified Organisms (1).ppt2.3 Genetically Modified Organisms (1).ppt
2.3 Genetically Modified Organisms (1).ppt
rakshaiya16
 
Design of Variable Depth Single-Span Post.pdf
Design of Variable Depth Single-Span Post.pdfDesign of Variable Depth Single-Span Post.pdf
Design of Variable Depth Single-Span Post.pdf
Kamel Farid
 
6th International Conference on Big Data, Machine Learning and IoT (BMLI 2025)
6th International Conference on Big Data, Machine Learning and IoT (BMLI 2025)6th International Conference on Big Data, Machine Learning and IoT (BMLI 2025)
6th International Conference on Big Data, Machine Learning and IoT (BMLI 2025)
ijflsjournal087
 
Machine foundation notes for civil engineering students
Machine foundation notes for civil engineering studentsMachine foundation notes for civil engineering students
Machine foundation notes for civil engineering students
DYPCET
 
Transport modelling at SBB, presentation at EPFL in 2025
Transport modelling at SBB, presentation at EPFL in 2025Transport modelling at SBB, presentation at EPFL in 2025
Transport modelling at SBB, presentation at EPFL in 2025
Antonin Danalet
 
ML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdf
ML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdfML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdf
ML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdf
rameshwarchintamani
 
Generative AI & Large Language Models Agents
Generative AI & Large Language Models AgentsGenerative AI & Large Language Models Agents
Generative AI & Large Language Models Agents
aasgharbee22seecs
 
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdfLittle Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
gori42199
 
Lecture - 7 Canals of the topic of the civil engineering
Lecture - 7  Canals of the topic of the civil engineeringLecture - 7  Canals of the topic of the civil engineering
Lecture - 7 Canals of the topic of the civil engineering
MJawadkhan1
 
Evonik Overview Visiomer Specialty Methacrylates.pdf
Evonik Overview Visiomer Specialty Methacrylates.pdfEvonik Overview Visiomer Specialty Methacrylates.pdf
Evonik Overview Visiomer Specialty Methacrylates.pdf
szhang13
 
hypermedia_system_revisit_roy_fielding .
hypermedia_system_revisit_roy_fielding .hypermedia_system_revisit_roy_fielding .
hypermedia_system_revisit_roy_fielding .
NABLAS株式会社
 
Agents chapter of Artificial intelligence
Agents chapter of Artificial intelligenceAgents chapter of Artificial intelligence
Agents chapter of Artificial intelligence
DebdeepMukherjee9
 
seninarppt.pptx1bhjiikjhggghjykoirgjuyhhhjj
seninarppt.pptx1bhjiikjhggghjykoirgjuyhhhjjseninarppt.pptx1bhjiikjhggghjykoirgjuyhhhjj
seninarppt.pptx1bhjiikjhggghjykoirgjuyhhhjj
AjijahamadKhaji
 
How to Build a Desktop Weather Station Using ESP32 and E-ink Display
How to Build a Desktop Weather Station Using ESP32 and E-ink DisplayHow to Build a Desktop Weather Station Using ESP32 and E-ink Display
How to Build a Desktop Weather Station Using ESP32 and E-ink Display
CircuitDigest
 
Artificial intelligence and machine learning.pptx
Artificial intelligence and machine learning.pptxArtificial intelligence and machine learning.pptx
Artificial intelligence and machine learning.pptx
rakshanatarajan005
 
Slide share PPT of NOx control technologies.pptx
Slide share PPT of  NOx control technologies.pptxSlide share PPT of  NOx control technologies.pptx
Slide share PPT of NOx control technologies.pptx
vvsasane
 
Control Methods of Noise Pollutions.pptx
Control Methods of Noise Pollutions.pptxControl Methods of Noise Pollutions.pptx
Control Methods of Noise Pollutions.pptx
vvsasane
 
Frontend Architecture Diagram/Guide For Frontend Engineers
Frontend Architecture Diagram/Guide For Frontend EngineersFrontend Architecture Diagram/Guide For Frontend Engineers
Frontend Architecture Diagram/Guide For Frontend Engineers
Michael Hertzberg
 
2.3 Genetically Modified Organisms (1).ppt
2.3 Genetically Modified Organisms (1).ppt2.3 Genetically Modified Organisms (1).ppt
2.3 Genetically Modified Organisms (1).ppt
rakshaiya16
 
Design of Variable Depth Single-Span Post.pdf
Design of Variable Depth Single-Span Post.pdfDesign of Variable Depth Single-Span Post.pdf
Design of Variable Depth Single-Span Post.pdf
Kamel Farid
 
6th International Conference on Big Data, Machine Learning and IoT (BMLI 2025)
6th International Conference on Big Data, Machine Learning and IoT (BMLI 2025)6th International Conference on Big Data, Machine Learning and IoT (BMLI 2025)
6th International Conference on Big Data, Machine Learning and IoT (BMLI 2025)
ijflsjournal087
 
Machine foundation notes for civil engineering students
Machine foundation notes for civil engineering studentsMachine foundation notes for civil engineering students
Machine foundation notes for civil engineering students
DYPCET
 
Transport modelling at SBB, presentation at EPFL in 2025
Transport modelling at SBB, presentation at EPFL in 2025Transport modelling at SBB, presentation at EPFL in 2025
Transport modelling at SBB, presentation at EPFL in 2025
Antonin Danalet
 
ML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdf
ML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdfML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdf
ML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdf
rameshwarchintamani
 
Generative AI & Large Language Models Agents
Generative AI & Large Language Models AgentsGenerative AI & Large Language Models Agents
Generative AI & Large Language Models Agents
aasgharbee22seecs
 
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdfLittle Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
gori42199
 
Ad

introduction to data structures in pandas

  • 1. Module 4: Introduction to data structures in Pandas
  • 2. Introduction ●Pandas is an open-source library that uses for working with relational or labeled data both easily and intuitively. ●It provides various data structures and operations for manipulating numerical data and time series. ●It offers a tool for cleaning and processes your data. ●It is the most popular Python library that is used for data analysis. ●It supports two data structures: ● Series ● Dataframe
  • 3. What is a Series? • A Pandas Series is like a column in a table. • It is a one-dimensional array holding data of any type. • If nothing else is specified, the values are labeled with their index number. • First value has index 0, second value has index 1 etc. • This label can be used to access a specified value.
  • 4. • Syntax: pandas.Series(data=None, index=None, dtype=None, name=None, copy=False) data: array- Contains data stored in Series. index: array-like or Index (1d) dtype: str, numpy.dtype, or ExtensionDtype, optional name: str, optional copy: bool, default False
  • 5. Example1: import pandas as pd # a simple char list list = ['h', 'e', 'l', 'l', 'o'] # create series from a char list res = pd.Series(list) print(res) Output: 0 h 1 e 2 l 3 l 4 0 dtype: object
  • 6. Example 2: Create a simple Pandas Series from a list: import pandas as pd a = [1, 7, 2] myvar = pd.Series(a) print(myvar) print(myvar[0]) Output: 0 1 1 7 2 2 dtype: int64 1
  • 7. Example 3: Create label import pandas as pd a = [1,7,2] myvar = pd.Series(a, index = ["x","y","z"]) print(myvar) Output: x 1 y 7 z 2 dtype: int64
  • 8. Key/Value Objects as Series Create a simple Pandas Series from a dictionary: Example 1 import pandas as pd calories = {"day1": 420, "day2": 380, "day3": 390} myvar = pd.Series(calories) print(myvar) Output: day1 420 day2 380 day3 390 dtype: int64
  • 9. Example 2 import pandas as pd dic = { 'Id': 1013, 'Name': ‘Mohit', 'State': 'Manipal','Age': 24} res = pd.Series(dic) print(res) Output: Id 1013 Name Mohit State Manipal Age 24 dtype: object
  • 10. Operations on a Series • Pandas Series provides two very useful methods for extracting the data from the top and bottom of the Series Object. • These methods are head() and tail(). 1. Head() Method • head() method is used to get the elements from the top of the series. By default, it gives 5 elements. Syntax: <Series Object> . head(n = 5)
  • 11. Example: Consider the following Series, we will perform the operations on the below given Series S.
  • 12. head() Function without argument If we do not give any argument inside head() function, it will give by default 5 values from the top. import pandas as pd # Creating a Pandas Series data = pd.Series([10, 20, 30, 40, 50, 60, 70, 80]) # Using head() method (default n=5) print(data.head())
  • 13. When a positive number is provided, the head() function will extract the top n rows from Series Object. In the below given example, I have given 7, so 7 rows from the top has been extracted. head() Function with Positive Argument head() Function with negative Argument
  • 14. 2. Tail() Method tail() method gives the elements of series from the bottom. Syntax: <Series Object> . tail(n = 5)
  • 15. Example: Consider the following Series, we will perform the operations on the below given Series S.
  • 16. tail() function without argument If we do not provide any argument tail() function gives be default 5 values from the bottom of the Series Object.
  • 17. tail() function Positive and negative arguments
  • 18. 3. Vector operations • Like NumPy array, series support vector operations. • Batch operations on data without writing any for loops. This is usually called vectorization.
  • 19. Mathematical operations on Pandas Series 1. You can perform arithmetic operations like addition, subtraction, division, multiplication on two Series objects. 2. The operations are performed only on the matching indexes. 3. For all non-matching indexes, NaN (Not a Number) will be returned. Let us consider the following two Series S1 and S2. We will perform mathematical operations on these Series.
  • 28. DataFrames • Data sets in Pandas are usually multi-dimensional tables, called DataFrames. • Series is like a column, a DataFrame is the whole table. • A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table with rows and columns. • Pandas use the loc attribute to return one or more specified row(s) • With the index argument, you can name your own indexes.
  • 29. ● Pandas DataFrame is a two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). ● A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns like a spreadsheet or SQL table, or a dict of Series objects. ● Pandas DataFrame consists of three principal components: • Data • Rows • Columns.
  • 30. Example1: import pandas as pd # list of strings lst = ['welcome', 'to', 'gods', 'own', 'country'] # Calling DataFrame constructor on list df = pd.DataFrame(lst) display(df) Output:
  • 31. 2. Create a DataFrame from two Series: import pandas as pd data = { "calories": [420, 380, 390], "duration": [50, 40, 45] } myvar = pd.DataFrame(data) print(myvar) Output:
  • 32. 3.Creating DataFrame from dict of array/lists. # Python code demonstrate creating # DataFrame from dict narray / lists # By default addresses. import pandas as pd # initialise data of lists. data = {'Name':['Tom', 'nick', 'krish', 'jack'], 'Age':[20, 21, 19, 18]} # Create DataFrame df = pd.DataFrame(data) # Print the output. display(df)
  • 34. Selection of column: The [ ] operator is used to select a column by mentioning the respective column name. # Import pandas package import pandas as pd # Define a dictionary containing employee data data = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'], 'Age':[27, 24, 22, 32], 'Address':['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'], 'Qualification':['Msc', 'MA', 'MCA', 'Phd']} # Convert the dictionary into DataFrame df = pd.DataFrame(data) # select two columns
  • 35. How to Select Rows and Column from Pandas DataFrame based on condition? Example 1: Selecting rows. pandas.DataFrame.loc is a function used to select rows from Pandas DataFrame based on the condition provided. Syntax: df.loc[df[‘cname’] ‘condition’] Parameters: ● df: represents data frame ● cname: represents column name ● condition: represents condition on which rows has to be selected
  • 36. from pandas import DataFrame ** Example for selecting row # Creating a data frame Data = {'Name': ['Mohe', 'Shyni', 'Parul', 'Sam'], 'ID': [12, 43, 54, 32], 'Place': ['Delhi', 'Kochi', 'Pune', 'Patna'] } df = DataFrame(Data, columns = ['Name', 'ID', 'Place']) # Print original data frame print("Original data frame:n") display(df) # Selecting the product of Electronic Type select_prod = df.loc[df['Name'] == 'Mohe'] print("n") # Print selected rows based on the condition print("Selecting rows:n") display (select_prod)
  • 37. Example for selecting column # Importing pandas as pd from pandas import DataFrame # Creating a data frame Data = {'Name': ['Mohe', 'Shyni', 'Parul', 'Sam'], 'ID': [12, 43, 54, 32], 'Place': ['Delhi', 'Kochi', 'Pune', 'Patna'] } df = DataFrame(Data, columns = ['Name', 'ID', 'Place']) # Print original data frame print("Original data frame:") display(df) print("Selected column: ") display(df[['Name', 'ID']] )
  • 38. Add a New Column Let’s create a DataFrame object to begin. Method 1: import pandas as pd df = pd.DataFrame({'price': [3, 89, 45, 6], 'amount': [57, 42, 70, 43]}) df['total'] = df['price'] * df['amount']
  • 39. Method 2 If you want to specify where your new column should be inserted in the DataFrame, you can use the DataFrame.insert() method. The insert method has four parameters: insert(loc, new column name, value, allow_duplications) ● loc: the column insertion index ● column: new column label ● value: desired row data ● allow_duplications: (optional) will not create a new column if a column with the same label already exists We can insert our new 'total' column at index 0 in our DataFrame object using the following code. df.insert(0, 'total', df['price']*df['amount'], False)
  • 40. Delete a Column ● The best way to delete Dataframe columns in Pandas is with the DataFrame.drop() method. ● The drop method is very flexible and can be used to drop specific rows or columns. ● It can also drop multiple columns at a time by either the column’s index or the column’s name. ● labels: index or column labels to drop ● axis: whether to drop labels from the index (0 or 'index') or columns (1 or 'columns') ● inplace: if True, complete the operation inplace and return None (data frame has to make changes permanent) df.drop('total', 1, inplace=True) df.drop(df.columns[[1, 2]], 1, inplace=True)
  • 41. Rename a Column The simplest way to achieve this in Pandas is with the DataFrame.rename() method. ● columns: dictionary-like transformations to apply to the column labels ● inplace: if True, complete the operation inplace and return None df.rename(columns={'amount': 'quantity'}, inplace=True)
  • 42. Binary Operations of Data frame
  • 43. 1. ADDITION OF TWO DATA FRAMES ● The Python library pandas, offers several methods to handle two-dimensional data through the class DataFrame. ● Two DataFrames can be added by using the add() method of pandas DataFrame class. ● Calling add() method is similar to calling the operator +. However, the add() method can be passed a fill value, which will be used for NaN values in the DataFrame. radd() • Also, you can use ‘radd()’, this works the same as add(), the difference is that if we want A+B, we use add(), else if we want B+A, we use radd(). (It won’t make any difference in addition but it would make sense when we need subtraction and division.)
  • 44. import pandas as pd dataSet1 = [(10, 20, 30), (40, 50, 60), (70, 80, 90)] dataFrame1 = pd.DataFrame(data=dataSet1) dataSet2 = [(5, 15, 25), (35, 45, 55), (65, 75, 85)] dataFrame2 = pd.DataFrame(data=dataSet2) print("DataFrame1:") print(dataFrame1) print("DataFrame2:") print(dataFrame2) result = dataFrame1.add(dataFrame2) print("Result of adding two pandas dataframes:") print(result)
  • 46. 2. Subtracting A Pandas DataFrame From Another DataFrame ● Python pandas library provides multitude of functions to work on two dimensioanl Data through the DataFrame class. ● The sub() method of pandas DataFrame subtracts the elements of one DataFrame from the elements of another DataFrame. ● Invoking sub() method on a DataFrame object is equivalent to calling the binary subtraction operator(-). ● The sub() method supports passing a parameter for missing values(np.nan, None). ● rsub(): if you want A-B, then use ‘sub()’, but if you want B-A, then use ‘rsub()’ SYNTAX ● dataFrame1.rsub(dataFrame2)
  • 47. import pandas as pd # Create Data data1 = [(2, 4, 6, 8), (1, 3, 5, 7), (5, 0, 0, 9)] data2 = [(1, 1, 0 , 1), (1, 0, 1 , 1), (0, 1, 1 , 0)] # Construct DataFrame1 dataFrame1 = pd.DataFrame(data=data1) print("DataFrame1:") print(dataFrame1) # Construct DataFrame2 dataFrame2 = pd.DataFrame(data=data2) print("DataFrame2:") print(dataFrame2) # Subtracting DataFrame2 from DataFrame1 subtractionResults = dataFrame1 - dataFrame2 print("Result of subtracting dataFrame1 from dataFrame2:") print(subtractionResults)
  • 49. 3. Multiplying A DataFrame With Another DataFrame, Series Or A Python Sequence • The mul() method of DataFrame object multiplies the elements of a DataFrame object with another DataFrame object, series or any other Python sequence. • mul() does an elementwise multiplication of a DataFrame with another DataFrame, a pandas Series or a Python Sequence. • Calling the mul() method is similar to using the binary multiplication operator(*). • The mul() method provides a parameter fill_value using which values can be passed to replace the np.nan, None values present in the data. • rmul(): if you want A*B, then use ‘mul()’, but if you want B*A, then use ‘rmul()’ SYNTAX • dataFrame1.rmul(dataFrame2)
  • 50. Example: # Multiply two DataFrames multiplicationResults = dataFrame1.mul(dataFrame2) print("Result of element-wise multiplication of two Data Frames:") print(multiplicationResults)
  • 51. 4. Dataframe Division Operations • div() method divides element-wise division of one pandas DataFrame by another. • DataFrame elements can be divided by a pandas series or by a Python sequence as well. • Calling div() on a DataFrame instance is equivalent to invoking the division operator (/). • The div() method provides the fill_value parameter which is used for replacing the np.nan and None values present in the DataFrame or in the resultant value with any other value. • rdiv(): if you want A/B, then use ‘div()’, but if you want B/A, then use ‘rdiv()’ SYNTAX • dataFrame1.rdiv(dataFrame2)
  • 52. Example: # Divide the DataFrame1 elements by the elements of DataFrame2 divisionResults = dataFrame1.div(dataFrame2) print("Elements of DataFrame1:") print(dataFrame1) print("Elements of DataFrame2:") print(dataFrame2) print("DataFrame1 elements divided by DataFrame2 elements:") print(divisionResults)
  翻译: