The Power of Python Data Structures on the Road to Data Science
Essentials; Powerful Tools, Endless Possibilities...
Python stands out as one of the richest languages in terms of data structures in the programming world. One of the most important factors behind this popularity is the wide range of data structures Python offers. In this article, we aim to help you improve your programming skills and masterfully manage your data by providing an in-depth look at the basic data structures and their uses in Python. Data structures are critical for organising, manipulating and analysing data in data science projects.
In this article, we will examine the most important data structures in Python and how they are used in data science.
Overview
Python has a number of basic data types:
Numbers:
Text Strings:
Boolean:
To organize data in more complex ways, the following data structures can be used:
List: Ordered collections of data (e.g., [1, 2, 3], [“apple”, “banana”, “pear”])
Dictionary: Collections that store key-value pairs (e.g., {“name”: “John”, “age”: 25})
Tuple: Immutable ordered collections of data (e.g., (1, 2, 3), (“apple”, “banana”, “pear”))
Set: Collections that store unique values (e.g., {1, 2, 3}, {“apple”, “banana”, “pear”})
These basic data types can be used in conjunction with the programming language’s built-in constructs for lists, dictionaries, tuples, and sets to create more complex data structures.
Why Are Data Structures Important?
Data structures are important for a number of reasons, including:
In this article, we will take a closer look at the most commonly used data structures in Python and how they are used.
Are you ready?
Basic Data Structures
Lists:
Lists can be used to store different data types (numbers, texts, other lists, etc.) in a single variable. Lists are defined using square brackets ([]) and elements are separated by commas (,).
Key Features:
# Create a list containing different types of data
data = ["Ahmet", 25, True, ["apple", "pear"]]
# Select an element from a list
name = data[0]
# Add an element to a list
data.append(3.14)
# Print elements in a list
for item in data:
print(item)# A list containing texts
fruits = ["apple", "banana", "orange", "strawberry"]
# A list containing numbers
numbers = [1, 2, 3, 4, 5]
'''
Ahmet
25
True
['apple', 'pear']
3.14
'''
List Operations:
- Elements Access: Elements in the list can be accessed by index numbers. Indexing starts from 0 and continues until the end of the list.
# A list containing texts
fruits = ["apple", "banana", "orange", "strawberry"]
# A list containing numbers
numbers = [1, 2, 3, 4, 5]
List Operations:
# First element
print(fruits[0]) # apple
# Last element
print(fruits[-1]) # strawberry
# Second element
print(numbers[1]) # 2
Slicing: You can get a specific part of a list with the [:] operator.
# First two elements
print(numbers[:2]) # [1, 2]
# The last three elements
print(fruits[-3:]) # ['banana', 'orange', 'strawberry']
# A certain range
print(data[1:3]) # [25, True]
'''
[1, 2]
['banana', 'orange', 'strawberry']
[25, True]
'''
# Add an element to the end of a list
numbers.append(6)
# Add an element to a specific index
fruits.insert(1, "Orange")
'''
[1, 2, 3, 4, 5, 6]
['apple', 'orange', 'banana', 'orange', 'strawberry']
'''
# Delete the last element of the list
numbers.pop()
# Delete an element from a specific index
fruits.remove("orange")
'''
[1, 2, 3, 4, 5]
['apple', 'banana', 'orange', 'strawberry']
'''
# List length
print(len(fruits)) # 4
# The biggest element
print(max(numbers)) # 5
# Smallest element
print(min(fruits)) # Orange
# Sum of elements
print(sum(numbers)) # 15
List Use Cases:
# Create a shopping list
shopping_list = ["Apple", "Banana", "Milk", "Egg"]
# Create a list containing the names of contacts
names = ["Ahmet", "Ayşe", "Fatma", "Mehmet"]
# Finding the highest score
points = [10, 8, 7, 9]
# Average calculation
notes = [50, 60, 70, 80]
'''
['Apple', 'Banana', 'Milk', 'Egg']
['Ahmet', 'Ayşe', 'Fatma', 'Mehmet']
[10, 8, 7, 9]
[50, 60, 70, 80]
'''
List Comprehension:
List comprehension provides a shorter and more readable syntax for creating lists in Python. With this syntax, you can create complex lists without using for loops.
# Create a list of even numbers from 1 to 10
# Classical method
list4 = []
for i in range(1, 11):
if i % 2 == 0:
list4.append(i)
# With list comprehension
list5 = [i for i in range(1, 11) if i % 2 == 0]
print(list4) # [2, 4, 6, 8, 10]
print(list5) # [2, 4, 6, 8, 10]
Tuples :
# Create a coordinate pair
coordinates = (40.9089, 28.9784)
# Access the first element in the tuple
latitude = coordinates[0] #40.9089
# Joining bundles
full_coordinates = coordinates + (500, "Turkey") #(40.9089, 28.9784, 500, 'Turkey')
# Checking the type of bundle
print(type(full_coordinates))
#<class 'tuple'>
#############################################################
# A bundle containing numbers
tuple1 = (1, 2, 3, 4, 5)
# A bundle with texts
tuple2 = ("Apple", "Pear", "Banana")
# A tuple with mixed data types
tuple3= (1, "Hello", True, [1, 2, 3])
Tuple and List Differences:
| Feature | Tuple | List |
|---------------------|-------------------|-----------------------|
| Modifiability | Unmodifiable | Modifiable |
| Construct Syntax | Brackets (()) | Square Brackets ([]) |
| Performance | Usually faster | Usually slower |
Tuple Operations:
# First element
print(tuple1[0]) # 1
# Last element
print(tuple2[-1]) # Banana
# Second element
print(tuple3[1]) # Hello
# First two elements
print(tuple1[:2]) # (1, 2)
# Last three elements
print(tuple2[-3:]) # ("Apple", "Pear", "Banana")
# A certain range
print(tuple3[1:3]) # ("Hello", True)
# Tuple length
print(len(tuple3)) # 4
# The biggest element
print(max(tuple1)) # 5
# Smallest element
print(min(tuple2)) # "Apple"
# Sum of elements
print(sum(tuple1)) # 15
Tuple Use Cases:
# Creating a tuple containing a person's first name, last name and age
person_information = ("Ahmet", "Yilmaz", 25)
# Passing a tuple as an argument to a function
def function(name, last_name, age):
print(f"name: {name}, last name: {last_name}, age: {age}")
function(*person_information)
'''
name: Ahmet, last name: Yilmaz, age: 25
'''
In Python, tuples are a very useful data structure for storing and manipulating fixed data sets. Their immutable structure ensures data security and consistency.
Dictionaries:
Dictionaries are data structures that store key-value pairs. Similar to real-life dictionaries, they can be used to search for and find a value corresponding to a given key. Dictionaries are defined using square brackets ({}) and keys and values are separated by commas (,). Keys and values can be of different data types.
# Create a user profile
user = {
"name":"Ayse",
"age": 30, "city": "Istanbul"
}
# Accessing a value from the dictionary
city = user["city"]
# Add a new key-value pair to the dictionary
user["e-mail"] = "ayse@example.com"
# Print all keys in the dictionary
for key in user:
print(key)
my_dict = {"name": "Ahmet", "last name": "Yilmaz", "age": 25}
# A dictionary with names and ages of people
persons = {"Ahmet": 25, "Ayşe": 23, "Fatma": 30}
# A dictionary with names of colors and their English equivalents
colors = {"Red": "Red", "Green": "Green", "Blue": "Blue"}
'''
name
age
city
e-mail
'''
Recommended by LinkedIn
Dictionary Operations:
# Dictionary length
print(len(persons)) # 3
# Is the key "Ahmet" in the dictionary?
print("Ahmet" in kisiler) # True
# List keys in the dictionary
print(kisiler.keys()) # dict_keys(['Ahmet', 'Ayse', 'Mehmet'])
# List values in the dictionary
print(persons.values()) # dict_values([25, 23, 28])
# List key-value pairs
print(kisiler.items()) # dict_items([('Ahmet', 25), ('Ayse', 23), ('Mehmet', 28)])
Dictionary Uses:
# Creating a dictionary containing a student's lecture notes
grades = {"Maths": 80, "Physics": 70, "Turkish": 90}
# Accessing and Modifying Values
print(f"Maths grade: {grades['Maths']}") # Accessing a value
grades["Chemistry"] = 85 # Adding a new key-value pair
grades["Maths"] = 95 # Modifying an existing value
# Using the `get()` method with a default value
chemistry_grade = grades.get("Chemistry", 0) # Get "Chemistry" grade or default to 0
print(f"Chemistry grade (if not present): {chemistry_grade}")
username = input("User Name: ")
password = input("Password: ")
user_info = {"user_name": "admin", "password": "12345"}
if username == user_info["user_name"] and password == user_info["password"]:
print("Login successful!")
else:
print("Invalid credentials.")
'''
Maths grade: 80
Chemistry grade (if not present): 85
User Name: admin
Password: 12345
Login successful!
'''
Sets:
my_set = {"apple", "banana", "orange", "apple"}
print(my_set) # {"apple", "banana", "orange"}
# {'orange', 'banana', 'apple'}
###############################################################
# Creating a cluster containing different fruits
fruits = {"apple", "pear", "banana", "kiwi", "apple"}
# Add a new element to the set
fruits.add("strawberry")
# {'pear', 'apple', 'kiwi', 'banana', 'strawberry'}
Set Operations:
# A set containing numbers
numbers = {1, 2, 3, 4, 4, 5, 1, 2}
# A set containing texts
texts = {"Apple", "Pear", "Banana", "Apple"}
##############################
# Checking for the presence of an element in a set
"orange" in fruits # False
# Finding the intersection of two sets
other_fruits = {"pear", "pineapple", "mango"}
common_fruits = fruits & other_fruits
# Printing elements at an intersection
for fruit in common_fruits:
print(fruit)
# {'pear'}
# Add a new number to a set
numbers.add(6)
# Add a new text to a set
texts.add("Orange")
# Delete a number from a set
numbers.remove(2)
# Delete a text from a set
texts.discard("Apple")
# The union of two sets
union = numbers | texts # {1, 3, 4, 5, 'Banana', 'Pear'}
# The intersection of two sets
intersection = numbers & texts # set()
# Difference of two sets
difference = numbers - texts # {1, 3, 4, 5, }
Cluster Use Cases:
# Adding items to a shopping list and deleting duplicate items
shopping_list = {"Apple", "Pear", "Banana", "Apple", "Yoghurt"}
# Delete repeated words from a word list
words = {"Hello", "World", "Hello", "Python"}
# Finding common grades by comparing the grades of two students
student1_notes = {10, 8, 7, 9}
student2_notes = {8, 7, 9, 6}
Sets and Other Data Structures:
Sets have some important differences from other data structures such as lists and dictionaries:
In Python, sets are a very useful data structure for storing and manipulating datasets of unique and unordered elements. It is ideal for fast checking and validation.
Series
Arrays are text data types consisting of characters. They are identified using single quotation marks ('), double quotation marks (") or triple quotation marks (""""). Arrays, which store numbers in a given range, make it easier to work with numbers.
# An array defined with single quotes
array1 = 'Hello World!'
# An array defined with double quotes
array2 = "Python Programming Language"
# An array defined with three quotes
array3 = """"
This is a
multiline
is an array instance.
"""
my_range = range(1, 10)
print(my_range) # range(1, 10)
Array Properties:
Array Operations:
- Accessing Elements: The characters in the array can be accessed by index numbers. Indexing starts from 0 and continues until the end of the array.
# First character
print(array1[0]) # H
# Last character
print(array2[-1]) # e
# Second character
print(array3[2]) # T
- Slicing: You can get a specific part of an array with the [:] operator.
# First three characters
print(array1[:3]) # Hel
# Last two characters
print(array2[-2:]) # ge
# A certain range
print(array3[1:5]) # Thi
- Replacement: You can replace characters in an array with functions such as str.replace() and str.format().
# Writing "Hi" instead of "Hello"
array1 = array1.replace("Hello", "Hi")
# Placing a variable in an array
name = "Ahmet"
array2 = "Hello {}!".format(name)
- Array Functions: In Python, many functions such as len(), upper(), lower(), title(), strip(), find() can be used on arrays.
# Array length
print(len(array3)) # 44
# Convert sequence to upper case
print(array1.upper()) # HI WORLD!
# Lowercase the array
print(array2.lower()) # hello ahmet!
# Capitalise the first letter of the array
print(array3.title())
# This Is A
# Multiline
# Is An Array Instance.
# Delete spaces at the beginning and end of an array
print(array2.strip()) # Hello Ahmet!
# Find the index where the word "Python" first appears
print(array2.find("Ahmet")) # 6
Array Usage Areas:
# Asking for name and last name
name = input("Your name: ")
last_name = input("Your surname: ")
# Merge and capitalize full name
full_name = f"{name.title()} {last_name.title()}"
# Create personalized message
message = f"Hello {full_name}, welcome!"
# Print the message
print(message)
Special Array Methods:
Python provides many special methods that can be used on arrays. Let's look at some of them:
- split(): Splits an array into parts with a specific separator.
# Convert space-separated words into a list
words = "Python programming language".split()
print(words) # ['Python', 'programming', 'language']
#['Python', 'programming', 'language']
- join(): Join a list or array with a specific bracket.
# Concatenate a list with a space
space_flex_sentence = " ".join(words)
print(space_flex_sentence) # Python programming language
- count(): Counts the number of times a given character is repeated in an array.
# Finding how many times the letter "a" repeats
a_number = array2.count("a")
print(a_number) #0
- startswith() and endswith(): Checks whether the index begins or ends with a specific text.
# Does the array start with "Hello"?
starts = array1.startswith("Hello")
print(starts) # False
# Does the array end with "!"?
ends = array2.endswith("!")
print(ends) # True
Multiline Ar rays:
Arrays defined using three quotation marks allow you to write multi-line text in them. In this way, you can increase code readability.
description = """"
This code example,
Demonstrates how you can use arrays in Python.
"""
print(description)
In Python, arrays are one of the most frequently used data structures when working with text data. With array slicing, modification and various functions, you can perform text processing operations easily and efficiently.
The Power of Data Structures:
Beyond organising and managing data, data structures in Python help you improve your programming skills. With data structures:
Real Life Examples:
Data structures appear in many programmes and applications that we use in daily life.
- Social media platforms: Dictionaries and clusters are used to store user profiles and relationships.
- E-commerce sites: Lists and dictionaries are used to store product catalogues and customer information.
- Games: Various data structures are used to store properties of game characters and objects in the game world.
Result
In this article, we have given a comprehensive introduction to the basic data structures commonly used in Python and their uses in data science. Each data structure has its own unique characteristics and uses. Learning to choose appropriate data structures for storing different types of data, manipulating and analysing data is an important part of writing efficient and effective programs.
After assimilating data structures, the next step is to learn functions, one of the basic building blocks of Python. Functions prevent code repetition, break down complex operations, and increase the readability and maintainability of code. In my next post, we will focus on the details of Python functions and their applications in data science.
See you in our next article!
Source: