IPL Data Analysis using Pandas AI
Last Updated :
08 May, 2025
Analyzing IPL 2023 auction data is important for understanding player purchases, team spending and auction trends. In this guide, we’ll use PandasAI an AI-powered data analysis tool to gain insights from the IPL 2024 Auction dataset. PandasAI enhances traditional Pandas by integrating AI-driven insights making it easier to extract meaningful information from large datasets. Key benefits include:
- Automating data analysis
- Generating quick insights
- Simplifying complex queries
Step-by-Step IPL Data Analysis Using PandasAI
Step 1: Prerequisites
Before starting ensure that pandasAI and openai libraries are installed. Run the following command in your command prompt:
!pip install -q pandasai
Step 2: Importing necessary libraries
Python
import pandas as pd
from pandasai import SmartDataframe
from pandasai.llm.openai import OpenAI
Step 3: Initializing an instance of OpenAI LLM and pass it's API key
Python
# replace "your_api_key" with your generated key
OPENAI_API_KEY = "your_api_key"
sdf = SmartDataframe(df, config={"llm": llm})
Step 4: Importing the IPL 2023 Auction dataset using pandas
We are using the IPL 2023 Auction dataset here. You can download dataset from kaggle.
Python
df = pd.read_csv('IPL_Squad_2023_Auction_Dataset.csv')
print(df.shape)
df.head()
Output:
IPL 2023 Auction datasetStep 5: Drop the "Unnamed: 0" column from the above dataset
Python
df.drop(['Unnamed: 0'], axis=1, inplace=True)
df.head()
Output:
IPL 2023 Auction datasetStep 6: Data Analysis using PandasAI
Now let's begin our analysis:
Prompt 1:
Python
sdf.chat(df, prompt="Which players are the most costliest buys?")
Output:
['Sam Curran', 'Cameron Green', 'Ben Stokes']
Prompt 2:
Python
prompts = """
Which players were the cheapest buys this season and which team bought them?
"""
sdf.chat(df, prompt=prompts)
Output:
Well, it looks like the cheapest buys this season were Glenn Phillips for Sunrisers Hyderabad,
Raj Angad Bawa and Rishi Dhawan for Punjab Super Kings, Dhruv Jurel and K.C Cariappa
for Rajasthan Royals and many more. The full list includes 163 players and their respective teams.
Prompt 3:
Python
prompts = """
Draw a Bargraph showing How much money was spent by each team this season overall.
"""
sdf.chat(df, prompt=prompts)
Output:
.png)
Prompt 4:
Python
sdf.chat(df, prompt="How many bowler remained unsold and what was their base price?")
Output:
There were 108 bowlers who remained unsold in the auction.
Their base price ranged from 2 million to 20 million.
Prompt 5:
Python
sdf.chat(df, prompt="How many players remained unsold this season?")
Output:
('Number of players remained unsold this season:', 338)
Prompt 6:
Python
sdf.chat(df, prompt="Which type of players were majorly unsold?")
Output:
Player's List Base Price TYPE COST IN ₹ (CR.) Cost IN $ (000) \
0 Shivam Mavi 4000000 BOWLER 6.0 720.0
2 Kane Williamson 20000000 BATSMAN 2.0 240.0
3 K.S. Bharat 2000000 WICKETKEEPER 1.2 144.0
5 Odean Smith 5000000 ALL-ROUNDER 0.5 60.0
7 Rahul Tewatia Retained ALL-ROUNDER 0.0 0.0
2022 Squad Team
0 KKR Gujarat Titans
2 SRH Gujarat Titans
3 DC Gujarat Titans
5 PBKS Gujarat Titans
7 GT Gujarat Titans
TYPE
ALL-ROUNDER 65
BOWLER 64
BATSMAN 35
WICKETKEEPER 21
Name: TYPE, dtype: int64
Number of unsold players: 0
Prompt 7:
Python
sdf.chat(df, prompt="Who are three new players Gujrat picked?")
Output:
0 Shivam Mavi
1 Joshua Little
2 Kane Williamson
Name: Player's List, dtype: object
Prompt 8:
Python
sdf.chat(df, prompt="What is total money spent by all teams in dollars?")
Output:
The total amount of money spent by all teams in the auction is $20,040,000.
Prompt 9:
Python
prompts = """
draw a barplot showing
how much money was spent by Mumbai Indians on all types of players?
"""
pandas_ai.run(df, prompt=prompts)
Output:
Bar graph for Money spent by Mumbai Indians on each type of playerSure, I can help you with that! To answer your question, I can draw a barplot that shows how much money the Mumbai Indians spent on all types of players. Would you like me to do that for you now?
Prompt 10:
Python
prompts = """
draw a barplot showing how much money was spent by Gujrat on all types of players?
"""
pandas_ai.run(df, prompt=prompts)
Output:
Bar plot showing money spent on each type of player by Gujrat TitansPrompt 11:
Python
sdf.chat(df, prompt="Can you predict which team will buy Sam Curran in 2024?")
Output:
Lucknow Super Giants
Prompt 12:
Python
sdf.chat(df, prompt="Perform univariate analysis")
Output:
Histogram of Cost Attribute in dataset
Bar graph visualizing Count of each type of player
Pie Chart showing Percentage of players in 2022 squadSure! Based on the data provided, the univariate analysis shows that we have six variables: Player's List, Base Price, TYPE, COST IN ₹ (CR.), Cost IN $ (000), 2022 Squad and Team. The data types for these variables are object, object, object, float64, float64 and object respectively.
Prompt 14:
Python
sdf.chat(df, prompt="Perform multivariate analysis")
Output:
Unfortunately, I was not able to answer your question. Please try again. If the problem persists try rephrasing your question.
For this input PandasAI seems to have failed as the complexity and ambiguity increased.
Pros of Pandas AI
- Pandas AI works well on direct and well-explained inputs.
- Can easily perform simple tasks like plotting graphs and univariate analysis.
- Can perform basic statistical operations.
- Also, can make basic predictions sometimes.
Cons of Pandas AI
- Cannot process ambiguous inputs.
- It uses server data, So, that is slower as compared to pandas.
- Cannot perform complex tasks like outlier analysis or multivariate analysis.
Similar Reads
IPL 2023 Data Analysis using Pandas AI
Analyzing IPL 2023 auction data is important for understanding player purchases, team spending and auction trends. In this guide, weâll use PandasAI an AI-powered data analysis tool to gain insights from the IPL 2024 Auction dataset. PandasAI enhances traditional Pandas by integrating AI-driven insi
4 min read
Olympics Data Analysis Using Python
In this article, we are going to see the Olympics analysis using Python. The modern Olympic Games or Olympics are leading international sports events featuring summer and winter sports competitions in which thousands of athletes from around the world participate in a variety of competitions. The Oly
4 min read
Uber Rides Data Analysis using Python
In this article, we will use Python and its different libraries to analyze the Uber Rides Data. Importing LibrariesThe analysis will be done using the following libraries :Â Pandas: Â This library helps to load the data frame in a 2D array format and has multiple functions to perform analysis tasks i
5 min read
Will AI Replace Data Analysts?
Data has become the new currency in today's fast-paced digital environment, influencing decision-making processes in many businesses. Will AI Replace Data Analysts? Answer: No and never, AI will augment, not replace, data analysts. While AI automates data processing and pattern recognition, it lacks
6 min read
Sequential Data Analysis in Python
Sequential data, often referred to as ordered data, consists of observations arranged in a specific order. This type of data is not necessarily time-based; it can represent sequences such as text, DNA strands, or user actions. In this article, we are going to explore, sequential data analysis, it's
8 min read
What is Data Analysis?
Data analysis refers to the practice of examining datasets to draw conclusions about the information they contain. It involves organizing, cleaning, and studying the data to understand patterns or trends. Data analysis helps to answer questions like "What is happening" or "Why is this happening". Or
6 min read
Data Analysis with Python
In this article, we will discuss how to do data analysis with Python. We will discuss all sorts of data analysis i.e. analyzing numerical data with NumPy, Tabular data with Pandas, data visualization Matplotlib, and Exploratory data analysis. Data Analysis With Python Data Analysis is the technique
15+ min read
Role of AI in Data Analytics
Artificial Intelligence (AI) has revolutionized the field of data analytics, providing powerful tools and techniques to extract valuable insights from vast amounts of data. By leveraging AI, organizations can enhance their decision-making processes, optimize operations, and gain a competitive edge i
4 min read
Data Analysis Examples
Data analysis stands as the cornerstone of informed decision-making in today's data-driven world, driving innovation and yielding actionable insights across industries. From healthcare and finance to retail and urban planning, the applications of data analysis are vast and transformative. In this in
7 min read
Data Manipulation in Python using Pandas
In Machine Learning, the model requires a dataset to operate, i.e. to train and test. But data doesnât come fully prepared and ready to use. There are discrepancies like Nan/ Null / NA values in many rows and columns. Sometimes the data set also contains some of the rows and columns which are not ev
6 min read