Introduction to openpyxl
openpyxl is a Python library designed to read, write, and modify Excel files (with the .xlsx extension). It is widely used for automating Excel-related tasks, making it a valuable tool for data analysis, reporting, and more.
Getting Started with openpyxl
pip install openpyxl
from openpyxl import Workbook
# Create a new workbook and select the active worksheet
wb = Workbook()
ws = wb.active
# Rename the worksheet
ws.title = "MySheet"
# Save the workbook
wb.save("example.xlsx")
from openpyxl import load_workbook
# Load an existing workbook
wb = load_workbook("example.xlsx")
ws = wb.active
# Print the value of cell A1
print(ws["A1"].value)
Working with Data
cell = ws['A1']
cell.value = "Hello, World!"
for row in ws.iter_rows(min_row=1, max_col=3, max_row=2):
for cell in row:
print(cell.value)
ws.append(["Name", "Age", "City"])
ws.append(["Alice", 30, "New York"])
ws.append(["Bob", 25, "Los Angeles"])
Advanced Features
Recommended by LinkedIn
ws["A2"] = "=SUM(A1:A10)"
from openpyxl.styles import Font, Color, Alignment, Border, Side
# Set font style
ws["A1"].font = Font(name='Arial', size=14, bold=True, color="FF0000")
# Set cell alignment
ws["A1"].alignment = Alignment(horizontal="center", vertical="center")
# Set cell border
thin_border = Border(left=Side(style='thin'),
right=Side(style='thin'),
top=Side(style='thin'),
bottom=Side(style='thin'))
ws["A1"].border = thin_border
from openpyxl.chart import BarChart, Reference
# Add some data
for i in range(1, 11):
ws.append([i, i**2])
# Create a reference to the data
values = Reference(ws, min_col=1, min_row=1, max_col=2, max_row=10)
# Create a bar chart
chart = BarChart()
chart.add_data(values, titles_from_data=True)
# Add the chart to the worksheet
ws.add_chart(chart, "E5")
from openpyxl.formatting.rule import ColorScaleRule
# Apply a color scale based on cell values
color_scale_rule = ColorScaleRule(start_type="min", start_color="FF0000",
end_type="max", end_color="00FF00")
ws.conditional_formatting.add("A1:A10", color_scale_rule)
Handling Large Datasets
When working with large datasets, openpyxl provides efficient ways to handle data without consuming too much memory:
wb = load_workbook("large_file.xlsx", read_only=True)
ws = wb.active
for row in ws.iter_rows():
for cell in row:
print(cell.value)
wb = Workbook(write_only=True)
ws = wb.create_sheet()
for i in range(1000000):
ws.append([i, i**2, i**3])
wb.save("large_output.xlsx")
Conclusion
openpyxl is a comprehensive library that offers a wide range of functionalities for working with Excel files in Python. From basic operations like reading and writing data to advanced features like styling, charting, and handling large datasets, openpyxl provides the tools you need to automate and enhance your Excel workflows.
Software Engineer | Transforming Ideas into Reality at Kasadara Pvt Ltd
5moInsightful!!