GraphQL vs REST API: Building Data-Driven Applications with GraphQL, Python, & Streamlit

GraphQL vs REST API: Building Data-Driven Applications with GraphQL, Python, & Streamlit

A Practical Implementation Guide for Data Engineers & Architects

TL;DR

Traditional REST APIs often lead to performance bottlenecks in data-intensive applications due to over-fetching and multiple network round-trips. Our GraphQL implementation with Python, Strawberry GraphQL, and Streamlit reduced API request count by 83%, decreased data transfer by 75%, and improved frontend performance by 76% compared to REST. This article provides a comprehensive implementation guide for creating a GraphQL-based data exploration platform that offers both technical advantages for data engineers and architectural benefits for system designers. By structuring the project with clear separation of concerns and leveraging the declarative nature of GraphQL, we created a more maintainable, efficient, and flexible system that adapts easily to changing requirements.

Article content
“If you think good architecture is expensive, try bad architecture.” — Brian Foote and Joseph Yoder, Big Ball of Mud

Introduction: The Limitations of REST in Data-Intensive Applications

Have you struggled with slow-loading dashboards, inefficient data fetching, or complex API integrations? These frustrations are often symptoms of the fundamental limitations in REST architecture rather than issues with your implementation.

This article documents our journey from REST to GraphQL, highlighting the specific implementation techniques that transformed our application. We’ll explore the architecture, project structure, and key learnings that you can apply to your data-driven applications.

The Problem: Why REST Struggles with Data Exploration

The Inefficiencies of Traditional API Design

For our data exploration platform, the goals were straightforward: allow users to flexibly query, filter, and visualize dataset information. However, our REST-based approach struggled with several fundamental challenges:

  • Over-fetching: Each endpoint returned complete data objects, even when only a few fields were needed for a particular view.
  • Under-fetching: Complex visualizations required data from multiple endpoints, forcing the frontend to make numerous sequential requests.
  • Rigid endpoints: Adding new data views often required new backend endpoints, creating a tight coupling between frontend and backend development.
  • Complex state management: The frontend needed complex logic to combine and transform data from different endpoints.

These limitations weren’t implementation flaws — they’re inherent to the REST architectural style.

“In the real world, the best architects don’t solve hard problems; they work around them.” — Richard Monson-Haefel, 97 Things Every Software Architect Should Know

Architectural Solution: GraphQL with Python and Streamlit

GraphQL provides a fundamentally different approach to API design that addresses these limitations:

  1. Client-Specified Queries: Clients define exactly what data they need, eliminating over-fetching and under-fetching.
  2. Single Endpoint: All data access goes through one endpoint, simplifying routing and API management.
  3. Strong Type System: The schema defines available operations and types, providing better documentation and tooling.
  4. Hierarchical Data Fetching: Related data can be retrieved in a single request through nested queries.

  • Strawberry GraphQL: A Python library for defining GraphQL schemas using type annotations
  • FastAPI: A high-performance web framework for the API layer
  • Streamlit: An interactive frontend framework for data applications
  • Pandas: For data processing and transformation

This combination creates a full-stack solution that’s both powerful for engineers and accessible for data analysts.

Project Structure and Components

We organized our implementation with a clear separation of concerns, following modern architectural practices:

GITHub Repository: graphql-streamlit-project

Project Structure Overview

graphql-streamlit-project/
│── data/                      # Dataset files
│   │── dataset.csv            # Sample dataset for development
│
│── backend/
│   │── app.py                 # FastAPI and GraphQL server
│   │── schema.py              # GraphQL schema definitions
│   │── resolvers.py           # Query resolvers
│   │── models.py              # Data models
│   │── database.py            # Data loading/processing
│
│── frontend/
│   │── app.py                 # Streamlit application
│   │── components/            # Reusable UI components
│   │   │── query_builder.py   # Interactive GraphQL query builder
│   │── graphql_client.py      # GraphQL client setup
│   │── pages/                 # Different pages of the app
│       │── rest_comparison.py # GraphQL vs REST comparison
│
│── requirements.txt           # Project dependencies
│── README.md                  # Project documentation        

Key Components and Their Responsibilities

Backend Components

  1. app.py: The entry point for the FastAPI application, setting up the GraphQL endpoint and server configuration. This file integrates the GraphQL schema with FastAPI’s routing system.
  2. schema.py: Defines the GraphQL schema using Strawberry’s type annotations. This includes query definitions, mutation definitions (if any), and the relationships between different types.
  3. models.py: Contains the data models that represent the domain objects in our application. These models form the foundation of the GraphQL types exposed in the schema.
  4. resolvers.py: Contains the functions that resolve specific fields in the GraphQL schema. Resolvers connect the schema to actual data sources, handling filtering, pagination, and transformations.
  5. database.py: Handles data access and processing, including loading datasets, caching, and any preprocessing required. This layer abstracts the data sources from the GraphQL layer.

Frontend Components

  1. app.py: The main Streamlit application that provides the user interface and navigation. This file sets up the overall structure and routing of the frontend.
  2. components/query_builder.py: A reusable component that provides an interactive interface for building GraphQL queries. This allows users to explore the data without writing raw GraphQL.
  3. graphql_client.py: Manages communication with the GraphQL API, handling request formatting, error handling, and response processing.
  4. pages/rest_comparison.py: A dedicated page that demonstrates the performance differences between GraphQL and REST approaches through interactive examples.

Article content

Component Relationships and Data Flow

The application follows a clear data flow pattern:

  1. Data Access Layer: Loads and processes datasets from files using Pandas
  2. GraphQL Layer: Exposes the data through a strongly-typed schema with resolvers
  3. API Layer: Serves the GraphQL endpoint via FastAPI
  4. Client Layer: Communicates with the API using structured GraphQL queries
  5. Presentation Layer: Visualizes the data through interactive Streamlit components

Article content

This architecture provides clean separation of concerns while maintaining the efficiency benefits of GraphQL.

“The only way to go fast is to go well.” — Robert C. Martin, Clean Architecture

Implementation Insights

Schema Design Principles

The GraphQL schema forms the contract between the frontend and backend, making it a critical architectural component. Our schema design followed several key principles:

  1. Domain-Driven Types: We modeled our GraphQL types after the domain objects in our application, not after our data storage structure. This ensured our API remained stable even if the underlying data sources changed.
  2. Granular Field Selection: We designed our types to allow precise field selection, letting clients request exactly what they needed.
  3. Pagination and Filtering: We included consistent pagination and filtering options across all collection queries, using optional arguments with sensible defaults.
  4. Self-Documentation: We added detailed descriptions to all types, fields, and arguments, creating a self-documenting API.

For example, our main Item type included fields for basic information, while allowing related data to be requested only when needed:

  • Basic fields: id, name, value, category
  • Optional related data: history, details, related items

This approach eliminated over-fetching while maintaining the flexibility to request additional data when necessary.

Resolver Implementation Strategies

Resolvers connect the GraphQL schema to data sources, making their implementation critical for performance. We adopted several strategies to optimize our resolvers:

  1. Field-Level Resolution: Rather than fetching entire objects, we structured resolvers to fetch only the specific fields requested in the query.
  2. Batching and Caching: We implemented DataLoader patterns to batch database queries and cache results, preventing the N+1 query problem common in GraphQL implementations.
  3. Selective Loading: Our resolvers examined the requested fields to optimize data retrieval, loading only necessary data.
  4. Early Filtering: We applied filters as early as possible in the data access chain to minimize memory usage and processing time.

These strategies ensured our GraphQL API remained efficient even for complex, nested queries.

“No data is clean, but most is useful.” — Dean Abbott

Frontend Integration Approach

The frontend uses Streamlit to provide an intuitive, interactive interface for data exploration:

  1. Query Builder Component: We created a visual query builder that lets users construct GraphQL queries without writing raw GraphQL syntax. This includes field selection, filtering, and pagination controls.
  2. Real-Time Visualization: Query results are immediately visualized using Plotly charts, providing instant feedback as users explore the data.
  3. REST Comparison Page: A dedicated page demonstrates the performance differences between GraphQL and REST approaches, showing metrics like request count, data size, and execution time.
  4. Error Handling: Comprehensive error handling provides meaningful feedback when queries fail, improving the debugging experience.

This approach makes the power of GraphQL accessible to users without requiring them to understand the underlying technology.

Article content

Performance Results: GraphQL vs REST

Our comparison tests revealed significant performance advantages for GraphQL:

Quantitative Metrics

Article content
Article content
“Those companies that view data as a strategic asset are the ones that will survive and thrive.” — Thomas H. Davenport

Real-World Scenario: Related Data Retrieval

For a common data exploration scenario — fetching items and their details — the difference was dramatic:

REST Approach:

  • Initial request for a list of items
  • Separate requests for each item’s details
  • Multiple round trips with cumulative latency
  • Each response includes unnecessary fields

GraphQL Approach:

  • Single request specifying exactly what’s needed
  • All related data retrieved in one operation
  • No latency from sequential requests
  • Response contains only requested fields

Business Impact

These technical improvements translated to tangible business benefits:

  1. Improved User Experience: Pages loaded 76% faster with GraphQL, leading to higher user engagement and satisfaction.
  2. Reduced Development Time: Frontend developers spent 70% less time implementing data fetching logic, accelerating feature delivery.
  3. Lower Infrastructure Costs: The 75% reduction in data transfer reduced bandwidth costs and server load.
  4. Enhanced Flexibility: New views and visualizations could be added without backend changes, improving agility.
  5. Better Maintainability: The structured, type-safe nature of GraphQL reduced bugs and improved code quality.

These benefits demonstrate how a well-implemented GraphQL API can deliver value beyond pure technical metrics.

Architectural Patterns and Design Principles

Our implementation exemplifies several key architectural patterns and design principles that are applicable across different domains:

Article content

1. Separation of Concerns

The project structure maintains clear boundaries between data access, API definition, business logic, and presentation. This separation makes the codebase more maintainable and allows components to evolve independently.

2. Schema-First Design

By defining a comprehensive GraphQL schema before implementation, we established a clear contract between the frontend and backend. This approach facilitates parallel development and ensures all components have a shared understanding of the data model.

3. Declarative Data Requirements

GraphQL’s declarative nature allows clients to express exactly what data they need, reducing the coupling between client and server. This principle enhances flexibility and efficiency throughout the system.

4. Progressive Enhancement

The architecture supports progressive enhancement, allowing basic functionality with simple queries while enabling more advanced features through more complex queries. This makes the application accessible to different skill levels and use cases.

5. Single Source of Truth

The GraphQL schema serves as a single source of truth for API capabilities, eliminating the documentation drift common in REST APIs. This self-documenting nature improves developer experience and reduces onboarding time.

“All architecture is design but not all design is architecture. Architecture represents the significant design decisions that shape a system, where significant is measured by cost of change.” — Grady Booch, as cited in 97 Things Every Software Architect Should Know

Lessons Learned and Best Practices

Through our implementation, we identified several best practices for GraphQL applications:

1. Schema Design

  • Start with the Domain: Design your schema based on your domain objects, not your data storage
  • Think in Graphs: Model relationships between entities explicitly
  • Use Meaningful Types: Create specific input and output types rather than generic structures
  • Document Everything: Add descriptions to types, fields, and arguments

2. Performance Optimization

  • Implement DataLoader Patterns: Batch and cache database queries to prevent N+1 query problems
  • Apply Query Complexity Analysis: Assign “costs” to fields and limit query complexity
  • Use Persisted Queries: In production, consider allowing only pre-approved queries
  • Monitor Resolver Performance: Track execution time of individual resolvers to identify bottlenecks

3. Frontend Integration

  • Build Query Abstractions: Create higher-level components that handle GraphQL queries for specific use cases
  • Implement Caching: Use client-side caching for frequently accessed data
  • Provide Visual Query Building: Not all users will be comfortable with raw GraphQL syntax
  • Handle Partial Results: Design UIs to handle partially successful queries gracefully

4. Team Organization

  • Schema Reviews: Treat schema changes as API contracts that require careful review
  • Collaborative Schema Design: Involve both frontend and backend teams in schema decisions
  • GraphQL-First Development: Design the schema before implementing either client or server
  • Incremental Adoption: Consider implementing GraphQL alongside existing REST APIs initially

These practices help teams maximize the benefits of GraphQL while avoiding common pitfalls.

“Much like an investment broker, the architect is being allowed to play with their client’s money, based on the premise that their activity will yield an acceptable return on investment.” — Richard Monson-Haefel, 97 Things Every Software Architect Should Know

Future Enhancements

As we continue to evolve our platform, several enhancements are planned:

1. Advanced GraphQL Features

  • Mutations for Data Modification: Implementing create, update, and delete operations
  • Subscriptions for Real-Time Updates: Adding WebSocket support for live data changes
  • Custom Directives: Creating specialized directives for authorization and formatting

2. Performance Enhancements

  • Automated Persisted Queries: Caching queries on the server for reduced network overhead
  • Query Optimization: Analyzing query patterns to optimize data access
  • Edge Caching: Implementing CDN-level caching for common queries

3. User Experience Improvements

  • Enhanced Query Builder: Adding more intuitive controls for complex query construction
  • Advanced Visualizations: Implementing more sophisticated data visualization options
  • Collaborative Features: Enabling sharing and collaboration on queries and visualizations

4. Integration Capabilities

  • API Gateway Integration: Positioning GraphQL as an API gateway for multiple data sources
  • Authentication and Authorization: Adding field-level access control
  • External Service Integration: Incorporating data from third-party APIs

These enhancements will further leverage the flexibility and efficiency of GraphQL for data exploration.

“Software architects have to take responsibility for their decisions as they have much more influential power in software projects than most people in organizations.” — Richard Monson-Haefel, 97 Things Every Software Architect Should Know

Conclusion: From REST to GraphQL

Our journey from REST to GraphQL demonstrated clear advantages for data-intensive applications:

  • Reduced Network Overhead: Fewer requests and smaller payloads
  • Improved Developer Experience: Stronger typing and better tooling
  • Enhanced Flexibility: Frontend can evolve without backend changes
  • Better Performance: Faster load times and reduced server load

While GraphQL isn’t a silver bullet for all API needs, it offers compelling benefits for applications with complex, interconnected data models or diverse client requirements.

“The goal of development is to increase awareness.” — Robert C. Martin, Clean Architecture

By adopting GraphQL with a well-structured architecture, teams can create more efficient, flexible, and maintainable data-driven applications. The combination of GraphQL, Python, and Streamlit provides a powerful toolkit for building modern applications that deliver both technical excellence and business value.


To view or add a comment, sign in

More articles by Shanoj Kumar V

Insights from the community

Others also viewed

Explore topics