Spring boot Apps getting optimized
Before making any changes, established clear performance baselines.
Here’s what our initial metrics looked like:
// Initial Performance Metrics
Maximum throughput: 50,000 requests/second
Average response time: 350ms
95th percentile response time: 850ms
CPU utilization during peak: 85-95%
Memory usage: 75% of available heap
Database connections: Often reaching max pool size (100)
Thread pool saturation: Frequent thread pool exhaustion
combination of tools to gather these metrics:
With these baseline metrics in hand, I could prioritize optimizations and measure their impact.
Uncovering the Real Bottlenecks 🔍
Initial profiling revealed several interesting bottlenecks:
Reactive Programming: The Game Changer ⚡
The most impactful change was adopting reactive programming with Spring WebFlux. This wasn’t a drop-in replacement; it required rethinking how we structured our application.
identifying services with heavy I/O operations:
// BEFORE: Blocking implementation
@Service
public class ProductService {
@Autowired
private ProductRepository repository;
public Product getProductById(Long id) {
return repository.findById(id)
.orElseThrow(() -> new ProductNotFoundException(id));
}
}
And converted them to reactive implementations:
// AFTER: Reactive implementation
@Service
public class ProductService {
@Autowired
private ReactiveProductRepository repository;
public Mono<Product> getProductById(Long id) {
return repository.findById(id)
.switchIfEmpty(Mono.error(new ProductNotFoundException(id)));
}
}
The controllers were updated accordingly:
// BEFORE: Traditional Spring MVC controller
@RestController
@RequestMapping("/api/products")
public class ProductController {
@Autowired
private ProductService service;
@GetMapping("/{id}")
public ResponseEntity<Product> getProduct(@PathVariable Long id) {
return ResponseEntity.ok(service.getProductById(id));
}
}
// AFTER: WebFlux reactive controller
@RestController
@RequestMapping("/api/products")
public class ProductController {
@Autowired
private ProductService service;
@GetMapping("/{id}")
public Mono<ResponseEntity<Product>> getProduct(@PathVariable Long id) {
return service.getProductById(id)
.map(ResponseEntity::ok)
.defaultIfEmpty(ResponseEntity.notFound().build());
}
}
This change alone doubled our throughput by making more efficient use of threads. Instead of one thread per request, WebFlux uses a small number of threads to handle many concurrent requests
Database Optimization: The Hidden Multiplier 📊
Database interactions were our next biggest bottleneck. I implemented a three-pronged approach:
1. Query Optimization
I used Spring Data’s @Query annotation to replace inefficient auto-generated queries:
// BEFORE: Using derived method name (inefficient)
List<Order> findByUserIdAndStatusAndCreatedDateBetween(
Long userId, OrderStatus status, LocalDate start, LocalDate end);
// AFTER: Optimized query
@Query("SELECT o FROM Order o WHERE o.userId = :userId " +
"AND o.status = :status " +
"AND o.createdDate BETWEEN :start AND :end " +
"ORDER BY o.createdDate DESC")
List<Order> findUserOrdersInDateRange(
@Param("userId") Long userId,
@Param("status") OrderStatus status,
@Param("start") LocalDate start,
@Param("end") LocalDate end);
optimized a particularly problematic N+1 query by using Hibernate’s @BatchSize:
@Entity
public class Order {
// Other fields
@OneToMany(mappedBy = "order", fetch = FetchType.EAGER)
@BatchSize(size = 30) // Batch fetch order items
private Set<OrderItem> items;
}
2. Connection Pool Tuning
The default HikariCP settings were causing connection contention. After extensive testing, I arrived at this configuration:
spring:
datasource:
hikari:
maximum-pool-size: 30
minimum-idle: 10
idle-timeout: 30000
connection-timeout: 2000
max-lifetime: 1800000
The key insight was that more connections isn’t always better; we found our sweet spot at 30 connections, which reduced contention without overwhelming the database.
3. Implementing Strategic Caching
Redis caching for frequently accessed data:
Recommended by LinkedIn
@Configuration
@EnableCaching
public class CacheConfig {
@Bean
public RedisCacheManager cacheManager(RedisConnectionFactory connectionFactory) {
RedisCacheConfiguration cacheConfig = RedisCacheConfiguration.defaultCacheConfig()
.entryTtl(Duration.ofMinutes(10))
.disableCachingNullValues();
return RedisCacheManager.builder(connectionFactory)
.cacheDefaults(cacheConfig)
.withCacheConfiguration("products",
RedisCacheConfiguration.defaultCacheConfig()
.entryTtl(Duration.ofMinutes(5)))
.withCacheConfiguration("categories",
RedisCacheConfiguration.defaultCacheConfig()
.entryTtl(Duration.ofHours(1)))
.build();
}
}
Then applied it to appropriate service methods:
@Service
public class ProductService {
// Other code
@Cacheable(value = "products", key = "#id")
public Mono<Product> getProductById(Long id) {
return repository.findById(id)
.switchIfEmpty(Mono.error(new ProductNotFoundException(id)));
}
@CacheEvict(value = "products", key = "#product.id")
public Mono<Product> updateProduct(Product product) {
return repository.save(product);
}
}
This reduced database load by 70% for read-heavy operations.
Serialization Optimization: The Surprising CPU Saver 💾
Profiling showed that 15% of CPU time was spent in Jackson serialization. Switched to a more efficient configuration:
@Configuration
public class JacksonConfig {
@Bean
public ObjectMapper objectMapper() {
ObjectMapper mapper = new ObjectMapper();
// Use afterburner module for faster serialization
mapper.registerModule(new AfterburnerModule());
// Only include non-null values
mapper.setSerializationInclusion(Include.NON_NULL);
// Disable features we don't need
mapper.disable(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS);
mapper.disable(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES);
return mapper;
}
For our most performance-critical endpoints, I replaced Jackson with Protocol Buffers:
syntax = "proto3";
package com.example.proto;
message ProductResponse {
int64 id = 1;
string name = 2;
string description = 3;
double price = 4;
int32 inventory = 5;
}
@RestController
@RequestMapping("/api/products")
public class ProductController {
// Jackson-based endpoint
@GetMapping("/{id}")
public Mono<ResponseEntity<Product>> getProduct(@PathVariable Long id) {
// Original implementation
}
// Protocol buffer endpoint for high-performance needs
@GetMapping("/{id}/proto")
public Mono<ResponseEntity<byte[]>> getProductProto(@PathVariable Long id) {
return service.getProductById(id)
.map(product -> ProductResponse.newBuilder()
.setId(product.getId())
.setName(product.getName())
.setDescription(product.getDescription())
.setPrice(product.getPrice())
.setInventory(product.getInventory())
.build().toByteArray())
.map(bytes -> ResponseEntity.ok()
.contentType(MediaType.APPLICATION_OCTET_STREAM)
.body(bytes));
}
}
This change reduced serialization CPU usage by 80% and decreased response sizes by 30%.
Thread Pool and Connection Tuning: The Configuration Magic 🧰
With WebFlux, we needed to tune Netty’s event loop settings:
spring:
reactor:
netty:
worker:
count: 16 # Number of worker threads (2x CPU cores)
connection:
provider:
pool:
max-connections: 10000
acquire-timeout: 5000
For the parts of our application still using Spring MVC, I tuned the Tomcat connector:
server:
tomcat:
threads:
max: 200
min-spare: 20
max-connections: 8192
accept-count: 100
connection-timeout: 2000
These settings allowed us to handle more concurrent connections with fewer resources.
Horizontal Scaling with Kubernetes: The Final Push 🚢
To reach our 1M requests/second target, we needed to scale horizontally. I containerized our application and deployed it to Kubernetes.
FROM openjdk:17-slim
COPY target/myapp.jar app.jar
ENV JAVA_OPTS="-XX:+UseG1GC -XX:MaxGCPauseMillis=100 -XX:+ParallelRefProcEnabled"
ENTRYPOINT exec java $JAVA_OPTS -jar /app.jar
Then configured auto-scaling based on CPU utilization:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 5
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
We also implemented service mesh capabilities with Istio for better traffic management:
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: myapp-vs
spec:
hosts:
- myapp-service
http:
- route:
- destination:
host: myapp-service
retries:
attempts: 3
perTryTimeout: 2s
timeout: 5s
This allowed us to handle traffic spikes efficiently while maintaining resilience.
Measuring the Results: The Proof 📈
After all optimizations, our metrics improved dramatically:
// Final Performance Metrics
Maximum throughput: 1,200,000 requests/second
Average response time: 85ms (was 350ms)
95th percentile response time: 120ms (was 850ms)
CPU utilization during peak: 60-70% (was 85-95%)
Memory usage: 50% of available heap (was 75%)
Database queries: Reduced by 70% thanks to caching
Thread efficiency: 10x improvement with reactive programming
Key Lessons Learned 💡
Performance optimization isn’t about finding one magic bullet; it’s about methodically identifying and addressing bottlenecks across your entire system. With Spring Boot, the capabilities are there; you just need to know which levers to pull.