Making sense of multithreading and concurrency with Java
Disclaimer: This is not an expertise article. I'm just sharing my experience with a concept I personally found interesting and genuinely enjoyed. If you notice inaccurate/incorrect, please let me know.
Introduction
The first time I was introduced to multithreading was back in college under operating systems and Java programming. I felt like I knew the concept after grasping the theory behind it until I came across the real engineering requirement in the software industry myself.
After reading a few resources on the internet, I was still not convinced and had a shallow understanding. So, in order to feed my curiosity, I decided to experiment it using a small prototype.
Thread is a process/sub-process that a CPU uses to handle tasks submitted to it. Consider an example to determine the number of prime numbers in a given range. When you execute this function (a notably computational task), your system will allot required resources to it, which includes memory(RAM), I/O and processing power. This processing power is allotted in terms of threads which will handle the task submitted to it. The thread will perform the computation, return back the result if any and will free up for further computations as programmed.
You have the power to spin up as many threads as you want (in theory), but the actual limitations will show based on the number of CPU cores available. We will talk about this in a bit.
Implementation
We'll create a system which will return the number of primes in a range as response. Here we have a simple REST based API in spring boot. Let's wrap the response in a future and attach a callback, to not block HTTP threads from making asynchronous calls to the endpoint.
@GetMapping("/countPrimes")
public CompletableFuture<ResponseEntity<Integer>> getCountofPrimes(@RequestParam Integer a, @RequestParam Integer b) {
return service.computePrimesInRange(a, b)
.thenApply(ResponseEntity::ok);
}
/countPrimes(a,b) takes in two request parameters. This states the range within which the application should return the number of prime numbers. Then we have the service layer. We'll use a customExecutor to handle the computation thread pool and supply the tasks to the available threads:
private final Executor customExecutor;
PrototypeService(@Qualifier("customExecutor") Executor executor){
this.customExecutor = executor;
}
public CompletableFuture<Integer> computePrimesInRange(int start, int end){
return CompletableFuture.supplyAsync(()-> countPrimesInRange(start, end), customExecutor);
}
public Integer countPrimesInRange(int start, int end) {
int count = 0;
for(int curr = start; curr <= end; curr++){
if(isPrime(curr)){
count++;
}
}
return count;
}
A simple endpoint is exposed which will take in two values and return the number of primes in the range. Nothing fancy. Let's run it locally and test out the endpoint,
Let's see the response time,
That's an average response time of 65ms for just one request to our app. That means, if your host machine is idle, not handling any traffic at that moment, gets a single request, it would return the response in about 65ms.
Now, let's assume the app became highly popular. We start seeing a traffic of around 100 TPS. What happens then? Do all the request receive a response within 65ms? In order to see this, let's simulate this traffic in our local system. I'm going to use Apache Jmeter (a free and open source performance testing tool) to achieve this.
We are going to start off with 100 hits/s handled by 1 thread and gradually increase the number of threads to see the difference in response time. This experiment is run on a system with 8 cores. i.e true parallel processing can only be achieved up to 8 submitted tasks.
Recommended by LinkedIn
Our custom executor looks like:
@Bean(name = "customExecutor")
public Executor taskExecutor(){
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(1); // total threads in pool
executor.setMaxPoolSize(1); // max threads in pool
executor.setQueueCapacity(90); //max requests that can be queued
executor.setThreadNamePrefix("AsyncThread-");
executor.initialize();
return executor;
}
Results
The below execution results shows the average, min and max response time for a sample of 400 requests i.e four executions with 100 hits/s. The number of threads in our pool are configured in our custom executor through executor.setCorePoolSize(n) and executor.setMaxPoolSize(n) where n = total threads in pool.
Key Observations
Hope you enjoyed the read and took away some learning.
MS Computer Science Student | Data Engineer | Data Analyst
2moThat was a good read