SlideShare a Scribd company logo
Threads
Cesarano Antonio
Del Monte Bonaventura
Università degli studi di
Salerno
7th April 2014
Operating Systems II
Agenda
 Introduction
 Threads models
 Multithreading: single-core Vs multicore
 Implementation
 A Case Study
 Conclusions
CPU Trends
Introduction
What’s a Thread?
Memory: Heavy vs Light processes
Introduction
Why should I care about Threads?
 Pro
• Responsiveness
• Resources
sharing
• Economy
• Scalability
Cons
• Hard implementation
• Synchronization
• Critical section,
deadlock, livelock…
Introduction
Thread Models
Two kinds of Threads
User Threads Kernel Threads
Thread Models
User-level Threads
Implemented in software library
 Pthread
 Win32 API
Pro:
• Easy handling
• Fast context switch
• Trasparent to OS
• No new address
space, no need to
change address
space
Cons:
• Do not benefit from
multithreading or
multiprocessing
• Thread blocked
Process blocked
Thread Models
Kernel-level
Threads Executed only in kernel mode, managed
by OS
 Kthreadd children
Pro:
• Resource Aware
• No need to use a
new address space
• Thread blocked
Scheduled
Con:
• Slower then User-
threads
Thread Models
Thread implementation models:
From many to one
From one to one
From many to many
Thread Models
From many to one
 Whole process is blocked if one thread is
blocked
 Useless on multicore architectures
Thread Models
From one to one
 Works fine on multicore architectures
o Many kernel threads = High overhead
Thread Models
From many to many
 Works fine on multicore architectures
 Less overhead then “one to one” model
Multithreading
Multitasking
Single core Symmetric
Multi-Processor
MultiThreading
Multithreading
Multithreading
HyperThreadin
g
Multithreading
 How can We use multithreading
architectures?
Thread
Level
Parallelism
Data
Level
Parallelis
m
Multithreading
Thread Level Parallelism
Multithreading
Data Level Parallelism
Multithreading
Granularity
 Coarse-
grained:
Multithreading
 Context switch on high latency event
 Very fast thread-switching, no threads slow down
 Loss of throughput due to short stalls: pipeline start-
up
Granularity
 Fine-grained
Multithreading
 Context switch on every cycle
 Interleaved execution of multiple threads: it can
hide
both short and long stalls
 Rarely-stalling threads are slowed down
Granularity
Multithreading
Context Switching
Single-core Vs Multi-core
Xthread_ctxtswitc
h:
pusha
movl esp, [eax]
movl edx, esp
popa
ret
CPU
ESP
Thread 1regs
Thread
2
registers
Thread 1 TCB
SP: ....
Thread 2 TCB
SP: ....
Running Ready
Pushing old context
Single-core Vs Multi-core
Xthread_ctxtswitc
h:
pusha
movl esp, [eax]
movl edx, esp
popa
ret
CPU
ESP
Thread 1regs
Thread 2
registers
Thread 1 TCB
SP: ....
Thread 2 TCB
SP: ....
Thread 1
registers
Running Ready
Saving old stack pointer
Single-core Vs Multi-core
Xthread_ctxtswitc
h:
pusha
movl esp, [eax]
movl edx, esp
popa
ret
CPU
ESP
Thread 1regs
Thread 2
registers
Thread 1 TCB
SP: ....
Thread 2 TCB
SP: ....
Thread 1
registers
Running Ready
Changing stack pointer
Single-core Vs Multi-core
Xthread_ctxtswitc
h:
pusha
movl esp, [eax]
movl edx, esp
popa
ret
CPU
ESP
Thread 1regs
Thread 2
registers
Thread 1 TCB
SP: ....
Thread 2 TCB
SP: ....
Thread 1
registers
Ready Running
Popping off thread #2 old
context
Single-core Vs Multi-core
Xthread_ctxtswitc
h:
pusha
movl esp, [eax]
movl edx, esp
popa
ret
CPU
ESP
Thread 2 regs
Thread 1 TCB
SP: ....
Thread 2 TCB
SP: ....
Thread 1
registers
Ready Running
Done: return
Single-core Vs Multi-core
Xthread_ctxtswitc
h:
pusha
movl esp, [eax]
movl edx, esp
popa
ret
CPU
ESP
Thread 2 regs
Thread 1 TCB
SP: ....
Thread 2 TCB
SP: ....
Thread 1
registers
Ready Running
RET pops of the
returning address
and it assigns its
value to PC reg
Problems
 Critical Section:
When a thread A tries to access to a shared
variable simultaneously to a thread B
 Deadlock:
When a process A is waiting for
resource reserved to B, which is
waiting for resource reserved to A
 Race Condition:
The result of an execution depens on
the order of execution of different
threads
More Issues
 fork() and exec() system calls: to duplicate or
to not deplicate all threads?
 Signal handling in multithreading application.
 Scheduler activation: kernel threads have to
communicate with user thread, i.e.: upcalls
 Thread cancellation: termination a thread
before it has completed.
• Deferred cancellation
• Asynchronous cancellation: immediate
Designing a thread library
 Multiprocessor support
 Virtual processor
 RealTime support
 Memory Management
 Provide functions library rather than a module
 Portability
 No Kernel mode
Implementation
Posix Thread
 Posix standard for threads: IEEE POSIX
1003.1c
 Library made up of a set of types and
procedure calls written in C, for UNIX
platform
 It supports:
a) Thread management
b) Mutexes
c) Condition Variables
d) Synchronization between threads using
R/W locks and barries
Implementation
Thread Pool
 Different threads available in a pool
 When a task arrives, it gets assigned to a
free thread
 Once a thread completes its service, it
returns in the pool and awaits another work.
Implementation
PThred Lib base operations
 pthread_create()- create and launch a new thread
 pthread_exit()- destroy a running thread
 pthread_attr_init()- set thread attributes to their
default values
 pthread_join()- the caller thread blocks and waits
for another thread to finish
 pthread_self()- it retrieves the id assigned to the
calling thread
Implementation Example
N x N Matrix Multiplication
Implementation Example
A simple algorithm
for (int i = 0; i < MATRIX_ELEMENTS; i += MATRIX_LINE)
{
for (int j = 0; j < MATRIX_LINE; ++j)
{
float tmp = 0;
for (int k = 0; k < MATRIX_LINE; k++)
{
tmp +=
A[i + k] * B[(MATRIX_LINE * k) + j];
}
C[i + j] = tmp;
}
}
Implementation Example
SIMD Approach
transpose(B);
for (int i = 0; i < MATRIX_LINE; i++) {
for (int j = 0; j < MATRIX_LINE; j++){
__m128 tmp = _mm_setzero_ps();
for (int k = 0; k < MATRIX_LINE; k += 4){
tmp = _mm_add_ps(tmp,
_mm_mul_ps(_mm_load_ps(&A[MATRIX_LINE * i + k]),
_mm_load_ps(&B[MATRIX_LINE * j +
k])));
}
tmp = _mm_hadd_ps(tmp, tmp);
tmp = _mm_hadd_ps(tmp, tmp);
_mm_store_ss(&C[MATRIX_LINE * i + j], tmp);
}
}
transpose(B);
Implementation Example
TLP Approach
struct thread_params
{
pthread_t id;
float* a;
float* b;
float* c;
int low;
int high;
bool flag;
};
………
int main(int argc, char** argv){
int
ncores=sysconf(_SC_NPROCESSORS_ONLN);
int stride = MATRIX_LINE / ncores;
for (int j = 0; j < ncores; ++j){
pthread_attr_t attr;
pthread_attr_init(&attr);
thread_params* par = new
thread_params;
par->low=j*stride; par-
>high=j*stride+stride;
par->a = A; par->b = B; par->c = C;
pthread_create(&(par->id), &attr, runner,
par);
// set cpu affinity for thread
// sched_setaffinity
}
Implementation Example
TLP Approach
int main(int argc, char**
argv){
….
int completed = 0;
while (true) {
if (completed >= ncores)
break;
completed = 0;
usleep(100000);
for (int j=0; j<ncores;
++j){
if (p[j]->flag)
completed++;
}
}
….
}
void runner(void* p){
thread_params* params = (thread_params*)
p;
int low = params->low; // unpack others
values
for (int i = low; i < high; i++) {
for (int j = 0; j < MATRIX_LINE; j++)
{
float tmp = 0;
for (int k = 0; k < MATRIX_LINE; k++){
tmp +=
A[MATRIX_LINE * i + k] *
B[(MATRIX_LINE * k) + j];
}
C[i + j] = tmp;
}
}
Implementation Performance
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
5500
6000
6500
7000
7500
8000
8500
9000
Simple SIMD TLP SIMD&TLP
8 cores
4 cores
A case study
Using threads in Interactive Systems
• Research by XEROX PARC Palo Alto
• Analysis of two large interactive system: Cedar and GVX
• Goals:
i. Identifing paradigms of thread usage
ii. architecture analysis of thread-based environment
iii. pointing out the most important properties of an
interactive system
A case study
Thread model
 Mesa language
 Multiple, lightweight, pre-emptively scheduled threads in shared
address space, threads may have different priorities
 FORK, JOIN, DETACH
 Support to conditional variables and monitors: critical sections and
mutexes
 Finer grain for locks: directly on data structures
A case study
Three types of thread
1. Eternal: run forever, waiting for cond. var.
2. Worker: perform some computation
3. Transient: short life threads, forked off by long-lived
threads
A case study
Dynamic analysis
0
5
10
15
20
25
30
35
40
45
Cedar GVX
# threads idle
Fork rate max
# threads max
Switching intervals: (130/sec, 270/sec) vs. (33/sec,
60/sec)
A case study
Paradigms of thread usage
 Defer Work: forking for reducing latency
 print documents
 Pumps or slack processes: components of pipeline
 Preprocessing user input
 Request to X server
 Sleepers and one-shots: wait for some event and then
execute
 Blink cursor
 Double click
 Deadlock avoiders: avoid violating lock order constraint
 Windows repainting
A case study
Paradigms of thread usage
 Task rejuvenation: recover a service from a bad state,
either forking a new thread or reporting the error
o Avoid fork overhead in input event dispatcher of
Cedar
 Serializers: thread processing a queue
o A window system with input events from many
sources
 Concurrency exploiters: for using multiple processors
 Encapsulated forks: a mix of previous paradigms, code
modularity
A case study
Common Mistakes and Issues
o Timeout hacks for compensate missing NOTIFY
o IF instead of WHILE for monitors
o Handling resources consumption
o Slack processes may need hack YieldButNotToMe
o Using single-thread designed libraries in multi-
threading environment: Xlib and XI
o Spurious lock
A case study
Xerox scientists’ conclusions
 Interesting difficulties were discovered both in
use and implementation of multi-threading
environment
 Starting point for new studies
Conclusion
Ad

More Related Content

What's hot (20)

process and thread.pptx
process and thread.pptxprocess and thread.pptx
process and thread.pptx
HamzaxTv
 
Multithreading models.ppt
Multithreading models.pptMultithreading models.ppt
Multithreading models.ppt
Luis Goldster
 
Semaphores
SemaphoresSemaphores
Semaphores
Mohd Arif
 
Multi threading models
Multi threading modelsMulti threading models
Multi threading models
DarakhshanNayyab
 
Association agggregation and composition
Association agggregation and compositionAssociation agggregation and composition
Association agggregation and composition
baabtra.com - No. 1 supplier of quality freshers
 
contiguous memory allocation.pptx
contiguous memory allocation.pptxcontiguous memory allocation.pptx
contiguous memory allocation.pptx
Rajapriya82
 
Chap2 2 1
Chap2 2 1Chap2 2 1
Chap2 2 1
Hemo Chella
 
Process of operating system
Process of operating systemProcess of operating system
Process of operating system
International Center for Chemical & Biological Sciences
 
Java Networking
Java NetworkingJava Networking
Java Networking
Sunil OS
 
Deadlock Prevention
Deadlock PreventionDeadlock Prevention
Deadlock Prevention
prachi mewara
 
Deadlock
DeadlockDeadlock
Deadlock
Rajandeep Gill
 
Contiguous Memory Allocation.ppt
Contiguous Memory Allocation.pptContiguous Memory Allocation.ppt
Contiguous Memory Allocation.ppt
infomerlin
 
Multithreading
Multithreading Multithreading
Multithreading
WafaQKhan
 
Memory Management in Amoeba
Memory Management in AmoebaMemory Management in Amoeba
Memory Management in Amoeba
Ramesh Adhikari
 
Process synchronization
Process synchronizationProcess synchronization
Process synchronization
Syed Hassan Ali
 
Demand paging
Demand pagingDemand paging
Demand paging
Trinity Dwarka
 
Java Thread Synchronization
Java Thread SynchronizationJava Thread Synchronization
Java Thread Synchronization
Benj Del Mundo
 
Strings in Java
Strings in JavaStrings in Java
Strings in Java
Abhilash Nair
 
Chapter 12 - Mass Storage Systems
Chapter 12 - Mass Storage SystemsChapter 12 - Mass Storage Systems
Chapter 12 - Mass Storage Systems
Wayne Jones Jnr
 
Paging and segmentation
Paging and segmentationPaging and segmentation
Paging and segmentation
Piyush Rochwani
 

Similar to Threads and multi threading (20)

Intro To .Net Threads
Intro To .Net ThreadsIntro To .Net Threads
Intro To .Net Threads
rchakra
 
Introto netthreads-090906214344-phpapp01
Introto netthreads-090906214344-phpapp01Introto netthreads-090906214344-phpapp01
Introto netthreads-090906214344-phpapp01
Aravindharamanan S
 
Hs java open_party
Hs java open_partyHs java open_party
Hs java open_party
Open Party
 
Threads in Operating systems and concepts
Threads in Operating systems and conceptsThreads in Operating systems and concepts
Threads in Operating systems and concepts
RamaSubramanian79
 
Medical Image Processing Strategies for multi-core CPUs
Medical Image Processing Strategies for multi-core CPUsMedical Image Processing Strategies for multi-core CPUs
Medical Image Processing Strategies for multi-core CPUs
Daniel Blezek
 
WEEK07operatingsystemdepartmentofsoftwareengineering.pptx
WEEK07operatingsystemdepartmentofsoftwareengineering.pptxWEEK07operatingsystemdepartmentofsoftwareengineering.pptx
WEEK07operatingsystemdepartmentofsoftwareengineering.pptx
babayaga920391
 
MULTI-THREADING in python appalication.pptx
MULTI-THREADING in python appalication.pptxMULTI-THREADING in python appalication.pptx
MULTI-THREADING in python appalication.pptx
SaiDhanushM
 
CS345 09 - Ch04 Threads operating system1.pptx
CS345 09 - Ch04 Threads operating system1.pptxCS345 09 - Ch04 Threads operating system1.pptx
CS345 09 - Ch04 Threads operating system1.pptx
RichaAgnihotri13
 
Threads
ThreadsThreads
Threads
Sameer Shaik
 
Os
OsOs
Os
DeepaR42
 
Chapter 6 os
Chapter 6 osChapter 6 os
Chapter 6 os
AbDul ThaYyal
 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)
Andriy Berestovskyy
 
A22 Introduction to DTrace by Kyle Hailey
A22 Introduction to DTrace by Kyle HaileyA22 Introduction to DTrace by Kyle Hailey
A22 Introduction to DTrace by Kyle Hailey
Insight Technology, Inc.
 
Operating Systems 1 (7/12) - Threads
Operating Systems 1 (7/12) - ThreadsOperating Systems 1 (7/12) - Threads
Operating Systems 1 (7/12) - Threads
Peter Tröger
 
CH04.pdf
CH04.pdfCH04.pdf
CH04.pdf
ImranKhan880955
 
Bglrsession4
Bglrsession4Bglrsession4
Bglrsession4
Nagasuri Bala Venkateswarlu
 
Towards an Integration of the Actor Model in an FRP Language for Small-Scale ...
Towards an Integration of the Actor Model in an FRP Language for Small-Scale ...Towards an Integration of the Actor Model in an FRP Language for Small-Scale ...
Towards an Integration of the Actor Model in an FRP Language for Small-Scale ...
Takuo Watanabe
 
Sucet os module_2_notes
Sucet os module_2_notesSucet os module_2_notes
Sucet os module_2_notes
SRINIVASUNIVERSITYEN
 
Fast switching of threads between cores - Advanced Operating Systems
Fast switching of threads between cores - Advanced Operating SystemsFast switching of threads between cores - Advanced Operating Systems
Fast switching of threads between cores - Advanced Operating Systems
Ruhaim Izmeth
 
Data race
Data raceData race
Data race
James Wong
 
Intro To .Net Threads
Intro To .Net ThreadsIntro To .Net Threads
Intro To .Net Threads
rchakra
 
Introto netthreads-090906214344-phpapp01
Introto netthreads-090906214344-phpapp01Introto netthreads-090906214344-phpapp01
Introto netthreads-090906214344-phpapp01
Aravindharamanan S
 
Hs java open_party
Hs java open_partyHs java open_party
Hs java open_party
Open Party
 
Threads in Operating systems and concepts
Threads in Operating systems and conceptsThreads in Operating systems and concepts
Threads in Operating systems and concepts
RamaSubramanian79
 
Medical Image Processing Strategies for multi-core CPUs
Medical Image Processing Strategies for multi-core CPUsMedical Image Processing Strategies for multi-core CPUs
Medical Image Processing Strategies for multi-core CPUs
Daniel Blezek
 
WEEK07operatingsystemdepartmentofsoftwareengineering.pptx
WEEK07operatingsystemdepartmentofsoftwareengineering.pptxWEEK07operatingsystemdepartmentofsoftwareengineering.pptx
WEEK07operatingsystemdepartmentofsoftwareengineering.pptx
babayaga920391
 
MULTI-THREADING in python appalication.pptx
MULTI-THREADING in python appalication.pptxMULTI-THREADING in python appalication.pptx
MULTI-THREADING in python appalication.pptx
SaiDhanushM
 
CS345 09 - Ch04 Threads operating system1.pptx
CS345 09 - Ch04 Threads operating system1.pptxCS345 09 - Ch04 Threads operating system1.pptx
CS345 09 - Ch04 Threads operating system1.pptx
RichaAgnihotri13
 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)
Andriy Berestovskyy
 
Operating Systems 1 (7/12) - Threads
Operating Systems 1 (7/12) - ThreadsOperating Systems 1 (7/12) - Threads
Operating Systems 1 (7/12) - Threads
Peter Tröger
 
Towards an Integration of the Actor Model in an FRP Language for Small-Scale ...
Towards an Integration of the Actor Model in an FRP Language for Small-Scale ...Towards an Integration of the Actor Model in an FRP Language for Small-Scale ...
Towards an Integration of the Actor Model in an FRP Language for Small-Scale ...
Takuo Watanabe
 
Fast switching of threads between cores - Advanced Operating Systems
Fast switching of threads between cores - Advanced Operating SystemsFast switching of threads between cores - Advanced Operating Systems
Fast switching of threads between cores - Advanced Operating Systems
Ruhaim Izmeth
 
Ad

More from Antonio Cesarano (8)

Inspire JSON Merger
Inspire JSON MergerInspire JSON Merger
Inspire JSON Merger
Antonio Cesarano
 
Erasmus Traineeship Report @ RedHat
Erasmus Traineeship Report @ RedHatErasmus Traineeship Report @ RedHat
Erasmus Traineeship Report @ RedHat
Antonio Cesarano
 
Lost John - Mobile Game Development
Lost John - Mobile Game DevelopmentLost John - Mobile Game Development
Lost John - Mobile Game Development
Antonio Cesarano
 
Pitch ItLosers - TechGarage 2014
Pitch ItLosers - TechGarage 2014Pitch ItLosers - TechGarage 2014
Pitch ItLosers - TechGarage 2014
Antonio Cesarano
 
Project Proposal - Project Management
Project Proposal - Project ManagementProject Proposal - Project Management
Project Proposal - Project Management
Antonio Cesarano
 
Project management - Final Report
Project management - Final ReportProject management - Final Report
Project management - Final Report
Antonio Cesarano
 
Tech Talk Project Work
Tech Talk Project WorkTech Talk Project Work
Tech Talk Project Work
Antonio Cesarano
 
Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...
Antonio Cesarano
 
Erasmus Traineeship Report @ RedHat
Erasmus Traineeship Report @ RedHatErasmus Traineeship Report @ RedHat
Erasmus Traineeship Report @ RedHat
Antonio Cesarano
 
Lost John - Mobile Game Development
Lost John - Mobile Game DevelopmentLost John - Mobile Game Development
Lost John - Mobile Game Development
Antonio Cesarano
 
Pitch ItLosers - TechGarage 2014
Pitch ItLosers - TechGarage 2014Pitch ItLosers - TechGarage 2014
Pitch ItLosers - TechGarage 2014
Antonio Cesarano
 
Project Proposal - Project Management
Project Proposal - Project ManagementProject Proposal - Project Management
Project Proposal - Project Management
Antonio Cesarano
 
Project management - Final Report
Project management - Final ReportProject management - Final Report
Project management - Final Report
Antonio Cesarano
 
Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...
Antonio Cesarano
 
Ad

Recently uploaded (20)

Do not let staffing shortages and limited fiscal view hamper your cause
Do not let staffing shortages and limited fiscal view hamper your causeDo not let staffing shortages and limited fiscal view hamper your cause
Do not let staffing shortages and limited fiscal view hamper your cause
Fexle Services Pvt. Ltd.
 
Top 12 Most Useful AngularJS Development Tools to Use in 2025
Top 12 Most Useful AngularJS Development Tools to Use in 2025Top 12 Most Useful AngularJS Development Tools to Use in 2025
Top 12 Most Useful AngularJS Development Tools to Use in 2025
GrapesTech Solutions
 
wAIred_LearnWithOutAI_JCON_14052025.pptx
wAIred_LearnWithOutAI_JCON_14052025.pptxwAIred_LearnWithOutAI_JCON_14052025.pptx
wAIred_LearnWithOutAI_JCON_14052025.pptx
SimonedeGijt
 
Programs as Values - Write code and don't get lost
Programs as Values - Write code and don't get lostPrograms as Values - Write code and don't get lost
Programs as Values - Write code and don't get lost
Pierangelo Cecchetto
 
GC Tuning: A Masterpiece in Performance Engineering
GC Tuning: A Masterpiece in Performance EngineeringGC Tuning: A Masterpiece in Performance Engineering
GC Tuning: A Masterpiece in Performance Engineering
Tier1 app
 
Memory Management and Leaks in Postgres from pgext.day 2025
Memory Management and Leaks in Postgres from pgext.day 2025Memory Management and Leaks in Postgres from pgext.day 2025
Memory Management and Leaks in Postgres from pgext.day 2025
Phil Eaton
 
How to Install and Activate ListGrabber Plugin
How to Install and Activate ListGrabber PluginHow to Install and Activate ListGrabber Plugin
How to Install and Activate ListGrabber Plugin
eGrabber
 
How I solved production issues with OpenTelemetry
How I solved production issues with OpenTelemetryHow I solved production issues with OpenTelemetry
How I solved production issues with OpenTelemetry
Cees Bos
 
Solar-wind hybrid engery a system sustainable power
Solar-wind  hybrid engery a system sustainable powerSolar-wind  hybrid engery a system sustainable power
Solar-wind hybrid engery a system sustainable power
bhoomigowda12345
 
UI/UX Design & Development and Servicess
UI/UX Design & Development and ServicessUI/UX Design & Development and Servicess
UI/UX Design & Development and Servicess
marketing810348
 
Troubleshooting JVM Outages – 3 Fortune 500 case studies
Troubleshooting JVM Outages – 3 Fortune 500 case studiesTroubleshooting JVM Outages – 3 Fortune 500 case studies
Troubleshooting JVM Outages – 3 Fortune 500 case studies
Tier1 app
 
The-Future-is-Hybrid-Exploring-Azure’s-Role-in-Multi-Cloud-Strategies.pptx
The-Future-is-Hybrid-Exploring-Azure’s-Role-in-Multi-Cloud-Strategies.pptxThe-Future-is-Hybrid-Exploring-Azure’s-Role-in-Multi-Cloud-Strategies.pptx
The-Future-is-Hybrid-Exploring-Azure’s-Role-in-Multi-Cloud-Strategies.pptx
james brownuae
 
Legacy Code Nightmares , Hellscapes, and Lessons Learned.pdf
Legacy Code Nightmares , Hellscapes, and Lessons Learned.pdfLegacy Code Nightmares , Hellscapes, and Lessons Learned.pdf
Legacy Code Nightmares , Hellscapes, and Lessons Learned.pdf
Ortus Solutions, Corp
 
Reinventing Microservices Efficiency and Innovation with Single-Runtime
Reinventing Microservices Efficiency and Innovation with Single-RuntimeReinventing Microservices Efficiency and Innovation with Single-Runtime
Reinventing Microservices Efficiency and Innovation with Single-Runtime
Natan Silnitsky
 
NYC ACE 08-May-2025-Combined Presentation.pdf
NYC ACE 08-May-2025-Combined Presentation.pdfNYC ACE 08-May-2025-Combined Presentation.pdf
NYC ACE 08-May-2025-Combined Presentation.pdf
AUGNYC
 
Catching Wire; An introduction to CBWire 4
Catching Wire; An introduction to CBWire 4Catching Wire; An introduction to CBWire 4
Catching Wire; An introduction to CBWire 4
Ortus Solutions, Corp
 
iTop VPN With Crack Lifetime Activation Key
iTop VPN With Crack Lifetime Activation KeyiTop VPN With Crack Lifetime Activation Key
iTop VPN With Crack Lifetime Activation Key
raheemk1122g
 
S3 + AWS Athena how to integrate s3 aws plus athena
S3 + AWS Athena how to integrate s3 aws plus athenaS3 + AWS Athena how to integrate s3 aws plus athena
S3 + AWS Athena how to integrate s3 aws plus athena
aianand98
 
Passive House Canada Conference 2025 Presentation [Final]_v4.ppt
Passive House Canada Conference 2025 Presentation [Final]_v4.pptPassive House Canada Conference 2025 Presentation [Final]_v4.ppt
Passive House Canada Conference 2025 Presentation [Final]_v4.ppt
IES VE
 
Quasar Framework Introduction for C++ develpoers
Quasar Framework Introduction for C++ develpoersQuasar Framework Introduction for C++ develpoers
Quasar Framework Introduction for C++ develpoers
sadadkhah
 
Do not let staffing shortages and limited fiscal view hamper your cause
Do not let staffing shortages and limited fiscal view hamper your causeDo not let staffing shortages and limited fiscal view hamper your cause
Do not let staffing shortages and limited fiscal view hamper your cause
Fexle Services Pvt. Ltd.
 
Top 12 Most Useful AngularJS Development Tools to Use in 2025
Top 12 Most Useful AngularJS Development Tools to Use in 2025Top 12 Most Useful AngularJS Development Tools to Use in 2025
Top 12 Most Useful AngularJS Development Tools to Use in 2025
GrapesTech Solutions
 
wAIred_LearnWithOutAI_JCON_14052025.pptx
wAIred_LearnWithOutAI_JCON_14052025.pptxwAIred_LearnWithOutAI_JCON_14052025.pptx
wAIred_LearnWithOutAI_JCON_14052025.pptx
SimonedeGijt
 
Programs as Values - Write code and don't get lost
Programs as Values - Write code and don't get lostPrograms as Values - Write code and don't get lost
Programs as Values - Write code and don't get lost
Pierangelo Cecchetto
 
GC Tuning: A Masterpiece in Performance Engineering
GC Tuning: A Masterpiece in Performance EngineeringGC Tuning: A Masterpiece in Performance Engineering
GC Tuning: A Masterpiece in Performance Engineering
Tier1 app
 
Memory Management and Leaks in Postgres from pgext.day 2025
Memory Management and Leaks in Postgres from pgext.day 2025Memory Management and Leaks in Postgres from pgext.day 2025
Memory Management and Leaks in Postgres from pgext.day 2025
Phil Eaton
 
How to Install and Activate ListGrabber Plugin
How to Install and Activate ListGrabber PluginHow to Install and Activate ListGrabber Plugin
How to Install and Activate ListGrabber Plugin
eGrabber
 
How I solved production issues with OpenTelemetry
How I solved production issues with OpenTelemetryHow I solved production issues with OpenTelemetry
How I solved production issues with OpenTelemetry
Cees Bos
 
Solar-wind hybrid engery a system sustainable power
Solar-wind  hybrid engery a system sustainable powerSolar-wind  hybrid engery a system sustainable power
Solar-wind hybrid engery a system sustainable power
bhoomigowda12345
 
UI/UX Design & Development and Servicess
UI/UX Design & Development and ServicessUI/UX Design & Development and Servicess
UI/UX Design & Development and Servicess
marketing810348
 
Troubleshooting JVM Outages – 3 Fortune 500 case studies
Troubleshooting JVM Outages – 3 Fortune 500 case studiesTroubleshooting JVM Outages – 3 Fortune 500 case studies
Troubleshooting JVM Outages – 3 Fortune 500 case studies
Tier1 app
 
The-Future-is-Hybrid-Exploring-Azure’s-Role-in-Multi-Cloud-Strategies.pptx
The-Future-is-Hybrid-Exploring-Azure’s-Role-in-Multi-Cloud-Strategies.pptxThe-Future-is-Hybrid-Exploring-Azure’s-Role-in-Multi-Cloud-Strategies.pptx
The-Future-is-Hybrid-Exploring-Azure’s-Role-in-Multi-Cloud-Strategies.pptx
james brownuae
 
Legacy Code Nightmares , Hellscapes, and Lessons Learned.pdf
Legacy Code Nightmares , Hellscapes, and Lessons Learned.pdfLegacy Code Nightmares , Hellscapes, and Lessons Learned.pdf
Legacy Code Nightmares , Hellscapes, and Lessons Learned.pdf
Ortus Solutions, Corp
 
Reinventing Microservices Efficiency and Innovation with Single-Runtime
Reinventing Microservices Efficiency and Innovation with Single-RuntimeReinventing Microservices Efficiency and Innovation with Single-Runtime
Reinventing Microservices Efficiency and Innovation with Single-Runtime
Natan Silnitsky
 
NYC ACE 08-May-2025-Combined Presentation.pdf
NYC ACE 08-May-2025-Combined Presentation.pdfNYC ACE 08-May-2025-Combined Presentation.pdf
NYC ACE 08-May-2025-Combined Presentation.pdf
AUGNYC
 
Catching Wire; An introduction to CBWire 4
Catching Wire; An introduction to CBWire 4Catching Wire; An introduction to CBWire 4
Catching Wire; An introduction to CBWire 4
Ortus Solutions, Corp
 
iTop VPN With Crack Lifetime Activation Key
iTop VPN With Crack Lifetime Activation KeyiTop VPN With Crack Lifetime Activation Key
iTop VPN With Crack Lifetime Activation Key
raheemk1122g
 
S3 + AWS Athena how to integrate s3 aws plus athena
S3 + AWS Athena how to integrate s3 aws plus athenaS3 + AWS Athena how to integrate s3 aws plus athena
S3 + AWS Athena how to integrate s3 aws plus athena
aianand98
 
Passive House Canada Conference 2025 Presentation [Final]_v4.ppt
Passive House Canada Conference 2025 Presentation [Final]_v4.pptPassive House Canada Conference 2025 Presentation [Final]_v4.ppt
Passive House Canada Conference 2025 Presentation [Final]_v4.ppt
IES VE
 
Quasar Framework Introduction for C++ develpoers
Quasar Framework Introduction for C++ develpoersQuasar Framework Introduction for C++ develpoers
Quasar Framework Introduction for C++ develpoers
sadadkhah
 

Threads and multi threading

  • 1. Threads Cesarano Antonio Del Monte Bonaventura Università degli studi di Salerno 7th April 2014 Operating Systems II
  • 2. Agenda  Introduction  Threads models  Multithreading: single-core Vs multicore  Implementation  A Case Study  Conclusions
  • 5. Memory: Heavy vs Light processes Introduction
  • 6. Why should I care about Threads?  Pro • Responsiveness • Resources sharing • Economy • Scalability Cons • Hard implementation • Synchronization • Critical section, deadlock, livelock… Introduction
  • 7. Thread Models Two kinds of Threads User Threads Kernel Threads
  • 8. Thread Models User-level Threads Implemented in software library  Pthread  Win32 API Pro: • Easy handling • Fast context switch • Trasparent to OS • No new address space, no need to change address space Cons: • Do not benefit from multithreading or multiprocessing • Thread blocked Process blocked
  • 9. Thread Models Kernel-level Threads Executed only in kernel mode, managed by OS  Kthreadd children Pro: • Resource Aware • No need to use a new address space • Thread blocked Scheduled Con: • Slower then User- threads
  • 10. Thread Models Thread implementation models: From many to one From one to one From many to many
  • 11. Thread Models From many to one  Whole process is blocked if one thread is blocked  Useless on multicore architectures
  • 12. Thread Models From one to one  Works fine on multicore architectures o Many kernel threads = High overhead
  • 13. Thread Models From many to many  Works fine on multicore architectures  Less overhead then “one to one” model
  • 18.  How can We use multithreading architectures? Thread Level Parallelism Data Level Parallelis m Multithreading
  • 21. Granularity  Coarse- grained: Multithreading  Context switch on high latency event  Very fast thread-switching, no threads slow down  Loss of throughput due to short stalls: pipeline start- up
  • 22. Granularity  Fine-grained Multithreading  Context switch on every cycle  Interleaved execution of multiple threads: it can hide both short and long stalls  Rarely-stalling threads are slowed down
  • 24. Context Switching Single-core Vs Multi-core Xthread_ctxtswitc h: pusha movl esp, [eax] movl edx, esp popa ret CPU ESP Thread 1regs Thread 2 registers Thread 1 TCB SP: .... Thread 2 TCB SP: .... Running Ready
  • 25. Pushing old context Single-core Vs Multi-core Xthread_ctxtswitc h: pusha movl esp, [eax] movl edx, esp popa ret CPU ESP Thread 1regs Thread 2 registers Thread 1 TCB SP: .... Thread 2 TCB SP: .... Thread 1 registers Running Ready
  • 26. Saving old stack pointer Single-core Vs Multi-core Xthread_ctxtswitc h: pusha movl esp, [eax] movl edx, esp popa ret CPU ESP Thread 1regs Thread 2 registers Thread 1 TCB SP: .... Thread 2 TCB SP: .... Thread 1 registers Running Ready
  • 27. Changing stack pointer Single-core Vs Multi-core Xthread_ctxtswitc h: pusha movl esp, [eax] movl edx, esp popa ret CPU ESP Thread 1regs Thread 2 registers Thread 1 TCB SP: .... Thread 2 TCB SP: .... Thread 1 registers Ready Running
  • 28. Popping off thread #2 old context Single-core Vs Multi-core Xthread_ctxtswitc h: pusha movl esp, [eax] movl edx, esp popa ret CPU ESP Thread 2 regs Thread 1 TCB SP: .... Thread 2 TCB SP: .... Thread 1 registers Ready Running
  • 29. Done: return Single-core Vs Multi-core Xthread_ctxtswitc h: pusha movl esp, [eax] movl edx, esp popa ret CPU ESP Thread 2 regs Thread 1 TCB SP: .... Thread 2 TCB SP: .... Thread 1 registers Ready Running RET pops of the returning address and it assigns its value to PC reg
  • 30. Problems  Critical Section: When a thread A tries to access to a shared variable simultaneously to a thread B  Deadlock: When a process A is waiting for resource reserved to B, which is waiting for resource reserved to A  Race Condition: The result of an execution depens on the order of execution of different threads
  • 31. More Issues  fork() and exec() system calls: to duplicate or to not deplicate all threads?  Signal handling in multithreading application.  Scheduler activation: kernel threads have to communicate with user thread, i.e.: upcalls  Thread cancellation: termination a thread before it has completed. • Deferred cancellation • Asynchronous cancellation: immediate
  • 32. Designing a thread library  Multiprocessor support  Virtual processor  RealTime support  Memory Management  Provide functions library rather than a module  Portability  No Kernel mode
  • 33. Implementation Posix Thread  Posix standard for threads: IEEE POSIX 1003.1c  Library made up of a set of types and procedure calls written in C, for UNIX platform  It supports: a) Thread management b) Mutexes c) Condition Variables d) Synchronization between threads using R/W locks and barries
  • 34. Implementation Thread Pool  Different threads available in a pool  When a task arrives, it gets assigned to a free thread  Once a thread completes its service, it returns in the pool and awaits another work.
  • 35. Implementation PThred Lib base operations  pthread_create()- create and launch a new thread  pthread_exit()- destroy a running thread  pthread_attr_init()- set thread attributes to their default values  pthread_join()- the caller thread blocks and waits for another thread to finish  pthread_self()- it retrieves the id assigned to the calling thread
  • 36. Implementation Example N x N Matrix Multiplication
  • 37. Implementation Example A simple algorithm for (int i = 0; i < MATRIX_ELEMENTS; i += MATRIX_LINE) { for (int j = 0; j < MATRIX_LINE; ++j) { float tmp = 0; for (int k = 0; k < MATRIX_LINE; k++) { tmp += A[i + k] * B[(MATRIX_LINE * k) + j]; } C[i + j] = tmp; } }
  • 38. Implementation Example SIMD Approach transpose(B); for (int i = 0; i < MATRIX_LINE; i++) { for (int j = 0; j < MATRIX_LINE; j++){ __m128 tmp = _mm_setzero_ps(); for (int k = 0; k < MATRIX_LINE; k += 4){ tmp = _mm_add_ps(tmp, _mm_mul_ps(_mm_load_ps(&A[MATRIX_LINE * i + k]), _mm_load_ps(&B[MATRIX_LINE * j + k]))); } tmp = _mm_hadd_ps(tmp, tmp); tmp = _mm_hadd_ps(tmp, tmp); _mm_store_ss(&C[MATRIX_LINE * i + j], tmp); } } transpose(B);
  • 39. Implementation Example TLP Approach struct thread_params { pthread_t id; float* a; float* b; float* c; int low; int high; bool flag; }; ……… int main(int argc, char** argv){ int ncores=sysconf(_SC_NPROCESSORS_ONLN); int stride = MATRIX_LINE / ncores; for (int j = 0; j < ncores; ++j){ pthread_attr_t attr; pthread_attr_init(&attr); thread_params* par = new thread_params; par->low=j*stride; par- >high=j*stride+stride; par->a = A; par->b = B; par->c = C; pthread_create(&(par->id), &attr, runner, par); // set cpu affinity for thread // sched_setaffinity }
  • 40. Implementation Example TLP Approach int main(int argc, char** argv){ …. int completed = 0; while (true) { if (completed >= ncores) break; completed = 0; usleep(100000); for (int j=0; j<ncores; ++j){ if (p[j]->flag) completed++; } } …. } void runner(void* p){ thread_params* params = (thread_params*) p; int low = params->low; // unpack others values for (int i = low; i < high; i++) { for (int j = 0; j < MATRIX_LINE; j++) { float tmp = 0; for (int k = 0; k < MATRIX_LINE; k++){ tmp += A[MATRIX_LINE * i + k] * B[(MATRIX_LINE * k) + j]; } C[i + j] = tmp; } }
  • 42. A case study Using threads in Interactive Systems • Research by XEROX PARC Palo Alto • Analysis of two large interactive system: Cedar and GVX • Goals: i. Identifing paradigms of thread usage ii. architecture analysis of thread-based environment iii. pointing out the most important properties of an interactive system
  • 43. A case study Thread model  Mesa language  Multiple, lightweight, pre-emptively scheduled threads in shared address space, threads may have different priorities  FORK, JOIN, DETACH  Support to conditional variables and monitors: critical sections and mutexes  Finer grain for locks: directly on data structures
  • 44. A case study Three types of thread 1. Eternal: run forever, waiting for cond. var. 2. Worker: perform some computation 3. Transient: short life threads, forked off by long-lived threads
  • 45. A case study Dynamic analysis 0 5 10 15 20 25 30 35 40 45 Cedar GVX # threads idle Fork rate max # threads max Switching intervals: (130/sec, 270/sec) vs. (33/sec, 60/sec)
  • 46. A case study Paradigms of thread usage  Defer Work: forking for reducing latency  print documents  Pumps or slack processes: components of pipeline  Preprocessing user input  Request to X server  Sleepers and one-shots: wait for some event and then execute  Blink cursor  Double click  Deadlock avoiders: avoid violating lock order constraint  Windows repainting
  • 47. A case study Paradigms of thread usage  Task rejuvenation: recover a service from a bad state, either forking a new thread or reporting the error o Avoid fork overhead in input event dispatcher of Cedar  Serializers: thread processing a queue o A window system with input events from many sources  Concurrency exploiters: for using multiple processors  Encapsulated forks: a mix of previous paradigms, code modularity
  • 48. A case study Common Mistakes and Issues o Timeout hacks for compensate missing NOTIFY o IF instead of WHILE for monitors o Handling resources consumption o Slack processes may need hack YieldButNotToMe o Using single-thread designed libraries in multi- threading environment: Xlib and XI o Spurious lock
  • 49. A case study Xerox scientists’ conclusions  Interesting difficulties were discovered both in use and implementation of multi-threading environment  Starting point for new studies
  翻译: