Threads and Multithreading
Threads and Multithreading¶
Overview¶
A thread is a lightweight execution unit that runs within a process. This lesson covers the differences between threads and processes, user/kernel threads, multithreading models, and pthread API.
Table of Contents¶
- What is a Thread?
- Thread vs Process
- Thread Control Block (TCB)
- User Threads and Kernel Threads
- Multithreading Models
- pthread API Basics
- Practice Problems
1. What is a Thread?¶
Definition¶
Thread = Execution flow within a process
= Basic unit of CPU scheduling
= Lightweight Process
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
â Single-threaded Process â
â âââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â Code â Data â Heap â Stack â â
â âââââââââââââââââââââââââââââââââââââââââââââââââââ â
â Only one execution flow exists â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
â Multi-threaded Process â
â ââââââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â Code â Data â Heap â â â
â ââââââââââââââââââââââââââââââââââââââââââââââââââââââ â
â ââââââââââââ ââââââââââââ ââââââââââââ â
â â Stack1 â â Stack2 â â Stack3 â â
â â Thread1 â â Thread2 â â Thread3 â â
â ââââââââââââ ââââââââââââ ââââââââââââ â
â Three execution flows (can run in parallel) â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
Why Use Threads?¶
ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
â Thread Benefits â
ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ€
â â
â 1. Responsiveness â
â - Other threads continue even if one blocks â
â - Separate UI thread from worker threads â
â â
â 2. Resource Sharing â
â - Share same address space â
â - Exchange data without IPC â
â â
â 3. Economy â
â - Thread creation faster than process creation â
â - Reduced context switch cost â
â â
â 4. Scalability â
â - Utilize multicore CPUs â
â - Each thread runs on different core â
â â
ââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
2. Thread vs Process¶
Shared vs Private Resources¶
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
â Thread Shared/Private â
ââââââââââââââââââââââââââŹâââââââââââââââââââââââââââââââââ€
â Shared â Private â
ââââââââââââââââââââââââââŒâââââââââââââââââââââââââââââââââ€
â âą Code section â âą Thread ID â
â âą Data section â âą Program Counter (PC) â
â âą Heap area â âą Register set â
â âą Open files â âą Stack â
â âą Signal handlers â âą Scheduling info (priority) â
â âą Current directory â âą Signal mask â
â âą User/Group ID â âą errno value â
ââââââââââââââââââââââââââŽâââââââââââââââââââââââââââââââââ
Memory Layout Comparison¶
Process Memory: Multithreaded Process Memory:
âââââââââââââââ âââââââââââââââââââââââââââ
â Kernel â â Kernel â
ââââââââââââââ†âââââââââââââââââââââââââââ€
â Stack â âThread1 âThread2 âThread3â
â â â â Stack â Stack â Stack â
â â â â â â â â â
â â â â â â â †â â â â â â â â â â â â ââ€
â â â â
â â â â â â
â Heap â â Heap (shared) â
ââââââââââââââ†âââââââââââââââââââââââââââ€
â Data â â Data (shared) â
ââââââââââââââ†âââââââââââââââââââââââââââ€
â Code â â Code (shared) â
âââââââââââââââ âââââââââââââââââââââââââââ
Cost Comparison¶
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
â Process vs Thread Cost Comparison â
âââââââââââââââââââââŹââââââââââââââŹââââââââââââââââââââââââ€
â Item â Process â Thread â
âââââââââââââââââââââŒââââââââââââââŒââââââââââââââââââââââââ€
â Creation time â Slow â Fast â
â (Linux baseline) â ~10ms â ~1ms â
âââââââââââââââââââââŒââââââââââââââŒââââââââââââââââââââââââ€
â Context switch â Slow â Fast â
â â TLB flush â TLB can be preserved â
âââââââââââââââââââââŒââââââââââââââŒââââââââââââââââââââââââ€
â Memory usage â High â Low â
â â Separate â Shared space â
âââââââââââââââââââââŒââââââââââââââŒââââââââââââââââââââââââ€
â Communication costâ High â Low â
â â IPC needed â Direct memory access â
âââââââââââââââââââââŒââââââââââââââŒââââââââââââââââââââââââ€
â Stability â High â Low â
â â Isolated â One crash affects allâ
âââââââââââââââââââââŽââââââââââââââŽââââââââââââââââââââââââ
3. Thread Control Block (TCB)¶
TCB Structure¶
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
â TCB (Thread Control Block) â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââ€
â â
â âââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â Thread ID (TID) â â
â ââââââââââââââââââââââââââââââââââââââââââââââââââ†â
â â Thread State (Running, Ready, Blocked...) â â
â ââââââââââââââââââââââââââââââââââââââââââââââââââ†â
â â Program Counter (PC) â â
â ââââââââââââââââââââââââââââââââââââââââââââââââââ†â
â â CPU Registers (general purpose, flags...) â â
â ââââââââââââââââââââââââââââââââââââââââââââââââââ†â
â â Stack Pointer â â
â ââââââââââââââââââââââââââââââââââââââââââââââââââ†â
â â Scheduling Information (priority) â â
â ââââââââââââââââââââââââââââââââââââââââââââââââââ†â
â â Pointer to parent PCB â â
â âââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â
â TCB is much smaller than PCB (why thread switch â
â is faster) â
â â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
PCB and TCB Relationship¶
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
â PCB â
â âââââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â PID, memory info, open files, signal handlers... â â
â â â â
â â Thread list: â â
â â âââââââââââââââââââââââââââââââââââââââââââââ â â
â â â TCB1 â TCB2 â TCB3 â â â
â â â TID, PC, SP, â TID, PC, SP, â TID, PC, â â â
â â â registers, â registers, â SP, ... â â â
â â â state â state â â â â
â â âââââââââââââââââââââââââââââââââââââââââââââ â â
â â â â
â âââââââââââââââââââââââââââââââââââââââââââââââââââââ â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
4. User Threads and Kernel Threads¶
User-Level Threads (ULT)¶
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
â User-Level Threads (ULT) â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ€
â â
â User space: â
â âââââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â Application â â
â â âââââââââââââââââââââââââââââââââââââââââââââââ â â
â â â Thread1 Thread2 Thread3 â â â
â â âââââââââââââââââââââââââââââââââââââââââââââââ â â
â â â â â â â
â â ââââââŒâââââ â â
â â â â â
â â Thread Library â â
â â (scheduling, creation, synchronization) â â
â âââââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â â
â âââââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â â
â Kernel space: â â
â âââââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â Kernel (sees as single thread) â â
â â â â
â â Kernel knows process only, not threads â â
â âââââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â
â Advantages: Fast context switch, portability â
â Disadvantages: Blocking blocks all, no multicore use â
â â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
Kernel-Level Threads (KLT)¶
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
â Kernel-Level Threads (KLT) â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ€
â â
â User space: â
â âââââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â Application â â
â â âââââââââââââââââââââââââââââââââââââââââââââââ â â
â â â Thread1 Thread2 Thread3 â â â
â â âââââââââââââââââââââââââââââââââââââââââââââââ â â
â âââââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â â â â
â âââââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â â â â
â Kernel space: â
â âââââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â Kernel Scheduler â â
â â âââââââââââââââââââââââââââââââââââââââââââââââ â â
â â â K-Thread1 K-Thread2 K-Thread3 â â â
â â âââââââââââââââââââââââââââââââââââââââââââââââ â â
â â â â
â â Kernel schedules each thread individually â â
â âââââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â
â Advantages: Multicore use, blocking doesn't affect â
â other threads â
â Disadvantages: Slower context switch (syscall needed) â
â â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
Comparison Table¶
ââââââââââââââââââââââŹâââââââââââââââââââââŹâââââââââââââââââââââ
â Feature â User Thread â Kernel Thread â
ââââââââââââââââââââââŒâââââââââââââââââââââŒâââââââââââââââââââââ€
â Managed by â Thread library â OS kernel â
ââââââââââââââââââââââŒâââââââââââââââââââââŒâââââââââââââââââââââ€
â Creation/switch â Fast (no syscall) â Slow (syscall) â
ââââââââââââââââââââââŒâââââââââââââââââââââŒâââââââââââââââââââââ€
â Multicore use â Not possible â Possible â
ââââââââââââââââââââââŒâââââââââââââââââââââŒâââââââââââââââââââââ€
â Blocking syscall â Blocks all process â Only blocks thread â
ââââââââââââââââââââââŒâââââââââââââââââââââŒâââââââââââââââââââââ€
â OS support needed â Not required â Required â
ââââââââââââââââââââââŒâââââââââââââââââââââŒâââââââââââââââââââââ€
â Examples â GNU Portable â Linux NPTL, â
â â Threads â Windows threads â
ââââââââââââââââââââââŽâââââââââââââââââââââŽâââââââââââââââââââââ
5. Multithreading Models¶
Many-to-One Model (N:1)¶
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
â Many-to-One Model (N:1) â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ€
â â
â User space: â
â âââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â ULT1 ULT2 ULT3 ULT4 ULT5 â â
â â â â â â â â â
â â âââââââââŒââââââââŒââââââââŒââââââââ â â
â â âââââââââŒââââââââ â â
â â â â â
â â Thread Library â â
â âââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â â
â âââââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â â
â Kernel space: ââââââââââââ â
â â KLT 1 â â
â ââââââââââââ â
â â
â âą Advantages: Efficient thread management â
â âą Disadvantages: No multicore, blocking blocks all â
â âą Examples: Early Green Threads â
â â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
One-to-One Model (1:1)¶
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
â One-to-One Model (1:1) â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ€
â â
â User space: â
â âââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â ULT1 ULT2 ULT3 ULT4 â â
â â â â â â â â
â âââââŒââââââââŒââââââââŒââââââââŒâââââââââââââââââââââ â
â â â â â â
â ââââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â â â â â
â Kernel space: â
â âââââââââââââââââââââââââââââââââââââââââââââââââ â
â â KLT1 KLT2 KLT3 KLT4 â â
â âââââââââââââââââââââââââââââââââââââââââââââââââ â
â â
â âą Advantages: Multicore use, blocking doesn't affect â
â other threads â
â âą Disadvantages: Kernel thread creation overhead â
â âą Examples: Linux NPTL, Windows, macOS â
â â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
Many-to-Many Model (M:N)¶
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
â Many-to-Many Model (M:N) â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ€
â â
â User space: â
â âââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â ULT1 ULT2 ULT3 ULT4 ULT5 ULT6 ULT7 â â
â â â â â â â â â â â
â â âââââââŒââââââŒââââââŒââââââŒââââââŒââââââ â â
â â âââââââŒââââââŒââââââŒââââââ â â
â â â â â â â
â â Thread Scheduler â â
â âââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â â â â
â âââââââââââââââââââââââââââââââââââââââââââââââââââââ â
â â â â â
â Kernel space: â
â âââââââââââââââââââââââââââââââââââââââââââââââââ â
â â KLT1 KLT2 KLT3 â â
â âââââââââââââââââââââââââââââââââââââââââââââââââ â
â â
â 7 user threads mapped to 3 kernel threads â
â â
â âą Advantages: Flexibility, balance of efficiency â
â and parallelism â
â âą Disadvantages: Implementation complexity â
â âą Examples: Solaris, Go goroutines â
â â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
Model Comparison¶
ââââââââââââââââŹâââââââââââââââŹâââââââââââââââŹâââââââââââââââ
â Feature â N:1 â 1:1 â M:N â
ââââââââââââââââŒâââââââââââââââŒâââââââââââââââŒâââââââââââââââ€
â Parallel exec â Not possible â Possible â Possible â
ââââââââââââââââŒâââââââââââââââŒâââââââââââââââŒâââââââââââââââ€
â Blocking issueâ Blocks all â No impact â Partial â
ââââââââââââââââŒâââââââââââââââŒâââââââââââââââŒâââââââââââââââ€
â Thread create â Fast â Slow â Medium â
ââââââââââââââââŒâââââââââââââââŒâââââââââââââââŒâââââââââââââââ€
â Complexity â Low â Low â High â
ââââââââââââââââŒâââââââââââââââŒâââââââââââââââŒâââââââââââââââ€
â Scalability â Low â High â High â
ââââââââââââââââŒâââââââââââââââŒâââââââââââââââŒâââââââââââââââ€
â Modern OS â Rarely used â Commonly usedâ Some use â
ââââââââââââââââŽâââââââââââââââŽâââââââââââââââŽâââââââââââââââ
6. pthread API Basics¶
Thread Creation and Termination¶
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
// Function executed by thread
void* thread_function(void* arg) {
int thread_num = *(int*)arg;
printf("Thread %d starting\n", thread_num);
// Perform work
sleep(1);
printf("Thread %d terminating\n", thread_num);
return NULL;
}
int main() {
pthread_t threads[3];
int thread_args[3] = {1, 2, 3};
// Create threads
for (int i = 0; i < 3; i++) {
int result = pthread_create(&threads[i], NULL,
thread_function, &thread_args[i]);
if (result != 0) {
perror("pthread_create failed");
exit(1);
}
}
// Wait for all threads to complete
for (int i = 0; i < 3; i++) {
pthread_join(threads[i], NULL);
printf("Thread %d joined\n", thread_args[i]);
}
printf("Main thread terminating\n");
return 0;
}
/*
Compile: gcc -pthread thread_example.c -o thread_example
Output (order may vary):
Thread 1 starting
Thread 2 starting
Thread 3 starting
Thread 1 terminating
Thread 2 terminating
Thread 3 terminating
Thread 1 joined
Thread 2 joined
Thread 3 joined
Main thread terminating
*/
pthread API Main Functions¶
ââââââââââââââââââââââââââŹâââââââââââââââââââââââââââââââââââââ
â Function â Description â
ââââââââââââââââââââââââââŒâââââââââââââââââââââââââââââââââââââ€
â pthread_create() â Create new thread â
â pthread_join() â Wait for thread termination â
â pthread_exit() â Exit current thread â
â pthread_self() â Return current thread ID â
â pthread_equal() â Compare two thread IDs â
â pthread_detach() â Detach thread (auto cleanup) â
â pthread_cancel() â Request thread cancellation â
ââââââââââââââââââââââââââŒâââââââââââââââââââââââââââââââââââââ€
â pthread_mutex_init() â Initialize mutex â
â pthread_mutex_lock() â Lock mutex â
â pthread_mutex_unlock() â Unlock mutex â
â pthread_mutex_destroy()â Destroy mutex â
ââââââââââââââââââââââââââŒâââââââââââââââââââââââââââââââââââââ€
â pthread_cond_init() â Initialize condition variable â
â pthread_cond_wait() â Wait on condition variable â
â pthread_cond_signal() â Signal condition variable â
â pthread_cond_broadcast()â Signal all waiting threads â
ââââââââââââââââââââââââââŽâââââââââââââââââââââââââââââââââââââ
Receiving Return Values¶
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
// Calculate and return square of number
void* calculate_square(void* arg) {
int num = *(int*)arg;
int* result = malloc(sizeof(int));
*result = num * num;
printf("Thread: %d squared = %d\n", num, *result);
return (void*)result; // Return heap-allocated result
}
int main() {
pthread_t thread;
int num = 5;
void* result;
// Create thread
pthread_create(&thread, NULL, calculate_square, &num);
// Wait for thread and receive result
pthread_join(thread, &result);
printf("Main: result = %d\n", *(int*)result);
free(result); // Free memory
return 0;
}
/*
Output:
Thread: 5 squared = 25
Main: result = 25
*/
Thread Detachment¶
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
void* detached_thread(void* arg) {
printf("Detached thread starting\n");
sleep(2);
printf("Detached thread terminating\n");
return NULL;
}
int main() {
pthread_t thread;
pthread_attr_t attr;
// Initialize thread attributes
pthread_attr_init(&attr);
// Set detached state
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
// Create detached thread
pthread_create(&thread, &attr, detached_thread, NULL);
// Destroy attribute object
pthread_attr_destroy(&attr);
printf("Main thread: no need to join detached thread\n");
// Give detached thread time to execute
sleep(3);
printf("Main thread terminating\n");
return 0;
}
/*
Detached thread:
- Automatically cleaned up on termination
- pthread_join() unnecessary (causes error)
- Suitable for background tasks
*/
Thread Safety Issue Example¶
#include <stdio.h>
#include <pthread.h>
int counter = 0; // Shared variable
void* increment(void* arg) {
for (int i = 0; i < 1000000; i++) {
counter++; // Race condition!
}
return NULL;
}
int main() {
pthread_t t1, t2;
pthread_create(&t1, NULL, increment, NULL);
pthread_create(&t2, NULL, increment, NULL);
pthread_join(t1, NULL);
pthread_join(t2, NULL);
// Expected: 2000000, Actual: less than that (varies)
printf("Counter: %d\n", counter);
return 0;
}
/*
counter++ operation is not atomic:
1. Read counter value from memory
2. Increment value
3. Store result to memory
When two threads execute simultaneously, values are lost
(solved in next lesson)
*/
7. Practice Problems¶
Problem 1: Thread vs Process¶
Choose all that threads do NOT share.
A. Code section B. Data section C. Stack D. Heap E. Program Counter F. Open files
Show Answer
**C, E** - Stack: Each thread has its own stack (local variables, function call info) - Program Counter: Each thread can execute different code locations All others are shared between threads.Problem 2: Multithreading Models¶
Match each description to the correct multithreading model.
- If one user thread blocks, the entire process blocks.
- The model used by Linux NPTL.
- Can efficiently manage more user threads than kernel threads.
Options: N:1, 1:1, M:N
Show Answer
1. **N:1 (Many-to-One)** - All user threads map to one kernel thread, so blocking one blocks all 2. **1:1 (One-to-One)** - Linux NPTL, Windows, macOS use this model 3. **M:N (Many-to-Many)** - Maps many user threads to fewer kernel threads efficientlyProblem 3: pthread Code Analysis¶
What are the possible outputs of the following code?
#include <stdio.h>
#include <pthread.h>
void* print_msg(void* arg) {
char* msg = (char*)arg;
printf("%s", msg);
return NULL;
}
int main() {
pthread_t t1, t2;
pthread_create(&t1, NULL, print_msg, "A");
pthread_create(&t2, NULL, print_msg, "B");
pthread_join(t1, NULL);
pthread_join(t2, NULL);
printf("C");
return 0;
}
A. ABC B. BAC C. CAB D. ACB
Show Answer
**A, B** - Order of A and B can vary depending on thread scheduling - C is always last (printed after both threads join) - Possible outputs: ABC or BAC CAB, ACB are impossible (C only prints after join)Problem 4: Thread Creation¶
Explain the problem with the following code.
#include <stdio.h>
#include <pthread.h>
void* thread_func(void* arg) {
int* num = (int*)arg;
printf("Thread: %d\n", *num);
return NULL;
}
int main() {
pthread_t threads[5];
for (int i = 0; i < 5; i++) {
pthread_create(&threads[i], NULL, thread_func, &i);
}
for (int i = 0; i < 5; i++) {
pthread_join(threads[i], NULL);
}
return 0;
}
Show Answer
**Problem: Race Condition** All threads receive the address of the same variable `i`. When threads execute, `i`'s value may have already changed. Expected output: 0, 1, 2, 3, 4 Actual possible output: 5, 5, 5, 5, 5 or 2, 3, 4, 5, 5, etc. **Solution:**int thread_args[5];
for (int i = 0; i < 5; i++) {
thread_args[i] = i;
pthread_create(&threads[i], NULL, thread_func, &thread_args[i]);
}
Problem 5: TCB and PCB¶
Explain why TCB is smaller than PCB in relation to thread characteristics.
Show Answer
Why TCB is smaller than PCB: 1. **Resource Sharing**: Threads share code, data, heap, open files with other threads in the same process. This information is stored only in PCB, not in TCB. 2. **Information stored in TCB is minimal**: - Thread ID - Program Counter - Register set - Stack Pointer - Thread state - Priority 3. **Additional information in PCB**: - Memory management info (page table) - Open file list - I/O status - Accounting information - Signal handlers Result: Thread context switch is faster than process context switch.Next Steps¶
- 04_CPU_Scheduling_Basics.md - Basic concepts of CPU scheduling