Skip to content

CodecoolGlobal/cpp-multithreading

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 

Repository files navigation

C++ Multithreading

Table of Contents

Example Application: Concurrent Bank Transactions

For this workshop, we will build a very simple simulation of a bank account. Our shared resource will be the account balance. We will write functions to perform thousands of deposits. First, we'll do this in a single-threaded environment to see the correct, expected outcome. Then, we will use multiple threads to simulate multiple ATMs or online users depositing money at the same time. This will allow us to clearly see the problems that arise from concurrent access and learn how to solve them correctly.

How to Compile

Example implementations are provided in different branches of the repository. You can compile the source code using the following command:

g++ -std=c++17 -pthread -o bank_account bank_account.cpp

Introduction: Why Go Parallel?

For many years, computers got faster primarily because CPU clock speeds increased. A program written in 2002 would magically run much faster on a computer from 2006. That era, however, is largely over. Today, performance gains come from having multiple processor cores. Your laptop, your phone—nearly every device has a multi-core processor.

A traditional application runs in a single thread. You can think of a thread as a single sequence of instructions that the computer follows one by one. Your main() function is the start of this primary sequence. This is like having a kitchen with a single, very fast chef. They can only do one task at a time: chop vegetables, then boil water, then stir the sauce.

To prepare a large meal faster, you don't just need a faster chef; you need more chefs working at the same time on different tasks. This is the essence of multithreading: creating multiple threads of execution within a single program that can run in parallel on different cores.

What can we do with this?

  • Responsiveness: Keep a user interface (UI) smooth while a heavy task runs in the background.
  • Performance: Process large datasets, render graphics, or serve many web requests simultaneously.

In modern C++, we will use the features from the library, introduced in C++11, which provide a powerful and platform-independent way to create and manage these "worker" threads.

Our goal today is to learn the C++ basics: how to start a new thread, how to spot the most common and dangerous pitfall known as a race condition, and how to fix it using a mutex. Let's start by building the foundation of our bank account application.

The Single-Threaded Foundation

Before we can introduce multiple threads, we need a simple, single-threaded program that works correctly. This is our baseline—the "ground truth" that we will compare our future results against.

Our bank account simulation will be very straightforward:

  • We'll have a single global variable, accountBalance, which represents our shared resource.
  • We'll create a function, makeDeposits(), which simulates a large number of individual $1 deposits. It does this by looping one million times and incrementing the balance in each iteration.

In our main() function, we'll call this makeDeposits() function twice to simulate two large batches of transactions. Because this program runs on a single thread, these function calls happen sequentially: the second batch of deposits only begins after the first one is completely finished. The execution order is guaranteed, and the result is perfectly predictable.

#include <iostream>

// Our shared resource
long long accountBalance = 0;

// A function that simulates 1,000,000 individual $1 deposits
void makeDeposits() {
    for (int i = 0; i < 1000000; ++i) {
        accountBalance++; // Increment the balance by 1
    }
}

int main() {
    std::cout << "Initial balance: " << accountBalance << std::endl;

    // Perform two batches of deposits, one after the other
    makeDeposits();
    makeDeposits();

    std::cout << "Final balance: " << accountBalance << std::endl;

    return 0;
}

Expected Outcome

When you compile and run this code, the output is deterministic and will always be the same:

Initial balance: 0
Final balance: 2000000

The Motivation for Concurrency

Think of the makeDeposits function not as something we call, but as an event that is triggered externally. A deposit is an asynchronous event. It can happen at any moment, initiated by a customer at an ATM or a transfer from another bank.

The bank's central server doesn't control when these deposit requests arrive. It cannot tell ATM #2, "Please wait, a transaction from ATM #1 is still in progress." The system must be prepared to handle these independent, unpredictable events as they happen, potentially at the exact same time.

Our goal is to simulate this reality. By placing the makeDeposits calls on separate threads, we are no longer running them in a predictable sequence. We are simulating two independent events that are happening concurrently, competing for the same resource: the accountBalance.

Our First Thread with std::thread

We'll use the C++ standard library's thread support, which is available by including the <thread> header.

The core of this library is the std::thread class. An object of this class represents a single thread of execution. When you create a std::thread object, you pass it the function you want to run on that new thread. The moment the object is created, the new thread can start running its function at any time, as scheduled by the operating system.

The join() Method

One critical concept is waiting for your threads to finish. If your main() function finishes before the threads you've spawned, the entire program will terminate, and your threads might be killed mid-task by the operating system.

To prevent this, we use the .join() method. Calling atm1.join() on a thread object will cause the current thread (in this case, main) to pause and wait until the atm1 thread has completed its execution. It's like a manager waiting for their employees to finish their work before closing the office. We must join() every thread we create to ensure a clean and predictable shutdown.

Let's modify our main function. Instead of calling makeDeposits() twice in a row, we'll create two threads, atm1 and atm2. Each thread will be assigned the makeDeposits function to run.

#include <iostream>
#include <thread>

// Our shared resource
long long accountBalance = 0;

// This function remains unchanged
void makeDeposits() {
    for (int i = 0; i < 1000000; ++i) {
        accountBalance++; // Increment the balance by 1
    }
}

int main() {
    std::cout << "Initial balance: " << accountBalance << std::endl;

    // Create two thread objects.
    // The makeDeposits function will now run on two separate threads concurrently.
    std::thread atm1(makeDeposits);
    std::thread atm2(makeDeposits);

    // Wait for both threads to complete their execution before proceeding.
    atm1.join();
    atm2.join();

    std::cout << "Final balance: " << accountBalance << std::endl;

    return 0;
}

What Do You See?

Now for the interesting part. Compile and run this new code. What is the final balance? Run it again. And again.

You'll likely notice two things:

  • The final balance is never 2,000,000.
  • The result is different almost every time you run the program.

Why is this happening? We've successfully launched two threads, but in doing so, we've uncovered a fundamental and dangerous problem in concurrent programming.

The Danger Zone - Uncovering a Race Condition

A race condition occurs when two or more threads try to access and manipulate a shared resource (like our accountBalance variable) at the same time. The final result depends on the precise, unpredictable sequence in which the threads are scheduled by the operating system—it's a "race" to see who gets to access the resource last.

You might look at the line accountBalance++ and think it's a single, instantaneous operation. To us, it is. But to the CPU, it's not. This operation is not atomic, meaning it's not indivisible. It's actually a three-step "Read-Modify-Write" sequence:

  • Read: The CPU copies the current value of accountBalance from the main memory (RAM) into one of its own temporary storage locations, called a register.
  • Modify: The CPU adds one to the value in its register.
  • Write: The CPU copies the new value from its register back to the accountBalance's location in main memory.

The problem is that the operating system can pause a thread and switch to another one in between any of these steps.

A Step-by-Step Explanation Let's imagine accountBalance is currently 100, and both atm1 and atm2 threads want to increment it.

  1. atm1 Reads: atm1 reads the value 100 from memory into its private register.
  2. CONTEXT SWITCH! Before atm1 can do anything else, the OS scheduler decides to pause atm1 and give atm2 a turn to run. This is the "unlucky timing".
  3. atm2 Reads: atm2 reads the value of accountBalance from memory. It's still 100, because atm1 never had a chance to write its updated value back.
  4. atm2 Modifies & Writes: atm2 increments its value to 101 and writes this 101 back to accountBalance in memory. The shared variable is now 101.
  5. CONTEXT SWITCH! The OS scheduler now pauses atm2 and switches back to atm1, which continues exactly where it left off.
  6. atm1 Modifies & Writes: atm1 still has the outdated value 100 in its register. It increments this to 101 and writes this value back to accountBalance in memory.

The final result: accountBalance is 101. We performed two increment operations, but because of the unlucky timing, one of them was completely lost.

This exact scenario, happening thousands of times, is why your final balance is much lower than 2,000,000. The section of code that is vulnerable to this problem (accountBalance++) is called a critical section. To fix our program, we need to find a way to ensure that only one thread can be inside this critical section at any given time.

Synchronizing with std::mutex and std::lock_guard

To fix the problem, we need to enforce a rule: only one thread is allowed inside the critical section (accountBalance++) at any given time. This principle is called Mutual Exclusion.

Think of it like a single-person restroom with a lock on the door. To enter, you must lock the door from the inside, preventing anyone else from entering. When you're done, you unlock it, allowing the next person in line to enter.

In C++, this "lock" is called a std::mutex (short for Mutual Exclusion), and it's available in the header.

The Old Way: Manual Locking

A std::mutex has two fundamental methods: .lock() and .unlock(). A thread calls .lock() to acquire the lock. If another thread already holds it, the calling thread will simply wait (it is blocked) until the lock is released. Once the thread is finished with the critical section, it must call .unlock() to release the lock for other threads.

#include <iostream>
#include <thread>
#include <mutex> // Include the new header for the mutex

long long accountBalance = 0;
std::mutex accountMutex; // Create a mutex object to protect our balance

void makeDeposits() {
    for (int i = 0; i < 1000000; ++i) {
        accountMutex.lock();   // Lock the door
        accountBalance++;      // CRITICAL SECTION
        accountMutex.unlock(); // Unlock the door
    }
}
// main() function remains the same...

This code works, but it has a serious flaw. What if an exception occurs inside the critical section? The .unlock() line would never be reached, and the mutex would remain locked forever. Any other thread waiting for the lock would be stuck indefinitely, a situation known as deadlock.

The Modern C++ Way: std::lock_guard

To solve this problem elegantly, modern C++ provides a utility called std::lock_guard. This tool uses a core C++ principle called RAII (Resource Acquisition Is Initialization).

It works like this:

  • You create a std::lock_guard object at the beginning of the critical section.
  • In its constructor, it automatically calls .lock() on the mutex you give it.
  • When the lock_guard object goes out of scope (at the end of the code block), its destructor is automatically called, which calls .unlock() on the mutex.

This is guaranteed to happen, even if an exception occurs. It's the safe, simple, and preferred way to handle mutexes in C++.

#include <iostream>
#include <thread>
#include <mutex>

long long accountBalance = 0;
std::mutex accountMutex;

void makeDeposits_safe() {
    for (int i = 0; i < 1000000; ++i) {
        // The lock is acquired when 'guard' is created.
        std::lock_guard<std::mutex> guard(accountMutex);
        
        accountBalance++;
        
    } // 'guard' goes out of scope here, automatically releasing the lock.
}

int main() {
    std::cout << "Initial balance: " << accountBalance << std::endl;

    std::thread atm1(makeDeposits_safe);
    std::thread atm2(makeDeposits_safe);

    atm1.join();
    atm2.join();

    std::cout << "Final balance: " << accountBalance << std::endl;

    return 0;
}

Compile and run this final version. You will now see the correct output every single time:

Initial balance: 0
Final balance: 2000000

We have successfully protected our shared data and solved the race condition! The price we pay is a small performance hit, as threads may now have to wait for each other. But in the world of concurrent programming, correctness always comes before performance.

What's Next? Advanced Topics We Haven't Covered

This workshop gave you the fundamentals, but C++ multithreading has much more to offer. Here are some advanced concepts worth exploring:

Atomic Operations (std::atomic) Sometimes you can avoid mutexes entirely by using atomic variables that guarantee thread-safe operations without explicit locking.

Condition Variables (std::condition_variable) Perfect for implementing producer-consumer patterns where threads need to wait for specific conditions to become true.

Read-Write Locks (std::shared_mutex) Allow multiple threads to read simultaneously while ensuring exclusive access for writers - great for scenarios with many readers and few writers.

Thread Pools Instead of creating new threads constantly, maintain a pool of worker threads that can be reused for different tasks.

Futures and Promises (std::future, std::promise) Elegant way to retrieve results from background threads without manual synchronization.

Parallel Algorithms (C++17) The standard library can automatically parallelize many common operations like sorting and searching with execution policies.

Workshop Challenge

Your task is to extend our bank account application with a new feature: withdrawals. You must ensure that both deposits and withdrawals can happen concurrently without corrupting the final balance.

Your Task You will be given the complete, working code from our last example. You need to modify it according to the following steps:

  • Create a new function called makeWithdrawals. This function should look very similar to makeDeposits_safe, but instead of incrementing the balance, it should loop 700,000 times and decrement (--) the accountBalance.
  • Protect the critical section. Inside your new makeWithdrawals function, you must protect the accountBalance-- operation. Use the exact same global std::mutex that the deposit function uses. Remember, the safest way to do this is with a std::lock_guard.
  • Update the main function. Instead of creating two threads that both make deposits, change it to create two different threads:
    • One thread should run makeDeposits_safe (which still adds 1,000,000).
    • The second thread should run your new makeWithdrawals function (which subtracts 700,000).
  • Join the threads. Make sure the main function waits for both threads to complete their tasks before printing the result.

Objective If your code is correct, the race condition will be prevented, and the program will produce the correct final balance every single time you run it.

Starting Balance: 0
Total Deposits: +1,000,000
Total Withdrawals: -700,000
Expected Final Balance: 300,000

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published