Bugs: Race Condition

The expression “Race Condition” is used to refer to a state in which multiple processes of a system execute in parallel, whereby a strict adherence to a particular order of steps is expected, but cannot be guaranteed, due to a lack of coordination between the entities. This oftentimes produces unpredictable behaviour and incorrect outcomes.

Example

Two simultaneous processes need to increment the same value which can only be retrieved or stored to in one operation (step).

StepProcess AShared valueProcess B
1Retrieve
[Internal value: 0]
0Retrieve
[Internal Value: 0]
2Increment
[Internal value: 1]
0Increment
[Internal value: 1]
3Store1Store

As you can see due to neither process knowing that the shared value was being manipulated elsewhere, only one of the operations was conducted successful with the other’s work being overwritten. This may seem to be nothing more than a cool though experiment, but Imagine processes A and B represent concert-goers booking tickets to see their favorite band, and the shared value is the number of available seats. Now say there’s only one ticket left, but they book simultaneously or worse, the same happens to dozens of pairs.

Remediation

To fix this bug, a data synchronization mechanism such as a mutex lock (see example below) must be introduced.

StepProcess AShared valueProcess B
1Lock value against outside influences0
2Retrieve
[Internal value: 0]
0Attempt to lock value
=> Failed
3Increment
[Internal value: 1]
0Wait for release of lock…
4Release lock1Wait for release of lock…
51Lock value
61Retrieve
[Internal value: 1]
71Increment
[Internal value: 2]
82Store
92Release lock