Simulating the Monty Hall Problem with Python

Written by Toni Karppi on June 28, 2019

The Monty Hall problem goes like this: Suppose you’re on a gameshow. You’re presented with a row of three doors; behind one of these doors there is a brand new sportscar, while behind the rest of the doors there are goats. You win the sports car if you pick the the door with the car behind it. Suppose that you pick door number 1. The host will then open one of the doors 2 or 3 that has a goat behind it. The host now asks you if you would like to change your choice of door. Should you change your choice, or should you stick with your original pick?

Intuitively it would seem like it wouldn’t really make a difference whether we change our selection; it should be 50/50, right? Well, it turns out that you are actually twice as likely to get a car if you change your pick. Specifically, you have a $1/3$ chance of getting the car if you stick with your original pick, while the probability is $2/3$ if you change your pick. Let’s create a simulation in Python to see this in practice.

Simulating the problem

We’ll start with a simple program that represents the structure of the problem.

# The total number of doors.
num_doors = 3

# A list of all doors. Each integer 0..num_doors-1 represents a door.
doors = list(range(num_doors))
print(f"All doors: {doors}")

# The door that the car is behind.
car_behind = 1
print(f"Car is behind: {car_behind}")

# The door the player initially chose.
picked_door = 0
print(f"Picked door: {picked_door}")

# A list of the doors that the player did not pick.
other_doors = [door for door in doors if door != picked_door]
print(f"Other doors: {other_doors}")

# The door that the player did not pick, and which was
# kept closed during the reveal.
alternative_door = car_behind if car_behind in other_doors else other_doors[0]
print(f"Alternative door: {alternative_door}")

# Whether or not changing the inital choice was correct.
change_correct = alternative_door == car_behind
print(f"Was the choice of changing the door correct? {change_correct}")

Each door here is represented by an integer $0..2$ . You can try to play around with the car_behind and picked_door variables to see what happens. Next let’s introduce some randomness into the program.

import random

# The total number of doors.
num_doors = 3

# A list of all doors. Each integer 0..num_doors-1 represents a door.
doors = list(range(num_doors))
print(f"All doors: {doors}")

# The door that the car is behind.
car_behind = random.randrange(num_doors)
print(f"Car is behind: {car_behind}")

# The door the player initially chose.
picked_door = random.randrange(num_doors)
print(f"Picked door: {picked_door}")

# A list of the doors that the player did not pick.
other_doors = [door for door in doors if door != picked_door]
print(f"Other doors: {other_doors}")

# The door that the player did not pick, and which was
# kept closed during the reveal.
alternative_door = (
    car_behind if car_behind in other_doors else random.choice(other_doors)
)
print(f"Alternative door: {alternative_door}")

# Whether or not changing the inital choice was correct.
change_correct = alternative_door == car_behind
print(f"Was the choice of changing the door correct? {change_correct}")

Now the the choices for car_behind, picked_door, and the choice for alternative_door (if car_behind is not among other_doors) are made randomly. Now that we’ve got the structure of the problem set up, let’s run this simulation many times so that we can analyze the results.

import random


def evaluate_change_correct():
    num_doors = 3
    doors = list(range(num_doors))
    car_behind = random.randrange(num_doors)
    picked_door = random.randrange(num_doors)
    other_doors = [door for door in doors if door != picked_door]
    alternative_door = (
        car_behind if car_behind in other_doors else random.choice(other_doors)
    )
    return alternative_door == car_behind


total_runs = 10000
runs_with_change_correct = 0

for i in range(total_runs):
    if evaluate_change_correct():
        runs_with_change_correct += 1

print(runs_with_change_correct / total_runs)

Running this program should give you an output of around $2/3 \approx 0.66$ . You can try changing the number of total runs to see if the results stay consistent.

Explaination of the results

So what is the explaination for these results? Let’s analyze what really happens at each step of the problem. You start off with choosing a door. During your initial selection, the probability of you picking the correct door is $1/3$ , since the car could be behind any of the three doors. This means that once you’ve selected a door, the probability that the car is behind one of the doors that you did not pick is $1-1/3=2/3$ . Now when the host opens one of the doors that you did not pick, the probability of the opened door containing the car drops to 0, and the probability from this door is transfered to the door that stayed closed. The probability for the initally picked door stays the same, since opening the door did not reveal any new information whether the initially picked door was correct or not.

To understand this explaination better, instead of 3 doors, let’s suppose that we had 1000 doors to choose from. For the inital choice, the probability to pick the correct door is $1/1000$ , a very small probability. Next the host opens 998 of the non-picked doors. Your initially selected door is still very unlikely to be correct, since it was picked when there were 1000 doors to choose from. The other closed door instead has a probability of $999/1000$ of being correct. You can verify this by changing the num_doors variable to 1000 in the simulation above.

Summary

In this article I showed you how to simulate the Monty Hall Problem to verify hypothesised results. The point of this was to show how you can take advantage of computer simulations to model and answer questions related to problems or real-life scenarios. Simulation can also be a useful starting point before starting to investigate a problem analytically.

Thanks for reading!