Test the Monty Hall problem (n=1000)

What is the Monty Hall problem?

The Monty Hall Problem is a famous statistic brain teaser, which has a counter-intuitive solution

Wiki: https://en.wikipedia.org/wiki/Monty_Hall_problem

The brain teaser is as follows:

The player is in a game show and has to choose from one of three doors.
Two of these doors lead to a goat each
Only one leads to a car

The player chooses one door at random, since they have no way of knowing the correct door
The host then looks behind the other two doors and reveals one of them showing a goat.
The player now has the opportunity to keep his original guess, or switch to the remaining door.

The statistics answer is: switching to the remaining door gives you a 2/3 chance of winning.

It's counter-intuitive, and that's why, we need to prove it.

Ok. let's simulate this by replicating the steps of the original problem

In [1]:
import random

Check if the door picked at random is 1/3 probability

In [2]:
win = 0
for i in range(0,999):
    winning_door = random.randint(1,3)
    door_picked = random.randint(1,3)
    if door_picked == winning_door:
        win = win + 1

print("Won {} times out of 1000".format(win))
print("Probability: {}".format(win/1000.0))
Won 322 times out of 1000
Probability: 0.322

Yup, this works. This is what happens if we pick one door and stick to it.

Now, lets simulate the other option. No optimizations. Let's play the game in it's entirety

Create an array doors, with values 1, 2 and 3 Pick a number between 1 and 3 at random and that is our correct door

In [3]:
doors = [1,2,3]
correct_door = random.randint(1,3)
correct_door
Out[3]:
3

Now, let's pick a number at random between 1 and 3. That is our initial pick

In [4]:
initial_pick = random.randint(1,3)
print("Initial pick: {}".format(initial_pick))
remaining_doors = [door for door in [1,2,3] if door != initial_pick]
print("Remaining doors: {}".format(remaining_doors))
Initial pick: 3
Remaining doors: [1, 2]

Since we aren't optimizing and following the game exactly, let's play as the game host now

The game host knows the correct door. So:

  1. if the initial_pick is the correct pick, reveal one of the remaining doors at random
  2. if the initial_pick is the wrong pick, reveal the wrong_door
In [5]:
if initial_pick == correct_door:
    revealed_door = random.choice(remaining_doors)
    remaining_doors = [door for door in remaining_doors if door != revealed_door]
    remaining_door = remaining_doors[0]
    print('''Door: {revealed_door} is not the correct one
    You can choose door {remaining_door} now'''
          .format(revealed_door=revealed_door, remaining_door=remaining_door))
else:
    remaining_doors = [door for door in remaining_doors if door != correct_door]
    revealed_door = remaining_doors[0]
    remaining_door = correct_door
    print('''Door: {revealed_door} is not the correct one
    You can choose door {remaining_door} now'''
          .format(revealed_door=revealed_door, remaining_door=remaining_door))
Door: 1 is not the correct one
    You can choose door 2 now

We are back to playing as the Player.

Now, let's check what happens if we stick to our original answer

In [6]:
win_original = (initial_pick == correct_door)
win_original
Out[6]:
True

What happens if we check we switch to the only remaining_door

In [7]:
win_switch = (remaining_door == correct_door)
win_switch
Out[7]:
False

Statistics however is not about just playing once.

It's about finding the probabilities.

Now that we have all the steps, let's pack them into a function and so we can run them as many times as we want

In [8]:
def monty_hall_solution_simulator():
    correct_door = random.randint(1,3)
    initial_pick = random.randint(1,3)
    
    remaining_doors = [door for door in [1,2,3] if door != initial_pick]
    
    if initial_pick == correct_door:
        revealed_door = random.choice(remaining_doors)
        remaining_doors = [door for door in remaining_doors if door != revealed_door]
        remaining_door = remaining_doors[0]
    else:
        remaining_doors = [door for door in remaining_doors if door != correct_door]
        revealed_door = remaining_doors[0]
        remaining_door = correct_door
        
    win_original = (initial_pick == correct_door)
    win_switch = (remaining_door == correct_door)
    
    return [win_original,win_switch]

Now let's make our puppets play... erm... I mean function run 1000 times

In [9]:
count_win_original = 0
count_win_switch = 0
for i in range(0,999):
    win_list = monty_hall_solution_simulator()
    if win_list[0]:
        count_win_original = count_win_original +1
    else:
        count_win_switch = count_win_switch +1


print("Probability of winning if you keep your original choice: {}".format(count_win_original/1000.0))
print("Probability of winning if you switch: {}".format(count_win_switch/1000.0))
Probability of winning if you keep your original choice: 0.341
Probability of winning if you switch: 0.658

Hence proved :)

What's going on?

Given the assumption that we are following the recommended solution for Monty Hall problem,

Pick a random number between 1 and 3.  This is the correct door

Pick another number at random between 1 and 3.  This is the initial answer

If the correct_door == winning door, we lose, since we are switching
If the correct_door != winning door, we win, because switching gives us the correct door


This is the intution we need to develop

The probability of correct_door == winning door has become the probability of losing    
We are abandoning the 1/3 probability of winning


The probability of correct_door != winning_door has become the probability of winning    
In return, the probability of losing on our initial pick, which was 2/3 has now become our probability of winning


Since the probability of correct_door == winning door is 1/3, the probability of losing becomes 1/3
Since the probability of correct_door != winning door is 2/3, the probability of winning becomes 2/3