Simulation¶

from datascience import *
import numpy as np

%matplotlib inline
import matplotlib.pyplot as plots
plots.style.use('fivethirtyeight')
import warnings
warnings.simplefilter(action='ignore', category=np.VisibleDeprecationWarning)

1. Coins¶

https://upload.wikimedia.org/wikipedia/commons/7/74/Pompey_by_Nasidius.jpg

If you flip a fair coin 100 times, how often can you expect to get between 40 and 60 heads?

Generic simulation algorithm:

Repeat N times:
- Simulate one trial
- Record the outcome
Make table of all the outcomes and analyze results

Simulating One Trial¶

coin = make_array('heads', 'tails')

np.random.choice(coin)

'tails'

tosses = np.random.choice(coin, 100)
tosses

array(['tails', 'heads', 'heads', 'heads', 'tails', 'tails', 'heads',
       'heads', 'heads', 'tails', 'heads', 'tails', 'tails', 'heads',
       'heads', 'tails', 'tails', 'heads', 'tails', 'heads', 'tails',
       'tails', 'heads', 'heads', 'heads', 'heads', 'tails', 'heads',
       'tails', 'heads', 'tails', 'heads', 'tails', 'tails', 'tails',
       'tails', 'tails', 'heads', 'tails', 'heads', 'tails', 'heads',
       'tails', 'tails', 'heads', 'heads', 'tails', 'tails', 'tails',
       'tails', 'tails', 'tails', 'tails', 'heads', 'tails', 'heads',
       'tails', 'heads', 'heads', 'tails', 'tails', 'heads', 'heads',
       'heads', 'tails', 'tails', 'heads', 'tails', 'heads', 'heads',
       'tails', 'heads', 'tails', 'tails', 'tails', 'heads', 'heads',
       'heads', 'tails', 'tails', 'tails', 'heads', 'heads', 'heads',
       'tails', 'tails', 'heads', 'tails', 'heads', 'tails', 'tails',
       'tails', 'heads', 'heads', 'tails', 'heads', 'heads', 'tails',
       'tails', 'heads'], dtype='<U5')

sum(tosses == 'heads')

All in one cell: run it a bunch!

#In our case, the outcome is the number of heads that appear in 100 tosses
outcome = sum(np.random.choice(coin, 100) == 'heads')
outcome 

Running Many Trials: For Loops¶

num_trials = 10 
for i in np.arange(0,num_trials):
    outcome = sum(np.random.choice(coin, 100) == 'heads')
    print(outcome)

Recording the outcome: appending to an array of outcomes¶

We need some way to record the outcome for each trial (each iteration of our for loop).

all_outcomes = make_array()

One simulation: run it a bunch!

outcome = sum(np.random.choice(coin, 100) == 'heads')
all_outcomes = np.append(all_outcomes, outcome)
all_outcomes

array([46.])

All in one cell:

num_trials = 10000

all_outcomes = make_array()
for i in np.arange(0,num_trials):
    outcome = sum(np.random.choice(coin, 100) == 'heads')
    all_outcomes = np.append(all_outcomes, outcome)
    
all_outcomes

array([51., 37., 47., ..., 54., 43., 53.])

Now let’s create a table so we can easily analyze the results.

simulated_results = Table().with_column('Heads in 100 flips', all_outcomes)

simulated_results.hist(bins=np.arange(30, 70, 1))

target_range = simulated_results.group("Heads in 100 flips").where("Heads in 100 flips", are.between(40,60))
target_range

Heads in 100 flips	count
40	101
41	174
42	238
43	303
44	387
45	495
46	555
47	643
48	776
49	770

... (10 rows omitted)

sum(target_range.column("count")) / simulated_results.num_rows

0.9546

2. The Monty Hall Problem¶

Three doors hide two goats and a car.
You pick one of three doors.
A different door is opened to reveal a goat.
You must decide which door has the car. You can stick with your original choice, or you can switch to the other unopened door.
Which choice (staying or switching) gives you the greatest chance of winning the car?

hidden_behind_doors = make_array('goat', 'goat', 'car')
hidden_behind_doors

array(['goat', 'goat', 'car'], dtype='<U4')

Define functions that give you the result of one trial. Both pick the initial door at random, and then determine which door the player ultimately selects based on our two strategies.

def monty_hall_game_staying_strategy():
    """
    Return what player wins if they stick with their original choice
    """
    return np.random.choice(hidden_behind_doors)

def monty_hall_game_switching_strategy():
    """
    Return what player wins if they switch with their original choice
    """
    contestant_guess = np.random.choice(hidden_behind_doors)
    if contestant_guess == 'car':
        # Revealed door is one goat ...
        # ... and the remaining door is the second goat.
        return 'goat'
    else:
        # Releaved door is the second goat ...
        # ... and the remaining door is the car.
        return 'car'
    

monty_hall_game_staying_strategy()

'goat'

monty_hall_game_switching_strategy()

'car'

Which strategy wins the car with higher probability?

num_trials = 10000

staying = make_array()
for i in np.arange(num_trials):
    staying = np.append(staying, monty_hall_game_staying_strategy())
print ("Win car if staying:", sum(staying == 'car') / num_trials)

switching = make_array()
for i in np.arange(num_trials):    
    switching = np.append(switching, monty_hall_game_switching_strategy())
print ("Win car if switching:", sum(switching == 'car') / num_trials)

Win car if staying: 0.3337

Win car if switching: 0.6638

Chance of At Least One Success in Repeated Independent Trials¶

Chance of no fours in 5 rolls of a die?

prob_no_fours_in_five_rolls = (5/6) ** 5
prob_no_fours_in_five_rolls

0.401877572016461

Chance of at least one 4 in 5 rolls of a die?

1 - prob_no_fours_in_five_rolls

0.598122427983539

Chance of at least one four in n rolls of a die?

rolls = np.arange(1, 51, 1)
results = Table().with_columns(
    'Rolls', rolls,
    'Chance of at least one 4', 1 - (5/6)**rolls
)
results

Rolls	Chance of at least one 4
1	0.166667
2	0.305556
3	0.421296
4	0.517747
5	0.598122
6	0.665102
7	0.720918
8	0.767432
9	0.806193
10	0.838494

... (40 rows omitted)

results.scatter('Rolls')