This problem is from 40 Puzzles and Problems in Probability and Mathematical Statistics, which I have paraphrased below:
An urn contains \(k\) black balls and a single red ball. Peter and Paula draw without replacement balls from this urn, alternating after each draw until the red ball is drawn. The game is won by the player who happens to draw the single red ball. In order to maximize her probability of winning, should Paula go first or second?
Let’s first solve this by simulation. The strategy will be to iterate over \(k = 1, 2, \cdots, n\) and for each value of \(k\), simulate the game a thousand times, and record Paula’s win percentage.
# Initialize vectors to store Paula's and Peter's win rates across different numbers of black balls
paula_win_rate <- numeric(25) # Pre-allocate vector of length 25
Above I initialize two vectors which will store the win percentage for Paula. Next we simulate the gameplay:
# Loop over different counts of black balls in the urn (from 1 to 25)
for (num_black_balls in 1:25) {
# Set the number of simulations for each number of black balls
num_simulations <- 1000
# Initialize vectors to store the outcomes of each simulation for the current number of black balls
paula_wins <- logical(num_simulations)
# Run simulations
for (simulation in 1:num_simulations) {
# Create an urn containing 'num_black_balls' black balls ("B") and 1 red ball ("R")
urn <- c(rep("B", num_black_balls), "R")
# Initialize vectors to record the sequence of picks for Paula and Peter
paula_picks <- character()
peter_picks <- character()
# Initialize a counter to keep track of the turn number
turn <- 1
# Continue the game until either Paula or Peter picks the red ball ("R")
while (!any(c(paula_picks == "R", peter_picks == "R"))) {
# Paula's turn to pick a ball from the urn
selected_ball_index <- sample(seq_along(urn), 1, replace = FALSE) # Randomly select a ball index
paula_picks[turn] <- urn[selected_ball_index] # Record Paula's pick
urn <- urn[-selected_ball_index] # Remove the picked ball from the urn
# Check if Paula picked the red ball or if the urn is empty
if (paula_picks[turn] == "R" || length(urn) == 0) {
break # End the game if Paula wins or if no balls are left
}
# Peter's turn to pick a ball from the urn
selected_ball_index <- sample(seq_along(urn), 1, replace = FALSE) # Randomly select a ball index
peter_picks[turn] <- urn[selected_ball_index] # Record Peter's pick
urn <- urn[-selected_ball_index] # Remove the picked ball from the urn
# Increment the turn counter for the next round
turn <- turn + 1
}
# Record the outcome of the simulation: whether Paula picked the red ball
paula_wins[simulation] <- any(paula_picks == "R")
}
# Calculate and store the win rates for Paula
paula_win_rate[num_black_balls] <- mean(paula_wins)
}
Below we can plot Paula’s win percentage as a function of the number of black balls (\(k\)):
So it seems that going first helps: When the number of black balls (\(k\)) is odd, Paula and Peter have about the same win rate. But when the number of black balls is even and the total number of balls is odd, then Paula’s win rate is subtantially higher in many cases.