Packages used in this post
library(tidyverse)
library(cowplot)
library(kableExtra)
theme_set(theme_minimal_grid())
Paw Hansen
June 12, 2024
The marriage problem, also known as the secretary problem, is a classic probability puzzle that allows us to study optimal stopping theory and decision-making under uncertainty.
It goes like this: Consider a woman who has đť‘› men willing to marry her. If all the men showed up at once, she could order them and choose the best. Unfortunately for her, the men arrive one at a time and in random order. After dating each man for a short period of time, she must decide, before moving on to the next, whether or not to marry him. If she rejects his marriage proposal, she cannot recall him at a later time and, should she decide to marry, she will have to forego meeting the remaining suitors.
What strategy should she adopt if she wants to maximize the probability of marrying the best suitor?
This puzzle is an excellent example of a type of decision-making that requires some form of stopping rule: when should the woman “stop playing” and accept the suitor at hand? This type of puzzle extends to many other forms of decisions such as hiring employees, selling assets, and making investment choices.
To begin, consider a naive strategy: choosing at random. For example, knowing that she has 10 suitors, the woman rolls a fair 10-sided die and chooses the suitor corresponding to the number on the die (e.g. the fifth suitor to show up).
How often will that strategy lead to woman to marrying the optimal suitor?
On average, this strategy results in choosing the best suitor about 10% of the time - pretty bad odds if you ask me.
Also, notice that this is equivalent to picking a specific position every time, such as always choosing the third suitor:
A more sophisticated strategy would be using a training set to first establish a baseline before making a final selection. Here’s how it works:
Let’s implement this strategy in R:
baseline_choice <- function(n_men, n_train) {
# Define suitors
suitors <- 1:n_men
# Shuffle the suitors
suitors <- sample(suitors, n_men)
# Split suitors into "train" and "test" sets
train <- suitors[1:n_train]
test <- suitors[(n_train + 1):n_men]
# Define a cutoff
cutoff <- max(train)
# Choose the first suitor greater than the cutoff
choice <- which(test > cutoff)[1]
# If no suitor is found that is greater than the cutoff, choose the last one
if (is.na(choice)) {
return(suitors[n_men]) # Choose the last suitor in the original list
} else {
return(test[choice])
}
}
To make sure this function works, let’s run a simulation with ten suitors with the first five being set aside for the training set:
Already much better odds than choosing at random.
The next question is of course what is the optimal number of suitors to set aside for the training set? We can look this number up on Wikipedia but simulation allows us to approximate the right number without knowing a lot of math.
Let us rerun our simulation from before but this time with 100 suitors. Using the ever-useful crossing
function and map
we can study how the probability of choosing the best suitor varies with the number of suitors set aside. We’ll do 10.0000 simulations:
And then calculate the probability:
Number of suitors in training set | Probaiblity |
---|---|
38 | 0.37 |
36 | 0.37 |
35 | 0.37 |
39 | 0.37 |
40 | 0.37 |
42 | 0.37 |
We can also visualize the success rate as a function of the training set size:
rs |>
ggplot(aes(n_train, chose_mr_ten)) +
geom_line(color = "firebrick", linewidth = 1) +
scale_y_continuous(labels = scales::percent_format()) +
labs(x = 'Number of Suitors in "Training Set"',
y = "Probability of Choosing the Best Suitor",
title = str_wrap("Using About One-Third of Suitors for Training Will Get You The Best Suitor in The End", 60)) +
theme(plot.title = element_text(hjust = 0.5))
The optimal stopping rule in the marriage problem suggests that the woman should observe and reject roughly the first third of suitors, then marry the first suitor who is better than those she has previously observed.
This approach to solving the marriage problem showcases the power of simulation in understanding decision-making processes. Similar strategies can be applied to various real-world scenarios, such as hiring the best candidate, selling a house at the optimal price, or making investment decisions, where making the right choice is crucial and must be done with incomplete information.
Happy simulating!