I first learned about the secretary problem (aka the googol game) watching the VSauce 2 video The Game You Quit.
Per wikipedia, The basic form of the problem is the following: imagine an administrator who wants to hire the best secretary out of n rankable applicants for a position. The applicants are interviewed one by one in random order. A decision about each particular applicant is to be made immediately after the interview. Once rejected, an applicant cannot be recalled. During the interview, the administrator gains information sufficient to rank the applicant among all applicants interviewed so far, but is unaware of the quality of yet unseen applicants. The question is about the optimal strategy (stopping rule) to maximize the probability of selecting the best applicant. If the decision can be deferred to the end, this can be solved by the simple maximum selection algorithm of tracking the running maximum (and who achieved it), and selecting the overall maximum at the end. The difficulty is that the decision must be made immediately.’
VSauce Kevin claims a player can select the maximum value roughly 37% of the time using the optimal strategy:
1. View the first 1/e (as in Euler’s number) % of n. For example, if the game is 100, you would view the first 37 options.
2. Determine the maximum value in this 1/e group.
3. Go through the remaining options individually and stop at the first value greater than the max from the 1/e group.
I wrote a Python script to simulate the secretary game 10,000 times with the observation group equal to 10%, 20%, 25%, 33%, 1/e%, 50%, and 67% of n. I used n = 1000 and each value is a random number from 1 to 1,000,000. My goals were:
1) Most importantly, practice Python.
2) Hopefully find that using 1/e% as the observation group would result in selecting the maximum value 37% of the time and outperforms other observation group sizes.
3) Determine if an easier-to-calculate % performs close to optimally.
Success! I found that using an observation group of n/e allows the player to stop on the maximum value approximately 37% of the time. In practice, using 33% for the observation group also wins the game 37% and is easier to calculate quickly than n/e.