A countable cup game and certifiable randomness

Consider the following game:

There is a countable sequence of face-down cups and an infinite bag of balls. First, player one places balls underneath finitely many of the cups. Next, player two turns over some (possibly infinite) set of cups and looks to see if they have balls underneath them. Finally, player two points to a cup that she has not yet turned over and declares “this cup does not have a ball underneath it.” Player two wins if her guess is correct; otherwise player one wins.

What’s a good strategy for player two in this game?

(For ease of reading, we’ll consistently refer to player two with “she/her” pronouns and player one with “he/him” pronouns.)

First, let’s see that any deterministic strategy has a counter-strategy that beats it. A deterministic strategy consists of some infinite test set T of cups to flip over, together with a function f: 2^T\rightarrow \mathbb N which decides what cup to flip over based on what was seen in the uncovered cups. To beat this, the first player first picks any assignment t\in 2^T and considers f(t). He places balls in T according to t and then also places a ball at f(t).

In fact, we can take this proof a little bit farther and say that player one can still deterministically win even if the second player uses randomness in the second part (but not the first part) of her turn. Actually, we’ll prove something slightly weaker: If the second player chooses T deterministically, then for any \varepsilon>0, the first player has a deterministic strategy guaranteeing that the second player wins with probability at most \varepsilon.

Suppose F: 2^T \to \mathbb N is a randomized strategy in the following sense: for any assignment t\in 2^T of cups to balls in T, F(t) is an \mathbb N-valued random variable which dictates which cup player two will point to. Fix t to be any finite-support assignment of balls to cups in T. F(t) induces a probability distribution on the integers, i.e. \sum_{n=0}^\infty\Pr[F(t) = n] = 1. In particular, the chance that F(t) is very large must go to 0 as “very large” goes to infinity. To be more precise, we have that for any \varepsilon, there is some N such that \Pr[F(t) > N] < \varepsilon. So the first player uses the following strategy: inside of T, he places balls according to t. Outside of T, he places balls under all cups up to cup N. With probability at least 1-\varepsilon, the second player will point to a cup with index lower than N and lose.

In particular, our proof shows that if F(t) is bounded, say if player two has access to only a finite amount of randomness (more precisely her random source has finite max-entropy, then the player one can guarantee a win.

What if the second player is allowed to randomize their choice of T? In that case, how well they can do depends sensitively on the amount of randomness they have access to. One appropriate measure of “amount of randomness” is min-entropy. We say that a random variable X has min-entropy H_\infty(X) = k if \max_x Pr[X=x] = 2^{-k}. In particular, a deterministic source has 0 min-entropy and a uniform source with k possibilities has \log k min-entropy.

Operationally, p_\text{guess} = -\log_2H_\infty(X) is the maximum guessing probability for X. If you use the source X to generate a secret, then p_\text{guess} is the maximum probability with which an adversary can guess your secret.

A source which has min-entropy k is “almost as good” as a source which gives a uniform random string on \log_2k bits. This is because of the existence of efficient randomness extractors. These are functions which, with the help of a little bit of independent random seed, can turn a source with min-entropy k into something which is almost indistinguishable from a uniform distribution on \log_2k bits. (For more on extractors theory, I recommend Chapter 6 of Salil Vidhan’s monograph on pseudorandomness.)

Because randomness is such a useful resource in computer science, one important problem is that of randomness verification. Suppose you buy a “random number generator,” which is to say you buy a box with a button and a display screen. Whenever you press the button, the display screen spits out a 0 or a 1. How can you tell whether the bits coming out are actually random? In other words, can you give some lower bound on the min-entropy of the random variable that you sample when you press the button n times? How do you know, for example, that the first n bits you observe aren’t just hardcoded into the box? We’ll return to our countable cup game to give a partial answer to this question.

Now that we have our notion of min-entropy fixed, we can state the following theorem:

  • If the  second player has access to a random source with min-entropy at least k, she has a strategy winning with probability at least 1-2^{-k}, regardless of the first player’s strategy.
  • If the min-entropy of the second player is at most k, then there exists a first-player strategy ensuring that the second player wins with probability at most 1-2^{-k}.

The second result is one we’ve basically already established, but let’s spell it out. Let F be a randomized strategy for player two. Let f be a deterministic strategy such that f = F with probability at least 2^{-k}; this exists by the definition of min-entropy. Then the first player plays the deterministic strategy which always wins against f, as detailed at the beginning of this post. (If you like, you can think that the second player starts by sampling their source for a random seed. The first player’s strategy is then to make an exact guess for what the seed is, and then play to win against that seed.)

To prove the first result, we need to find a set of deterministic strategies for player two such that a counter to one of these will lose to the rest. That way, the strategy from the previous paragraph is the best that player one can do. Player two’s strategy is as follows: Fix some partition \mathbb N into a collection T_i of infinite sets. (e.g. the equivalence classes mod m for some m) On her turn, she picks i at random (in such a way that H_\infty(i) \geq k) and then looks under the cups T:= \mathbb N\setminus T_i. Finally, she picks a cup in T_i with a higher index than any ball-containing cup in S\cap T. (More precisely, she can pick the minimal n such that n > \max S\cap T and n\in T_i.)

To see that this works, notice that no matter how S is chosen, there is a unique j such that the highest-numbered ball is in T_j. Player two wins as long as she picks i\neq j. By the operational interpretation of min-entropy, this happens with probability at least 1 - 2^{-k}.

This theorem shows that our countable cup game is, in a somewhat trivial sense, a one-box randomness verification test: it is a task that can be achieved by a data-generating box iff the box has sufficiently much min-entropy. However, this test is not effective in the following sense: There is no single “test” strategy for the first player which for which the win probability of the second player is equal to her min-entropy.  To see this, we can use an argument similar to one we’ve seen before.

Fix any strategy for player one; let S be the random variable denoting the set of places where he places balls. Since S is surely finite, we have by Borel-Cantelli that \sum_{n=1}^\infty \operatorname{Pr}[n \in S] < \infty. Therefore, for any \varepsilon, there is some N such that \sum_{n=N}^\infty \operatorname{Pr}[n \in S] < \varepsilon. If player two uses the deterministic strategy “turn over no cups and then point to cup N“, she’ll win with probability 1-\varepsilon.

An effective randomness test?

We can try to solve this non-effectiveness problem by appealing to Kolmogorov complexity. The Kolmogorov complexity of a string is equal to the length of the shortest program which can output that string. (This notion depends on the programming language one uses, but only up to an additive factor.) If the Kolmogorov complexity of the output string is greater than the number of particles in the box, then we know that the box couldn’t have come up with that string deterministically. Furthermore, if the box actually is giving us identical, independent random samples, this bound will eventually come into play. (The number of strings with a given Kolmogorov complexity n is only as large as the number of programs of length n, say 2^n if our programming language has its source code in binary. But then of the strings of length 2n, only a 2^{-n+1} fraction of them have Kolmogorov complexity at most n.)

Unfortunately for us, Kolmogorov complexity is undecidable. (This makes the sentence at the beginning of the previous paragraph quite laughable.) This means that for any reasonable notion of “algorithm”, there is no algorithm which given an arbitrary string and a natural number n, tells whether the Kolmogorov complexity of the string is at least n.

Generalizing a bit, if we don’t make any assumptions on the computational power of the boxes, then no deterministic algorithm should be able to serve as a good randomness test (by arguments similar to ones we used in analyzing the countable cup game).

Things get a little better if we do make some assumptions about the boxes. In particular, let’s modify our notion of Kolmogorov complexity to include a time parameter. Instead of only requiring that the box be at most a given size while outputting a given string, we also require that the output is achieved within a given time. (Think of this as the amount of time that we measure in our laboratory when we press the button and wait for the result.) This is now computable: just take all of the algorithms of at most the specified length, run them all for the specified amount of time, and see if any produce the string in question. Unfortunately again, this approach has some problems. First, it requires explicit bounds on the running time and on the size of the box’s source code. Secondly, computing the time-parameterized Kolmogorov complexity doesn’t seem easy: the naive algorithm takes time 2^nt. (Possibly n argument similar to the proof of the time hierarchy theorem can give a lower bound on the time complexity of this problem.)

Furthermore, this approach as written only lets us prove that the box is not deterministic, but it doesn’t give us any nonzero lower bound on the amount of randomness used by the program. (One can imagine a kind of randomness-parameterized Kolmogorov complexity as follows: fix n. What is the length of the smallest program with access to n bits of randomness which produces string x with positive probability? Probably someone has considered this before but I’m not privy to it.)

Let’s end on a positive note and discuss a setting where randomness certification is possible. This idea was introduced by Thomas Vidick and Umesh Vazirani in their paper on Certifiable Quantum Dice. They require that the box being tested comes in the form of two boxes which don’t communicate with each other. The test requires drawing \Omega(n) samples from the boxes and then performing some statistical tests on part of the data. If the tests are passed, then

  • The boxes make nontrivial use of shared quantum entanglement
  • The rest of the data has n min-entropy remaining.

Furthermore, the test only requires O(\log n) bits of randomness if we assume that there’s no active interference by outside parties. This is an amazing result: if you have access to a small amount of randomness, and somebody else gives you access to these quantum boxes, then you can generate exponentially much more (trusted, certified) randomness than you started with. (For a more formal statement, read the paper! The exposition is very nice, and the non-adversarial case is accessible to anyone with a bit of computer science background, no quantum required.)

Is there any hope for a similar result in a one-box setting? It seems like the ideas in the post dance around some kind of nice impossibility proof, but I don’t know what the right statement is. If you have any ideas on this or any other comments/questions about the above exposition, please leave a comment!

 

Leave a Reply

Your email address will not be published. Required fields are marked *