A few days ago we opened up with some group work surrounding the following problem. I gave no guidance other than, “One representative will share your solution with the class.”

My favorite cereal has just announced that it’s going to start including prizes in the box. There is one of three different prizes in every package. My mom, being cheap and largely unwilling to purchase the kind of cereal that has prizes in it, has agreed to buy me exactly three boxes. What is the probability that, at the end of opening the three boxes, I will have collected

all threedifferent prizes?

It’s a very JV, training-wheels version of the coupon collector’s problem, but it’s nice for a couple of reasons:

- The actual coupon collector’s problem is several years out of reach, but it’s a goody, so why not introduce the basics of it?
- There is a meaningful conversation to be had about independence. (
*Does drawing a prize from Box 1 change the probabilities for Box 2? Truly? Appreciably? Is it okay to assume, for simplicity, that it doesn’t? How many prizes need to be out there in the world for us to feel comfortable treating this thing as if it were a drawing*)**with replacement**? If everybody else is buying up cereal—and prizes—uniformly, does that bring things**closer**to true independence? farther away? - There are enough intuitive wrong answers to require some deeper discussion: e.g,
**1/3**(*Since all the probabilities along the way are 1/3, shouldn’t the final probability of success also be 1/3?*),**1/27**(*There are three chances of 1/3 each, so I multiplied them together.*), and**1/9**(*There are three shots at three prizes, so nine outcomes, and I want the one where I get all different toys.*) The correct answer, by the by, is**6/27 or 2/9**(try it out).

Many groups jumped right into working with the raw numbers (see wrong answers above). A few tried, with varying levels of success, to list all the outcomes individually (interestingly, a lot of these groups correctly counted 27 possibilities, but then woefully miscounted the number of successes…hmmm). A small but determined handful of groups used tree diagrams to help them reason about outcomes sequentially.

This business of using tree diagrams was pleasantly surprising. We hadn’t yet introduced them in class, and I hadn’t made any suggestions whatsoever about how to tackle the problem, so I thought it was nice to see a spark of recollection. That said, it’s not *terribly* surprising; presumably these kids have used them before. But I did run across one student, Z, who **interpreted** his tree diagram in novel way—to me at least.

Most students, when looking at a tree diagram, hunt for paths that meet the criteria for success. *Here’s a path where I get Prize 1, then Prize 2, then Prize 3. Here’s another where I get Prize 1, then Prize 3, then Prize 2… *The algorithm goes something like, follow a path event-by-event and, if you ultimately arrive at the compound event of interest, tally up a success. Repeat until you’re out of paths. That is, most students see each path as an stand-alone entity to be checked, and then either counted or ignored.

What Z did was different in three important ways. First of all, he found his solutions via subtraction rather than addition. Second, he attacked the problem in a very visual—almost **geometric**—way. And third, he didn’t treat each path separately; rather, Z searched for equivalence classes of paths within the overall tree.

Z’s (paraphrased) explanation goes as follows:

First I erased all of the straight paths, because they mean I get the same prize in every box. Then I erased all of the paths that were

almoststraight, but had one segment that was crooked, which means I get two of the same prize. And then I was left with the paths that werethe mostcrooked, which means I get a different prize each time.

Looking at his diagram, I noticed that Z hadn’t even labeled the segments; he simply drew the three stages, with three possibilities at each node, and then deleted everything that wasn’t maximally crooked. How awesome is that? In fact, taking this tack made it really easy for him to answer more complicated followup questions. Since he’d already considered the other cases, he could readily figure out the probability of getting three of the same prize (the **3** branches he pruned first), or getting only two different prizes (the next **18 **trimmings). He could even quickly recognize the probability of getting the same prize twice in a row, followed by a different one (the **6** branches he trimmed that went off in one direction, followed by a **straight-crooked** pattern).

Of course this method isn’t particularly efficient. He had to cut away 21 paths to get down to 6. For *n* prizes and boxes, you end up pruning *n ^{n} — n!* branches. Since

*n*grows much, much faster, than

^{n}*n!*, Z’s algorithm becomes prohibitively tedious in a hurry. If there are 5 prizes and 5 boxes, that’s already

**3005**branches that need to be lopped off. So yes, it’s inefficient, but then again so are tree diagrams. Without more sophisticated tools under his belt, that’s not too shabby. What the algorithm lacks in computational efficiency, it makes up for in conceptual thoughtfulness. I’ll take it that tradeoff any day of the week.

Pingback: Cereal Boxes Redux | Lines and Lines of Tangency

Hey! Can you make that demo into a video that I can pause? (No this is not a Khan Academy joke!) I dig the visualization but need time to process and make sense of each step.