## Expected Number of Consecutive Successes for an Bernoulli Distribution

The bernoulli distribution is defined by a single parameter $p$; it can be considered as the probability of success. The most common example used is a coin flip. Assuming the coin is fair, $p = 0.5$.

This type of distribution is also commonly used in Automatic Speech Recognition (ASR). A common approach to ASR models speech as a set of Hidden Markov Models (HMM). Within each HMM, there are multiple states; transitions between these states are modeled as bernoulli distributions. The probability of staying in the same state in the next time frame is $p$ and the probability of transitioning to the next state is $(1-p)$.

One question I had recently is what is the expected length of a sample generated by the HMM. In the types of HMMs commonly used in ASR, answering this question requires knowing what is the expected number of times a state transitions to itself. Lets work our way up to answering this question.

The probability of $n$ number of self transitions is $(1-p) p^{n}$. The $(1-p)$ term is needed because if there were only $n$ self transitions, then it must have transitioned to another state in the next time step. To find the expected number of transitions we take the number $n$ times the probability of $n$ self transitions for every possible $n$. Mathematically, we can write this as

$\sum_{n=0} n (1-p) p^{n}$.

To simplify the analysis, we can factor out the $(1-p)$ term.

$(1-p) \sum_{n=0} n p^{n}$

If we ignore the $n$ factor, it looks very similar to a geometric series. It is known that the infinite sum for a geometric series is

$\sum_{n=0} p^{n} = \frac{1}{1-p}$.

The one insight required is that if we take the derivative of the geometric series, we end up with the exact series we are trying to solve—except for the $(1-p)$ term. Taking the derivative of both sides gives

$\sum_{n=0} n p^{n} = \frac{1}{(1-p)^{2}}$.

Substituting that back into our original series gives us

$(1-p) \sum_{n=0} n p^{n} = (1-p) \frac{1}{(1-p)^{2}} = \frac{1}{1-p}$.

So the expected number of self transitions for any state is $\frac{1}{1-p}$, which is also the expected number of consecutive successes for a bernoulli distribution.

A couple of examples. If I hand you a fair coin and ask how many times do you expect to get heads before the first tails, the answer is $\frac{1}{1 - 0.5} = 2$. If I hand you a fair six-sided die and ask how many times do you expect to roll before you see a 1, the answer is $\frac{1}{1 - 5/6} = 6$.