Formulas for Bayesian A/B Testing

By Evan Miller

October 1, 2015 (changes)

This page collects a few formulas I've derived for evaluating A/B tests in a Bayesian context. The formulas on this page are closed-form, so you don't need to do complicated integral evaluations; they can be computed with simple loops and a decent math library. The advantage of Bayesian formulas over the traditional frequentist formulas is that you don't have to collect a pre-ordained sample size in order to get a valid result. (See How Not To Run An A/B Test for more context on the "peeking" problem, and Simple Sequential A/B Testing for a frequentist solution to the problem.)

Table of Contents

  1. A/B testing: binary outcomes
    1. Derivation
    2. Equivalent Formulas
    3. Implementation
  2. A/B/C testing: binary outcomes
    1. Derivation
    2. Implementation
  3. A/B testing: count data
    1. Derivation
    2. Implementation
  4. A/B/C testing: count data
    1. Derivation
    2. Implementation

A/B testing: binary outcomes

For a binary-outcome test (e.g. a test of conversion rates), the probability that B will beat A in the long run is given by:

\[ {\rm Pr}(p_B > p_A) = \sum_{i=0}^{\alpha_B-1}{\frac{B(\alpha_A+i, \beta_B + \beta_A)}{(\beta_B+i)B(1+i,\beta_B)B(\alpha_A,\beta_A)}} \]

Where:

Derivation

Not for the timid! If calculus isn’t your cup of chamomile, skip right to the implementation section.

The beta distribution is a convenient prior distribution for modeling a binomial parameter \(p\). Starting from a non-informative prior,1 the distribution of \(p\) after \(S\) successes and \(F\) failures is given by \({\rm Beta}(S+1, F+1)\). For the remainder of the discussion let \(\alpha=S+1\) and \(\beta=F+1\), which are just the two parameters of the beta distribution for the belief.

Now suppose we have two experimental branches (A and B) and have a Bayesian belief for each one:

\[ \begin{array}{lr} p_A \sim {\rm Beta}(\alpha_A, \beta_A) \\ p_B \sim {\rm Beta}(\alpha_B, \beta_B) \end{array} \]

Using the pdf of the beta distribution, we can get the total probability that \(p_B\) is greater than \(p_A\) by integrating the joint distribution over all values for which \(p_B > p_A\):

\[ \begin{equation} {\rm Pr}(p_B > p_A) = \int_0^1 \int_{p_A}^1 \frac{p_A^{\alpha_A-1}(1-p_A)^{\beta_A-1}}{B(\alpha_A, \beta_A)} \frac{{p_B}^{\alpha_B-1}(1-p_B)^{\beta_B-1}}{B(\alpha_B, \beta_B)} dp_B dp_A \end{equation} \]

Evaluating the inner integral, the equation becomes:

\[ \begin{equation} \label{binary_ab_pr_eval_inner} {\rm Pr}(p_B > p_A) = 1 - \int_0^1 \frac{p_A^{\alpha_A-1}(1-p_A)^{\beta_A-1}}{B(\alpha_A,\beta_A)}I_{p_A}(\alpha_B, \beta_B)dp_A \end{equation} \]

Where \(I_x\) is the regularized incomplete beta function. Using the identity \(I_x(1,b) = 1 - (1 - x)^b\), the recursive relationship

\[ \begin{equation} I_x(a, b) = I_x(a-1, b) - \frac{x^{a-1}(1-x)^b}{(a-1)B(a-1,b)} \end{equation} \]

and the fact that \(\alpha\) and \(\beta\) are integers, we can express \(I_x\) as:

\[ \begin{equation} I_x(a, b) = 1 - (1 - x)^b - \sum_{j=1}^{a-1}\frac{x^{a-j}(1-x)^b}{(a-j)B(a-j,b)} \end{equation} \]

Or equivalently:

\[ \begin{equation} \label{ibeta_sum} I_x(a, b) = 1 - \sum_{i=0}^{a-1} \frac{x^{i}(1 - x)^b}{(b+i)B(1+i,b)} \end{equation} \]

The probability integral \(\eqref{binary_ab_pr_eval_inner}\) can therefore be written:

\[ \begin{array}{ll} {\rm Pr}(p_B > p_A) &=& 1 - \int_0^1 \frac{p_A^{\alpha_A-1}(1-p_A)^{\beta_A-1}}{B(\alpha_A,\beta_A)} \left(1 - \sum_{i=0}^{\alpha_B-1}{\frac{p_A^i(1-p_A)^{\beta_B}}{(\beta_B+i)B(1+i, \beta_B)}}\right)dp_A \\ &=& 1 - 1 + \int_0^1 \frac{p_A^{\alpha_A-1}(1-p_A)^{\beta_A-1}}{B(\alpha_A,\beta_A)} \sum_{i=0}^{\alpha_B-1}{\frac{p_A^i(1-p_A)^{\beta_B}}{(\beta_B+i)B(1+i, \beta_B)}}dp_A \\ &=& \int_0^1 \sum_{i=0}^{\alpha_B-1}{\frac{p_A^{\alpha_A-1+i}(1-p_A)^{\beta_A+\beta_B-1}}{(\beta_B+i)B(\alpha_A, \beta_A)B(1+i, \beta_B)}}dp_A \\ &=& \sum_{i=0}^{\alpha_B-1}\int_0^1{\frac{p_A^{\alpha_A-1+i}(1-p_A)^{\beta_A+\beta_B-1}}{(\beta_B+i)B(\alpha_A, \beta_A)B(1+i, \beta_B)}}dp_A \\ &=& \sum_{i=0}^{\alpha_B-1}\frac{B(\alpha_A+i,\beta_A+\beta_B)}{(\beta_B+i)B(\alpha_A, \beta_A)B(1+i, \beta_B)} \int_0^1{\frac{p_A^{\alpha_A-1+i}(1-p_A)^{\beta_A+\beta_B-1}}{B(\alpha_A+i,\beta_A+\beta_B)}}dp_A \end{array} \]

Finally:

\[ \begin{equation} \label{binary_ab_pr_alpha_b} {\rm Pr}(p_B > p_A) = \sum_{i=0}^{\alpha_B-1}\frac{B(\alpha_A+i,\beta_A+\beta_B)}{(\beta_B+i) B(1+i, \beta_B) B(\alpha_A, \beta_A) } \end{equation} \]

Update, 6/8/2014: Chris Stucchio has derived a decision rule and asymptotic formula using the above formula. Check them out!

Equivalent Formulas

It's possible to derive similar formulas that sum over the other three parameters:

\[ \begin{align} {\rm Pr}(p_B > p_A) &= 1 - \sum_{i=0}^{\alpha_A-1}{\frac{B(\alpha_B+i, \beta_B + \beta_A)}{(\beta_A+i)B(1+i,\beta_A)B(\alpha_B,\beta_B)}} \\ {\rm Pr}(p_B > p_A) &= \sum_{i=0}^{\beta_A-1}{\frac{B(\beta_B+i, \alpha_A + \alpha_B)}{(\alpha_A+i)B(1+i,\alpha_A)B(\alpha_B,\beta_B)}} \\ {\rm Pr}(p_B > p_A) &= 1 - \sum_{i=0}^{\beta_B-1}{\frac{B(\beta_A+i, \alpha_A + \alpha_B)}{(\alpha_B+i)B(1+i,\alpha_B)B(\alpha_A,\beta_A)}} \end{align} \]

The above formulas can be found with symmetry arguments.

Implementation

Here's a quick implementation of \(\eqref{binary_ab_pr_alpha_b}\) in Julia:

function probability_B_beats_A(α_A, β_A, α_B, β_B)
    total = 0.0
    for i = 0:(α_B-1)
        total += exp(lbeta(α_A+i, β_B+β_A) 
            - log(β_B+i) - lbeta(1+i, β_B) - lbeta(α_A, β_A))
    end
    return total
end

The beta function produces very large numbers, so if you’re getting infinite values in your program, be sure to work with logarithms, as in the code above. Your standard library’s log-beta function will come in handy here.

If you don’t have log-beta available, it's easy enough to define one with the log-gamma function and the identity:

\[ \log(B(a, b)) = \log(\Gamma(a)) + \log(\Gamma(b)) - \log(\Gamma(a+b)) \]

If you have neither log-beta nor log-gamma available, first rewrite the equation in terms of gamma function:

\[ {\rm Pr}(p_B > p_A) = \sum_{i=0}^{\alpha_B-1}\frac{\Gamma(\alpha_A+i)\Gamma(\beta_A+\beta_B)\Gamma(1+i+\beta_B)\Gamma(\alpha_A+\beta_A)}{(\beta_B+i) \Gamma(\alpha_A+i+\beta_A+\beta_B) \Gamma(1+i)\Gamma(\beta_B) \Gamma(\alpha_A)\Gamma(\beta_A) } \]

Using the property that \(\Gamma(z)=(z-1)!\), notice that there are an equal number of multiplicative terms in the numerator and denominator. If you alternately multiply and divide one term at a time, you should be able to arrive at an answer without encountering numerical overflow.

As there are four equivalent formulas available, implementers concerned with computational efficiency may want to choose the formula that requires the smallest number of iterations for a particular set of \(\alpha\) and \(\beta\) values.

A final word of caution to implementers: when calling these functions, don’t forget to add 1 to the success and failure counts! Otherwise your results will be slightly off.


A/B/C testing: binary outcomes

It is possible to extend the binary-outcome formula to three test groups, call them A, B, and C. The probability that C will beat both A and B in the long run is:

\[ {\rm Pr}(p_C > \max\{p_A, p_B\}) = \\ 1 - {\rm Pr}(p_A > p_C) - {\rm Pr}(p_B > p_C) + \sum_{i=0}^{\alpha_A-1} \sum_{j=0}^{\alpha_B-1}{\frac{B(i+j+\alpha_C, \beta_A + \beta_B + \beta_C)}{(\beta_A+i)B(1+i,\beta_A)(\beta_B+j)B(1+j,\beta_B)B(\alpha_C, \beta_C)}} \]

Where:

Note that this formula can be computed in \(O(\alpha_A \alpha_B)\) time (see the implementation section below).

Derivation

Start with a Bayesian belief for each of three experimental branches (A, B, and C):

\[ \begin{array}{lr} p_A \sim {\rm Beta}(\alpha_A, \beta_A) \\ p_B \sim {\rm Beta}(\alpha_B, \beta_B) \\ p_C \sim {\rm Beta}(\alpha_C, \beta_C) \\ \end{array} \]

Calling the pdf of the beta distribution \(f(p|\alpha, \beta) = f(p)\), we can get the total probability that \(p_C\) is greater than both \(p_A\) and \(p_B\) by integrating the joint distribution over all values for which \(p_C > p_A\) and \(p_C > p_B\):

\[ {\rm Pr}(p_C > \max{\{p_A, p_B\}}) = \int_0^1 \int_0^{p_C} \int_0^{p_C} f(p_A) f(p_B) f(p_C) dp_A dp_B dp_C \]

Evaluating the inner two integrals, the equation becomes:

\[ \begin{equation} \label{binary_abc_pr_eval_inner} {\rm Pr}(p_C > \max{\{p_A, p_B\}}) = \int_0^1 I_{p_C}(\alpha_A, \beta_A) I_{p_C}(\alpha_B, \beta_B) f(p_C) dp_z \end{equation} \]

Using the identity for \(I_X\) \(\eqref{ibeta_sum}\), we have:

\[ {\rm Pr} = \int_0^1 \left(1-\sum_{i=0}^{\alpha_A-1}\frac{p_C^i (1-p_C)^{\beta_A}}{(\beta_A+i)B(1+i,\beta_A)}\right) \left(1-\sum_{i=0}^{\alpha_B-1}\frac{p_C^i (1-p_C)^{\beta_B}}{(\beta_B+i)B(1+i,\beta_B)}\right) f(p_C) dp_C \\ \]

Multiplying out the parenthetical terms and integrating them separately:

\[ {\rm Pr} = 1 - \int_0^1 \sum_{i=0}^{\alpha_A-1}\frac{p_C^i (1-p_C)^{\beta_A}}{(\beta_A+i)B(1+i,\beta_A)} f(p_C) dp_C - \int_0^1 \sum_{i=0}^{\alpha_B-1}\frac{p_C^i (1-p_C)^{\beta_B}}{(\beta_B+i)B(1+i,\beta_B)} f(p_C) dp_C \\ + \int_0^1 \sum_{i=0}^{\alpha_A-1}\frac{p_C^i (1-p_C)^{\beta_A}}{(\beta_A+i)B(1+i,\beta_A)} \sum_{i=0}^{\alpha_B-1}\frac{p_C^i (1-p_C)^{\beta_B}}{(\beta_B+i)B(1+i,\beta_B)} f(p_C) dp_C \]

From the previous derivation, we can rewrite the first two integrals as \({\rm Pr}(p_A > p_C)\) and \({\rm Pr}(p_B > p_C)\), and consolidate the terms inside the third integral:

\[ \begin{array}{ll} {\rm Pr} &=& 1 - {\rm Pr}(p_A > p_C) - {\rm Pr}(p_B > p_C) \\ && + \int_0^1 \sum_{i=0}^{\alpha_A-1} \sum_{j=0}^{\alpha_B-1} \frac{p_C^{i+j}(1-p_C)^{\beta_A+\beta_B}}{(\beta_A+i)(\beta_B+j)B(1+i,\beta_A)B(1+j,\beta_B)} \frac{p_C^{\alpha_C-1}(1-p_C)^{\beta_C-1}}{B(\alpha_C, \beta_C)} dp_C \\ \\ &=& 1 - {\rm Pr}(p_A > p_C) - {\rm Pr}(p_B > p_C) \\ && + \int_0^1 \sum_{i=0}^{\alpha_A-1} \sum_{j=0}^{\alpha_B-1} \frac{p_C^{i+j+\alpha_C-1}(1-p_C)^{\beta_A+\beta_B+\beta_C-1}}{(\beta_A+i)(\beta_B+j)B(1+i,\beta_A)B(1+j,\beta_B)B(\alpha_C, \beta_C)} dp_C \end{array} \]

Finally, evaluating the integral we have:

\[ \begin{array}{ll} {\rm Pr}(p_C > \max\{p_A, p_B\}) &=& 1 - {\rm Pr}(p_A > p_C) - {\rm Pr}(p_B > p_C) \\ && + \sum_{i=0}^{\alpha_A-1} \sum_{j=0}^{\alpha_B-1} \frac{B(i+j+\alpha_C, \beta_A+\beta_B+\beta_C)}{(\beta_A+i)(\beta_B+j)B(1+i,\beta_A)B(1+j,\beta_B)B(\alpha_C, \beta_C)} \end{array} \]

Implementation

In Julia (taking advantage of probability_B_beats_A defined above):

function probability_C_beats_A_and_B(α_A, β_A, α_B, β_B, α_C, β_C)
    total = 0.0
    for i = 0:(α_A-1)
        for j = 0:(α_B-1)
          total += exp(lbeta(α_C+i+j, β_A+β_B+β_C) - log(β_A+i) - log(β_B+j)
              - lbeta(1+i, β_A) - lbeta(1+j, β_B) - lbeta(α_C, β_C))
        end
    end
    return 1 - probability_B_beats_A(α_C, β_C, α_A, β_A) 
             - probability_B_beats_A(α_C, β_C, α_B, β_B) + total
end

See the hints above if you don't have a log-beta function in your language.


A/B testing: count data

Analyzing count data (for example, if you're comparing the number of sales per salesman, or number of sales week over week) requires a different formula. The probability that group 1 has a higher arrival rate than group 2 is given by:

\[ {\rm Pr}(\lambda_1 > \lambda_2) = \sum_{k=0}^{\alpha_1-1} \frac{ (\beta_1 + \beta_2)^{-(k+\alpha_2)} \beta_1^k \beta_2^{\alpha_2} }{(k+\alpha_2)B(k+1,\alpha_2)} \]

Here the \(\alpha\) values represent the total event counts in each group, and the \(\beta\) values represent the exposure, that is, the relative opportunity for events to occur. For example, if the first group was “exposed” to events for twice as long as the second group, you would set \(\beta_1 = 2\) and \(\beta_2 = 1\). (Alternatively, you could set \(\beta_1 = 20\) and \(\beta_2 = 10\); the math works out the same.) In the above equation, \(B\) is the beta function.

A derivation and implementation follow; they both closely mirror the binary-outcome case.

Derivation

Not for the timid! You may want to skip right to the implementation section.

Let \(\lambda_1\) and \(\lambda_2\) be the Poisson parameter for each group. With a gamma-distributed prior belief, the posterior beliefs are given by:

\[ \begin{array}{lr} \lambda_1 \sim {\rm Gamma}(\alpha_1, \beta_1) \\ \lambda_2 \sim {\rm Gamma}(\alpha_2, \beta_2) \end{array} \]

Using the pdf of the gamma distribution, we can get the total probability that \(\lambda_1\) is greater than \(\lambda_2\) by integrating the joint distribution over all values for which \(\lambda_1 > \lambda_2\):

\[ \begin{equation} {\rm Pr}(\lambda_1 > \lambda_2) = \int_0^\infty \int_{\lambda_2}^\infty \frac{\beta_1^{\alpha_1} \lambda_1^{\alpha_1 - 1} e^{-\beta_1 \lambda_1}}{\Gamma(\alpha_1)} \frac{\beta_2^{\alpha_2} \lambda_2^{\alpha_2 - 1} e^{-\beta_2 \lambda_2}}{\Gamma(\alpha_2)} d\lambda_1 d\lambda_2 \end{equation} \]

Evaluating the inner integral, the equation becomes:

\[ \begin{equation} \label{count_ab_pr_eval_inner} \int_0^\infty Q(\alpha_1, \beta_1 \lambda_2) \frac{\beta_2^{\alpha_2} \lambda_2^{\alpha_2-1} e^{-\beta_2 \lambda_2}}{\Gamma(\alpha_2)} d\lambda_2 \end{equation} \]

Where \(Q\) is the upper incomplete regularized gamma function. Using the identity \(Q(1,z) = e^{-z}\) and the recursive relationship

\[ \begin{equation} Q(a + n, z) = Q(a, z) + z^a e^{-z} \sum_{k=0}^{n-1} \frac{z^k}{\Gamma(a+k+1)} \end{equation} \]

we can express \(Q\) as:

\[ \begin{equation} \label{igamma_sum} Q(n, z) = e^{-z}\sum_{k=0}^{n-1}\frac{z^k}{\Gamma(k+1)} \end{equation} \]

The probability integral \(\eqref{count_ab_pr_eval_inner}\) can therefore be written:

\[ \begin{array}{ll} {\rm Pr}(\lambda_1 > \lambda_2) &=& \int_0^\infty e^{-\beta_1 \lambda_2} \left( \sum_{k=0}^{\alpha_1-1} \frac{(\beta_1 \lambda_2)^k}{\Gamma(k+1)}\right) \frac{\beta_2^{\alpha_2} \lambda_2^{\alpha_2-1} e^{-\beta_2 \lambda_2}}{\Gamma(\alpha_2)} d\lambda_2 \\ &=& \sum_{k=0}^{\alpha_1-1} \int_0^\infty \frac{e^{-\beta_1 \lambda_2} (\beta_1 \lambda_2)^k}{\Gamma(k+1)} \frac{\beta_2^{\alpha_2} \lambda_2^{\alpha_2-1} e^{-\beta_2 \lambda_2}}{\Gamma(\alpha_2)} d\lambda_2 \\ &=& \sum_{k=0}^{\alpha_1-1} \frac{\beta_1^k \beta_2^{\alpha_2}}{\Gamma(k+1) \Gamma(\alpha_2)} \int_0^\infty e^{-(\beta_1+\beta_2) \lambda_2} \lambda_2^{k+\alpha_2-1} d\lambda_2 \\ &=& \sum_{k=0}^{\alpha_1-1} \frac{\beta_1^k \beta_2^{\alpha_2}}{\Gamma(k+1) \Gamma(\alpha_2)} (\beta_1 + \beta_2)^{-(k+\alpha_2)} \Gamma(k+\alpha_2) \\ &=& \sum_{k=0}^{\alpha_1-1} \beta_1^k \beta_2^{\alpha_2} (\beta_1 + \beta_2)^{-(k+\alpha_2)} \frac{\Gamma(k+\alpha_2)}{\Gamma(k+1) \Gamma(\alpha_2)} \frac{k+\alpha_2}{k+\alpha_2} \end{array} \]

Using \(\Gamma(z+1) = z\Gamma(z)\), we have:

\[ \begin{equation} \label{gamma_version} {\rm Pr}(\lambda_1 > \lambda_2) = \sum_{k=0}^{\alpha_1-1} \beta_1^k \beta_2^{\alpha_2} (\beta_1 + \beta_2)^{-(k+\alpha_2)} \frac{\Gamma(k+\alpha_2+1)}{\Gamma(k+1) \Gamma(\alpha_2)} \frac{1}{k+\alpha_2} \end{equation} \]

Replacing the gamma functions with a beta function, we have:

\[ \begin{equation} \label{count_ab_pr_alpha_b} {\rm Pr}(\lambda_1 > \lambda_2) = \sum_{k=0}^{\alpha_1-1} \frac{ (\beta_1 + \beta_2)^{-(k+\alpha_2)} \beta_1^k \beta_2^{\alpha_2} }{(k+\alpha_2)B(k+1,\alpha_2)} \end{equation} \]

Implementation

Here's a quick implementation of \(\eqref{count_ab_pr_alpha_b}\) in Julia:

function probability_1_beats_2(α_1, β_1, α_2, β_2)
    total = 0.0
    for k = 0:(α_1-1)
        total += exp(k * log(β_1) + α_2 * log(β_2) - (k+α_2) * log(β_1 + β_2)
            - log(k + α_2) - lbeta(k + 1, α_2))
    end
    return total
end

The beta function produces very large numbers, so if you’re getting infinite values in your program, be sure to work with logarithms, as in the code above. Your standard library’s log-beta function will come in handy here.

If you don’t have log-beta available, it's easy enough to define one with the log-gamma function and the identity:

\[ \log(B(a, b)) = \log(\Gamma(a)) + \log(\Gamma(b)) - \log(\Gamma(a+b)) \]

For more life-changing tips see the implementation section for the binary-outcome case.


A/B/C testing: count data

The count data formula above can be extended to three groups:

\[ {\rm Pr}(\lambda_1 > \max\{\lambda_2, \lambda_3\}) = 1 - {\rm Pr}(\lambda_2 > \lambda_1) - {\rm Pr}(\lambda_3 > \lambda_1) \\ + \sum_{k=0}^{\alpha_2-1} \sum_{l=0}^{\alpha_3-1} \frac{\beta_1^{\alpha_1} \beta_2^k \beta_3^l}{(\beta_1 + \beta_2 + \beta_3)^{(k+l+\alpha_1)}} \frac{\Gamma(k+l+\alpha_1)}{\Gamma(k+1) \Gamma(l+1) \Gamma(\alpha_1)} \]

Where:

Derivation

Start with a Bayesian belief for each of three experimental branches (A, B, and C):

\[ \begin{array}{lr} \lambda_1 \sim {\rm Gamma}(\alpha_1, \beta_1) \\ \lambda_2 \sim {\rm Gamma}(\alpha_2, \beta_2) \\ \lambda_2 \sim {\rm Gamma}(\alpha_3, \beta_3) \end{array} \]

Calling the pdf of the gamma distribution \(g(\lambda|\alpha, \beta) = g(\lambda)\), we can construct the total probability that \(\lambda_1\) is greater than \(\lambda_2\) and \(\lambda_3\) with a triple integral:

\[ {\rm Pr}(\lambda_1 > \max\{\lambda_2, \lambda_3\}) = \int_0^\infty \int_0^{\lambda_1} \int_0^{\lambda_1} g(\lambda_3) g(\lambda_2) g(\lambda_1) d\lambda_3 d\lambda_2 d\lambda_1 \]

Evaluating the two inner integrals (using the upper incomplete regularized gamma function as before), we can write:

\[ {\rm Pr}(\lambda_1 > \max{\lambda_2, \lambda_3}) = \int_0^\infty (1-Q(\alpha_2, \beta_2 \lambda_1))(1-Q(\alpha_3, \beta_3 \lambda_1)) g(\lambda_1) d\lambda_1 \]

Multiplying out the parenthetical terms and distributing the integral across the four resulting terms:

\[ {\rm Pr} = 1 - \int_0^\infty Q(\alpha_2, \beta_2 \lambda_1) g(\lambda_1) d\lambda_1 - \int_0^\infty Q(\alpha_3, \beta_3 \lambda_1) g(\lambda_1) d\lambda_1 + \int_0^\infty Q(\alpha_2, \beta_2 \lambda_1) Q(\alpha_3, \beta_3 \lambda_1) g(\lambda_1) d\lambda_1 \]

Following the previous derivation, we can rewrite the first two integrals as \({\rm Pr}(\lambda_2 > \lambda_1)\) and \({\rm Pr}(\lambda_3 > \lambda_1)\), and then write the remaining \(Q\) terms as summations:

\[ {\rm Pr} = 1 - {\rm Pr}(\lambda_2 > \lambda_1) - {\rm Pr}(\lambda_3 > \lambda_1) + \int_0^\infty e^{-\beta_2\lambda_1} \left(\sum_{k=0}^{\alpha_2-1} \frac{(\beta_2 \lambda_1)^k}{\Gamma(k+1)} \right) e^{-\beta_3\lambda_1} \left(\sum_{l=0}^{\alpha_3-1} \frac{(\beta_3 \lambda_1)^l}{\Gamma(l+1)} \right) g(\lambda_1) d\lambda_1 \]

Writing out \(g\) explicitly and consolidating terms:

\[ {\rm Pr} = 1 - {\rm Pr}(\lambda_2 > \lambda_1) - {\rm Pr}(\lambda_3 > \lambda_1) + \int_0^\infty e^{-\lambda_1(\beta_1+\beta_2+\beta_3)} \sum_{k=0}^{\alpha_2-1} \sum_{l=0}^{\alpha_3-1} \frac{\beta_1^{\alpha_1} \beta_2^k \beta_3^l \lambda_1^{(k+l+\alpha_1-1)}}{\Gamma(k+1)\Gamma(l+1)\Gamma(\alpha_1)} d\lambda_1 \]

Bringing the integral inside the summations:

\[ {\rm Pr} = 1 - {\rm Pr}(\lambda_2 > \lambda_1) - {\rm Pr}(\lambda_3 > \lambda_1) + \sum_{k=0}^{\alpha_2-1} \sum_{l=0}^{\alpha_3-1} \frac{\beta_1^{\alpha_1} \beta_2^k \beta_3^l}{\Gamma(k+1)\Gamma(l+1)\Gamma(\alpha_1)} \int_0^\infty e^{-\lambda_1(\beta_1+\beta_2+\beta_3)} \lambda_1^{(k+l+\alpha_1-1)} d\lambda_1 \]

And evaluating it:

\[ {\rm Pr} = 1 - {\rm Pr}(\lambda_2 > \lambda_1) - {\rm Pr}(\lambda_3 > \lambda_1) + \sum_{k=0}^{\alpha_2-1} \sum_{l=0}^{\alpha_3-1} \frac{\beta_1^{\alpha_1} \beta_2^k \beta_3^l}{\Gamma(k+1)\Gamma(l+1)\Gamma(\alpha_1)} \frac{\Gamma(k+l+\alpha_1)}{(\beta_1+\beta_2+\beta_3)^{(k+l+\alpha_1)}} \]

Rearranging, we finally have:

\[ \begin{equation} \label{count_abc_pr_alpha_b} {\rm Pr} = 1 - {\rm Pr}(\lambda_2 > \lambda_1) - {\rm Pr}(\lambda_3 > \lambda_1) + \sum_{k=0}^{\alpha_2-1} \sum_{l=0}^{\alpha_3-1} \frac{\beta_1^{\alpha_1} \beta_2^k \beta_3^l}{(\beta_1+\beta_2+\beta_3)^{(k+l+\alpha_1)}} \frac{\Gamma(k+l+\alpha_1)}{\Gamma(k+1)\Gamma(l+1)\Gamma(\alpha_1)} \end{equation} \]

It also is possible to rewrite the gamma functions in terms of a single, multivariate beta function, but that doesn't necessarily increase the equation's clarity.

Implementation

In Julia, leveraging the probability_1_beats_2 function above, we can compute \(\eqref{count_abc_pr_alpha_b}\) with:

function probability_1_beats_2_and_3(α_1, β_1, α_2, β_2, α_3, α_3)
    total = 0.0
    for k = 0:(α_2-1)
        for l = 0:(α_3-1)
            total += exp(α_1 * log(β_1) + k * log(β_2) + l * log(β_3) 
                - (k+l+α_1) * log(β_1 + β_2 + β_3)
                + lgamma(k+l+α_1) - lgamma(k+1) - lgamma(l+1) - lgamma(α_1))
        end
    end
    return 1 - probability_1_beats_2(α_2, β_2, α_1, β_1)
             - probability_1_beats_2(α_3, β_3, α_1, β_1) + total
end

And that's a wrap!


Footnotes

1 An informative prior can also be used here, with the restriction that \(\alpha\) and \(\beta\) remain integers.


Changes


You're reading evanmiller.org, a random collection of math, tech, and musings. If you enjoyed this, you might also like:

If you run A/B tests frequently, you should check out Evan's Awesome A/B Tools:

Sample Size Calculator

Chi-Squared Test

Two-Sample T-Test

Finally, if you own a Mac, my desktop statistics software Wizard can help you analyze more data in less time and communicate discoveries visually without spending days struggling with pointless command syntax. Check it out!


Wizard
Statistical analyzer

Back to Evan Miller's home pageFollow on TwitterSubscribe to RSS