Formulas for Bootstrapping Sample Medians
By Evan Miller
September 9, 2022
A recent paper from researchers at Spotify describes a new method for sampling differences in quantiles using a bootstrap method. To speed up computation compared to existing methods, the authors approximate the distribution of bootstrapped quantiles using a binomial distribution, and conjecture that bootstrapped distribution of quantiles converges to a binomial in the largesample limit.
This note derives an exact closedform solution for the distribution of interest in the special case of the median, valid in both small and large samples. This closedform solution is somewhat lighter in the tails than a binomial; whether it converges to a binomial, that is, whether the conjecture is true, is left as an open problem. The note concludes with an improved approximation that will yield less conservative confidence intervals than the binomial when measuring differences in the median across two samples.
Preliminaries
The bootstrap method in the paper is a Poisson bootstrap. Each point in the original sample is included in each bootstrap sample according to a Poisson distribution with parameter (mean) 1.
Suppose that the original sample consists of ordered points \((y_1, \ldots, y_n)\); we are interested in the probability that each point in the sample will be the quantile of interest in a bootstrapped sample. Denote this probability \({\rm Pr}(y_i=\hat{\tau}_q)\).
Unlike a traditional bootstrap, the total number of points in a Poissonbootstrapped sample is a random variable. Poisson parameters are additive, and so the total number of points is Poissondistributed with parameter \(n\).
As a small curiosity,
\[ \sum_{i=1}^n {\rm Pr}(y_i=\hat{\tau}_q) = 1  e^{n} < 1 \]
That is, it is possible that no points in the original sample are included in a given bootstrap sample, let alone included as the quantile of interest. But this probability vanishes with large \(n\).
The rest of the analysis will be restricted to the median quantile \(\hat{\tau}_{1/2}\), as its symmetric properties make it amenable to exact modeling.
Conditional model
As a first step, let \(m_i\) represent the number of times that \(y_i\) is included in a bootstrap sample. Suppose \(m_i=1\) for some point \(y_i\), \(1<i<n\), that is, \(y_i\) is included exactly once. What is the probability that this \(y_i\) will represent the median in the bootstrap sample?
Momentarily restricting the analysis to oddnumbered bootstrap sample sizes, \(y_i\) will be the bootstrapped median if the number of points drawn below \(y_i\) exactly equals the number of points drawn above \(y_i\).
The additive nature of the Poisson parameter lets us quantify this probability. The number of points drawn below \(y_i\) will be Poissondistributed with mean \(i1\), and the number of points drawn above \(y_i\) will be Poissondistributed with mean \(ni\). Denoting the oddnumbered sample size restriction with \({\rm Pr}_{\rm odd}\), the probability that these two numbers are equal is
\[ {\rm Pr}_{\rm odd}(y_i=\hat{\tau}_{1/2}m_i=1) = \sum_{k=0}^\infty \frac{e^{(i1)}(i1)^k}{k!} \frac{e^{(ni)}(ni)^k}{k!} = e^{(n1)} \sum_{k=0}^\infty \frac{((i1)(ni))^k}{(k!)^2} \]Noting that the modified Bessel equation of the first kind with \(\nu=0\) is
\[ I_0(z) = \sum_{k=0}^\infty \frac{(\frac{1}{4}z^2)^k}{(k!)^2} \]we can set
\[ \begin{array}{c} \frac{1}{4}z^2 = (i1)(ni) \\ z = 2\sqrt{(i1)(ni)} \end{array} \]and write
\[ {\rm Pr}_{\rm odd}(y_i=\hat{\tau}_{1/2}m_i=1) = e^{(n1)}I_0(2\sqrt{(i1)(ni)}) \]If you implement the above equation as written, you may notice that
\[ \sum_{i=1}^n{\rm Pr}_{\rm odd}(y_i=\hat{\tau}_{1/2}m_i=1) < 0.5 \]which will seem to indicate a missing factor of two somewhere. This factor of two is attributable to the fact that oddnumbered sample sizes will be chosen approximately half the time.
To rectify the deficiency, consider evennumbered sample sizes, where either of the middle two points in the sorted sample could be considered the median. Retaining the \(m_i=1\) condition, and following the paper’s convention of designating either middle point to be the median with 50% probability, then \(y_i\) will be the bootstrapped median (with probability onehalf) if the difference between the number of points drawn less than \(y_i\) and the number of points drawn greater than \(y_i\) is exactly equal to one. Mathematically,
\[ {\rm Pr}_{\rm even}(y_i=\hat{\tau}_{1/2}m_i=1)= \frac{1}{2}\left(\sum_{k=0}^\infty \frac{e^{(i1)}(i1)^k}{k!} \frac{e^{(ni)}(ni)^{k+1}}{(k+1)!} + \sum_{k=0}^\infty \frac{e^{(i1)}(i1)^{k+1}}{(k+1)!} \frac{e^{(ni)}(ni)^{k}}{(k)!} \right) \]The firstorder modified Bessel function of the first kind can be written
\[ I_1(z) = \frac{1}{2}z \sum_{k=0}^\infty \frac{(\frac{1}{4}z^2)^k}{k!(k+1)!} \]and so setting again
\[ z = 2\sqrt{(i1)(ni)} \]we have
\[ \begin{array}{c} {\rm Pr}_{\rm even}(y_i=\hat{\tau}_{1/2}m_i=1)= \frac{1}{2}e^{(n1)}\left[\sqrt{\frac{i1}{ni}}I_1(2\sqrt{(i1)(ni)})+\sqrt{\frac{ni}{i1}}I_1(2\sqrt{(i1)(ni)})\right] \\ =\frac{1}{2}e^{(n1)}\left[\sqrt{\frac{i1}{ni}}+\sqrt{\frac{ni}{i1}}\right]I_1(2\sqrt{(i1)(ni)}) \end{array} \]For notational simplicity, let
\[ I_{\nu}=I_\nu(2\sqrt{(i1)(ni)}) \\ I_{\nu}^*=\left[\left(\frac{i1}{ni}\right)^{\nu/2}+\left(\frac{ni}{i1}\right)^{\nu/2}\right]I_\nu \]Then
\[ \begin{array}{c} {\rm Pr}_{\rm odd}(y_i=\hat{\tau}_{1/2}m_i=1)= e^{(n1)}I_0 \\ {\rm Pr}_{\rm even}(y_i=\hat{\tau}_{1/2}m_i=1)= \frac{1}{2}e^{(n1)}I_1^* \\ {\rm Pr}(y_i=\hat{\tau}_{1/2}m_i=1)= e^{(n1)}\left[I_0+\frac{1}{2}I_1^*\right] \end{array} \]We proceed in like manner for the case when \(m_i=2\), that is, \(y_i\) is drawn exactly twice, finding
\[ \begin{array}{c} {\rm Pr}_{\rm odd}(y_i=\hat{\tau}_{1/2}m_i=2)= e^{(n1)}I_1^* \\ {\rm Pr}_{\rm even}(y_i=\hat{\tau}_{1/2}m_i=2)= e^{(n1)}\left[I_0+\frac{1}{2}I_2^*\right] \\ {\rm Pr}(y_i=\hat{\tau}_{1/2}m_i=2)= e^{(n1)}\left[I_0+I_1^*+\frac{1}{2}I_2^*\right] \end{array} \]Generically,
\[ {\rm Pr}(y_i=\hat{\tau}_{1/2}m_i>0)= e^{(n1)}\left[I_0+\sum_{s=1}^{m_i1}I_s^*+\frac{1}{2}I_{m_i}^*\right] \]Unconditional model
Recall that in the Poisson bootstrap, \(m_i\) will be Poissondistributed with mean 1. We can then write an unconditional expression for the probability that \(y_i\) will be the bootstrapped median as
\[ {\rm Pr}(y_i=\hat{\tau}_{1/2})= \sum_{k=1}^\infty \frac{e^{1}}{k!} {\rm Pr}(y_i=\hat{\tau}_{1/2}m_i=k)= \sum_{k=1}^\infty \frac{e^{1}}{k!} e^{(n1)}\left[I_0+\sum_{s=1}^{k1}I_s^*+\frac{1}{2}I_{k}^*\right] \]A bit of algebra yields
\[ {\rm Pr}(y_i=\hat{\tau}_{1/2})=e^{n}\left[c_0 I_0 + \sum_{k=1}^\infty c_k I_k^*\right]\\ \]where
\[ \begin{array}{l} c_0&=&e1\\ c_k &=& \frac{1}{2}\frac{1}{k!}+e\sum_{s=0}^k\frac{1}{s!} \end{array} \]An implementation reveals that this equation is wellapproximated by the suggested binomial distribution, though it is somewhat lighter in the tails.^{1} We may conclude that for differenceinquantile inference, the use of the binomial approximation for index selection will yield slightly conservative confidence intervals.
Approximation
As an alternative to the binomial, I suggest the approximation
\[ {\rm Pr}(y_i=\hat{\tau}_{1/2})\approx 2e^{(n1)}I_0(2\sqrt{(n+1i)(i)}) \]This expression can be derived as twice the probability that \((y_1,\ldots,y_i)\) and \((y_i,\ldots,y_n)\) will contribute equal numbers of points to the bootstrapped sample (note that \(y_i\) appears in both sets), conditioned on an oddnumbered bootstrapped sample size. The approximation adheres more closely to the exact expression than the binomial, though remains heavier in the tails than the exact expression. Thus when used for bootstrapping confidence intervals, it will err very slightly on the conservative side.
The foregoing treatment has omitted exact probability expressions for the edge cases \({\rm Pr}(y_1=\hat{\tau}_{1/2})\) and \({\rm Pr}(y_n=\hat{\tau}_{1/2})\); these are left as an exercise for the highly motivated reader. For all practical applications, they may be approximated as zero.
Conclusion
The Poisson bootstrap has gained a measure of popularity in recent years due to its ease of implementation as well as its adaptability to paralleldata settings. The derivations above demonstrate another benefit: amenability to exact analysis, which may shed some future light on the original authors’ conjecture that the distribution of sample quantile indexes converges to a binomial in the largesample limit. In the meantime, implementers may use the above equations to tighten their bootstrapped confidence intervals for any statistics involving sample medians.
Notes

Implementers are advised to make use of scaled Bessel functions, such as
SpecialFunctions.besselix
in Julia orscipy.special.jve
in Python.
You’re reading evanmiller.org, a random collection of math, tech, and musings. If you liked this you might also enjoy:
 Formulas for Bayesian A/B Testing
 Simple Sequential A/B Testing
 The Amex EveryDay Bonus: A Stochastic Valuation Model
 Evaluating Splatoon’s Ranking System
Get new articles as they’re published, via Twitter or RSS.
Want to look for statistical patterns in your MySQL, PostgreSQL, or SQLite database? My desktop statistics software Wizard can help you analyze more data in less time and communicate discoveries visually without spending days struggling with pointless command syntax. Check it out!
Back to Evan Miller’s home page – Subscribe to RSS – LinkedIn – Twitter