Skip to main content

The Fairest Possible Choice

· 5 min read
Series parts
  1. Part 1The Fairest Possible Choice
On this page

In software, the presence of randomness usually starts with an absence of justification. A load balancer sees two healthy servers. A test harness has several equally valid cases. A tiny experiment needs one user out of many. The system has a few acceptable options, none deserves a preference the code can explain, and something still needs to be picked.

Probability slips in right there. It arrives disguised as indifference, then leaves behind a distribution whether anyone asked for it or not. Most of the time, this whole idea gets compressed into a line of code that looks almost too innocent and ordinary to deserve attention:

floor-index.ts
const i = Math.floor(Math.random() * n);

It all feels trivial until someone asks the annoying but important question: is this actually fair?

That question is where sampling begins.

What “fair” means

Suppose you have a finite set with n elements:

S={x1,x2,,xn}.S = \{x_1, x_2, \dots, x_n\}.

To sample uniformly from SS means that every element has the same probability of being chosen. Written with standard probability notation1:

P(X=xi)=1n,i=1,,n.P(X = x_i) = \frac{1}{n}, \qquad i = 1, \dots, n.

That is the whole definition. Uniform means equal odds.

This is one of those ideas that sounds so simple it almost disappears. But it hides an important distinction:

random does not mean “messy-looking,” and fair does not mean “everyone gets picked quickly.”

A uniform sampler can pick the same item twice in a row. It can pick the same item five times in a row. That may feel suspicious, but it is not evidence of unfairness. Fairness is not about how the first few draws feel. It is about the long-run frequencies.

If you draw from a uniform sampler many times, each item should appear about equally often. Not exactly equally. Just with no built-in favoritism.

Sampling is not magic; it is a probability distribution attached to a choice.

The smallest useful implementation

In TypeScript, a minimal uniform sampler over a non-empty array looks like this:

sample-uniform.ts
export type NonEmptyArray<T> = readonly [T, ...T[]];

export function sampleUniform<T>(xs: NonEmptyArray<T>): T {
  const i = Math.floor(Math.random() * xs.length);
  return xs[i];
}

Almost suspiciously short. And for many ordinary uses, perfectly fine.

If Math.random() behaves like a uniform draw from [0,1)[0, 1), then multiplying by xs.length gives a number in [0,n)[0, n), and Math.floor(...) maps each equal-width interval to one array index.

For n = 4, the intervals look like this:

  • [0, 1) maps to index 0
  • [1, 2) maps to index 1
  • [2, 3) maps to index 2
  • [3, 4) maps to index 3

Each interval has the same width, so each index gets the same probability. That is the whole trick.

A bug that looks innocent

There is a version of this code that many people write once, trust for a few years, and then quietly regret:

round-index.ts
const i = Math.round(Math.random() * (n - 1));

It looks reasonable. It is not.

The problem is that rounding does not carve the interval into equal pieces. Math.round groups numbers by “nearest integer,” and the first and last groups get cut in half by the boundaries of the range.

If n = 4, then Math.random() * (n - 1) is uniform on [0, 3). The mapping is:

  • [0, 0.5) maps to index 0
  • [0.5, 1.5) maps to index 1
  • [1.5, 2.5) maps to index 2
  • [2.5, 3) maps to index 3

That means indices 0 and 3 get intervals of width 0.5, while indices 1 and 2 get intervals of width 1. Since probability is proportional to interval width, the edge indices each get probability 1/6, while the middle indices each get 1/3. The middle values are twice as likely.

A sampler can look random and still be biased. Software loves bugs like this because they do not crash. They just quietly tilt reality.

Uniform Sampling Explorer
n
Draw
n1001k10k
frequency
0
1
2
3
4
5
6
7
8
9

At small sample sizes, both versions can look innocent. That is exactly why this bug survives. Give them a few thousand draws and the disguise falls apart: Math.floor drifts toward balance, while Math.round keeps feeding the middle indices. Random-looking is cheap. Uniform is stricter.

What uniform sampling buys you

The value of uniform sampling is not mathematical purity. It is restraint. When every option is genuinely symmetric, any deterministic rule is just a hidden bias dressed up nicely. Uniform choice makes the weaker and more honest claim: the system has no reason to prefer one item, so it should not pretend that it does.

It is also the baseline for the rest of this series: weighted sampling changes the probabilities; sampling without replacement forbids repeats; reservoir sampling assumes the whole set may never fit in memory.

It is the simplest random choice you can defend. Later variants only add constraints on top of it.


Uniform choice only works while the options are truly symmetric. Real systems rarely stay there for long. Search results come with partial evidence, streams arrive with skew, databases sample from uneven tables, and token decoders rank candidates that are anything but interchangeable. The moment those asymmetries matter, 1/n stops being honest. That is where weighted sampling begins in the next post.