I Found a Probability Bug in My Old TSP Solver

I reopened sampling.h from my old university TSP project because I expected a concrete case study. Line 178 gave me a better article: the weighted sampler cites the correct paper, computes the paper’s random keys, and then keeps the wrong end of the priority queue.

The solver may still have liked the resulting selection pressure. The code did not implement the distribution its comment promised.

That is the danger with randomized code. A deterministic bug often produces a stable wrong answer. A probability bug keeps producing plausible answers with the wrong frequencies.

The cleanest primitive: pick distinct indices

The sample_indexes(low, high, k, random) function samples k distinct indices from a range, using Robert Floyd’s algorithm.

For a neighborhood operator that needs distinct tour positions, this is the right abstraction. It samples the domain the move actually accepts instead of proposing duplicates and retrying.

From indices to moves

Right below that, sample_pair(low, high, random) samples two distinct indices and optionally sorts them before returning the pair.

A small function, but it captures a big idea: a search move often begins as a sampled pair. In TSP, a pair can mean two cities to swap, two cut points for a segment move, or two edges that define a neighborhood operation.

The pair decides which local part of the tour becomes editable. Randomness is choosing a move, not an answer.

Sometimes you want constraints, not just randomness

The most metaheuristic-flavored function is sample_constrained_window. It samples two uniform random numbers, sorts them, converts them into positions, forces the second one to be at least delta_min away from the first, and clips it to be at most delta_max away.

The constraints are useful, but the distribution is easy to overstate. Clipping every oversized window to a1 + deltaMax piles probability on the maximum distance. The function returns valid windows; it does not sample uniformly from all valid constrained pairs. Its name wisely promises a constrained window, not a uniform one.

sample-constrained-pair.ts

function sampleConstrainedPair(
  low: number,
  high: number,
  deltaMin: number,
  deltaMax: number
): [number, number] {
  const space = high - low - deltaMin;

  let u1 = Math.random();
  let u2 = Math.random();
  if (u1 > u2) [u1, u2] = [u2, u1];

  const a1 = low + Math.floor(u1 * space);
  let a2 = low + Math.floor(u2 * space) + deltaMin;

  if (a2 > a1 + deltaMax) a2 = a1 + deltaMax;
  return [a1, a2];
}

The weighted sampler that made me stop

weighted_sample_indexes assigns each candidate a random key

\text{key}_i = \frac{\log r_i}{q_i},

where $r_i$ is uniform on $(0,1)$ and $q_i$ is a positive weight. The Efraimidis-Spirakis method retains the k largest keys, the values closest to zero.¹ A larger weight divides the negative logarithm more strongly, so its key tends to move upward.

priority-direction.ts

// Efraimidis-Spirakis: larger keys win.
const expected = [...keys]
  .sort((a, b) => b - a)
  .slice(0, k);

// My old solver kept the other end.
const old = [...keys]
  .sort((a, b) => a - b)
  .slice(0, k);

My code built a min-priority queue and popped the k smallest keys. With two candidates weighted 1 and 2, the published method selects the weight-2 candidate with probability $2/3$ . Reversing the heap selects the weight-1 candidate with probability $2/3$ .

The calling code makes the contract murkier. It normalizes tour costs, calls them “probabilities of being removed,” passes them into the sampler, then keeps the returned offspring as the next population. Three ideas have collapsed into one variable: removal pressure, survival fitness, and sampling weight.

Perhaps the intention was to favor cheap tours as survivors. The reversed heap does push in that direction. But for subsets larger than one, it is not simply the cited algorithm applied to inverse costs. The implementation, comment, and paper cannot all be describing the same distribution.

What I would write now

I would start by naming the policy. If high-cost tours should be removed, use cost as a positive removal weight, keep the largest Efraimidis-Spirakis keys, and delete the sampled indices. If low-cost tours should survive, define an explicit positive fitness function, apply the same algorithm correctly, and keep the sample. In either case I would test frequencies on a two-candidate distribution before trusting a 1,665-city benchmark.

This is a better artifact than the clean story I expected to tell. The file contains good sampling primitives, one deliberately non-uniform proposal, and one documented probability contract that the comparator violates. My younger self cited the source; my current self finally checked the direction.