Writing
Series
Multi-part posts that follow a single story from constraints to conclusions.
Series index
Track long-form arcs across the blog, grouped by topic and ordered by narrative flow.
Splitmix32: Thirteen Lines of Beautiful Randomness
I went looking for a seeded PRNG in TypeScript and found splitmix32, a 32-bit pseudorandom number generator so elegant it made me want to understand every single bit.
Barcode Scanners Are Keyboards With Extra Steps
From bars and spaces to keystrokes in your input field. How UPC-A barcodes encode data, how scanners decode them, and why your barcode reader acts like a keyboard.
Investigating `createRequire is not a function`
A debugging journey through webpack internals, WebAssembly loading, and ESM edge cases. Four dead ends before finding the fix.
Stochastic Greedy: Scaling Submodular Maximization to Massive Datasets
Stochastic Greedy replaces greedy's full scan with random subsampling, reducing runtime from O(nk) to O(n ln(1/ε)) while losing only an additive ε in the approximation guarantee. This post covers the algorithm, its proof, and practical guidance.
The Greedy Algorithm for Submodular Maximization
The greedy algorithm achieves a (1 - 1/e) approximation for monotone submodular maximization, provably the best any efficient algorithm can do. This post covers the algorithm, its proof, Lazy Greedy, and when greedy fails.
An Introduction to Submodularity
A practical introduction to submodular functions, the mathematical framework behind diminishing returns, covering set functions, marginal gains, and real-world applications from sensor placement to influence maximization.
Fixing "SDK Build Tools is Too Low" in React Native
A Gradle-based solution using afterEvaluate to permanently fix SDK Build Tools version mismatches across React Native Android dependencies.
Choosing Fast: From Softmax to FlashSampling
By the time an LLM chooses its next token, it feels like the hard part should be over. In practice, that final step can still dominate memory traffic.
Choosing Several Things, Not Just One
A raffle with one winner is easy. A raffle with ten winners is where people start staring at the code.
From Scores to Probabilities
A classifier gives you raw numbers, one per class. Softmax is the normalization step that turns relative preferences into a proper categorical distribution.
How a TSP Solver Decides What to Try Next
Inside a TSP metaheuristic, sampling means 'choose what to try next.' Different object, same deeper pattern: controlled choice under uncertainty.
How Language Models Choose the Next Token
A language model can be brilliant and still sound dull. The decoding strategy is where much of its behavior shows up.
How Much Can You Trust a Sample?
At some point in every conversation about sampling, someone asks the harder question: sure, but how wrong could we be?
How to Keep a Fair Sample of a Stream
A stream keeps arriving after you have run out of memory. Can you keep a fair sample of everything you have seen so far, without storing everything?
Matrix Calculus for Deep Learning, Without the Fog
A geometric and probabilistic refresher on the matrix calculus that actually matters in deep learning: gradients, Jacobians, affine maps, chain rules, and the softmax-cross-entropy shortcut.
Small Data Structures That Lie Just Enough
A Bloom filter can say 'definitely not' or 'maybe.' That asymmetry turns out to be very useful.
The Case for Hidden Variables
The visible world is often easier to model by introducing unobserved causes. That instinct is the doorway to latent variable models.
The Fairest Possible Choice
Most software randomness enters through the boring door. A line of code that picks an index. Is it actually fair?
What Transformers Really Do
Each token asks a simple question: which other tokens matter for me right now? That is much less mystical than the usual hype makes it sound.
When a Latent Space Becomes Sampleable
A code that is useful for reconstruction is not automatically a code you can sample from. That distinction is the whole reason VAEs exist.
When Uniform Isn't Enough
Uniform choice works only while the options are symmetric. Once one server, task, or token deserves more attention, randomness should favor it more often.
Why Random Rows in SQL Are Weirdly Expensive
Asking a database for random rows feels like it should be boring. It is not. Once data has a physical layout, randomness starts negotiating with the engine.
Why Randomness Helps Algorithms
A deterministic algorithm can be perfectly logical and still be bad at looking around. Randomness helps in three distinct ways.
Streaming Excel to a Database Without Losing a Single Row
The data pipeline behind a warehouse system: exporting snapshots to Excel, streaming bulk imports over gRPC with typed error codes, staging tables, atomic database swaps, and the SQL Server migration that made it all possible.
Running a Warehouse System on a 4 GB Server with No Docker
The architecture behind a production warehouse system: Koa.js over Express, gRPC for process isolation, Unix pipes for log forwarding, systemd instead of Docker, and a polyglot monorepo where a Rust/WASM binary is just another npm package.
Three Generations of a Warehouse Routing Engine
From a Node.js solver built on npm libraries to a 144 KB Rust/WASM binary with Jump Point Search, compile-time code generation, and a nearest-neighbor + 2-opt solver that closes within 1% of the ILP optimum.
Walking Is the Most Expensive Warehouse Operation
How one engineer built a custom route optimization system for a small Italian warehouse: the problem, the constraints, and why naive pick sequences waste half an operator's shift.
Working with Me
A user manual for collaborating with me: communication preferences, working style, and what I care about.