Thinking Machines & Deterministic AI

Research·September 10 2025·3 min read
Determinism in LLM inference

Thinking Machines (www.thinkingmachines.ai) just published an excellent post on defeating nondeterminism in LLM inference. It highlights an issue many people in production AI have run into: even with temperature set to 0, models sometimes give different outputs for the same prompt.

The reason is subtle. When GPUs batch requests together, the order of floating-point operations shifts, and because floating-point math isn’t perfectly associative, that tiny difference can cascade into different token choices.

Their solution — batch-invariant kernels — makes inference deterministic. The same input now produces the same output, every time. That’s a huge step forward for anyone trying to build reproducible, reliable AI systems.

Why Determinism Matters

Deterministic inference unlocks real benefits:

  • Reproducibility — identical inputs, identical outputs, no surprises.
  • Debugging — easier to trace issues when outputs don’t wobble.
  • Benchmarking — accuracy scores stop drifting between runs.
  • Tuning — correction pairs and preferences become more consistent.

In short: reproducibility is infrastructure. It gives teams confidence that what they measure today will hold tomorrow.

Determinism and Accuracy

But reproducibility is only part of the puzzle. Determinism means you’ll always get the same answer — but it doesn’t guarantee that answer is true.

That’s not a flaw in this approach; it’s just the nature of LLMs. They generate by matching patterns and probabilities, not by checking facts. A deterministic model can still be consistently wrong if the underlying weights or data are off.

Where Superficial Fits

This is where Superficial comes in.

Determinism makes models stable. Superficial makes them trustworthy. We take deterministic outputs and break them into atomic claims, then verify each claim against trusted sources. The result is a system that’s not only reproducible but also accurate, auditable, and compliant.

Put another way:

  • Thinking Machines have solved consistency.
  • Superficial solves correctness.

Together, these developments push AI closer to being both reliable and deployable in high-stakes environments.

Looking Ahead

The work from Thinking Machines shows how much progress can come from careful engineering. Deterministic inference removes one of the hidden sources of instability in AI systems. Paired with verification layers like Superficial, we can finally move from systems that are merely repeatable to systems that are reliable.

This is the direction AI needs to go.