Why Prompting Feels Like Guesswork

Using AI to Talk to AI

I've lost count of how many times I've done this.

I ask Claude something. The response is close but not quite right. So I open ChatGPT and ask it to write a better prompt for Claude. Or I ask Claude to rewrite my prompt for Claude. Sometimes I go three or four iterations deep, one model coaching me on how to talk to another model, or the same model coaching me on how to talk to itself.

Eventually something clicks. The output lands exactly where I wanted it.

And every time, the same question comes back: why do I have to do this at all?

The Extra Work We've Normalized

This isn't a complaint about capability. These models know things. They can reason. They can write. The knowledge is there.

The problem is access.

Getting that knowledge out consistently requires a kind of ritual. Specific phrasing. Careful word choice. Sometimes a completely different sentence structure that means the same thing but somehow works better. The gap between what I'm trying to say and what the model needs to hear has become its own skill to manage.

I've watched people build entire workflows around this. Prompt libraries. Template systems. Chains of models talking to each other just to stabilize output. We've accepted that talking to AI requires translation, even when both sides speak the same language.

That acceptance is the problem.

Water on a Mountain

I think about this the same way I think about drainage on uneven terrain.

Imagine rain falling on a mountain. The water is the same everywhere. Gravity is the same everywhere. The destination, the valley below, is the same everywhere. And yet, the water does not flow uniformly.

Some streams form deep, reliable channels and reach the valley quickly. Others split, wander, pool, or evaporate. Small differences in the surface, a rock here, a shallow groove there, can cause large differences in where the water ends up.

The water isn't confused. The terrain is.

That distinction matters.

The Terrain Inside a Model

In a language model, your prompt is where the rain lands.

The model's internal structure, millions or billions of parameters, is the terrain. Because modern models are highly redundant, that terrain contains many overlapping paths representing similar ideas. Not identically. Similar.

A small change in phrasing can route execution through a different internal path. Some of those paths are well-conditioned and stable. Others are noisy, shallow, poorly aligned. When you reword a prompt and get a completely different answer, you're not teaching the model something new.

You're nudging the water onto a different slope.

This is what I've been doing every time I ask one model to help me talk to another. I'm not adding information. I'm searching for the phrasing that lands on stable ground.

Redundancy Without Alignment

Redundancy itself isn't the problem. It's how large models learn. Multiple internal representations overlap to cover the same concepts from different angles. That flexibility is part of what makes them powerful.

The problem arises when that redundancy is left unconstrained.

When many internal paths disagree slightly, the model's output becomes sensitive to surface form. Semantically equivalent prompts produce different behaviors because they activate different regions of the terrain. You can ask the same question three ways and get three different answers, not because the model doesn't know, but because you landed on three different slopes.

This is why prompting feels like guesswork. You're not discovering new information. You're searching for a stable channel. And when you can't find one, you ask another model to search for you.

The Real Cost

The extra work isn't just annoying. It's expensive.

Every iteration burns tokens. Every prompt-writing prompt adds latency. Every failed attempt erodes trust. I've read about enterprise teams building entire abstraction layers just to get consistent outputs from systems that should already be consistent.

We've turned "talking to AI" into a technical discipline because the alternative, trusting that intent will translate, fails too often.

That's not a prompting problem. That's a variance problem. And variance, I understand.

Reshaping the Mountain

Variance reduction doesn't change what the model knows. It doesn't change the destination. It doesn't tell the water where to go.

It reshapes the terrain.

Unstable channels get removed or suppressed. Reliable channels get deepened. Semantically similar inputs begin to converge naturally toward the same behavior. You don't force the flow. You shape the mountain so the flow becomes inevitable.

In engineering, you don't fight physics. You design so physics works in your favor. The same principle applies here.

Where SparseKD Fits

SparseKD approaches this problem indirectly. It doesn't attempt to fix prompting. It doesn't introduce new prompt rules or heuristics.

Instead, it reduces internal behavioral variance by collapsing redundant pathways and aligning the model's response distribution. The effect isn't that every prompt works perfectly. The effect is that equivalent prompts behave equivalently.

When variance is reduced, surface phrasing matters less. Intent matters more.

You stop needing one model to teach you how to talk to another.

Before and After

Before variance reduction, one phrasing yields a shallow answer. Another yields the correct one. A third triggers a refusal or tangent. The user learns to work around the model, testing different approaches until something clicks.

After variance reduction, all three converge to the same core response. Differences are stylistic, not semantic. The experience shifts from learning the model's quirks to simply expressing intent.

That's the difference between a tool you fight and a tool you use.

Prompting Was Never the Real Problem

Prompt engineering emerged as a workaround for internal instability. It helped users discover reliable channels in unstable terrain. The techniques work. They just solve the wrong layer.

I don't want to get better at writing prompts. I want to stop needing to.

That's where SparseKD begins.

For the full mathematical framework, read the paper: Hallucinations Live in Variance