Sometime last year I was trying to understand how transformer attention actually works — not the hand-wavy explanation, but the math behind why scaled dot-product attention is scaled. I spent an hour on YouTube and came away more confused than when I started. Then I asked Claude to explain it, and I still didn’t get it. The problem wasn’t the model. The problem was that I didn’t know enough to ask the right question.
That experience pointed at something I keep running into. The web has basically all the knowledge I could ever need. Language models can synthesize it on demand. And yet learning still bottlenecks in the same place it always has: knowing what to ask. Not whether answers exist, but whether you can formulate a question precise enough to get the answer you actually need.
What makes this weird is that the question-formulation problem used to be hidden by good teachers. A good teacher’s actual value isn’t the information they carry — it’s their ability to diagnose what you don’t understand from the way you talk about it, and then address the gap you didn’t know to name. I’ve had professors who could tell from one sentence that I was confused about notation versus concept, and redirect accordingly. That’s not information transfer. That’s interpretive work on the question I failed to ask.
Language models are getting surprisingly good at this too, but only if you give them something to work with. The limiting factor shifts to you: can you describe your confusion in enough detail that the system can respond to where you actually are, not where you think you are?
I don’t think this makes good questions more important in some abstract sense. I think it makes them important in a concrete, immediately practical way — because the gap between a vague question and a precise one now has measurable consequences. With a vague question, you get a generic answer you already knew. With a precise one, you get exactly what you needed.
The skill isn’t really about asking questions. It’s about being honest about what you don’t understand. These feel related but they’re different. Asking questions is a conversational behavior. Being honest about confusion requires you to first notice that you’re confused, which is harder than it sounds. Most people feel a general fog and stop there. The ones who learn faster tend to be the ones who can locate exactly where the fog is — “I understand why attention weights are normalized, but I don’t understand what query/key/value are actually doing geometrically” — and then ask about that specific thing.
I’m not sure whether this skill is teachable in the traditional sense. You kind of have to develop it through failure — through realizing that the answer you got was useless because your question was bad. But that feedback loop is faster now than it’s ever been, which might matter. If every bad question is answered in seconds, you get more opportunities to notice the pattern.
What this probably changes about education is harder to say. I don’t think it means less school — school does things that don’t have much to do with information transfer, and those things still matter. But I think it shifts what the valuable part of school is. The most useful thing a teacher can do now might not be explaining things, but modeling how to ask. Showing what it looks like when someone notices they’re confused and converts that into a question precise enough to resolve.
That’s something I’m still trying to get better at myself. The attention mechanism thing: I eventually got there. But it took me asking six different variations of the same question before I understood why I kept getting answers that didn’t help me.