The child in the park
Watch a child learning to throw a ball. They try. They miss. Their face flushes — that unmistakable look of embarrassment. And then something remarkable happens: they focus. Not on what they already know. Not on the easy throws. They concentrate almost entirely on the throw that went wrong.
A few tries later, they've got it. And once it clicks — once the embarrassment fades — they move on. They don't keep practicing what they've already mastered. They let it go.
That observation became the hypothesis: what if a neural network trained the same way? Standard training fires every weight on every image, epoch after epoch — most of that compute is redundant. The model already knows what a truck looks like. Why keep spending energy there?
The embarrassment signal
SEL gives each class its own "embarrassment score" — a temperature-scaled cross-entropy loss computed per class at every training step. High embarrassment means the model is genuinely confused. Low embarrassment means it has learned. The compute budget follows the embarrassment.
One month, many failures
Three major versions. The first used a uniform curriculum with no sparsity. The second added the embarrassment signal but not the staged pools. This is version 3 — the first to combine all three mechanisms and produce consistent results. Built on a single T4 GPU, one iteration at a time.