Research Project · CIFAR-10 · ResNet-18

A network that learns
from embarrassment

Curriculum-driven, sparsity-aware training. 99% fewer FLOPs. 42% faster. Inspired by how a 7-year-old learns to throw a ball.

42% faster training

99% FLOPs saved

CIFAR-10 · ResNet-18

MIT License · 2026

Origin

Where the idea came from —

The child in the park

Watch a child learning to throw a ball. They try. They miss. Their face flushes — that unmistakable look of embarrassment. And then something remarkable happens: they focus. Not on what they already know. Not on the easy throws. They concentrate almost entirely on the throw that went wrong.

A few tries later, they've got it. And once it clicks — once the embarrassment fades — they move on. They don't keep practicing what they've already mastered. They let it go.

That observation became the hypothesis: what if a neural network trained the same way? Standard training fires every weight on every image, epoch after epoch — most of that compute is redundant. The model already knows what a truck looks like. Why keep spending energy there?

The embarrassment signal

SEL gives each class its own "embarrassment score" — a temperature-scaled cross-entropy loss computed per class at every training step. High embarrassment means the model is genuinely confused. Low embarrassment means it has learned. The compute budget follows the embarrassment.

One month, many failures

Three major versions. The first used a uniform curriculum with no sparsity. The second added the embarrassment signal but not the staged pools. This is version 3 — the first to combine all three mechanisms and produce consistent results. Built on a single T4 GPU, one iteration at a time.

Results

The numbers —

FLOPs Saved
99%
0.338T vs 33.5T

Time Saved

42%

859s vs 1480s

Accuracy Gap

−10.1%

82.6% vs 92.7%

Gradient Sparsity

97%

Frozen per update

Training Accuracy Curves

Train accuracy over 50 staged epochs (5 stages, 10 epochs each)

Baseline Accuracy Curves

Train and test accuracy over 30 baseline epochs

Test Accuracy: Staged vs Baseline

Full test set (10,000 images) — direct comparison

Cumulative FLOPs

Total floating-point operations consumed over training

Class	Baseline (100/class)	SEL (100/class)	Difference	Notes
airplane	91%	74%	−17%	Visually diverse class
automobile	97%	93%	−4%	Strong retention
bird	92%	82%	−10%	Moderate gap
cat	87%	61%	−26%	Most ambiguous class
deer	90%	82%	−8%	Reasonable retention
dog	88%	86%	−2%	Near-baseline
frog	96%	82%	−14%	Moderate gap
horse	99%	87%	−12%	High baseline, drops
ship	92%	91%	−1%	Virtually identical
truck	95%	88%	−7%	Good retention

System	Full Test Acc	100/class Acc	FLOPs	Time	Epochs
Baseline	93.2%	92.7%	33.5T	1480s	30
SEL (Staged)	82.2%	82.6%	0.338T	859s	50
Savings	−11.0%	−10.1%	+99%	+42%	+20

Architecture

Three mechanisms, one idea —

Embarrassment Signal

Per-class temperature-scaled loss computed at every batch. E_c is high when the model is confused; C_c = max(0, 1 − E_c) tracks growing confidence. Both are logged across all 50 epochs.

Curriculum Staging

Training data is sorted by per-sample loss from a cold model. Five stages progressively introduce harder examples — from the bottom 0–25% of loss up to the hardest 78–100%. Each stage must pass an accuracy threshold before continuing.

Sparse Gradient Updates

After each backward pass, gradients below the 20th-percentile threshold are zeroed. At steady state, 97% of gradients are masked per update. FLOPs are measured on actual active parameters only — not theoretical full-pass count.

Curriculum Stages

Stage 1 · Easy

10 epochs

≥30%

Stage 2 · Medium

10 epochs

≥45%

Stage 3 · Hard

10 epochs

≥58%

Stage 4 · Harder

10 epochs

≥68%

Stage 5 · Hardest

10 epochs

≥76%

Use Cases

Where 99% fewer FLOPs matters —

Edge AI

Drones & Micro-Robotics

Delivery and surveillance drones can adapt to new obstacles in real-time without draining flight batteries. Vision tasks run on ESP32 and Raspberry Pi Zero hardware with no cooling required.

Wearable Health

On-Device Monitoring

Smartwatches can fine-tune arrhythmia and gait models to an individual's biometric signature locally — without draining the battery or warming the skin from sustained compute.

Smart City

Solar-Powered Cameras

Traffic cameras and waste-segregation sensors can update their recognition models using only energy harvested from small solar panels — no cloud uplink required.

Space

On-Orbit Satellites

Satellites cannot dissipate heat in vacuum. SEL enables on-orbit image classification — forest fires, crop monitoring — within strict thermal budgets. Deep-space probes can prioritize high-value transmissions using the embarrassment signal itself.

Green AI

Carbon-Neutral Fine-Tuning

A 99% reduction in FLOPs enables 100× more users on the same energy grid. Students and researchers on decade-old hardware can now run deep learning workloads previously inaccessible to them.

Federated Learning

Private On-Device Training

Mobile apps can train models on private user data with no perceptible battery drain or heat. Hospitals can fine-tune shared models on sensitive patient records locally, with data never leaving the premises.

Defense

Thermally Silent AI

Field-worn vision systems must remain invisible to infrared sensors. Underwater autonomous submersibles fine-tune sonar recognition while submerged, where power is finite and surface communication is impossible.

Video

Active CCTV Summarization

Instead of recording 24/7, cameras "wake up" only when they encounter something novel — high embarrassment, high entropy events. This alone could reduce storage requirements by orders of magnitude.

Industrial IoT

Remote Pipeline Sensors

Vibration and heat sensors in remote pipelines and substations — where battery replacement is physically dangerous and expensive — can adapt continuously on harvested power alone.

Get Started

Installation —

Requires Python 3.8+, PyTorch 1.13+, and a CUDA GPU. Tested on Google Colab T4.

        # Clone and install dependencies

        git clone https://github.com/ishaaqdev/Staged-Embarrassment-Learning-SEL

        cd Staged-Embarrassment-Learning-SEL

        pip install torch torchvision numpy matplotlib pandas

        # Open the notebook (recommended: Google Colab, Runtime → T4 GPU)

        jupyter notebook staged_embarrassment_v3.ipynb

A network that learnsfrom embarrassment

The child in the park

The embarrassment signal

One month, many failures

Training Accuracy Curves

Baseline Accuracy Curves

Test Accuracy: Staged vs Baseline

Cumulative FLOPs

Embarrassment Signal

Curriculum Staging

Sparse Gradient Updates

Drones & Micro-Robotics

On-Device Monitoring

Solar-Powered Cameras

On-Orbit Satellites

Carbon-Neutral Fine-Tuning

Private On-Device Training

Thermally Silent AI

Active CCTV Summarization

Remote Pipeline Sensors

A network that learns
from embarrassment