Human-In-The-Loop models¶

Human-centered reinforcement learning¶

Human-centred reinforcement learning adds people to the training loop, like having backseat drivers for algorithms. The AI tries something, a human gives feedback (usually in the form of exasperated sighs), and the system slowly, painfully improves. It’s how we train both robots and interns.

Real-life¶

ChatGPT uses human feedback to become less terrible over time. Thousands of underpaid clickworkers rate its responses, teaching it that “the Earth is flat” is a bad answer unless the user is a conspiracy theorist.

Security & privacy risks (moderate)¶

Human feedback channels can be exploited - imagine trolls deliberately teaching an AI assistant to be racist. There’s also privacy concerns when humans review sensitive interactions.

Active learning¶

Active learning is the most needy of all approaches, constantly interrupting humans to ask “is this right?” like an insecure intern. The algorithm identifies the data points it’s most uncertain about and demands labels, theoretically becoming smarter with less effort.

Real-life¶

Medical imaging systems use this to reduce the workload for radiologists. The AI flags the 10% of scans it’s unsure about for human review, while confidently misdiagnosing the other 90%.

Security & privacy risks (moderate)¶

The queries themselves may reveal sensitive information - like an AI asking a doctor to label particularly graphic medical images in a shared workspace. There’s also the risk of the system developing blind spots in areas it never thought to ask about.

Last update: 2025-08-04 08:32