Reinforcement Learning from Human Feedback • RLHF • AI Blog

RLHF is a technique that uses reinforcement learning to optimize an AI agent’s policy using human feedback. This technique has been successfully applied to various areas of natural language processing and video game bots, allowing agents to learn from human preferences and generate more natural a

Source