Reinforcement Learning from Human Feedback • RLHF • AI Blog
RLHF is a technique that uses reinforcement learning to optimize an AI agent’s policy using human feedback. This technique has been successfully applied to various areas of natural language processing and video game bots, allowing agents to learn from human preferences and generate more natural a
Source
Post Comment