Reinforcement Learning from Human Feedback

Until a few years ago, the most advanced language models we had were GPT-2 and BERT. GPT-2 was the most advanced auto-regressive decoder based model that was suitable for Text Generation. The model T5 was state of the art for other tasks like Translation, Summarization. These models have been a great starting point but were … Read more

