An approach to better align language models with human preferences:
We've used reinforcement learning from human feedback to train language models for summarization. The resulting models produce better summaries than 10x larger models trained only with supervised learning: openai.com/blog/learning-to-…

Sep 4, 2020 · 4:28 PM UTC

4
10
81
Replying to @gdb
Thanks for sharing this, @gdb. It’s encouraging to see OpenAI making continual enhancements.
Replying to @gdb
Could this be integrated into the api anytime soon?
Replying to @gdb
Hello Greg, I have already sent multiple emails regarding GPT-3 API access. I was told that I would get the API access but I haven't got it yet. Please look into it. 👨‍💻👨‍💻
1
2