One common AI safety concern is that coding a full set of values, with all their nuances and trade offs, into an AI system might be intractable. So we want to develop techniques for machines to learn from humans in natural language. Step in that direction:
We've fine-tuned GPT-2 using human feedback for tasks such as summarizing articles, matching the preferences of human labelers (if not always our own). We're hoping this brings safety methods closer to machines learning values by talking with humans. openai.com/blog/fine-tuning-…
Sep 19, 2019 · 4:17 PM UTC
1
6
56

