The biggest challenge with language benchmarks is how quickly they get obsoleted.

Jun 3, 2022 · 6:16 PM UTC

12
4
1
70
Replying to @gdb
Shows the rapid pace of development in the field as well.
Replying to @gdb
Soon we'll be relying on AI to determine how good a model is compared to a prior one. 😅