ā€œLLMs are just trained to predict next tokenā€

  • This is only true for pretraining and not post training
  • It is actually that sophisticated internal representations are best means of predicting the next token

ā€œLLMs do very bad in reasoning tasksā€

  • True in 2021, not true today