Discussion about this post

User's avatar
Mark A's avatar

Great piece.

This:

"“It is often implicitly assumed that these reflect the model’s ‘thoughts’ or inner reasoning process, and that ideas exposed in this CoT [Chain of Thought] trace may be indicative of what the model prefers or believes.” However, “The relationship between CoT and the reasoning process is contested.” “CoT traces may only partially reflect the reasoning process that determines model outputs.”"

Is the most interesting area to me. I have a modestly better than a laymen's understanding of LLMs but I know the study of thought and reasoning are pretty deeply contested and challenging areas in both psychology and philosophy. I think intuitively most people understand "chain of thought" as "chain of reasoning", though they can come apart, as reasoning has to do with justification and validity, whereas thought is more general and allows for some creative associating. But the idea that the model is 'reasoning' and then reporting on said reasoning rather than just mechanistically producing text output via the relevant weights, given its training, seems like more anthropomorphizing to me.

If these models don't believe or pretend, I don't see how they can reason, unless by reason we simply mean good old fashioned AI where the models essentially just perform logical functions like 'if, then', etc. I could be wrong and that's why I'd like to read more about what exactly makes the output of CoT models different than regular models.

Expand full comment

No posts