Large Language Models (LLMs) have shown remarkable capabilities in tasks like language understanding and reasoning, marking a paradigm shift in how we interact with AI systems. To augment the proficiency of LLMs, researchers generally employ the chain of thought prompting technique, which involves intermediate reasoning steps to guide the model’s response. Although this technique is similar to how humans solve a problem, it does not fully utilize the computational prowess of LLMs, and the authors of this paper have tried to explore an alternate reasoning approach.
Chain of thought (CoT) methods have shown great results, but the downside to their use is that they delay the generation of the desired final answer. The researchers have introduced a new approach called implicit chain-of-though that, as the name suggests, makes the steps involved in CoT reasoning implicit so that the model produces the final answer directly.
Unlike explicit CoT reasoning, where the LLM is trained to produce the intermediate steps before the final output, in implicit CoT reasoning, the model sees the intermediate steps only during the training phase and not during testing. It processes these steps in its internal states and learns to internalize the concept thoroughly, bypassing explicit reasoning.
The researchers used a ‘teacher training’ method instead of the traditional ‘teacher forcing’ method to achieve implicit CoT reasoning. Their strategy first involves training a student model to read the teacher’s hidden states and utilize some of them to produce the final answer. They then employ knowledge distillation, a process of transferring knowledge from a larger model to a smaller one. They train an emulator to predict the teacher’s hidden states based on input. Importantly, this emulation happens vertically across the model’s layers, eliminating the need for explicit reasoning steps.
The final step involves combining the emulator with the student, which produces the final output based on the emulated teacher’s thought process. The integrated system is then optimized end-to-end, enabling the student model to develop its own reasoning methods, which may differ from the teacher’s.
The researchers conducted experiments on two tasks – multi-digit multiplication and grade school math problems. The results showed that their method equipped the models to solve previously unsolvable tasks without explicit CoT. They observed that the GPT-2 Small model, which achieved 97% accuracy on 4-digit multiplication under implicit CoT, performed poorly when tested on 5-digit multiplications, which suggests that the effectiveness of the technique is dependent on having sufficient intermediate layers for the required calculations. They also observed that the implicit CoT technique has a higher inference speed, especially for tasks that require multiple intermediate steps.
A few major issues associated with this technique are the lack of transparency, heavy dependence on the teacher’s thought processes, and lagging in performance compared to explicit CoT. However, this work marks just an initial step toward building implicit CoT, and the researchers believe that many adjustments could be built on top of this work to optimize this process further and augment LLMs’ ability to reason.