Improve Reasoning in ChatGPT Through Diversity of Thought (DoT)

ChatGPT and other large language models have shown impressive capabilities, but complex reasoning remains a weak spot. However, a new study reveals an effective technique to enhance reasoning - using diverse prompts.

Researchers from Microsoft and Stanford tested methods to elicit more diverse and structured thinking from models like GPT-3 and GPT-4. The key idea is prompting the model itself to suggest various approaches and personas for solving reasoning problems.

For example, when faced with a math word problem, GPT-4 can propose trying direct calculation, drawing a working backwards, and much more. These diverse strategies are then incorporated into multiple rephrased prompts.

The researchers introduced two techniques building on this idea:

DIV-SE: Execute each diverse prompt separately and combine the responses.
IDIV-SE: Combine multiple approaches into a single prompt.

In this article we are going to concentrate on IDIV-SE "(In-call DIVerse reasoning path Self-Ensemble)"

Image Source: Naik, R., Chandrasekaran, V., Yuksekgonul, M., Palangi, H., & Nushi, B. (2023). Diversity of thought improves reasoning abilities of large language models. arXiv preprint arXiv:2310.07088.

Across benchmarks in math, planning, and commonsense reasoning, both DIV-SE and IDIV-SE improved accuracy and cost-effectiveness substantially compared to prior prompting strategies.

On a difficult 4/5 blocks world planning challenge, DIV-SE boosted GPT-4's accuracy by 29.6 percentage points. For grade school math problems, it increased GPT-3.5's performance by over 10 percentage points.

Unlike other methods that modify the decoding process, diverse prompting works by eliciting diversity at the input level. This makes it broadly applicable even to black-box models.

In Summary:

Prompting the model for diverse problem-solving approaches is an effective strategy to improve reasoning.
Combining these diverse prompts boosts accuracy and cost-effectiveness.
DIV-SE and IDIV-SE outperformed existing prompting techniques substantially.
The methods provide gains without needing access to model internals.
Diversity at the prompt level complements diversity during decoding.
Planning, math and commonsense reasoning saw large improvements.
Eliciting diversity directly from the model itself was critical.

The striking gains show the power of diversity for reasoning. While not flawless, diverse prompting pushes ChatGPT notably forward on its journey toward robust reasoning.

Key Takeaways for Readers:

Get GPT's feedback on potential approaches and personas to solve the reasoning problem
Create demonstrations of solving the problem using different approaches
Prompt GPT to solve the problem taking on each persona and using the approaches
Aggregate the solutions from different personas and approaches
Diversity of approaches and "thinkers" is key to improving reasoning

Here’s a prompt template that we at The Prompt Index have put together which embodies the Diverse of Thought (DoT) approach:

IDIV-SE ( Diverse Reasoning)

[State reasoning problem here for example: In the following question, a number series is given with one term missing. Choose the correct alternative that will follow the same pattern and fill in the blank spaces. 1, 2, 3, 5, x, 13]

To begin, please suggest 3 distinct approaches I could use to accurately solve the above problem:

Approach 1:
Approach 2:
Approach 3:

Now please provide 3 short demonstrations, each solving the original problem using one of the approaches you suggested above:

Demonstration 1 (Approach 1):

Demonstration 2 (Approach 2):

Demonstration 3 (Approach 3):

Great, let's put it all together. Please now take on the role of expert one (a persona you feel is mostly aligned to the issue) and solve the original problem using Approaches 1-3.

Now take on the persona of expert 2 (a persona you feel is the next most likely aligned to the issue) and solve the original problem again using Approaches 1-3.

Finally, take on the persona of expert 3 (a persona you feel is the next most likely aligned to the issue) and solve the original problem a third time using Approaches 1-3.

Please synthesize your responses from the 3 expert personas above and provide your final recommended solution.

You can find this prompt at The Prompt Index.

Prompt Author: The Prompt Index

Full credit to Naik, R., Chandrasekaran, V., Yuksekgonul, M., Palangi, H., & Nushi, B. (2023)Diversity of thought improves reasoning abilities of large language models. arXiv preprint arXiv:2310.07088