Introduction
Imagine asking an AI to plan your two-week trip to Japan. A basic LLM would dump a wall of text and call it a day. But an agentic AI would break the problem into pieces - research flights, check hotel availability, look up visa requirements, cross-reference your calendar, find restaurants near each hotel - and then stitch everything together into a coherent itinerary. The difference? The reasoning and planning algorithm running under the hood.
That's what this guide is about. Behind every capable AI agent - whether it's writing code, booking travel, debugging software, or orchestrating complex workflows - lies an algorithm that determines how the agent thinks and acts. Some think in chains. Others explore trees. A few build entire graphs of interconnected ideas. And the newest ones? They discover their own algorithms automatically.
This guide walks you through 17 foundational algorithms that form the backbone of agentic AI. For each one, you'll find a flow diagram, a plain-English example, Python pseudocode, the original research paper, and an honest assessment of trade-offs.
The algorithms naturally fall into four categories:
- Reasoning Frameworks (1-4): How agents think step-by-step
- Planning Architectures (5-8): How agents organize and execute multi-step tasks
- Self-Improvement Loops (9-12): How agents learn from mistakes and refine outputs
- Advanced Topologies (13-17): How agents structure complex reasoning beyond simple chains
Part 1: Reasoning Frameworks
1. ReAct: Reasoning + Acting
Paper: ReAct: Synergizing Reasoning and Acting in Language Models - Yao et al., ICLR 2023
ReAct interleaves reasoning (thinking) with acting (tool use) in a continuous loop. Before ReAct, researchers treated these as separate problems. ReAct's insight is that they strengthen each other: reasoning guides better actions, and real-world observations ground reasoning to reduce hallucinations.
Flow Diagram
βββββββββββββββββββ
β π₯ User Query β
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββββββββββ
β β
ββββββ΄βββββββββββββββββββββ β
β π THOUGHT β β
β LLM reasons about β β
β what to do next β β
ββββββββββββββ¬ββββββββββββ β
β β
βΌ β
ββββββββββββββββββββββββββ β
β β‘ ACTION β β
β Call a tool β β
β (search, calculate) β β
ββββββββββββββ¬ββββββββββββ β
β β
βΌ β
ββββββββββββββββββββββββββ β
β ποΈ OBSERVATION β β
β Get real-world result β β
ββββββββββββββ¬ββββββββββββ β
β β
βΌ β
βββββββββββββ β
β Answer ββββNoβββββ
β found? β
βββββββ¬ββββββ
Yes β
βΌ
βββββββββββββββββ
β β
Final Answerβ
βββββββββββββββββ
Example: "Who painted the ceiling of the Sistine Chapel?"
| Step | Type | Content |
|---|---|---|
| 1 | Thought | I need to search for who painted the Sistine Chapel ceiling. |
| 2 | Action | Search["Sistine Chapel ceiling painter"] |
| 3 | Observation | Michelangelo painted the ceiling between 1508-1512... |
| 4 | Thought | I have the answer - it was Michelangelo. |
| 5 | Answer | Michelangelo |
Python Pseudocode
def react_agent(question: str, tools: dict, max_steps: int = 10) -> str:
"""ReAct Agent: Interleaves Thought, Action, and Observation."""
trajectory = f"Question: {question}\n"
for step in range(1, max_steps + 1):
# THOUGHT: LLM reasons about what to do next
response = llm.generate(REACT_PROMPT + trajectory)
if "Final Answer:" in response:
return extract_final_answer(response)
# ACTION: Parse and execute tool call
thought, action_name, action_input = parse_thought_and_action(response)
trajectory += f"Thought {step}: {thought}\n"
trajectory += f"Action {step}: {action_name}[{action_input}]\n"
# OBSERVATION: Get real-world feedback
observation = tools[action_name](action_input)
trajectory += f"Observation {step}: {observation}\n"
return "Max steps reached."
Key Insight: Reasoning guides better actions, and real-world observations ground reasoning - reducing hallucinations. ReAct is the foundation of modern agent frameworks (LangChain, LlamaIndex).
Trade-offs: High token usage (multiple LLM calls per query), 10-30s latency vs 800ms for single calls, overkill for simple tasks.
2. Chain-of-Thought (CoT) Prompting
Paper: Chain-of-Thought Prompting Elicits Reasoning in Large Language Models - Wei et al., NeurIPS 2022
Instead of asking an LLM to jump straight to an answer, you show it (or tell it) to think step-by-step. This simple technique unlocks powerful reasoning abilities that already exist in large models but aren't activated by default prompting.
Flow Diagram
ββββββββββββββββββββββββββββββββββββ
β π₯ Question β
β + "Let's think step by step" β
ββββββββββββββββ¬ββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββ
β π Step 1: Identify what's given β
ββββββββββββββββ¬ββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββ
β π Step 2: Apply logic/formula β
ββββββββββββββββ¬ββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββ
β π Step 3: Compute result β
ββββββββββββββββ¬ββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββ
β π Step 4: Derive final answer β
ββββββββββββββββ¬ββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββ
β β
Final Answer β
βββββββββββββββββββββ
Example: "A store has 23 apples. They sell 8, then get a delivery of 15. How many?"
| Step | Reasoning |
|---|---|
| Step 1 | Start with 23 apples. |
| Step 2 | Sell 8: 23 - 8 = 15 apples. |
| Step 3 | Delivery of 15: 15 + 15 = 30 apples. |
| Answer | 30 apples |
Python Pseudocode
def chain_of_thought(question: str, examples: list = None) -> str:
"""Chain-of-Thought: Encourage step-by-step reasoning."""
if examples:
# Few-shot CoT: provide worked examples
prompt = "Solve by thinking step-by-step.\n\n"
for ex in examples:
prompt += f"Q: {ex['question']}\n"
prompt += f"A: {ex['reasoning']} The answer is {ex['answer']}.\n\n"
prompt += f"Q: {question}\nA:"
else:
# Zero-shot CoT: just add the magic phrase
prompt = f"Q: {question}\nA: Let's think step by step."
return llm.generate(prompt)
Key Insight: Just adding "Let's think step by step" can unlock reasoning in large models - no training needed. But CoT only works with large models (~100B+ params). Reasoning is an emergent property of scale.
Results: CoT with large models significantly outperforms standard prompting on math and reasoning tasks.
3. Self-Consistency CoT
Paper: Self-Consistency Improves Chain of Thought Reasoning - Wang et al., 2022
Generate many reasoning paths and pick the answer that appears most often. If multiple independent paths converge on the same answer, it's probably correct.
Flow Diagram
βββββββββββββββββββ
β π₯ Question β
ββββββββββ¬βββββββββ
β
ββββββββββββββββββΌβββββββββββββββββ
β β β
βΌ βΌ βΌ
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β π Path 1 β β π Path 2 β β π Path 3 β ...
β 23-8=15 β β 23+15=38 β β 23+15-8 β
β 15+15=30 β β 38-8=30 β β = 30 β
β Ans: 30 β
β β Ans: 30 β
β β Ans: 30 β
β
ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ
β β β
ββββββββββββββββββΌβββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββ
β π³οΈ MAJORITY VOTE β
β "30" wins (4 of 5) β
ββββββββββββ¬ββββββββββββ
β
βΌ
ββββββββββββββββββββ
β β
Answer: 30 β
ββββββββββββββββββββ
Example: Same apple problem, 5 reasoning paths
| Path | Reasoning | Answer |
|---|---|---|
| Path 1 | 23 - 8 = 15, then 15 + 15 = 30 | β |
| Path 2 | 23 + 15 = 38, then 38 - 8 = 30 | β |
| Path 3 | 15 - 8 = 7, then 23 + 7 = 30 | β |
| Path 4 | 23 - 8 = 15, then 15 + 5 = 20 (error!) | β |
| Path 5 | 23 + 15 - 8 = 30 | β |
| Vote | 30 appeared 4/5 times | 30 wins |
Python Pseudocode
from collections import Counter
def self_consistency_cot(question: str, k: int = 10, temperature: float = 0.7) -> str:
"""Self-Consistency: Sample multiple CoT paths, majority vote."""
answers = []
for _ in range(k):
response = llm.generate(
f"Q: {question}\nA: Let's think step by step.",
temperature=temperature # Higher temp = more diverse paths
)
answers.append(extract_answer(response))
# Majority vote
return Counter(answers).most_common(1)[0][0]
Key Insight: Diverse reasoning paths act like an ensemble - errors get outvoted. Consistently outperforms standard CoT across math and reasoning benchmarks.
Trade-off: kΓ more compute (10 samples = 10Γ cost), diminishing returns beyond ~20-40 samples.
4. Tree of Thoughts (ToT)
Paper: Tree of Thoughts: Deliberate Problem Solving with Large Language Models - Yao et al., NeurIPS 2023
Explore multiple reasoning branches like a tree. At each step, generate candidates, evaluate them, prune bad ones, and backtrack when needed.
Flow Diagram
ββββββββββββββββββββββββββββββββββββββββββ
β π₯ Problem: Make 24 from [4, 5, 6, 3] β
ββββββββββββ¬βββββββββββββββββ¬βββββββββββββ
β β
βββββββββββββββ ββββββββββββββββ
β β
βΌ βΌ
βββββββββββββββββββββ βββββββββββββββββββββ
β π Branch A β β π Branch B β
β 4 + 5 = 9 β β 5 Γ 6 = 30 β
β Score: 7/10 β β Score: 8/10 β β
ββββββββββ¬βββββββββββ ββββββββββ¬βββββββββββ
β β
ββββββ΄βββββ ββββββ΄βββββ
βΌ βΌ βΌ βΌ
βββββββββ βββββββββ βββββββββ ββββββββββββ
β9Γ3=27 β β9-3=6 β β30-3=27β β(6-4)=2 β
β27-6=21β β6Γ6=36 β β β β β(5+3)=8 β
β β β β β β βββββββββ β2 Γ 8 = 24β
βββββββββ βββββββββ β β
!!! β
β β ββββββββββββ
βΌ βΌ β
β©οΈ Backtrack β©οΈ Backtrack βΌ
ββββββββββββββββββββ
β β
Answer: 24 β
βββββββββββββββββββββ β (6-4)Γ(5+3)=2Γ12 β
β π Branch C β ββββββββββββββββββββ
β 4 - 3 = 1 β
β Score: 2/10 β β
β ποΈ PRUNED β
βββββββββββββββββββββ
Example: Game of 24 - Make 24 from [4, 5, 6, 3]
| Branch | Exploration | Result |
|---|---|---|
| A | 4 + 5 = 9 β 9 Γ 3 = 27 β 27 - 6 = 21 | β Dead end, backtrack |
| B | 5 Γ 6 = 30 β Try (6-4)Γ(5+3) = 2Γ12 = 24 | β Found it! |
| C | 4 - 3 = 1 β Score too low | ποΈ Pruned |
CoT can't backtrack and struggles with combinatorial problems. ToT explores + backtracks, delivering massive improvements on tasks like Game of 24.
Python Pseudocode
def tree_of_thoughts(problem: str, breadth: int = 5, depth: int = 3) -> str:
"""Tree of Thoughts: BFS over reasoning tree with LLM evaluation."""
current_states = [problem]
for step in range(depth):
candidates = []
for state in current_states:
thoughts = llm.generate_n(f"Next steps for:\n{state}", n=breadth)
for thought in thoughts:
new_state = state + "\n" + thought
score = llm.evaluate(f"Rate this (0-10):\n{new_state}")
candidates.append((new_state, score))
candidates.sort(key=lambda x: x[1], reverse=True)
current_states = [s for s, _ in candidates[:breadth]]
return current_states[0]
Part 2: Planning Architectures
5. LATS: Language Agent Tree Search
Paper: Language Agent Tree Search Unifies Reasoning, Acting, and Planning - Zhou et al., ICML 2024
LATS combines Monte Carlo Tree Search (the algorithm behind AlphaGo) with LLM agents. It explores, evaluates, backtracks, and learns from self-reflection on failures.
Flow Diagram
ββββββββββββββββββ
β π₯ Task β
βββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββββββ
β 1. SELECT β βββββββββββββββββββ
β Pick best node β β
ββββββββββββ¬βββββββββββ β
β β
βΌ β
βββββββββββββββββββββββ β
β 2. EXPAND β β
β LLM generates β β
β possible actions β β
ββββββββββββ¬βββββββββββ β
β β
βΌ β
βββββββββββββββββββββββ β
β 3. EVALUATE β β
β Score the state β β
ββββββββββββ¬βββββββββββ β
β β
βΌ β
βββββββββββββββββββββββ β
β 4. BACKPROPAGATE β β
β Update tree scores β β
ββββββββββββ¬βββββββββββ β
β β
ββββββ΄βββββ β
β Success? β β
ββββββ¬βββββ β
Yes β β No β
β βΌ β
β ββββββββββββββββββββ β
β β 5. SELF-REFLECT β β
β β "Failed because ββββββββββββββ
β β ..." + retry β
β ββββββββββββββββββββ
βΌ
βββββββββββββββββββββββ
β β
Best Action β
β Sequence β
βββββββββββββββββββββββ
Example: Writing a function to reverse a linked list
| Step | Action | Result |
|---|---|---|
| Select | Root β most promising path | - |
| Expand | Try iterative approach with prev/curr pointers | Code v1 |
| Evaluate | Run tests β 2/5 pass | Score: 0.4 |
| Reflect | "Failed on edge case: empty list. Need base case." | Stored |
| Expand 2 | Add if not head: return None + fix pointers |
Code v2 |
| Evaluate 2 | Run tests β 5/5 pass! | Score: 1.0 β |
Key Insight: Self-reflection prevents repeating the same mistakes. Strong improvements on code generation and interactive environments.
Python Pseudocode
import math
def lats(task: str, tools: dict, n_iterations: int = 50) -> str:
"""LATS: Monte Carlo Tree Search + LLM agent with self-reflection."""
root = MCTSNode(state=task)
reflections = [] # Memory of past failures
for _ in range(n_iterations):
# 1. SELECT: pick most promising node using UCB1
node = select_node(root, exploration_weight=1.4)
# 2. EXPAND: generate possible actions via LLM
actions = llm.generate(
f"Task: {node.state}\nPast failures: {reflections}\nSuggest next actions:"
)
children = [MCTSNode(state=apply(node.state, a)) for a in parse_actions(actions)]
node.children.extend(children)
# 3. EVALUATE: score the new state
child = children[0]
result = execute_with_tools(child.state, tools)
score = evaluate_result(result)
# 4. BACKPROPAGATE: update scores up the tree
backpropagate(child, score)
if score >= 1.0: # Success!
return extract_action_sequence(child)
# 5. SELF-REFLECT on failure
reflection = llm.generate(f"This approach failed: {result}\nWhy? How to improve?")
reflections.append(reflection)
return extract_best_path(root)
6. Plan-and-Execute Agent
Paper: Plan-and-Solve Prompting - Wang et al., ACL 2023
First make a complete plan, then execute it step by step. Separates the "strategist" (planner) from the "worker" (executor) - can use different models for each.
Flow Diagram
ββββββββββββββββββββββββββββββββββββββββββββ
β π₯ Task: "Compare weather in β
β Tokyo and London" β
ββββββββββββββββββββ¬ββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββ
β π§ PLANNER (GPT-4 - powerful model) β
β β
β Plan: β
β 1. Search Tokyo weather β
β 2. Search London weather β
β 3. Compare and summarize β
ββββββββββββββββββββ¬ββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββ
β β‘ EXECUTOR (GPT-3.5 - cheaper model) β
β β
β Step 1: Search β "Tokyo: 28Β°C, sunny" β
β β β
β βΌ β
β Step 2: Search β "London: 15Β°C, rainy" β
β β β
β βΌ β
β Step 3: Compare both results β
ββββββββββββββββββββ¬ββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββ
β β
"Tokyo: 28Β°C sunny. London: 15Β°C β
β rainy. Tokyo is 13Β° warmer." β
ββββββββββββββββββββββββββββββββββββββββββββ
Example: "Compare weather in Tokyo and London"
| Phase | Model | Action |
|---|---|---|
| Plan | GPT-4 | 1) Get Tokyo weather β 2) Get London weather β 3) Compare |
| Execute Step 1 | GPT-3.5 | Search β "Tokyo: 28Β°C, sunny, humidity 65%" |
| Execute Step 2 | GPT-3.5 | Search β "London: 15Β°C, rainy, humidity 82%" |
| Execute Step 3 | GPT-3.5 | Compare β "Tokyo is 13Β°C warmer and less humid" |
Key Insight: Use a powerful model for planning and a cheap model for execution - saves cost without sacrificing quality.
Python Pseudocode
def plan_and_execute(task: str, tools: dict) -> str:
"""Plan-and-Execute: Separate planning from execution."""
# PLAN: Use a powerful model to create a step-by-step plan
plan = planner_llm.generate( # e.g., GPT-4
f"Create a step-by-step plan to accomplish:\n{task}"
)
steps = parse_steps(plan)
# EXECUTE: Use a cheaper model to carry out each step
results = []
for step in steps:
result = executor_llm.generate( # e.g., GPT-3.5
f"Execute this step using available tools:\n{step}\n"
f"Previous results: {results}"
)
results.append(execute_with_tools(result, tools))
# SYNTHESIZE: Combine all results into a final answer
return planner_llm.generate(
f"Task: {task}\nStep results: {results}\nSynthesize final answer:"
)
7. ReWOO: Reasoning Without Observation
Paper: ReWOO: Decoupling Reasoning from Observations - Xu et al., 2023
Plan ALL tool calls upfront, execute them all, then reason once. Saves ~5Γ tokens vs ReAct.
Flow Diagram
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π₯ "Who is older: the director of Titanic β
β or the director of Avatar?" β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π PLANNER (one LLM call) β
β β
β #E1 = Search["director of Titanic"] β
β #E2 = Search["director of Avatar"] β
β #E3 = Search["age of #E1"] β
β #E4 = Search["age of #E2"] β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β‘ WORKER (execute all tool calls) β
β β
β #E1 β "James Cameron" β
β #E2 β "James Cameron" β
β #E3 β "Born 1954, age 71" β
β #E4 β Same person! β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π§© SOLVER (one LLM call) β
β β
β "Both films were directed by James Cameron. β
β Same person - the question is moot!" β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Efficiency Comparison
| Metric | ReAct | ReWOO |
|---|---|---|
| HotpotQA Accuracy | Baseline | Higher |
| Tokens Used | ~10,000 | ~2,000 |
| Token Efficiency | 1x | ~5x |
Key Insight: ReAct re-sends the entire conversation history at every step. ReWOO avoids this by planning upfront - 5Γ fewer tokens, better accuracy.
Python Pseudocode
def rewoo(question: str, tools: dict) -> str:
"""ReWOO: Plan all tool calls upfront, execute, then reason once."""
# PLANNER: One LLM call to create the full plan with variable references
plan = planner_llm.generate(
f"Plan tool calls to answer: {question}\n"
f"Use #E1, #E2, etc. as variable placeholders.\n"
f"Available tools: {list(tools.keys())}"
)
steps = parse_plan_with_vars(plan) # e.g., [("#E1", "Search", "director of Titanic"), ...]
# WORKER: Execute all tool calls, resolving variable references
evidence = {}
for var, tool_name, args in steps:
# Replace variable references like #E1 with actual results
resolved_args = resolve_variables(args, evidence)
evidence[var] = tools[tool_name](resolved_args)
# SOLVER: One LLM call to synthesize everything
return solver_llm.generate(
f"Question: {question}\nEvidence: {evidence}\nAnswer:"
)
8. LLMCompiler: Parallel Function Calling
Paper: An LLM Compiler for Parallel Function Calling - Kim et al., ICML 2024
Creates a task dependency graph (DAG) and runs independent tasks in parallel.
Flow Diagram
ββββββββββββββββββββββββββββββββββββββββββββββββ
β π₯ "Compare ratings of 3 movies: β
β Inception, Interstellar, Tenet" β
ββββββββββββββββββββββββ¬ββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββ
β π PLANNER: Create Task DAG β
βββββββββ¬ββββββββββββ¬ββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββ βββββββββββ βββββββββββ
β Task 1 β β Task 2 β β Task 3 β
β Search β β Search β β Search β
βInceptionβ βInterst. β β Tenet β
βdeps: [] β βdeps: [] β βdeps: [] β
ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ
β β β
β β‘ ALL RUN IN PARALLEL β‘
β β β
βββββββββββββΌββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββ
β π Task 4: Compare ratings β
β deps: [Task 1, Task 2, Task 3]β
βββββββββββββββββ¬ββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββ
β β
"Inception: 8.8 β
β Interstellar: 8.7 β
β Tenet: 7.3" β
βββββββββββββββββββββββββββββββββ
Python Pseudocode
import asyncio
def llm_compiler(query: str, tools: dict) -> str:
"""LLMCompiler: Plan a DAG, execute independent tasks in parallel."""
dag = planner_llm.generate(
f"Query: {query}\nCreate tasks with: task_id, tool, args, dependencies"
)
tasks = parse_dag(dag)
results = {}
async def run(task):
while not all(d in results for d in task.deps):
await asyncio.sleep(0.1)
args = resolve_vars(task.args, results)
results[task.id] = tools[task.tool](args)
asyncio.run(asyncio.gather(*[run(t) for t in tasks]))
return solver_llm.generate(f"Query: {query}\nResults: {results}")
Key Insight: Parallelism is the key win. Independent tasks run simultaneously, dramatically reducing latency and cost compared to sequential approaches like ReAct.
Part 3: Self-Improvement Loops
9. Reflexion: Verbal Reinforcement Learning
Paper: Reflexion: Language Agents with Verbal Reinforcement Learning - Shinn et al., NeurIPS 2023
When the agent fails, it writes a "lessons learned" reflection and stores it in memory. On the next attempt, it reads past reflections and avoids repeating mistakes.
Flow Diagram
ββββββββββββββββββ
β π₯ Task β
βββββββββ¬βββββββββ
β
βΌ
ββββββββββββββββββββββββββββ
β π€ ACTOR β βββββββββββββββββ
β Attempt task β β
β (reads past reflections) β β
ββββββββββββββ¬ββββββββββββββ β
β β
βΌ β
ββββββββββββββββββββββββββββ β
β π EVALUATOR β β
β Run tests / score β β
ββββββββββββββ¬ββββββββββββββ β
β β
ββββββ΄βββββ β
β Pass? β β
ββββββ¬βββββ β
Yes β β No β
βΌ βΌ β
ββββββββββββββ ββββββββββββββββββββββββ β
β β
Done! β β πͺ SELF-REFLECT β β
ββββββββββββββ β "What went wrong?" β β
ββββββββββββ¬ββββββββββββ β
β β
βΌ β
ββββββββββββββββββββββββ β
β πΎ MEMORY βββββββ
β Store lesson learned β
ββββββββββββββββββββββββ
Example: Write is_palindrome()
| Trial | Code | Tests | Reflection |
|---|---|---|---|
| 1 | return s == s[::-1] |
"racecar" β , "Race Car" β | "Need to normalize: lowercase + remove spaces" |
| 2 | s = s.lower().replace(" ",""); ... |
"Race Car" β , "A man, a plan" β | "Also strip non-alphanumeric chars" |
| 3 | Uses regex to keep only alphanumeric | All tests pass β | - |
Key Insight: Learning from failure without retraining. Each reflection builds persistent memory that prevents the same mistake twice.
Python Pseudocode
def reflexion(task: str, evaluator, max_trials: int = 5) -> str:
"""Reflexion: Learn from failures via verbal self-reflection."""
memory = [] # Stores reflections from past attempts
for trial in range(1, max_trials + 1):
# ACTOR: Attempt the task (reading past reflections)
prompt = f"Task: {task}\n"
if memory:
prompt += f"Lessons from past attempts:\n" + "\n".join(memory) + "\n"
prompt += "Generate solution:"
solution = llm.generate(prompt)
# EVALUATOR: Check if it's correct
score, feedback = evaluator(solution)
if score >= 1.0:
return solution # Success!
# SELF-REFLECT: Analyze what went wrong
reflection = llm.generate(
f"Task: {task}\nYour solution: {solution}\n"
f"Feedback: {feedback}\nWhat went wrong and how to fix it?"
)
memory.append(f"Trial {trial}: {reflection}")
return solution # Return best attempt
10. Self-Refine: Iterative Self-Improvement
Paper: Self-Refine: Iterative Refinement with Self-Feedback - Madaan et al., 2023
One LLM plays three roles: generator, critic, and refiner. Generate β critique β refine β repeat.
Flow Diagram
ββββββββββββββββββ
β π₯ Task β
βββββββββ¬βββββββββ
β
βΌ
ββββββββββββββββββββββββββββ
β βοΈ GENERATE β
β First draft β
ββββββββββββββ¬ββββββββββββββ
β
βββββββββΌβββββββββββββββββββββββββββ
β βΌ β
β ββββββββββββββββββββββββββββ β
β β π CRITIQUE β β
β β "What's wrong?" β β
β ββββββββββββββ¬ββββββββββββββ β
β β β
β ββββββ΄ββββββ β
β β Good β β
β β enough? β β
β ββββββ¬ββββββ β
β Yes β β No β
β β βΌ β
β β ββββββββββββββββββ β
β β β βοΈ REFINE β β
β β β Improve based β β
β β β on critique β β
β β βββββββββ¬βββββββββ β
β β β β
β β ββββββββββββ
β βΌ
β ββββββββββββββββββββββββ
ββββ β
Polished Output β
ββββββββββββββββββββββββ
Example: Professional email
| Round | Draft | Critique |
|---|---|---|
| 1 | "I can't make the meeting Thursday." | "Too blunt. No greeting, no alternative." |
| 2 | "Hi Sarah, I have a conflict Thursday. Could we reschedule to Friday?" | "Better, but could acknowledge importance." |
| 3 | "Hi Sarah, I appreciate you organizing this. Unfortunately I have a conflict - would Friday at 2pm work?" | "Professional, warm. Looks good! β " |
Key Insight: Self-critique catches what the initial generation misses. Humans consistently prefer Self-Refine outputs over single-pass generation.
Python Pseudocode
def self_refine(task: str, max_rounds: int = 5) -> str:
"""Self-Refine: Generate, critique, refine in a loop."""
# GENERATE: Create initial draft
draft = llm.generate(f"Complete this task:\n{task}")
for round in range(max_rounds):
# CRITIQUE: Same LLM evaluates its own work
critique = llm.generate(
f"Task: {task}\nCurrent draft:\n{draft}\n\n"
f"What's wrong with this? Be specific about improvements needed."
)
# Check if the critique says it's good enough
if "looks good" in critique.lower() or "no issues" in critique.lower():
break
# REFINE: Improve based on the critique
draft = llm.generate(
f"Task: {task}\nCurrent draft:\n{draft}\n"
f"Critique: {critique}\n\nImprove the draft based on this feedback:"
)
return draft
11. RAP: Reasoning via Planning
Paper: Reasoning with Language Model is Planning with World Model - Hao et al., EMNLP 2023
The LLM plays dual roles: world model (predicts outcomes) and reasoning agent (picks actions). MCTS searches for the best reasoning path.
Flow Diagram
ββββββββββββββββββββ
β π₯ Problem β
ββββββββββ¬ββββββββββ
β
βββββββββββββββββββΌβββββββββββββββββββββββ
β βΌ β
β ββββββββββββββββββββββββββββββββββ β
β β π€ LLM as AGENT β β
β β "Go to dairy aisle" β β
β ββββββββββββββββ¬ββββββββββββββββββ β
β β β
β βΌ β
β ββββββββββββββββββββββββββββββββββ β
β β π LLM as WORLD MODEL β β
β β "Now at dairy aisle. β β
β β Milk and eggs available." β β
β ββββββββββββββββ¬ββββββββββββββββββ β
β β β
β βΌ β
β ββββββββββββββββββββββββββββββββββ β
β β π Reward: +2 β β
β β (2 items collected) β β
β ββββββββββββββββ¬ββββββββββββββββββ β
β β β
β βΌ β
β ββββββββββββββββββββββββββββββββββ β
β β β©οΈ Backpropagate β β
β ββββββββββββββββ¬ββββββββββββββββββ β
β β β
β More iterations? βββββββββββββ
β β No
β βΌ
β ββββββββββββββββββββββββββββββββββ
ββββ β
Optimal Plan: β
β dairy β bakery β checkout β
ββββββββββββββββββββββββββββββββββ
Key Insight: The LLM plays both agent AND world model. Smarter search compensates for smaller models.
Python Pseudocode
import math
def rap(problem: str, n_iterations: int = 100, depth: int = 5) -> str:
"""RAP: Monte Carlo Tree Search with LLM as both agent and world model."""
root = MCTSNode(state=problem)
for _ in range(n_iterations):
node = root
# SELECT: traverse tree using UCB1
while node.children and not node.is_terminal:
node = max(node.children, key=lambda c: ucb1_score(c))
# EXPAND: LLM as agent proposes actions
actions = llm.generate(f"Possible next actions for:\n{node.state}")
for action in parse_actions(actions):
# LLM as world model predicts next state
next_state = llm.generate(
f"State: {node.state}\nAction: {action}\nPredict next state:"
)
child = MCTSNode(state=next_state, parent=node)
node.children.append(child)
# EVALUATE: LLM scores the state
reward = llm.generate(f"Rate progress (0-1):\n{child.state}")
# BACKPROPAGATE: update scores up the tree
while node:
node.visits += 1
node.total_reward += float(reward)
node = node.parent
# Return best path
return extract_best_path(root)
12. ADaPT: Adaptive Planning
Paper: ADaPT: As-Needed Decomposition and Planning - Prasad et al., NAACL 2024
Try first, decompose only when you fail. Simple tasks get done immediately. Complex tasks get recursively broken down.
Flow Diagram
ββββββββββββββββββββββββββββ
β π₯ "Clean the kitchen" β
βββββββββββββ¬βββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββ
β β‘ Try executing directly β
βββββββββββββ¬βββββββββββββββ
β
ββββββ΄βββββ
β Success? β
ββββββ¬βββββ
Yes β No
β β
ββββββββ ββββββββββββββββββββ
β β
βΌ βΌ
ββββββββββββββββ βββββββββββββββββββββββββ
β β
Done! β β π Decompose into β
ββββββββββββββββ β subtasks β
βββββ¬βββββββ¬βββββββββ¬ββββ
β β β
ββββββββββ β ββββββββββ
βΌ βΌ βΌ
ββββββββββββββ ββββββββββββββ ββββββββββββββββββ
β Wash dishesβ β Clean stoveβ βOrganize pantry β
β β‘ Try β β β‘ Try β β β‘ Try β
β β β
β β β β
β β β β Failed! β
ββββββββββββββ ββββββββββββββ βββββββββ¬βββββββββ
β
βΌ
ββββββββββββββββββββββββ
β π Decompose further β
βββββββ¬βββββββββββ¬ββββββ
β β
βΌ βΌ
ββββββββββββ ββββββββββββ
βSort cans β βSort boxesβ
β β β
β β β β
β
ββββββββββββ ββββββββββββ
Key Insight: Decomposition depth naturally matches task complexity. Only break down what actually fails.
Python Pseudocode
def adapt(task: str, executor, max_depth: int = 3, depth: int = 0) -> str:
"""ADaPT: Try first, decompose only on failure."""
# Try executing the task directly
result = executor.attempt(task)
if result.success:
return result.output
if depth >= max_depth:
return f"Failed after max decomposition depth: {task}"
# Failed - decompose into subtasks
subtasks = llm.generate(
f"Task '{task}' failed. Break it into smaller subtasks:"
)
results = []
for subtask in parse_subtasks(subtasks):
# Recursively apply ADaPT to each subtask
sub_result = adapt(subtask, executor, max_depth, depth + 1)
results.append(sub_result)
return combine_results(results)
Part 4: Advanced Topologies
13. Hierarchical Planning
Paper: HuggingGPT: Solving AI Tasks with ChatGPT and its Friends - Shen et al., NeurIPS 2023
A powerful LLM acts as the "brain" - decomposing requests, selecting specialist models, and orchestrating execution. Like a CEO delegating to experts.
Flow Diagram
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π₯ "Describe this image and read aloud the text" β
βββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π§ Stage 1: TASK PLANNING β
β β
β Subtask A: Image captioning β
β Subtask B: OCR text extraction β
β Subtask C: Text-to-speech β
βββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π Stage 2: MODEL SELECTION β
β β
β A β BLIP-2 (best caption model) β
β B β TrOCR (best OCR model) β
β C β Bark (best TTS model) β
βββββββββ¬βββββββββββββββ¬ββββββββββββββββ¬ββββββββββββββββ
β β β
βΌ βΌ βΌ
ββββββββββββββββ βββββββββββββ ββββββββββββββββ
β BLIP-2 β β TrOCR β β Bark TTS β
β "A cat on β β "print( β β π audio.wav β
β a laptop" β β hello β β β
β β β world)" β β β
ββββββββ¬ββββββββ βββββββ¬ββββββ ββββββββ¬ββββββββ
β β β
βββββββββββββββββΌβββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
Stage 4: RESPONSE GENERATION β
β β
β "The image shows a cat on a laptop. β
β The code says print('hello world'). [βΆοΈ Audio]" β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Key Insight: No single model can do everything well. Use an LLM as the orchestrator to delegate to specialist models.
Python Pseudocode
def hierarchical_planning(request: str, model_registry: dict) -> str:
"""Hierarchical Planning: LLM orchestrator delegates to specialist models."""
# Stage 1: Task Planning - break into subtasks
plan = orchestrator_llm.generate(
f"Break this into subtasks with dependencies:\n{request}"
)
subtasks = parse_subtasks_with_deps(plan)
# Stage 2: Model Selection - pick best model for each subtask
assignments = {}
for task in subtasks:
selected = orchestrator_llm.generate(
f"Which model best handles '{task.type}'?\n"
f"Available: {list(model_registry.keys())}"
)
assignments[task.id] = model_registry[selected.strip()]
# Stage 3: Execution - run each specialist model
results = {}
for task in topological_sort(subtasks):
input_data = resolve_dependencies(task, results)
results[task.id] = assignments[task.id].run(input_data)
# Stage 4: Response Generation - combine all results
return orchestrator_llm.generate(
f"Original request: {request}\nResults: {results}\nGenerate final response:"
)
14. Least-to-Most Prompting
Paper: Least-to-Most Prompting Enables Complex Reasoning - Zhou et al., Google, 2022
Break the hard problem into easy subproblems. Solve from easiest to hardest, feeding each answer into the next.
Flow Diagram
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π₯ "5 machines make 5 widgets in 5 min. β
β How long for 100 machines to make 100?" β
βββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββ
β π DECOMPOSE β
β (easiest β hardest) β
ββββββββββββββββ¬ββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββ
β Sub Q1 (easiest): β
β "How long for 1 machine β
β to make 1 widget?" β
β β
β β Answer: 5 minutes β
ββββββββββββββββ¬ββββββββββββ
β answer feeds β
βΌ
ββββββββββββββββββββββββββββ
β Sub Q2 (medium): β
β "How long for 1 machine β
β to make 100 widgets?" β
β β
β β Answer: 500 minutes β
ββββββββββββββββ¬ββββββββββββ
β answer feeds β
βΌ
ββββββββββββββββββββββββββββ
β Sub Q3 (hardest): β
β "How long for 100 β
β machines to make 100?" β
β β
β β 100 machines parallel β
β each makes 1 widget β
ββββββββββββββββ¬ββββββββββββ
β
βΌ
ββββββββββββββββββββββ
β β
Answer: 5 min β
ββββββββββββββββββββββ
Example: The classic widget problem
| Sub-question | Reasoning | Answer |
|---|---|---|
| Q1 (easy) | Each machine makes 1 widget in 5 min | 5 min |
| Q2 (medium) | 1 widget = 5 min, so 100 = 500 min | 500 min |
| Q3 (hard) | 100 machines in parallel, each makes 1 | 5 min |
Key Insight: Solving easiest sub-problems first builds a foundation of knowledge that makes harder sub-problems tractable.
Python Pseudocode
def least_to_most(question: str) -> str:
"""Least-to-Most: Decompose into sub-questions, solve easiest first."""
# Stage 1: Decompose into sub-questions (easiest to hardest)
decomposition = llm.generate(
f"Break this into sub-questions from easiest to hardest:\n{question}"
)
sub_questions = parse_ordered_questions(decomposition)
# Stage 2: Solve sequentially, feeding answers forward
context = ""
for sub_q in sub_questions:
prompt = f"Context from previous answers:\n{context}\n\nQuestion: {sub_q}"
answer = llm.generate(prompt)
context += f"\nQ: {sub_q}\nA: {answer}\n"
# The last answer addresses the original (hardest) question
return answer
15. Algorithm of Thoughts (AoT)
Paper: Algorithm of Thoughts: Enhancing Exploration of Ideas in LLMs - Sel et al., 2023
Teach the LLM to simulate tree search internally using algorithmic examples in the prompt. Tree-like exploration in a single query.
Flow Diagram
βββββββββββββββββββββββββββββββββββββββββββββββ
β π₯ Problem + "Explore like DFS algorithm" β
ββββββββββββββββββββ¬βββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββ
β π§ SINGLE LLM CALL (internal simulation) β
β β
β Path 1: Try approach A β
β β hit contradiction β
β β β BACKTRACK β
β β
β Path 2: Try approach B β
β β partial progress, keep going β
β β β
β ββ Path 2.1: Sub-approach B1 β
β β β dead end β
β β β β BACKTRACK β
β β β
β ββ Path 2.2: Sub-approach B2 β
β β works! β
β β β
SUCCESS β
β β
β (Path 3: not needed, answer found) β
ββββββββββββββββββββ¬βββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββ
β β
Answer from Path 2.2 β
ββββββββββββββββββββββββββββ
Example: Solving a logic puzzle
| Path | Exploration | Result |
|---|---|---|
| Path 1 | Assume A is true β B is both true AND false | β Contradiction, backtrack |
| Path 2.1 | Assume A is false, B is false β C must be both | β Dead end |
| Path 2.2 | Assume A is false, B is true β C is false | β Consistent! |
Key Insight: By showing the LLM how an algorithm explores, it can simulate that exploration internally in a single call. Much cheaper than ToT.
Python Pseudocode
def algorithm_of_thoughts(problem: str, algorithm: str = "dfs") -> str:
"""AoT: Teach LLM to simulate tree search in a single call."""
# Build a prompt that shows the LLM how to explore like an algorithm
prompt = f"""Solve this problem by exploring like a {algorithm} algorithm.
Rules:
- Explore one path at a time
- If you hit a contradiction or dead end, explicitly BACKTRACK
- Try the next unexplored branch
- Continue until you find a consistent solution
- Show your exploration trace
Problem: {problem}
Exploration trace:
Path 1: """
# Single LLM call does the entire tree search internally
response = llm.generate(prompt, max_tokens=2000)
# Extract the final answer from the exploration trace
return extract_solution(response)
16. Graph of Thoughts (GoT)
Paper: Graph of Thoughts: Solving Elaborate Problems with LLMs - Besta et al., AAAI 2024
CoT = chain. ToT = tree. GoT = graph. Thoughts can have multiple parents (merging ideas), feedback loops, and arbitrary connections.
Flow Diagram
βββββββββββββββββββββββββββββββ
β π₯ "Sort [7,3,9,1,5,8,2]" β
ββββββββββββ¬βββββββββββββββββββ
β
Split into parts
β
ββββββββββββββ΄βββββββββββββ
βΌ βΌ
ββββββββββββββββββββ ββββββββββββββββββββ
β π Idea A β β π Idea B β
β Sort [7,3,9,1] β β Sort [5,8,2] β
β β [1,3,7,9] β β β [2,5,8] β
ββββββββββ¬ββββββββββ ββββββββββ¬ββββββββββ
β β
ββββββββββββββ¬βββββββββββββ
β
βΌ (2 parents - only graphs can do this!)
ββββββββββββββββββββββββββ
β π AGGREGATE (merge) β ββββββββ
β [1,2,3,5,7,8,9] β β
ββββββββββββββ¬ββββββββββββ β
β β
ββββββ΄βββββ β
β Correct? β β
ββββββ¬βββββ β
Yes β β No β
βΌ βββ π REFINE ββββ
ββββββββββββββββββ
β β
Sorted list! β
ββββββββββββββββββ
The merge node has TWO parents (A and B). Trees can't do this - only graphs can combine ideas from different branches. Significant quality improvement over ToT with lower cost.
Python Pseudocode
from enum import Enum
class Operation(Enum):
GENERATE = "generate"
AGGREGATE = "aggregate"
REFINE = "refine"
SCORE = "score"
def graph_of_thoughts(problem: str, operations: list) -> str:
"""GoT: Process thoughts as a graph with merging, looping, and refining."""
graph = ThoughtGraph()
initial = graph.add_node(thought=problem)
for op in operations:
if op.type == Operation.GENERATE:
# Split: create multiple child thoughts (like ToT)
children = [llm.generate(f"Approach for: {op.input}") for _ in range(op.k)]
for child in children:
graph.add_node(thought=child, parents=[op.input_node])
elif op.type == Operation.AGGREGATE:
# Merge: combine multiple thoughts into one (unique to GoT!)
combined = llm.generate(
f"Combine these partial solutions:\n" +
"\n".join(n.thought for n in op.input_nodes)
)
graph.add_node(thought=combined, parents=op.input_nodes)
elif op.type == Operation.REFINE:
# Loop back: improve an existing thought
improved = llm.generate(f"Improve this:\n{op.input_node.thought}")
graph.add_node(thought=improved, parents=[op.input_node])
elif op.type == Operation.SCORE:
# Evaluate: score a thought for quality
op.input_node.score = llm.evaluate(op.input_node.thought)
return graph.get_best_node().thought
17. AFlow: Automated Workflow Generation
Paper: AFlow: Automating Agentic Workflow Generation - Zhang et al., ICLR 2025 (Oral)
The meta-algorithm. Instead of humans choosing which algorithm to use, let an LLM automatically discover the optimal workflow using MCTS.
Flow Diagram
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π₯ Task Dataset + Available Operators β
β [Generate, Review, Ensemble, Test, Refine, ...] β
βββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββ
β
βββββββββββββββββββΌββββββββββββββββββββββββββββ
β βΌ β
β ββββββββββββββββββββββββββββββββββββββ β
β β 1. SELECT β β
β β Pick promising workflow to modify β β
β ββββββββββββββββ¬ββββββββββββββββββββββ β
β β β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββ β
β β 2. LLM MODIFIES workflow code β β
β β "Add ensemble step after β β
β β generation" β β
β ββββββββββββββββ¬ββββββββββββββββββββββ β
β β β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββ β
β β 3. EVALUATE on validation set β β
β β Score improves each iteration... β β
β ββββββββββββββββ¬ββββββββββββββββββββββ β
β β β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββ β
β β 4. BACKPROPAGATE score β β
β ββββββββββββββββ¬ββββββββββββββββββββββ β
β β β
β Converged? βββββ No ββββββββββββββ
β β
β Yes β
β βΌ
β ββββββββββββββββββββββββββββββββββββββ
ββββ β
Optimal Workflow Discovered! β
β β
β "Generate β Self-Critique β β
β Ensemble(3) β Format" β
β β
β GPT-4o-mini with this workflow β
β Small model + smart workflow β
β BEATS big model + naive workflow! β
ββββββββββββββββββββββββββββββββββββββ
Example: Finding the best workflow for math problems
| Iteration | Workflow Tried | Trend |
|---|---|---|
| 1 | "Generate answer directly" | Baseline |
| 2 | "Generate + Review" | Better |
| 3 | "Generate + Review + Ensemble(5)" | Much better |
| 10 | "CoT + Self-Consistency(3) + Review + Refine" | Strong |
| 30 | "Generate(temp=0.8) + Test + Reflect + Retry + Ensemble(3)" | Best |
Key Insight: A smaller model with a smart workflow can outperform a larger model with a naive workflow. Smarter workflows > bigger models.
Python Pseudocode
def aflow(task_dataset: list, operators: list, n_iterations: int = 30) -> dict:
"""AFlow: Use MCTS to automatically discover the best workflow."""
# Start with a simple workflow
root = WorkflowNode(workflow=["generate"])
best_workflow = None
best_score = 0
for i in range(n_iterations):
# 1. SELECT: pick a promising workflow to modify
node = select_node(root, exploration_weight=1.4)
# 2. EXPAND: LLM proposes a modification to the workflow
modification = llm.generate(
f"Current workflow: {node.workflow}\n"
f"Available operators: {operators}\n"
f"Suggest one improvement (add/remove/reorder a step):"
)
new_workflow = apply_modification(node.workflow, modification)
child = WorkflowNode(workflow=new_workflow, parent=node)
# 3. EVALUATE: run the new workflow on validation data
score = evaluate_workflow(new_workflow, task_dataset)
child.score = score
if score > best_score:
best_score = score
best_workflow = new_workflow
# 4. BACKPROPAGATE: update scores up the tree
backpropagate(child, score)
return {"workflow": best_workflow, "score": best_score}
Choosing the Right Algorithm
| Scenario | Recommended Algorithm |
|---|---|
| Simple Q&A with tools | ReAct |
| Math/logic problems | CoT or Self-Consistency CoT |
| Tasks requiring backtracking | Tree of Thoughts |
| Multi-step agent tasks | Plan-and-Execute or ReWOO |
| Independent parallel subtasks | LLMCompiler |
| Code generation with retries | Reflexion |
| Content quality improvement | Self-Refine |
| Complex decomposition problems | Least-to-Most |
| Multi-modal orchestration | Hierarchical Planning |
| Cost-constrained production | AFlow |
Algorithm Composition Patterns
In practice, these algorithms are rarely used in isolation. The real power comes from combining them. Here are five proven composition patterns:
Pattern 1: ReAct + Reflexion (Learn-from-Failure Agent)
Use ReAct for action execution and Reflexion for learning from failures. When the ReAct loop fails, Reflexion writes a reflection and the agent retries with that memory.
ReAct Loop βββΊ Failure βββΊ Reflexion (reflect + store)
β² β
βββββββββ Retry with memory ββββ
Use case: Code generation agents that debug their own mistakes across attempts.
Pattern 2: CoT + Self-Consistency (Robust Reasoning)
Generate multiple CoT paths with high temperature, then majority-vote the answer. This is the simplest and most commonly used composition.
Question βββΊ CoT Path 1 βββ
βββΊ CoT Path 2 βββΌβββΊ Majority Vote βββΊ Answer
βββΊ CoT Path 3 βββ
Use case: Math problems, factual QA, anywhere correctness matters more than speed.
Pattern 3: Plan-and-Execute + LLMCompiler (Fast Parallel Agent)
Use Plan-and-Execute to create the plan, then hand it to LLMCompiler to parallelize independent steps.
Task βββΊ Planner (GPT-4) βββΊ Dependency Graph βββΊ LLMCompiler
β
Parallel Execution
β
βββΊ Result
Use case: Multi-tool agents that need to gather information from several APIs quickly.
Pattern 4: Hierarchical Planning + Self-Refine (Quality Orchestration)
Use Hierarchical Planning to delegate to specialist models, then Self-Refine to polish the combined output.
Request βββΊ Orchestrator βββΊ Specialist A βββ
βββΊ Specialist B βββΌβββΊ Combine βββΊ Self-Refine Loop βββΊ Output
βββΊ Specialist C βββ
Use case: Multi-modal tasks (image + text + audio) that need polished final output.
Pattern 5: ADaPT + Least-to-Most (Smart Decomposition)
Try the task directly (ADaPT). If it fails, decompose using Least-to-Most ordering (easiest subtask first).
Task βββΊ Try directly (ADaPT)
β
Success? ββYesβββΊ Done
β
No
β
βΌ
Decompose (Least-to-Most ordering)
Solve easiest βββΊ ... βββΊ Solve hardest βββΊ Done
Use case: Complex multi-step tasks where difficulty is unknown upfront.
Getting Started: Practical Integration
Ready to use these algorithms in your own projects? Here's how to get started with two popular frameworks.
Using LangChain
LangChain has built-in support for several of these algorithms. Here's a quick ReAct agent:
from langchain.agents import create_react_agent, AgentExecutor
from langchain_openai import ChatOpenAI
from langchain.tools import Tool
# Define your tools
tools = [
Tool(name="Search", func=search_func, description="Search the web"),
Tool(name="Calculator", func=calc_func, description="Do math"),
]
# Create a ReAct agent
llm = ChatOpenAI(model="gpt-4")
agent = create_react_agent(llm, tools, react_prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
# Run it
result = executor.invoke({"input": "What is the population of France divided by 3?"})
Using LangGraph (Plan-and-Execute, Reflexion, and more)
LangGraph enables more complex patterns like Plan-and-Execute and Reflexion through its graph-based workflow:
from langgraph.graph import StateGraph, END
# Define a Plan-and-Execute workflow
workflow = StateGraph(PlanExecuteState)
# Add nodes for planning and execution
workflow.add_node("planner", plan_step) # GPT-4 creates the plan
workflow.add_node("executor", execute_step) # GPT-3.5 executes each step
workflow.add_node("replan", replan_step) # Re-plan if needed
# Define edges
workflow.set_entry_point("planner")
workflow.add_edge("planner", "executor")
workflow.add_conditional_edges(
"executor",
should_replan, # Check if we need to adjust
{"replan": "replan", "end": END}
)
workflow.add_edge("replan", "executor")
app = workflow.compile()
result = app.invoke({"input": "Compare weather in Tokyo and London"})
Using LlamaIndex (RAG + ReAct)
LlamaIndex combines retrieval with agentic reasoning:
from llama_index.core.agent import ReActAgent
from llama_index.llms.openai import OpenAI
from llama_index.core.tools import QueryEngineTool
# Create tools from your data indices
tools = [
QueryEngineTool.from_defaults(
query_engine=docs_index.as_query_engine(),
name="documentation",
description="Search internal documentation"
),
]
# Create a ReAct agent with your tools
agent = ReActAgent.from_tools(
tools,
llm=OpenAI(model="gpt-4"),
verbose=True
)
response = agent.chat("What does our API rate limiting policy say?")
Quick Reference: Which Framework for Which Algorithm?
| Algorithm | LangChain | LangGraph | LlamaIndex |
|---|---|---|---|
| ReAct | create_react_agent |
Custom graph | ReActAgent |
| Plan-and-Execute | - | PlanExecute template |
- |
| Reflexion | - | Custom with memory | - |
| Self-Consistency | Custom chain | Parallel branches | - |
| Tree of Thoughts | - | Custom BFS graph | - |
The Evolution: From Chains to Graphs to Automation
2022 2023 2024 2025
β β β β
β βββββββββββ β ββββββββββββ β ββββββββββββ β ββββββββββββ
ββββΊβ π CoT β ββββΊβ π³ ToT β βββΊβ πΈοΈ GoT β βββΊβ π€ AFlow β
β β (Linear β β β (Tree + β β β (Graph β β β (Auto- β
β β Chain) β β β Backtrk) β β β Topology)β β β mated) β
β βββββββββββ β ββββββββββββ β ββββββββββββ β ββββββββββββ
β β β β
β βββββββββββ β ββββββββββββ β β
ββββΊβ π³οΈ Self- β ββββΊβ π€ ReAct β β β
β β Consist. β β β ReWOO β β β
β β(Ensemble)β β β Plan&Exe β β β
β βββββββββββ β ββββββββββββ β β
β β β β
β β ββββββββββββ β β
β ββββΊβ πͺ Reflex β β β
β β β Self- β β β
β β β Refine β β β
β β ββββββββββββ β β
β β β β
The trend is unmistakable: we're moving from hand-designed chains to automatically discovered, arbitrarily structured reasoning workflows. The future belongs to systems that dynamically choose and combine these algorithms based on the task at hand.
Comments (0)