ReAct

Intermediate🧠 Reasoning PatternsPrinceton & Google (Yao et al., 2022)

Intent

Interleave reasoning (thinking about what to do) with acting (using tools) in a Thought → Action → Observation loop.

Problem

Pure reasoning (chain-of-thought) can't access external information or take actions. Pure acting (tool use) without reasoning leads to aimless tool calls and inability to synthesize results. An agent needs to think about what information it needs, get it, then reason about what it learned.

Solution

The agent operates in a loop of three steps: Thought (reason about what to do next), Action (call a tool or take an action), Observation (process the result). The reasoning informs which action to take, and the observation feeds back into the next round of reasoning. This creates a grounded reasoning loop where the model isn't just thinking in a vacuum — it's interacting with the real world and adjusting based on what it finds.

Diagram

Question → [Thought: I need to search for X]
                ↓
           [Action: search('X')]
                ↓
           [Observation: Found Y and Z]
                ↓
           [Thought: Now I know Y. I also need W]
                ↓
           [Action: lookup('W')]
                ↓
           [Observation: W is ...]
                ↓
           [Thought: I have enough info. The answer is ...]
                ↓
              Answer

When to Use

Tasks that require gathering information from external sources
Interactive problem-solving where each step depends on previous results
Question-answering over knowledge bases or APIs
Any task combining reasoning with tool use

When NOT to Use

Pure reasoning tasks with no need for external information
When all needed information is already in the prompt
Simple tool calls that don't require reasoning about results

Pros & Cons

Pros

Grounds reasoning in real observations, reducing hallucination
Transparent — you can see the agent's thought process
Flexible — works with any set of tools
Handles tasks that require dynamic, multi-step information gathering

Cons

Can get stuck in loops or make unnecessary tool calls
Reasoning overhead adds latency
Quality depends on tool quality and availability
Hard to predict the number of iterations needed

Implementation Steps

1Define the available tools (search, calculator, API, etc.)
2Create a system prompt that instructs the model to use Thought/Action/Observation format
3Implement tool execution: parse the action, execute it, return the observation
4Build the loop: feed observations back as context for the next thought
5Set a maximum iteration count to prevent infinite loops
6Add a mechanism for the agent to signal it has a final answer

Real-World Example

Research Assistant

User asks: 'What were Anthropic's key papers in 2024?' Thought: I should search for Anthropic publications. Action: web_search('Anthropic research papers 2024'). Observation: Results mention Constitutional AI, Claude 3, and Sleeper Agents. Thought: Let me get details on each. Action: web_search('Anthropic sleeper agents paper'). ... Eventually synthesizes a comprehensive answer.

PromptReAct System Prompt

You have access to these tools:
- search(query): Search the web for information
- calculate(expression): Evaluate a math expression

Use this exact format for EVERY step:

Thought: reason about what to do next
Action: tool_name(argument)
Observation: [tool result will appear here]

...repeat Thought/Action/Observation as needed...

Thought: I now have enough information
Answer: [your final answer]

Question: What is the GDP per capita of the country
that hosted the 2024 Summer Olympics?

PythonReAct Agent Loop

from openai import OpenAI
import re

client = OpenAI()
TOOLS = {"search": lambda q: f"Results for '{q}'...", "calculate": lambda e: str(eval(e))}

def react_agent(question: str, max_steps: int = 5) -> str:
    messages = [
        {"role": "system", "content": "Use Thought/Action/Observation format. End with Answer:"},
        {"role": "user", "content": question},
    ]

    for _ in range(max_steps):
        response = client.chat.completions.create(model="gpt-4o", messages=messages)
        text = response.choices[0].message.content

        if "Answer:" in text:
            return text.split("Answer:")[-1].strip()

        # Parse action and execute tool
        action_match = re.search(r"Action: (\w+)\((.+?)\)", text)
        if action_match:
            tool, arg = action_match.group(1), action_match.group(2).strip("'\"` ")
            observation = TOOLS[tool](arg)
            messages.append({"role": "assistant", "content": text})
            messages.append({"role": "user", "content": f"Observation: {observation}"})

    return "Max steps reached"

References

ReAct: Synergizing Reasoning and Acting — Yao et al., 2022