Interactive Demo

The Agentic Loop

AI agents work by repeating a fundamental cycle: reason about the task, take an action, observe the result, and decide what to do next. This is the agentic loop.

Think Act Observe
Agent State Idle
Think Act Observe
Speed 3x
Scenario
"Find the current weather in Tokyo and recommend what to wear today."
Agent Transcript
Press Run Agent to begin the simulation

The Three Phases

Every iteration of the agentic loop follows the same three-step pattern, regardless of the task complexity.

Think

The agent reasons about its current state and decides what to do next. It considers the goal, what it has learned so far, and which tools are available.

"I need weather data. I should call the weather API for Tokyo."

Act

The agent executes an action -- calling a tool, running code, searching the web, or writing a file. This is where the agent interacts with the external world.

get_weather(city="Tokyo", units="celsius")

Observe

The agent reads the result of its action and integrates the new information. It decides whether the task is complete or if another iteration is needed.

Result: 14C, partly cloudy, 60% humidity, wind 12 km/h

How It Comes Together

1
User Prompt
The user describes a task in natural language.
2
Loop Begins
The agent enters the Think-Act-Observe cycle.
3
Iterations
Each loop gathers information and makes progress.
4
Final Answer
The agent delivers a complete response to the user.

Deep Dive

Go beyond the demo. Explore the patterns, code, and lessons behind building reliable AI agents.

The ReAct (Reasoning + Acting) pattern is the backbone of modern AI agents. Introduced by Yao et al. in 2022, it interleaves chain-of-thought reasoning with concrete tool use. Instead of generating a plan all at once and hoping for the best, a ReAct agent reasons one step at a time, executes that step, and feeds the result back into its next round of reasoning. This tight loop lets the agent recover from mistakes, adapt to unexpected data, and build up answers incrementally.

The pattern works because it mirrors how humans solve unfamiliar problems: we think about what to do, try something, look at the outcome, and adjust. By making each of these stages explicit and observable, ReAct agents become far more debuggable than end-to-end black-box systems. You can inspect the agent's reasoning trace, see exactly which tool calls it made and why, and pinpoint where things went wrong when they do.

A key insight is that the "Think" step is not optional filler -- it is where the agent decides which tool to call and why, formulates hypotheses about what the result might look like, and maintains a running mental model of progress toward the goal. Skipping the reasoning step leads to agents that thrash between tools without converging on a solution.

Below is Python pseudocode showing the essential structure of a ReAct agent. The outer while loop drives the iterations, and each pass moves through Think, Act, and Observe before the agent decides whether to continue or deliver a final answer.

import llm, tools

def react_agent(user_prompt, max_iterations=10):
    history = [{"role": "user", "content": user_prompt}]

    for i in range(max_iterations):
        # ---- THINK ----
        thought = llm.generate(
            system="You are a helpful agent. Reason step-by-step.",
            messages=history,
        )
        history.append({"role": "assistant", "content": thought})

        # Check if the agent decided it is done
        if thought.startswith("FINAL ANSWER:"):
            return thought.removeprefix("FINAL ANSWER:").strip()

        # ---- ACT ----
        tool_name, tool_args = parse_tool_call(thought)
        result = tools.execute(tool_name, **tool_args)

        # ---- OBSERVE ----
        observation = f"Tool `{tool_name}` returned:\n{result}"
        history.append({"role": "tool", "content": observation})

    return "Reached max iterations without a final answer."

Notice how the history list accumulates context. Each thought and observation becomes part of the prompt for the next iteration, giving the agent a growing memory of what it has tried and learned. In production systems you would add error handling around tool execution, token-budget management to keep the context window from overflowing, and guardrails to prevent the agent from calling dangerous tools without confirmation.

Set a hard iteration cap. Without a maximum number of loops, an agent can enter infinite cycles -- especially when it encounters ambiguous tool output and keeps retrying the same call. A reasonable default is 5 to 15 iterations depending on task complexity. When the cap is reached, the agent should return the best partial answer it has rather than silently failing.

Make tool descriptions precise. The quality of the agent's "Think" step depends heavily on how well it understands the tools at its disposal. Vague or overlapping tool descriptions lead to wrong tool selection, wasted iterations, and confused reasoning. Treat tool descriptions like API documentation: specify inputs, outputs, edge cases, and when not to use the tool.

Observe more than the happy path. Real-world tool calls fail: APIs time out, files are missing, data is malformed. Robust agents include the error message in the observation and let the LLM reason about recovery strategies. This is where the loop really shines -- a single failed call does not end the task; the agent simply thinks about what went wrong and tries a different approach on the next iteration.