AI Agents: Give Your LLM the Ability to Act, Not Just Talk

01What Are AI Agents? 02Tool Definition JSON Schema 03Function Calling with OpenAI 04Tool Dispatch Pattern 05The ReAct Loop 06Building a Research Agent 07Multi-Step Task Example 08Agent Failure Modes & Fixes

Concept 01

What Are AI Agents? LLM + Loop + Tools

An LLM in a single call answers a question. An agent completes a multi-step task autonomously. The difference is the loop.

AGENT ARCHITECTURE ════════════════════════ User Task: "Research the top 3 competitors of Stripe and compare their pricing" │ ▼ ┌─────────────────────────────────────────┐ │ AGENT LOOP │ │ │ │ ┌─────────┐ ┌──────────────────┐ │ │ │ LLM │────▶│ DECIDE: │ │ │ │(planner)│ │ Which tool? │ │ │ └─────────┘ │ What arguments? │ │ │ ▲ └──────┬───────────┘ │ │ │ │ │ │ │ ┌──────▼───────────┐ │ │ │ │ EXECUTE TOOL │ │ │ │ │ web_search() │ │ │ TOOL │ │ calculator() │ │ │ RESULT│ │ code_runner() │ │ │ └───────────│ database_query()│ │ │ └──────────────────┘ │ │ Continue until task is complete │ └─────────────────────────────────────────┘ │ ▼ Final Answer with research and comparison

The loop is what makes agents powerful and dangerous. Each iteration: the LLM sees the current state (task + tool results so far), decides what to do next (use a tool or return final answer), executes, and repeats. This continues until the LLM decides it has enough information to answer.

Agents Are Non-Deterministic

Unlike regular API calls, agents can take variable numbers of steps, use tools in unexpected orders, and get into infinite loops. Always implement a maximum step limit and comprehensive logging. Never deploy an agent to production without guardrails.

Concept 02

Tool Definition — The JSON Schema the LLM Uses to Understand Tools

A "tool" (also called a "function") is defined as a JSON schema that tells the LLM what the function does, what arguments it takes, and what each argument means. The LLM uses this schema to decide when and how to call the tool.

from openai import OpenAI

client = OpenAI()

# Tool definitions — JSON schemas that describe your functions
TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "search_web",
            "description": "Search the web for current information. Use this when you need "
                           "up-to-date information that might not be in your training data.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "The search query. Be specific for better results.",
                    },
                    "num_results": {
                        "type": "integer",
                        "description": "Number of results to return (1-10). Default: 5",
                        "default": 5,
                    }
                },
                "required": ["query"],
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "Evaluate a mathematical expression. "
                           "Use for arithmetic, percentages, and financial calculations.",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description": "A valid Python math expression. E.g., '(150 + 200) * 1.18'",
                    }
                },
                "required": ["expression"],
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_current_time",
            "description": "Get the current date and time. Use when the task requires knowing today's date.",
            "parameters": {
                "type": "object",
                "properties": {
                    "timezone": {
                        "type": "string",
                        "description": "IANA timezone name. E.g., 'America/New_York', 'Asia/Kolkata'",
                        "default": "UTC"
                    }
                },
                "required": [],
            }
        }
    },
]

Concept 03

Function Calling with OpenAI — The API Mechanics

import json

def call_llm_with_tools(messages: list, tools: list = None) -> dict:
    """
    Make an LLM API call with tool support.
    Returns the full response object — handle tool_calls or content.
    """
    kwargs = {
        "model": "gpt-4o",  # Function calling requires gpt-4+ quality
        "messages": messages,
        "temperature": 0,
    }

    if tools:
        kwargs["tools"] = tools
        kwargs["tool_choice"] = "auto"  # Let the model decide whether to use tools
        # Other options: "none" (never use tools), {"type": "function", "function": {"name": "search_web"}}

    response = client.chat.completions.create(**kwargs)
    choice = response.choices[0]

    return {
        "message": choice.message,
        "finish_reason": choice.finish_reason,
        # finish_reason will be "tool_calls" if the model wants to use a tool
        # finish_reason will be "stop" if the model has a final answer
    }

# Example: Single tool call
messages = [
    {"role": "user", "content": "What is 15% tip on a $84.50 restaurant bill?"}
]

result = call_llm_with_tools(messages, TOOLS)
print("Finish reason:", result["finish_reason"])

if result["finish_reason"] == "tool_calls":
    for tool_call in result["message"].tool_calls:
        print("Tool requested:", tool_call.function.name)
        print("Arguments:", tool_call.function.arguments)

Concept 04

Tool Dispatch — The execute_tool Function

import ast
import datetime
import pytz

# Actual Python implementations of your tools
def search_web(query: str, num_results: int = 5) -> str:
    """
    Real web search. In production, use SerpAPI, Brave Search API, or Tavily.
    This is a mock for demonstration.
    """
    # In production:
    # import requests
    # response = requests.get(
    #     "https://api.tavily.com/search",
    #     json={"query": query, "max_results": num_results},
    #     headers={"Authorization": f"Bearer {os.environ['TAVILY_API_KEY']}"}
    # )
    # return json.dumps(response.json()["results"])

    # Mock response for demo
    return f"[Search results for '{query}']: Found {num_results} results about {query}."

def calculate(expression: str) -> str:
    """Safely evaluate a mathematical expression."""
    try:
        # Use ast.literal_eval for safe evaluation (no exec/eval)
        # For math expressions, we need eval but with restricted namespace
        allowed_names = {
            "abs": abs, "round": round, "min": min, "max": max,
            "sum": sum, "pow": pow,
        }
        result = eval(expression, {"__builtins__": {}}, allowed_names)
        return f"Result: {result}"
    except Exception as e:
        return f"Error evaluating expression: {e}"

def get_current_time(timezone: str = "UTC") -> str:
    """Get current time in specified timezone."""
    try:
        tz = pytz.timezone(timezone)
        now = datetime.datetime.now(tz)
        return now.strftime(f"%Y-%m-%d %H:%M:%S {timezone}")
    except Exception as e:
        return f"Error: {e}"

# The dispatcher — maps tool names to Python functions
TOOL_IMPLEMENTATIONS = {
    "search_web": search_web,
    "calculate": calculate,
    "get_current_time": get_current_time,
}

def execute_tool(tool_call) -> str:
    """
    Execute a tool call from the LLM.
    Returns the result as a string for inclusion in the next API call.
    """
    function_name = tool_call.function.name
    function_args = json.loads(tool_call.function.arguments)

    print(f"  [TOOL] Calling: {function_name}({function_args})")

    if function_name not in TOOL_IMPLEMENTATIONS:
        return f"Error: Tool '{function_name}' not found"

    try:
        result = TOOL_IMPLEMENTATIONS[function_name](**function_args)
        print(f"  [TOOL] Result: {str(result)[:100]}...")
        return str(result)
    except Exception as e:
        return f"Tool error: {e}"

Concept 05

The ReAct Loop — Reason + Act Until Done

ReAct (Reasoning + Acting) is the most common agent pattern. The LLM reasons about what to do, takes an action (tool call), observes the result, reasons again, and continues until the task is complete. Here's the complete loop implementation:

def run_agent(
    task: str,
    system_prompt: str = None,
    tools: list = TOOLS,
    max_steps: int = 10,  # CRITICAL: always limit steps
    verbose: bool = True,
) -> dict:
    """
    The ReAct agent loop.
    Runs until the model produces a final answer or hits max_steps.
    """
    if system_prompt is None:
        system_prompt = """You are a helpful AI assistant with access to tools.
Use tools when you need external information or computation.
Think step by step. When you have enough information to answer, respond directly."""

    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": task},
    ]

    steps = []
    total_tokens = 0

    for step in range(max_steps):
        if verbose:
            print(f"\n[Step {step + 1}/{max_steps}]")

        # Ask the LLM what to do next
        result = call_llm_with_tools(messages, tools)
        message = result["message"]
        finish_reason = result["finish_reason"]

        # Track the step
        step_info = {"step": step + 1, "finish_reason": finish_reason}

        if finish_reason == "stop":
            # LLM has a final answer — we're done
            final_answer = message.content
            step_info["type"] = "final_answer"
            step_info["answer"] = final_answer
            steps.append(step_info)

            if verbose:
                print(f"  [DONE] Final answer: {final_answer[:200]}...")

            return {
                "answer": final_answer,
                "steps": steps,
                "step_count": step + 1,
                "success": True,
            }

        elif finish_reason == "tool_calls":
            # LLM wants to use tools
            step_info["type"] = "tool_calls"
            step_info["tools"] = []

            # Add the assistant's tool call message to history
            messages.append({
                "role": "assistant",
                "content": message.content,
                "tool_calls": [
                    {
                        "id": tc.id,
                        "type": "function",
                        "function": {
                            "name": tc.function.name,
                            "arguments": tc.function.arguments,
                        }
                    }
                    for tc in message.tool_calls
                ]
            })

            # Execute each requested tool
            for tool_call in message.tool_calls:
                tool_result = execute_tool(tool_call)
                step_info["tools"].append({
                    "name": tool_call.function.name,
                    "result": tool_result[:200],
                })

                # Add tool result to conversation
                messages.append({
                    "role": "tool",
                    "tool_call_id": tool_call.id,
                    "content": tool_result,
                })

            steps.append(step_info)

        else:
            # Unexpected finish reason
            steps.append({"step": step + 1, "type": "unexpected", "finish_reason": finish_reason})
            break

    # Hit max_steps without finishing
    return {
        "answer": "Agent reached maximum steps without completing the task.",
        "steps": steps,
        "step_count": max_steps,
        "success": False,
    }

Concept 06

Building a Research Agent Step by Step

def build_research_agent():
    """
    A research agent that can search the web, do calculations, and synthesize findings.
    """
    research_tools = [
        TOOLS[0],  # search_web
        TOOLS[1],  # calculate
        {
            "type": "function",
            "function": {
                "name": "summarize_findings",
                "description": "When you have gathered all necessary information, "
                               "call this to compile your research into a final report.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "title": {"type": "string", "description": "Report title"},
                        "findings": {
                            "type": "array",
                            "items": {"type": "string"},
                            "description": "List of key findings"
                        },
                        "conclusion": {"type": "string", "description": "Summary conclusion"}
                    },
                    "required": ["title", "findings", "conclusion"]
                }
            }
        }
    ]

    system_prompt = """You are a professional researcher. For any research task:
1. Search for information on the specific topics
2. Use 2-3 targeted searches to gather comprehensive information
3. Do any required calculations
4. Compile your findings into a structured report using summarize_findings
Be thorough but efficient — don't search for the same thing twice."""

    return lambda task: run_agent(task, system_prompt, research_tools, max_steps=8)


# Run the research agent
agent = build_research_agent()

result = agent(
    "Research the pricing models of Stripe, Braintree, and Square for payment processing. "
    "Calculate the fee for processing $10,000/month in transactions on each platform."
)

print("\n" + "="*60)
print("RESEARCH COMPLETE")
print("="*60)
print(result["answer"])
print(f"\nCompleted in {result['step_count']} steps")

Concept 08

Agent Failure Modes and How to Prevent Them

Failure 1: Infinite Loops

What happens: The agent gets stuck calling the same tool repeatedly because each result doesn't satisfy its reasoning.
Fix: Always enforce max_steps. Log all tool calls and add a check: if the same tool is called with the same arguments 3 times, break the loop and return what you have.

Failure 2: Hallucinated Tool Calls

What happens: The model calls a tool that doesn't exist, or calls a real tool with made-up arguments that look plausible.
Fix: Validate tool names against your registry before dispatching. Return a clear error message (not a crash) when validation fails. The model usually corrects itself on the next step.

Failure 3: Wrong Tool Selection

What happens: The agent uses calculate to do a web search (i.e., calls the wrong tool for a task).
Fix: Write extremely precise tool descriptions. Include examples of when to use each tool. Add an explicit note: "Use search_web for factual/current info. Use calculate only for math expressions."

Failure 4: Ignoring Tool Results

What happens: The agent calls a tool, gets the result, then generates an answer that contradicts or ignores the result.
Fix: Add to your system prompt: "You MUST incorporate tool results into your reasoning. Never contradict information returned by a tool."

CP-07 Summary

An agent is LLM + loop + tools: it iterates until the task is done
Tools are defined as JSON schemas; the LLM decides when and how to call them
The ReAct loop: Reason → Act → Observe → Repeat until done
Always set max_steps — agents can and do loop forever without a limit
Tool dispatch maps LLM function names to Python functions
Log every tool call and result for debugging — agent behavior is hard to predict

AI Agents: Give Your LLM the Ability to Act, Not Just Talk
Function calling, tool use, and the ReAct loop — with real code

Table of Contents