AI Agents Fundamentals: Build Your First Agent from Scratch
Master AI agents from the ground up. Learn the agent loop, build a working agent in pure Python, and understand the foundations that power LangGraph and CrewAI.
Moshiour Rahman
Advertisement
AI Agents Mastery Series
This is Part 1 of our comprehensive AI Agents series. By the end of this series, you’ll go from zero to building production-ready multi-agent systems.
| Part | Topic | Level |
|---|---|---|
| 1 | Fundamentals - Build from Scratch | Beginner |
| 2 | LangGraph Deep Dive | Intermediate |
| 3 | Local LLMs with Ollama | Intermediate |
| 4 | Tool-Using Agents | Intermediate |
| 5 | Multi-Agent Systems | Advanced |
| 6 | Production Deployment | Advanced |
What Are AI Agents?
AI agents are autonomous programs that use Large Language Models (LLMs) to perceive, reason, and act to accomplish goals. Unlike simple chatbots that just respond, agents can:
- Break complex tasks into steps
- Use external tools (search, code, APIs)
- Remember context across interactions
- Make decisions and course-correct
The Agent Loop
Every AI agent follows the same fundamental loop:
┌─────────────────────────────────────────┐
│ │
│ ┌─────────┐ ┌─────────┐ ┌────┐ │
│ │ PERCEIVE│ -> │ THINK │ -> │ ACT│ │
│ └─────────┘ └─────────┘ └────┘ │
│ ^ │ │
│ └───────────────────────────┘ │
│ │
└─────────────────────────────────────────┘
| Phase | What Happens | Example |
|---|---|---|
| Perceive | Agent receives input + context | User asks “What’s the weather in Tokyo?” |
| Think | LLM reasons about what to do | ”I need to call a weather API” |
| Act | Execute action or respond | Calls weather tool, returns result |
Why Agents Matter in 2025
The numbers tell the story:
| Statistic | Source |
|---|---|
| 25% of YC W25 startups have 95% AI-generated code | Y Combinator |
| 70,000+ new AI projects on GitHub in 2024 | GitHub Octoverse |
| AI mentioned in 25% of job postings | HN Hiring Trends |
Agents are the next evolution—they don’t just generate code, they execute workflows autonomously.
Build Your First Agent (Pure Python)
Let’s build a working agent from scratch—no frameworks, just Python and an LLM API. This teaches you what’s happening under the hood.
Project Setup
mkdir ai-agent-scratch && cd ai-agent-scratch
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install openai python-dotenv
Create .env:
OPENAI_API_KEY=your-key-here
The Simplest Possible Agent
# simple_agent.py
import os
from openai import OpenAI
from dotenv import load_dotenv
load_dotenv()
client = OpenAI()
def simple_agent(goal: str) -> str:
"""The simplest possible AI agent - just thinks and responds."""
messages = [
{
"role": "system",
"content": """You are a helpful AI agent.
Think step-by-step about how to accomplish the user's goal.
Explain your reasoning, then provide your answer."""
},
{"role": "user", "content": goal}
]
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
temperature=0.7
)
return response.choices[0].message.content
# Test it
if __name__ == "__main__":
result = simple_agent("Explain quantum computing in simple terms")
print(result)
This works, but it’s just a chatbot. Real agents need tools.
Adding Tools: The ReAct Pattern
The ReAct (Reasoning + Acting) pattern is how agents decide when to think vs. when to use tools.
Thought: I need to find the current weather
Action: weather_tool("Tokyo")
Observation: 72°F, partly cloudy
Thought: Now I have the information
Action: respond("The weather in Tokyo is 72°F and partly cloudy")
Building a Tool-Using Agent
# react_agent.py
import os
import json
import re
from openai import OpenAI
from dotenv import load_dotenv
from datetime import datetime
load_dotenv()
client = OpenAI()
# Define our tools
TOOLS = {
"get_current_time": {
"description": "Get the current date and time",
"parameters": []
},
"calculate": {
"description": "Perform mathematical calculations",
"parameters": ["expression"]
},
"search_knowledge": {
"description": "Search for information about a topic",
"parameters": ["query"]
}
}
def get_current_time() -> str:
"""Returns current date and time."""
return datetime.now().strftime("%Y-%m-%d %H:%M:%S")
def calculate(expression: str) -> str:
"""Safely evaluate math expressions."""
try:
# Only allow safe math operations
allowed = set('0123456789+-*/.() ')
if all(c in allowed for c in expression):
return str(eval(expression))
return "Error: Invalid expression"
except Exception as e:
return f"Error: {str(e)}"
def search_knowledge(query: str) -> str:
"""Simulated knowledge search (replace with real API)."""
knowledge_base = {
"python": "Python is a high-level programming language known for readability.",
"ai agents": "AI agents are autonomous programs that use LLMs to accomplish goals.",
"langgraph": "LangGraph is a framework for building stateful AI agents as graphs.",
}
query_lower = query.lower()
for key, value in knowledge_base.items():
if key in query_lower:
return value
return f"No specific information found for: {query}"
def execute_tool(tool_name: str, args: dict) -> str:
"""Execute a tool and return the result."""
if tool_name == "get_current_time":
return get_current_time()
elif tool_name == "calculate":
return calculate(args.get("expression", ""))
elif tool_name == "search_knowledge":
return search_knowledge(args.get("query", ""))
return f"Unknown tool: {tool_name}"
def create_system_prompt() -> str:
"""Create the system prompt with tool descriptions."""
tools_desc = "\n".join([
f"- {name}: {info['description']} (params: {info['parameters']})"
for name, info in TOOLS.items()
])
return f"""You are an AI agent that can use tools to accomplish tasks.
Available tools:
{tools_desc}
When you need to use a tool, respond EXACTLY in this format:
THOUGHT: [your reasoning]
ACTION: [tool_name]
PARAMS: {{"param_name": "value"}}
When you have the final answer, respond with:
THOUGHT: [your reasoning]
ANSWER: [your final response to the user]
Always think step-by-step. Use tools when needed."""
def parse_agent_response(response: str) -> dict:
"""Parse the agent's response to extract thoughts, actions, or answers."""
result = {"type": None, "thought": None, "action": None, "params": None, "answer": None}
# Extract thought
thought_match = re.search(r'THOUGHT:\s*(.+?)(?=ACTION:|ANSWER:|$)', response, re.DOTALL)
if thought_match:
result["thought"] = thought_match.group(1).strip()
# Check for action
action_match = re.search(r'ACTION:\s*(\w+)', response)
if action_match:
result["type"] = "action"
result["action"] = action_match.group(1)
# Extract params
params_match = re.search(r'PARAMS:\s*({.+?})', response, re.DOTALL)
if params_match:
try:
result["params"] = json.loads(params_match.group(1))
except json.JSONDecodeError:
result["params"] = {}
# Check for answer
answer_match = re.search(r'ANSWER:\s*(.+?)$', response, re.DOTALL)
if answer_match:
result["type"] = "answer"
result["answer"] = answer_match.group(1).strip()
return result
class ReActAgent:
"""A ReAct-style AI agent that can use tools."""
def __init__(self, max_iterations: int = 5):
self.max_iterations = max_iterations
self.conversation_history = []
def run(self, goal: str) -> str:
"""Run the agent to accomplish a goal."""
print(f"\n{'='*60}")
print(f"GOAL: {goal}")
print('='*60)
# Initialize conversation
self.conversation_history = [
{"role": "system", "content": create_system_prompt()},
{"role": "user", "content": f"Goal: {goal}"}
]
for iteration in range(self.max_iterations):
print(f"\n--- Iteration {iteration + 1} ---")
# Get agent response
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=self.conversation_history,
temperature=0
)
agent_message = response.choices[0].message.content
self.conversation_history.append({"role": "assistant", "content": agent_message})
# Parse the response
parsed = parse_agent_response(agent_message)
if parsed["thought"]:
print(f"THOUGHT: {parsed['thought']}")
# If agent wants to take an action
if parsed["type"] == "action":
print(f"ACTION: {parsed['action']}")
print(f"PARAMS: {parsed['params']}")
# Execute the tool
observation = execute_tool(parsed["action"], parsed["params"] or {})
print(f"OBSERVATION: {observation}")
# Add observation to conversation
self.conversation_history.append({
"role": "user",
"content": f"OBSERVATION: {observation}"
})
# If agent has final answer
elif parsed["type"] == "answer":
print(f"\nFINAL ANSWER: {parsed['answer']}")
return parsed["answer"]
return "Max iterations reached without finding an answer."
# Test the agent
if __name__ == "__main__":
agent = ReActAgent(max_iterations=5)
# Test 1: Math calculation
agent.run("What is 15% of 250, and what time is it?")
# Test 2: Knowledge search
agent.run("What is LangGraph and why should I learn it?")
Running the Agent
python react_agent.py
Expected output:
============================================================
GOAL: What is 15% of 250, and what time is it?
============================================================
--- Iteration 1 ---
THOUGHT: I need to calculate 15% of 250 and get the current time.
ACTION: calculate
PARAMS: {"expression": "250 * 0.15"}
OBSERVATION: 37.5
--- Iteration 2 ---
THOUGHT: Now I need to get the current time.
ACTION: get_current_time
PARAMS: {}
OBSERVATION: 2024-12-03 14:32:15
--- Iteration 3 ---
THOUGHT: I have both pieces of information now.
ANSWER: 15% of 250 is 37.5, and the current time is 2024-12-03 14:32:15
FINAL ANSWER: 15% of 250 is 37.5, and the current time is 2024-12-03 14:32:15
Understanding What We Built
Let’s break down the key components:
1. Tool Registry
TOOLS = {
"tool_name": {
"description": "What it does",
"parameters": ["param1", "param2"]
}
}
Tools are functions the agent can call. The descriptions help the LLM understand when to use each tool.
2. The System Prompt
The prompt teaches the agent:
- What tools are available
- How to format tool calls
- When to provide final answers
3. The Agent Loop
for iteration in range(max_iterations):
response = llm.generate(conversation)
parsed = parse_response(response)
if parsed.is_action:
observation = execute_tool(parsed.action)
conversation.add(observation)
elif parsed.is_answer:
return parsed.answer
This is the Perceive → Think → Act loop in code.
Agent Memory: Short-Term vs Long-Term
Our agent already has short-term memory via conversation_history. Let’s add long-term memory:
# memory_agent.py
import json
from pathlib import Path
class AgentMemory:
"""Persistent memory for AI agents."""
def __init__(self, memory_file: str = "agent_memory.json"):
self.memory_file = Path(memory_file)
self.memories = self._load_memories()
def _load_memories(self) -> dict:
"""Load memories from disk."""
if self.memory_file.exists():
return json.loads(self.memory_file.read_text())
return {"facts": [], "conversations": []}
def _save_memories(self):
"""Save memories to disk."""
self.memory_file.write_text(json.dumps(self.memories, indent=2))
def remember_fact(self, fact: str):
"""Store a fact for later recall."""
if fact not in self.memories["facts"]:
self.memories["facts"].append(fact)
self._save_memories()
def recall_facts(self, query: str, limit: int = 5) -> list:
"""Recall relevant facts (simple keyword matching)."""
query_words = set(query.lower().split())
scored_facts = []
for fact in self.memories["facts"]:
fact_words = set(fact.lower().split())
overlap = len(query_words & fact_words)
if overlap > 0:
scored_facts.append((overlap, fact))
scored_facts.sort(reverse=True)
return [fact for _, fact in scored_facts[:limit]]
def save_conversation(self, goal: str, result: str):
"""Save a conversation summary."""
self.memories["conversations"].append({
"goal": goal,
"result": result,
"timestamp": datetime.now().isoformat()
})
self._save_memories()
# Integrate with our agent
class MemoryReActAgent(ReActAgent):
"""ReAct agent with persistent memory."""
def __init__(self, max_iterations: int = 5):
super().__init__(max_iterations)
self.memory = AgentMemory()
def run(self, goal: str) -> str:
# Recall relevant memories
relevant_facts = self.memory.recall_facts(goal)
if relevant_facts:
memory_context = "\n\nRelevant memories:\n" + "\n".join(f"- {f}" for f in relevant_facts)
goal = goal + memory_context
# Run the agent
result = super().run(goal)
# Save the conversation
self.memory.save_conversation(goal, result)
return result
Key Concepts Summary
| Concept | Description | Our Implementation |
|---|---|---|
| Agent Loop | Perceive → Think → Act cycle | for iteration in range(max_iterations) |
| Tools | Functions the agent can call | TOOLS dictionary + execute_tool() |
| ReAct | Reasoning + Acting pattern | Parse THOUGHT/ACTION/ANSWER |
| Short-term Memory | Current conversation context | conversation_history list |
| Long-term Memory | Persistent knowledge storage | AgentMemory class |
Common Pitfalls
| Problem | Cause | Solution |
|---|---|---|
| Agent loops forever | No termination condition | Set max_iterations |
| Tool calls fail | Bad parameter parsing | Validate JSON strictly |
| Hallucinated tools | LLM invents fake tools | List tools explicitly in prompt |
| Context overflow | Too much history | Summarize old messages |
What’s Next?
In Part 2, we’ll upgrade from our custom implementation to LangGraph—the industry-standard framework for building production AI agents. You’ll learn:
- State machines for complex workflows
- Built-in tool handling
- Conditional routing
- Checkpointing and persistence
Continue to Part 2: LangGraph Deep Dive →
Full Code Repository
All code from this series is available on GitHub:
git clone https://github.com/Moshiour027/ai-agents-mastery.git
cd ai-agents-mastery/01-fundamentals
pip install -r requirements.txt
python react_agent.py
Summary
| Topic | Key Takeaway |
|---|---|
| What are agents | Autonomous LLM programs that perceive, think, and act |
| Agent loop | Continuous cycle until goal is achieved |
| Tools | External functions that extend agent capabilities |
| ReAct pattern | Structured thinking before acting |
| Memory | Short-term (conversation) + long-term (persistent) |
You’ve built a working AI agent from scratch. This foundation will help you understand frameworks like LangGraph and CrewAI at a deeper level—you’ll know exactly what’s happening under the hood.
Advertisement
Moshiour Rahman
Software Architect & AI Engineer
Enterprise software architect with deep expertise in financial systems, distributed architecture, and AI-powered applications. Building large-scale systems at Fortune 500 companies. Specializing in LLM orchestration, multi-agent systems, and cloud-native solutions. I share battle-tested patterns from real enterprise projects.
Related Articles
Tool-Using AI Agents: Web Search, Code Execution & API Integration
Build powerful AI agents with real-world tools. Learn to integrate web search, execute code safely, work with files, and connect to external APIs using LangGraph.
PythonMulti-Agent Systems: Build AI Teams with CrewAI & LangGraph
Master multi-agent orchestration with CrewAI and LangGraph. Build specialized AI teams that collaborate, delegate tasks, and solve complex problems together.
PythonLangGraph Deep Dive: Build AI Agents as State Machines
Master LangGraph for building production AI agents. Learn state graphs, conditional routing, cycles, and persistence patterns with hands-on examples.
Comments
Comments are powered by GitHub Discussions.
Configure Giscus at giscus.app to enable comments.