Build an AI Agent from Scratch with Node.js and Ollama on Ubuntu

The word “AI agent” sounds complicated. Marketing pages make it sound like magic, and frameworks bury the idea under layers of abstraction. The truth is far simpler, and once you see it written out in code you will never be confused by the term again.

An AI agent is just a language model running in a loop. You give it a task and a set of tools. The model decides which tool to call, your code runs that tool and feeds the result back, and the loop repeats until the model decides the task is finished. That is the whole idea. No framework required.

In this tutorial you will build a working agent from scratch in plain Node.js, using Ollama to run the model locally. The agent accepts a task from the command line, has access to real tools that read and list files on your machine, and keeps working until it produces a final answer. By the end you will understand exactly what happens inside every agent framework, because you will have written the core loop yourself.

This tutorial is for developers who can read JavaScript and have used a terminal, but who have never built an agent before. No machine learning background is needed.

What an Agent Really Is

A plain LLM call is a single shot. You send text, you get text back, and that is the end of the conversation turn. The model cannot look anything up, run code, or check a file. It only knows what was in its training data and what you typed.

An agent adds two things on top of that single call:

Tools. These are normal functions in your code that the model is allowed to ask for. A tool might read a file, query a database, call an API, or run a shell command. You describe each tool to the model so it knows what the tool does and what arguments it takes.

A loop. Instead of calling the model once, you call it repeatedly. On each turn the model can either answer the user directly or ask to use a tool. If it asks for a tool, your code runs the tool, appends the result to the conversation, and calls the model again. When the model finally responds with plain text instead of a tool request, the loop ends.

Here is the entire flow in plain language:

Send the user’s task, plus the list of available tools, to the model.
The model replies. Either it gives a final answer, or it requests one or more tool calls.
If it requested tools, run them, add the results to the conversation, and go back to step 1.
If it gave a final answer, print it and stop.

That repeating decision is what makes the behavior feel intelligent. The model is not following a script you wrote. It is choosing, turn by turn, what to do next based on what it has learned so far. If a RAG pipeline is about feeding the model the right context once, an agent is about letting the model gather its own context over several steps. If you have not seen the single-shot retrieval pattern yet, the end-to-end RAG pipeline guide is a good companion to this one.

How Tool Calling Works Under the Hood

You might wonder how a text model can “call” a function. It cannot, really. What happens is a structured agreement between you and the model.

When you send a request to Ollama, you include a list of tool definitions. Each definition is a JSON description: the tool name, what it does, and the arguments it accepts. The model, which has been trained to recognize these definitions, can respond with a special structured output that says “I want to call the tool read_file with the argument path set to notes.txt.”

Your code receives that structured response, matches the name to a real JavaScript function, runs it, and sends the return value back to the model as a new message. The model never touches your filesystem directly. It only emits a request, and your code decides whether and how to honor it. This separation is important, and we will return to it in the best practices section.

Prerequisites

Ubuntu 20.04, 22.04, or 24.04
Node.js 18 or newer, check with node --version
Ollama installed and running

If you do not have Node.js yet:

curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt install -y nodejs

If you do not have Ollama yet:

curl -fsSL https://ollama.com/install.sh | sh

For agents you want a model that was trained for tool calling. Not every model supports it well. llama3.1 is a reliable choice and runs comfortably on a machine with 8 GB of RAM. Pull it now:

ollama pull llama3.1

Confirm Ollama is serving on its default port:

curl http://localhost:11434/api/tags

You should see a JSON list that includes llama3.1. If the request hangs or refuses, start the service with sudo systemctl start ollama and try again.

Step 1: Set Up the Project

Create a project folder and initialize it. We use the official ollama package for Node.js, which gives us a clean way to send tool definitions and read tool calls back.

mkdir ~/ollama-agent && cd ~/ollama-agent
npm init -y
npm install ollama

Open package.json and add the module type so we can use modern import syntax:

{
  "type": "module"
}

Create a sandbox folder with a sample file so the agent has something real to work with later:

mkdir workspace
echo "The deployment runs on three nodes behind an Nginx load balancer." > workspace/notes.txt
echo "Backups happen nightly at 02:00 using restic." >> workspace/notes.txt

Step 2: Define the Tools

Tools are two things working together: a JSON description that the model reads, and a JavaScript function that actually does the work. Keep them in one file so they never drift apart.

Create tools.js:

import fs from "fs/promises";
import path from "path";

// All file access is locked to this folder for safety.
const ROOT = path.resolve("./workspace");

function safePath(userPath) {
  const resolved = path.resolve(ROOT, userPath || ".");
  if (!resolved.startsWith(ROOT)) {
    throw new Error("Access outside the workspace is not allowed");
  }
  return resolved;
}

// The actual implementations.
export const toolFunctions = {
  async list_files({ directory }) {
    const target = safePath(directory);
    const entries = await fs.readdir(target, { withFileTypes: true });
    return entries
      .map((e) => (e.isDirectory() ? e.name + "/" : e.name))
      .join("\n");
  },

  async read_file({ filename }) {
    const target = safePath(filename);
    return await fs.readFile(target, "utf-8");
  },
};

// The descriptions the model reads to decide what to call.
export const toolSchemas = [
  {
    type: "function",
    function: {
      name: "list_files",
      description: "List the files and folders inside a directory.",
      parameters: {
        type: "object",
        properties: {
          directory: {
            type: "string",
            description: "Directory to list. Defaults to the workspace root.",
          },
        },
        required: [],
      },
    },
  },
  {
    type: "function",
    function: {
      name: "read_file",
      description: "Read the full text content of a file.",
      parameters: {
        type: "object",
        properties: {
          filename: {
            type: "string",
            description: "Name of the file to read, relative to the workspace.",
          },
        },
        required: ["filename"],
      },
    },
  },
];

Two details matter here. First, every path goes through safePath, which refuses anything outside the workspace folder. The model can request any path it wants, but your code is the gatekeeper. Second, the description fields are written for the model to read. Clear descriptions are what let the model pick the right tool, so treat them as part of your prompt, not as throwaway comments.

Step 3: Write the Agent Loop

This is the heart of the tutorial. Create agent.js:

import ollama from "ollama";
import { toolFunctions, toolSchemas } from "./tools.js";

const MODEL = "llama3.1";
const MAX_STEPS = 10;

async function runAgent(task) {
  const messages = [
    {
      role: "system",
      content:
        "You are a helpful assistant with access to tools that read files. " +
        "Use the tools to gather information before answering. " +
        "When you have enough information, give a clear final answer.",
    },
    { role: "user", content: task },
  ];

  for (let step = 1; step <= MAX_STEPS; step++) {
    const response = await ollama.chat({
      model: MODEL,
      messages,
      tools: toolSchemas,
    });

    const message = response.message;
    messages.push(message); // remember what the model said

    // No tool calls means the model is done.
    if (!message.tool_calls || message.tool_calls.length === 0) {
      return message.content;
    }

    // Run every tool the model asked for, in order.
    for (const call of message.tool_calls) {
      const name = call.function.name;
      const args = call.function.arguments;
      console.log(`  [tool] ${name}(${JSON.stringify(args)})`);

      let result;
      try {
        const fn = toolFunctions[name];
        result = fn ? await fn(args) : `Unknown tool: ${name}`;
      } catch (err) {
        result = `Error: ${err.message}`;
      }

      // Feed the tool result back into the conversation.
      messages.push({ role: "tool", content: String(result) });
    }
  }

  return "Stopped: reached the maximum number of steps.";
}

// Accept the task from the command line.
const task = process.argv.slice(2).join(" ");
if (!task) {
  console.error('Usage: node agent.js "your task here"');
  process.exit(1);
}

console.log(`Task: ${task}\n`);
const answer = await runAgent(task);
console.log(`\nAnswer:\n${answer}`);

Read the loop carefully, because every agent framework in existence is a more elaborate version of these forty lines.

The messages array is the agent’s memory. It starts with a system prompt that explains the agent’s job, followed by the user’s task. On each turn we call ollama.chat with the whole history and the tool schemas. We push the model’s reply onto the history no matter what, so it remembers its own decisions.

The single most important line is the check for tool_calls. If the model replied with plain content and no tool requests, the task is finished and we return the answer. If it requested tools, we run each one, capture the output or the error, and append it to the history as a tool message. Then the loop continues, the model sees the new tool results, and it decides what to do next.

The MAX_STEPS guard prevents an infinite loop. A confused model could otherwise keep calling tools forever, so we cap the number of turns. This is a safety net you should never remove.

Step 4: Run the Agent

Give the agent a task that forces it to use its tools:

node agent.js "What does the notes file say about backups?"

You will see output similar to this:

Task: What does the notes file say about backups?

  [tool] list_files({})
  [tool] read_file({"filename":"notes.txt"})

Answer:
The notes file says that backups happen nightly at 02:00 using restic.

Watch what the model did. You never told it the filename. It first called list_files to discover what was available, saw notes.txt, then called read_file to read it, and only then answered. Each of those was a separate trip through the loop. That sequence of decisions, made by the model and not by your code, is the agent behavior everyone talks about.

Try a task that needs more reasoning:

node agent.js "Summarize everything you can learn about the infrastructure from the files in the workspace."

The agent will list the directory, read each file it finds, and then write a summary that combines them. Add more files to workspace and run it again. The code does not change. The agent simply does more work.

Common Mistakes and Troubleshooting

The model answers without calling any tools. Smaller or older models sometimes ignore tools and guess instead. Make sure you pulled a tool-capable model like llama3.1, and strengthen the system prompt with an explicit instruction such as “Do not guess. Always read the relevant file before answering.” If a model still refuses to use tools reliably, switch models. Tool support varies a lot.

message.tool_calls is undefined. This is normal and expected on the final turn, when the model gives its answer. Your loop must treat a missing tool_calls as the signal to stop, which the code above already does. Only treat it as a bug if it happens on the very first turn for a task that clearly needs a tool.

The arguments come back as a string instead of an object. The ollama Node package parses tool arguments into a real object for you. If you call the raw HTTP API directly instead, you may receive a JSON string that you must parse yourself with JSON.parse. Stick with the package to avoid this.

The agent loops until it hits MAX_STEPS. This usually means a tool keeps returning an error that the model does not know how to recover from, so it tries again and again. Print every tool result while debugging. Often the fix is a clearer error message from the tool itself, since the model reads that message and decides what to do next.

Connection refused on port 11434. Ollama is not running. Start it with sudo systemctl start ollama, or run ollama serve in a separate terminal, then retry.

Best Practices

Never let the model touch raw resources. The model only emits requests. Your code decides what to honor. The safePath guard in this tutorial is a small example of a large principle: validate every tool argument as if it came from an untrusted user, because in effect it did. Be especially careful before giving an agent a tool that runs shell commands, deletes files, or makes network calls.

Write tool descriptions like prompts. The model chooses tools based entirely on their description text and parameter names. Vague descriptions lead to wrong tool choices. Spend real effort here. A good description states what the tool does, when to use it, and what it returns.

Keep tools small and single-purpose. One tool that reads a file and one that lists a directory are easier for the model to use correctly than a single tool with a mode flag. Small tools also keep your error handling simple.

Always cap the loop. The MAX_STEPS limit is not optional. Without it a malfunctioning model can run forever and rack up compute. In production you may also want a wall-clock timeout and a budget on total tokens.

Return errors as text, not exceptions. Notice that the loop catches tool errors and feeds the message back to the model instead of crashing. This lets the agent recover, for example by trying a different filename after a “file not found” result. An agent that crashes on the first error is far less useful than one that reads the error and adjusts.

Pin a temperature for predictable behavior. Agents that make decisions usually benefit from a low temperature so they stay focused. You can set it in the ollama.chat call with options: { temperature: 0.2 }. The trade-offs are explained in the guide to temperature, top-p, and top-k.

Conclusion

You have built a real AI agent in under a hundred lines of plain Node.js, with no agent framework anywhere in sight. It takes a task from the command line, decides on its own which tools to call, reads files through those tools, feeds the results back to itself, and keeps going until the task is done.

More importantly, you now know the secret that the word “agent” hides: it is an LLM in a loop that calls tools until it finishes. Every framework you will ever use is built on exactly the loop you wrote in agent.js. The schemas might be generated for you, the memory might be stored in a database, and the tools might number in the dozens, but the core never changes.

From here, the natural next steps are to add new tools, a write_file tool to let the agent make changes, a tool that calls an external API, or a tool that runs a carefully sandboxed shell command. You can also turn the single-task script into an interactive chat by wrapping runAgent in a readline loop so the conversation continues across multiple prompts. The pattern stays the same. You add capabilities by adding tools, and the agent figures out the rest.