Engineering7 min read

Deep Dive: Building a Self-Hosted AI Agent with Ollama and Open WebUI

Run local AI like ChatGPT entirely offline. Ollama + Open WebUI gives you a self-hosted, private, multi-model interface with powerful customization. This guide shows you how to install, configure, and build your own agent step-by-step. No cloud. No limits.

Tega Adeyemi
Tega Adeyemi
Deep Dive: Building a Self-Hosted AI Agent with Ollama and Open WebUI

In the fast-evolving world of self-hosted AI, combining the model management capabilities of Ollama with the interactive power of Open WebUI creates an ecosystem where you run large language models (LLMs) entirely offline. This guide will walk you through not only the installation and basic setup but also advanced configurations, troubleshooting, and custom extensions to make your AI agent truly unique.

Self-hosted is the easy half; making an agent reliable enough to ship is the harder half we teach in Cohorte's Building Accountable AI Agents course (E3).

1. Why Choose Ollama and Open WebUI?

Presentation Benefits

Real-World Use Cases

2. Supported Models and Advanced Options

Ollama acts as your model manager, letting you easily pull models from its library. Popular models include:

With Open WebUI’s built-in pipelines and tools integration, you can even combine multiple models or integrate functions (like web search, code execution, or data retrieval) to create richer interactions.

3. Getting Started: Installation & Setup

A. Installing via Docker

Using Docker is the quickest way to get started because it bundles dependencies and simplifies environment management. For example, to install Open WebUI bundled with Ollama (CPU-only), run:

docker run -d -p 3000:8080 \
  -v ollama:/root/.ollama \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:ollama

If your Ollama instance resides on another server, update the OLLAMA_BASE_URL environment variable:

docker run -d -p 3000:8080 \
  -e OLLAMA_BASE_URL=https://example.com \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

Tip: If you’re using a GPU-enabled setup, replace the image tag with :cuda and add --gpus all to the command. This approach is documented in the Open WebUI Quick Start guide.

B. Manual Installation via pip and uv

For users who prefer a non-Docker approach, install via pip with Python 3.11:

pip install open-webui
open-webui serve

For robust environment management, the recommended method is to use the uv runtime manager. On macOS/Linux, for example:

DATA_DIR=~/.open-webui uvx --python 3.11 open-webui@latest serve

This method isolates dependencies and minimizes conflicts, a practice highlighted in the official Open WebUI documentation (​docs.openwebui.com).

C. Configuring Advanced Networking

If you plan to expose your Open WebUI interface externally:

server {
    listen 80;
    server_name openwebui.example.com;

    location / {
        proxy_pass http://127.0.0.1:3000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

hen use Certbot to obtain SSL certificates and secure your setup (see detailed steps in various guides like those on Vultr Docs docs.vultr.com).

4. Running Your First Model

Once installed, access Open WebUI at http://localhost:3000. During the first run, you’ll need to:

  1. Create an Administrator Account: Follow the on-screen registration process.
  2. Download a Model: Click on the settings icon, navigate to “Models”, and select a model (e.g., gemma:2b or llama2). Open WebUI will prompt you to download the model from Ollama (details are available in the Getting Started guide).
  3. Test the Model: In the chat window, select your model and enter a prompt like “What is the future of AI?” to see it in action.

5. Building a Custom AI Agent: A Step-by-Step Example

A. Basic Command-Line Agent

Below is an extended example in Python to create a simple agent. This agent sends a prompt to an Ollama model and retrieves the response:

import subprocess

def run_model(prompt: str, model: str = "llama2") -> str:
    """
    Run the specified model via Ollama and return its response.

    :param prompt: The prompt to send to the model.
    :param model: The model tag (default: "llama2").
    :return: The model's response.
    """
    # Construct the command to run the model using Ollama CLI
    command = ["ollama", "run", model]
    try:
        # Launch the process and provide the prompt as input
        process = subprocess.Popen(
            command,
            stdin=subprocess.PIPE,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            text=True
        )
        stdout, stderr = process.communicate(input=prompt, timeout=30)
        if process.returncode != 0:
            return f"Error: {stderr.strip()}"
        return stdout.strip()
    except Exception as e:
        return f"Exception occurred: {e}"

# Example usage
if __name__ == "__main__":
    user_prompt = "Tell me a creative short story about the future of AI."
    response = run_model(user_prompt)
    print("Agent Response:", response)

Deep Dive:

B. Extending the Agent with Tools and Pipelines

For advanced users, integrate your agent with Open WebUI’s native tools. For instance, add a web search capability:

from duckduckgo_search import DDGS

def search_web(query: str) -> str:
    """
    Search the web using DuckDuckGo and return top 3 results.
    
    :param query: The search query.
    :return: A formatted string of results.
    """
    try:
        results = DDGS().text(query, max_results=3)
        return "\n".join([f"Title: {r['title']}\nURL: {r['href']}" for r in results])
    except Exception as e:
        return f"Web search error: {e}"

# Example integration
if __name__ == "__main__":
    search_query = "latest trends in AI"
    search_results = search_web(search_query)
    print("Search Results:\n", search_results)

Such tools can be integrated into Open WebUI as part of a larger pipeline, allowing your AI agent to augment its responses with live data.

6. Troubleshooting & Advanced Customization

Common Issues and Fixes

docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Customizing the Interface

7. Final Thoughts and Future Directions

Combining Ollama with Open WebUI empowers you with a fully customizable, local AI platform that adapts to both personal and enterprise needs. Here are a few takeaways:

Looking Ahead

Whether you’re an individual developer or part of a large organization, this deep dive into using Ollama with Open WebUI offers the insight needed to build robust, self-hosted AI applications. Experiment, extend, and enjoy the journey of creating your own AI assistant!

Happy coding and exploring your AI ecosystem!

Tega AdeyemiMarch 31, 2025