How to Build a Local AI Agent Using DeepSeek and Ollama: A Step-by-Step Guide

Learn how to set up DeepSeek with Ollama to run AI models locally, ensuring privacy, cost efficiency, and fast inference. This guide walks you through installation, setup, and building a simple AI agent with practical code examples.

DeepSeek is an open‑source language model framework focused on high‑quality reasoning and coding tasks. When paired with Ollama, a tool designed to run AI models locally, it empowers developers to deploy state‑of‑the‑art models—like the distilled DeepSeek-R1—in a local environment. This setup not only enhances data privacy but also dramatically reduces dependency on cloud APIs, leading to faster inference times and lower operational costs.

1. Presentation of the Framework

DeepSeek leverages distilled versions of larger models (e.g., Qwen or Llama‑based variants) to bring advanced reasoning capabilities into devices with limited hardware resources. Ollama acts as the model manager by:

Simplifying Model Downloads: It provides an easy-to-use CLI to pull models directly.

Local Inference: Running models locally means that data never leaves your device.

API Integration: Ollama exposes a simple REST API and Python client to interact with your model.

This framework is ideal for developers who require fast, private, and cost-efficient AI applications.

2. Benefits of Using DeepSeek with Ollama

• Privacy & Security:

Run inference entirely on your own hardware without sending sensitive data to external servers.

• Cost Efficiency:

Eliminate recurring API fees. Once installed, your local model runs without incurring additional costs.

• Performance:

Local processing significantly reduces latency, allowing for real-time interactions.

• Flexibility:

Choose from a variety of model sizes (from 1.5B to even 671B parameters) based on your hardware capabilities.

3. Getting Started: Installation & Setup

Step 1: Install Ollama

Download and install Ollama from the official website or via the command line. For example, on Linux you can use:

curl -fsSL https://ollama.com/install.sh | sh

(Installation instructions may vary by OS.)

Step 2: Download and Run the DeepSeek Model

Once Ollama is installed, pull and run a distilled version of DeepSeek-R1. For demo purposes, you can start with the smaller 1.5B model:

ollama run deepseek-r1:1.5b

This command downloads the model and starts it locally. If your hardware supports larger models, replace 1.5b with your desired parameter size (e.g., 7b, 14b, etc.).

Step 3: Verify the Installation

To ensure the model is running, you can try a simple API call using a tool like curl:

curl http://localhost:11434/api/chat -d '{
  "model": "deepseek-r1:1.5b",
  "messages": [{"role": "user", "content": "Hello, how are you?"}],
  "stream": false
}'

You should receive a response from DeepSeek-R1 confirming that it’s operational.

4. Building a Simple AI Agent: Step-by-Step Example

Below is an example in Python that demonstrates how to build a simple chat agent using the Ollama Python library.

Code Snippet: A Basic Chat Agent

import ollama

def chat_with_deepseek(prompt):
    # Use the DeepSeek-R1 1.5B model to process the user prompt.
    response = ollama.chat(
        model="deepseek-r1:1.5b",
        messages=[{"role": "user", "content": prompt}]
    )
    return response["message"]["content"]

# Example usage
if __name__ == "__main__":
    print("DeepSeek Agent is running. Type 'exit' to quit.")
    while True:
        user_input = input("User: ")
        if user_input.lower() in ["exit", "quit"]:
            break
        answer = chat_with_deepseek(user_input)
        print("Agent:", answer)

Explanation

Importing Ollama: The script starts by importing the Ollama Python package.

chat_with_deepseek Function: This function sends a user prompt to the DeepSeek model and returns the generated response.

Interactive Loop: The main block sets up a continuous loop to interact with the AI agent until the user types an exit command.

This simple yet powerful agent can be further extended by integrating additional features such as retrieval-augmented generation, custom prompts, or even connecting to a local vector database for enhanced context.

5. Final Thoughts

By combining DeepSeek’s advanced reasoning capabilities with Ollama’s streamlined local deployment, developers can create robust, privacy‑focused AI agents that run efficiently on local hardware. This setup not only reduces latency and costs but also provides complete control over data and inference processes. This step‑by‑step guide offers a solid foundation to get started.

Cohorte Team

March 10, 2025