Ollama

Local model manager with daemon sandboxing and one-command model pulls.

Last updated

Local Inference Go

1. Installation

Prerequisites

  • macOS 13+ (Metal GPU) or Linux (kernel 5.13+)

Install via Homebrew (CLI-only, recommended)

brew install ollama

This gives you the ollama binary without the GUI app. Run as a service with brew services start ollama, or manually with ollama serve.

Or install the macOS app

brew install --cask ollama

Installs Ollama.app which runs the server automatically as a login item.

Install the preferred stack

brew tap nvk/tap
brew install nvk/tap/agent-bondage
brew install nono

# Optional, only if you want server auth via env vars:
brew install nvk/tap/envchain-xtra

Verify

ollama --version
bondage --help
nono --version

Pull a model

ollama pull llama3.2
ollama pull qwen2.5

2. nono Profile

Ollama uses a client-server architecture. ollama serve runs an HTTP server on 127.0.0.1:11434. ollama run connects to it as a client. In the preferred setup, bondage decides which mode you are launching and which nono profile applies.

Server: what the sandbox allows

ResourceAccessWhy
~/.ollama/Read + WriteModels, logs, config, keys
localhost:11434Network bindAPI server
ollama.com / registry.ollama.aiNetworkModel pulls
Metal/Accelerate frameworksReadGPU compute
/tmp/Read + WriteTemp files during model loading

What the sandbox blocks

ResourceWhy blocked
~/.ssh/, ~/.aws/, ~/.gnupg/Credentials
~/Documents/, ~/Desktop/Personal files
All other networkNo lateral movement

Model storage

Models live in ~/.ollama/models/ using content-addressable storage (similar to Docker). Blobs are raw GGUF files named by SHA256 digest. Override with OLLAMA_MODELS env var.

Air-gapped mode: Pull models while online, then block outbound network in the nono profile. Set OLLAMA_NO_CLOUD=true to disable cloud features. The ~/.ollama/models/ directory is portable between machines.
No config files: Ollama is configured almost entirely via env vars. The only config file is ~/.ollama/server.json with a single option: {"disable_ollama_cloud": true}.

3. Optional envchain-xtra

envchain-xtra is optional for Ollama. There are no mandatory API keys for local inference. Two scenarios where it helps:

Enable server authentication

Ollama has an experimental OLLAMA_AUTH flag. If you expose the server beyond localhost, protect it:

envchain --set ollama OLLAMA_AUTH

When you don't need envchain

For the common case — running Ollama locally on 127.0.0.1 — there are no secrets to protect. Skip envchain and use nono alone.

No HuggingFace token: Unlike llama.cpp, Ollama pulls from its own registry (ollama.com). There is no HF_TOKEN integration.

4. bondage Wrapper

For the server (ollama serve)

ollama-serve() {
  bondage exec ollama-serve ~/.config/bondage/bondage.conf -- "$@"
}

Sample stack snippets

Assuming your shared [global] block already exists in ~/.config/bondage/bondage.conf, this is a minimal local Ollama shape to adapt:

# ~/.config/bondage/bondage.conf
[profile "ollama-serve"]
use_envchain = false
use_nono = true
nono_profile = ollama-serve
touch_policy = none
target_kind = native
target = /absolute/path/to/ollama
target_fp = sha256:replace-me
nono_allow_cwd = true
nono_allow_file = /dev/tty
nono_allow_file = /dev/null
nono_read_file = /dev/urandom
{
  "extends": "default",
  "meta": {
    "name": "ollama-serve",
    "description": "Ollama server with writable local model state"
  },
  "policy": {
    "add_deny_access": ["/Volumes"],
    "add_allow_readwrite": [
      "$HOME/.ollama"
    ]
  },
  "workdir": {
    "access": "readwrite"
  }
}

For the client (ollama run)

The client just connects to the server over HTTP. It needs minimal permissions:

ollama-run() {
  bondage exec ollama-run ~/.config/bondage/bondage.conf -- "$@"
}

Usage: ollama-run llama3.2

Combined wrapper

If you want a single ollama command, keep the shell convenience thin and move the real launch policy into profile config:

ollama() {
  bondage exec ollama-run ~/.config/bondage/bondage.conf -- "$@"
}

Reload your shell:

source ~/.zshrc

5. Verification

Test the server

bondage verify ollama-serve ~/.config/bondage/bondage.conf
ollama-serve

# In another terminal, verify it's running
curl http://localhost:11434/
# Should return: "Ollama is running"

Test a model

ollama-run llama3.2
# Should start an interactive chat session

Confirm model storage

# List downloaded models
ollama list

# Check storage location
ls ~/.ollama/models/blobs/ | head -5

Troubleshooting

SymptomCauseFix
"connect: connection refused" Server not running Run ollama-serve first
"permission denied" on model pull nono blocking ~/.ollama/ write Ensure --allow-file ~/.ollama/ in server wrapper
No GPU acceleration Metal not available in sandbox nono allows Metal by default on macOS
Port 11434 already in use Ollama.app running in background Quit Ollama.app or disable login item
Shell name → bondage → [envchain-xtra] → nono → Ollama Convenience Launch policy Optional secrets Kernel sandbox Local daemon/client