All apps · 0 apps

ollama

Official

Overview

Official Ollama images for Nvidia or AMD GPUs. Intel Arc GPUs could potentially work with Vulkan enabled. NOTE: Extra Parameters needed for Nvidia: --gpus=all Additionally you can add a Variable called CUDA_VISIBLE_DEVICES to only use specific Nvidia GPU(s) by ID. Example: 0 or 0,1 Extra Parameters needed for AMD: --device='/dev/kfd' --device='/dev/dri' Additionally you can add a Variable called ROCR_VISIBLE_DEVICES to only use specific AMD GPU(s) by UUID. Example: GPU-67246603a11f0a21 or GPU-XXXXXXX,GPU-YYYYYYY Docs: https://docs.ollama.com/

Install Ollama on Unraid in a few clicks.

Find Ollama in Community Apps on your Unraid server, review the template, and click Install. Unraid handles the Docker app or plugin setup from the published template.

Open the Apps tab on your Unraid server Search Community Apps for Ollama Review the template variables and paths Click Install

Explore Unraid OS

Requirements

Nvidia-Driver plugin (nVidia Support)

Radeon-TOP plugin (AMD Support)

Related apps

Explore more like this

Explore all

AI Other

Links

Supporthub.docker.com Docker Hubhub.docker.com Templateraw.githubusercontent.com

Details

Repository

ollama/ollama

Registry

https://hub.docker.com/r/ollama/ollama/

Last Updated2026-06-25

First Seen2023-11-08

Runtime arguments

Web UI: http://[IP]:[PORT:11434]/
Network: bridge
Shell: bash
Privileged: false

Template configuration

DataPathrw

Target: /root/.ollama
Default: /mnt/user/appdata/ollama
Value: /mnt/user/appdata/ollama

API Interface PortPorttcp

Port number where ollama listens on.

Target: 11434
Default: 11434
Value: 11434

OLLAMA_HOSTVariable

IP and Port the server binds to. Set to 127.0.0.1:11434 for internal only access.

Default: 0.0.0.0:11434
Value: 0.0.0.0:11434

OLLAMA_ORIGINSVariable

Comma-separated list of allowed CORS origins.

Default: *
Value: *

OLLAMA_KEEP_ALIVEVariable

How long a model stays in VRAM, e.g. 60m or 24h (Set to -1 for infinite, 0 for none).

Default: 5m
Value: 5m

OLLAMA_LOAD_TIMEOUTVariable

Timeout for stall detection during model loads.

Default: 5m
Value: 5m

OLLAMA_NUM_PARALLELVariable

Max number of parallel requests a single model can handle.

Default: 1
Value: 1

OLLAMA_CONTEXT_LENGTHVariable

Default context window (tokens) if not specified by the model.

Default: 4096
Value: 4096

OLLAMA_KV_CACHE_TYPEVariable

Quantization type for the K/V cache, e.g. f16, q8_0, q4_0.

Default: f16
Value: f16

OLLAMA_MODELSVariable

The path where model weights and blobs are stored.

Default: /root/.ollama/models
Value: /root/.ollama/models

OLLAMA_MAX_LOADED_MODELSVariable

Maximum number of models loaded per GPU at once (Set to 0 for infinite).

Default: 0
Value: 0

OLLAMA_MAX_QUEUEVariable

Max requests that can wait in line when the server is busy.

Default: 512
Value: 512

OLLAMA_DEBUGVariable

Log detail level: 0 for INFO, 1 for DEBUG, 2 for TRACE.

Default: 0|1|2

OLLAMA_GPU_OVERHEADVariable

Reserved VRAM (in bytes) to leave empty on each GPU.

Default: 0
Value: 0

OLLAMA_FLASH_ATTENTIONVariable

Enables experimental Flash Attention optimizations.

Default: false|true

OLLAMA_SCHED_SPREADVariable

If true, always spreads model layers across all visible GPUs.

Default: false|true

OLLAMA_MULTIUSER_CACHEVariable

Optimizes prompt caching when multiple users share a model.

Default: false|true

OLLAMA_NOPRUNEVariable

If true, does not delete unused model blobs on startup.

Default: false|true

OLLAMA_NOHISTORYVariable

Disables the readline history in the interactive CLI.

Default: false|true

OLLAMA_NEW_ENGINEVariable

Enables the experimental new Ollama engine.

Default: false|true

OLLAMA_VULKANVariable

Enables experimental Vulkan hardware acceleration.

Default: false|true

HTTP_PROXYVariable

Proxy for downloading models over HTTP.

HTTPS_PROXYVariable

Proxy for downloading models over HTTPS.

NO_PROXYVariable

Comma-separate list of hosts/IPs that bypass the proxy.

Library

Curated

Categories