Retour aux applications Soumettre une applicationSoumettre

ollama

Officielle

Application Docker from joly0's Repository

Vue d'ensemble

Official Ollama images for Nvidia or AMD GPUs. Intel Arc GPUs could potentially work with Vulkan enabled. NOTE: Extra Parameters needed for Nvidia: --gpus=all Additionally you can add a Variable called CUDA_VISIBLE_DEVICES to only use specific Nvidia GPU(s) by ID. Example: 0 or 0,1 Extra Parameters needed for AMD: --device='/dev/kfd' --device='/dev/dri' Additionally you can add a Variable called ROCR_VISIBLE_DEVICES to only use specific AMD GPU(s) by UUID. Example: GPU-67246603a11f0a21 or GPU-XXXXXXX,GPU-YYYYYYY Docs: https://docs.ollama.com/

Exigences

Nvidia-Driver plugin (nVidia Support)

Radeon-TOP plugin (AMD Support)

Arguments d'exécution

Interface utilisateur Web: http://[IP]:[PORT:11434]/
Réseau: bridge
Coquille: bash
Privilégié: false

Configuration du modèle

DataPathrw

Cible: /root/.ollama
Défaut: /mnt/user/appdata/ollama
Valeur: /mnt/user/appdata/ollama

API Interface PortPorttcp

Port number where ollama listens on.

Cible: 11434
Défaut: 11434
Valeur: 11434

OLLAMA_HOSTVariable

IP and Port the server binds to. Set to 127.0.0.1:11434 for internal only access.

Défaut: 0.0.0.0:11434
Valeur: 0.0.0.0:11434

OLLAMA_ORIGINSVariable

Comma-separated list of allowed CORS origins.

Défaut: *
Valeur: *

OLLAMA_KEEP_ALIVEVariable

How long a model stays in VRAM, e.g. 60m or 24h (Set to -1 for infinite, 0 for none).

Défaut: 5m
Valeur: 5m

OLLAMA_LOAD_TIMEOUTVariable

Timeout for stall detection during model loads.

Défaut: 5m
Valeur: 5m

OLLAMA_NUM_PARALLELVariable

Max number of parallel requests a single model can handle.

Défaut: 1
Valeur: 1

OLLAMA_CONTEXT_LENGTHVariable

Default context window (tokens) if not specified by the model.

Défaut: 4096
Valeur: 4096

OLLAMA_KV_CACHE_TYPEVariable

Quantization type for the K/V cache, e.g. f16, q8_0, q4_0.

Défaut: f16
Valeur: f16

OLLAMA_MODELSVariable

The path where model weights and blobs are stored.

Défaut: /root/.ollama/models
Valeur: /root/.ollama/models

OLLAMA_MAX_LOADED_MODELSVariable

Maximum number of models loaded per GPU at once (Set to 0 for infinite).

Défaut: 0
Valeur: 0

OLLAMA_MAX_QUEUEVariable

Max requests that can wait in line when the server is busy.

Défaut: 512
Valeur: 512

OLLAMA_DEBUGVariable

Log detail level: 0 for INFO, 1 for DEBUG, 2 for TRACE.

Défaut: 0|1|2

OLLAMA_GPU_OVERHEADVariable

Reserved VRAM (in bytes) to leave empty on each GPU.

Défaut: 0
Valeur: 0

OLLAMA_FLASH_ATTENTIONVariable

Enables experimental Flash Attention optimizations.

Défaut: false|true

OLLAMA_SCHED_SPREADVariable

If true, always spreads model layers across all visible GPUs.

Défaut: false|true

OLLAMA_MULTIUSER_CACHEVariable

Optimizes prompt caching when multiple users share a model.

Défaut: false|true

OLLAMA_NOPRUNEVariable

If true, does not delete unused model blobs on startup.

Défaut: false|true

OLLAMA_NOHISTORYVariable

Disables the readline history in the interactive CLI.

Défaut: false|true

OLLAMA_NEW_ENGINEVariable

Enables the experimental new Ollama engine.

Défaut: false|true

OLLAMA_VULKANVariable

Enables experimental Vulkan hardware acceleration.

Défaut: false|true

HTTP_PROXYVariable

Proxy for downloading models over HTTP.

HTTPS_PROXYVariable

Proxy for downloading models over HTTPS.

NO_PROXYVariable

Comma-separate list of hosts/IPs that bypass the proxy.

Liens

Modèle Soutien Hub Docker

Détails

Référentiel

ollama/ollama

Registre

https://hub.docker.com/r/ollama/ollama/

Dernière mise à jour2026-05-27

Première vue2023-11-08

Exécutez Ollama sur Unraid.

Ollama est listé dans Community Apps pour Unraid OS. Explorez Unraid pour créer un serveur domestique flexible, un NAS ou un laboratoire domestique.

Explorez Unraid OS