Volver a las aplicaciones Presentar una aplicaciónEnviar

ollama

Oficial

Aplicación Docker from joly0's Repository

Visión general

Official Ollama images for Nvidia or AMD GPUs. Intel Arc GPUs could potentially work with Vulkan enabled. NOTE: Extra Parameters needed for Nvidia: --gpus=all Additionally you can add a Variable called CUDA_VISIBLE_DEVICES to only use specific Nvidia GPU(s) by ID. Example: 0 or 0,1 Extra Parameters needed for AMD: --device='/dev/kfd' --device='/dev/dri' Additionally you can add a Variable called ROCR_VISIBLE_DEVICES to only use specific AMD GPU(s) by UUID. Example: GPU-67246603a11f0a21 or GPU-XXXXXXX,GPU-YYYYYYY Docs: https://docs.ollama.com/

Requisitos

Nvidia-Driver plugin (nVidia Support)

Radeon-TOP plugin (AMD Support)

Argumentos en tiempo de ejecución

Interfaz web: http://[IP]:[PORT:11434]/
Red: bridge
Concha: bash
Privilegiado: false

Configuración de plantillas

DataPathrw

Objetivo: /root/.ollama
Por defecto: /mnt/user/appdata/ollama
Valor: /mnt/user/appdata/ollama

API Interface PortPorttcp

Port number where ollama listens on.

Objetivo: 11434
Por defecto: 11434
Valor: 11434

OLLAMA_HOSTVariable

IP and Port the server binds to. Set to 127.0.0.1:11434 for internal only access.

Por defecto: 0.0.0.0:11434
Valor: 0.0.0.0:11434

OLLAMA_ORIGINSVariable

Comma-separated list of allowed CORS origins.

Por defecto: *
Valor: *

OLLAMA_KEEP_ALIVEVariable

How long a model stays in VRAM, e.g. 60m or 24h (Set to -1 for infinite, 0 for none).

Por defecto: 5m
Valor: 5m

OLLAMA_LOAD_TIMEOUTVariable

Timeout for stall detection during model loads.

Por defecto: 5m
Valor: 5m

OLLAMA_NUM_PARALLELVariable

Max number of parallel requests a single model can handle.

Por defecto: 1
Valor: 1

OLLAMA_CONTEXT_LENGTHVariable

Default context window (tokens) if not specified by the model.

Por defecto: 4096
Valor: 4096

OLLAMA_KV_CACHE_TYPEVariable

Quantization type for the K/V cache, e.g. f16, q8_0, q4_0.

Por defecto: f16
Valor: f16

OLLAMA_MODELSVariable

The path where model weights and blobs are stored.

Por defecto: /root/.ollama/models
Valor: /root/.ollama/models

OLLAMA_MAX_LOADED_MODELSVariable

Maximum number of models loaded per GPU at once (Set to 0 for infinite).

Por defecto: 0
Valor: 0

OLLAMA_MAX_QUEUEVariable

Max requests that can wait in line when the server is busy.

Por defecto: 512
Valor: 512

OLLAMA_DEBUGVariable

Log detail level: 0 for INFO, 1 for DEBUG, 2 for TRACE.

Por defecto: 0|1|2

OLLAMA_GPU_OVERHEADVariable

Reserved VRAM (in bytes) to leave empty on each GPU.

Por defecto: 0
Valor: 0

OLLAMA_FLASH_ATTENTIONVariable

Enables experimental Flash Attention optimizations.

Por defecto: false|true

OLLAMA_SCHED_SPREADVariable

If true, always spreads model layers across all visible GPUs.

Por defecto: false|true

OLLAMA_MULTIUSER_CACHEVariable

Optimizes prompt caching when multiple users share a model.

Por defecto: false|true

OLLAMA_NOPRUNEVariable

If true, does not delete unused model blobs on startup.

Por defecto: false|true

OLLAMA_NOHISTORYVariable

Disables the readline history in the interactive CLI.

Por defecto: false|true

OLLAMA_NEW_ENGINEVariable

Enables the experimental new Ollama engine.

Por defecto: false|true

OLLAMA_VULKANVariable

Enables experimental Vulkan hardware acceleration.

Por defecto: false|true

HTTP_PROXYVariable

Proxy for downloading models over HTTP.

HTTPS_PROXYVariable

Proxy for downloading models over HTTPS.

NO_PROXYVariable

Comma-separate list of hosts/IPs that bypass the proxy.

Enlaces

Plantilla Ayuda Centro Docker

Detalles

Repositorio

ollama/ollama

Registro

https://hub.docker.com/r/ollama/ollama/

Última actualización2026-05-27

Visto por primera vez2023-11-08

Ejecute Ollama en Unraid.

Ollama se encuentra en Community Apps para Unraid OS. Explore Unraid para crear un servidor doméstico flexible, un NAS o un laboratorio doméstico.

Explore Unraid OS