ollama-intel-gpu

ollama-intel-gpu

Docker app from SpaceInvaderOne's Repository

Overview

Ollama for Intel Arc GPUs (B580, A770, A750, etc.) powered by Intel IPEX-LLM. Drop-in replacement for the standard Ollama container — exposes the same API on port 11434. Requires an Intel Arc GPU

Requirements

Intel Arc GPU (B580, A770, A750, or other Arc series) with kernel driver loaded.

Runtime arguments

Web UI
http://[IP]:[PORT:11434]/
Network
bridge
Shell
bash
Privileged
false
Extra Params
--device=/dev/dri

Template configuration

Model StoragePathrw

Path on the host for persistent model storage. Models are large (4-20 GB each).

Target
/root/.ollama
Default
/mnt/user/appdata/ollama-intel-gpu
Value
/mnt/user/appdata/ollama-intel-gpu
Ollama API PortPorttcp

Port for the Ollama API.

Target
11434
Default
11434
Value
11434
OLLAMA_ORIGINSVariable

Allowed origins for CORS. Set to * to allow Open WebUI and other frontends to connect.

Default
*
Value
*
ONEAPI_DEVICE_SELECTORVariable

Select which Intel GPU to use. Use level_zero:0 for the first GPU. Change only if you have multiple Intel GPUs.

Default
level_zero:0
Value
level_zero:0
OLLAMA_NUM_PARALLELVariable

Number of parallel inference requests. Set to 1 for 12 GB VRAM cards (B580). Increase only if you have more VRAM.

Default
1
Value
1
OLLAMA_NUM_CTXVariable

Context window size in tokens. Larger values use more VRAM. Default 4096 is a good balance for 12 GB cards.

Default
4096
Value
4096
OLLAMA_KEEP_ALIVEVariable

How long to keep a model loaded in VRAM after the last request. Use 5m for 5 minutes, -1 for forever, 0 to unload immediately.

Default
5m
Value
5m

Download Statistics

876
Total Downloads

Details

Repository
spaceinvaderone/ollama-intel-gpu
Last Updated2026-03-27
First Seen2026-04-03

Run ollama-intel-gpu on Unraid.

ollama-intel-gpu is listed in Community Apps for Unraid OS. Explore Unraid to build a flexible home server, NAS, or homelab.