Docling-Serve
Docker 应用程序 from xxBeanSproutxx's Repository
概述
Docling is an open-source toolkit (from IBM Research) that converts documents (PDF, DOCX, images, HTML, etc.) into structured Markdown or JSON. It's great for RAG and local document processing.
Highlights
- Multi-format parsing with layout understanding and table extraction.
- Simple API + optional Web UI.
- Runs locally on your Unraid box; keep your data private.
Default Endpoints
- API:
http://[IP]:[PORT:5001] - Docs:
http://[IP]:[PORT:5001]/docs - Web UI:
http://[IP]:[PORT:5001]/ui(setDOCLING_SERVE_ENABLE_UI=1)
First-Run Model Download
- On a fresh install the models directory will be empty. Docling must download RapidOCR and other artifacts on first boot.
- Make sure
DOCLING_SERVE_ENABLE_REMOTE_SERVICESis set totruefor the very first start so downloads can reach upstream model hosts (e.g. modelscope.cn). - After the first successful start and model cache is populated, you may set
DOCLING_SERVE_ENABLE_REMOTE_SERVICESback tofalseif you prefer a fully local-only deployment. - Keep
DOCLING_SERVE_LOAD_MODELS_AT_BOOT=trueso any download failures show up immediately in startup logs rather than at first OCR request.
Persistent Paths
- Models/artifacts are persisted in appdata so restarts do not re-download everything.
- If logs show
artifacts_path is set to an invalid directory, verify the models path exists and matchesDOCLING_SERVE_ARTIFACTS_PATH.
要求
**CPU-only deployments**
- Select the `cpu` branch (`quay.io/docling-project/docling-serve-cpu`) for CPU-only operation.
- `DOCLING_DEVICE=cpu` is a runtime hint; it does NOT replace choosing the CPU image branch.
**GPU deployments (NVIDIA, optional)**
- Install the NVIDIA Driver plugin and reboot.
- Add `--gpus all` in Extra Parameters.
- If GPU is detected but jobs still run on CPU, try branch `cu126` (better compatibility on some older driver stacks) or update NVIDIA drivers.
- Optional: set `DOCLING_DEVICE=cuda` (or `cuda:0`) to force GPU execution.
运行时参数
- 网络用户界面
http://[IP]:[PORT:5001]/ui- 网络
bridge- 外壳
sh- 特权
- false
模板配置
Docling Serve API/UI port
- 目标
- 5001
- 默认值
- 5001
Persistent Docling model artifacts (required; must exist and be writable; must match DOCLING_SERVE_ARTIFACTS_PATH)
- 目标
- /opt/app-root/src/.cache/docling/models
- 默认值
- /mnt/user/appdata/docling/models
Persistent HuggingFace cache
- 目标
- /opt/app-root/src/.cache/huggingface
- 默认值
- /mnt/user/appdata/docling/huggingface_cache
Caches OCR models
- 目标
- /opt/app-root/.EasyOCR
- 默认值
- /mnt/user/appdata/docling/easyocr_cache
Enable the /ui playground (1/0)
- 目标
- DOCLING_SERVE_ENABLE_UI
- 默认值
- 1
Preload/download models at startup (recommended for easier first-run diagnostics).
- 目标
- DOCLING_SERVE_LOAD_MODELS_AT_BOOT
- 默认值
- true
Allow remote model/service calls. Enabled by default so first-run model downloads work automatically. Set to false after initial setup if you prefer fully offline operation.
- 目标
- DOCLING_SERVE_ENABLE_REMOTE_SERVICES
- 默认值
- true
Directory used by Docling to load/store model artifacts
- 目标
- DOCLING_SERVE_ARTIFACTS_PATH
- 默认值
- /opt/app-root/src/.cache/docling/models
Runtime device: auto, cpu, cuda, cuda:0, mps. Use 'auto' to let Docling decide. For CPU-only, select the cpu image branch above.
- 目标
- DOCLING_DEVICE
- 默认值
- auto
VLM used for image-to-text
- 目标
- DOCLING_SERVE_IMAGE_TO_TEXT_MODEL
- 默认值
- HuggingFaceTB/SmolVLM-256M-Instruct
Figure/diagram classifier
- 目标
- DOCLING_SERVE_PICTURE_CLASSIFICATION_MODEL
- 默认值
- ds4sd/DocumentFigureClassifier
- 默认值
- 99
- 默认值
- 100
- 默认值
- all
- 默认值
- compute,utility
详细信息
quay.io/docling-project/docling-serve在Unraid 上运行 Docling-Serve 。
Docling-Serve 已被列入Unraid OS 的社区应用程序。探索Unraid ,构建灵活的家庭服务器、NAS 或家庭实验室。