AI · 114 apps

ACE-Step

Application

Overview

ACE-Step 1.5 - AI Music Generation. Generate full songs with vocals, instrumentals, and lyrics using a Diffusion Transformer. Supports text-to-music, remixing, cover generation, and LoRA fine-tuning. Requires NVIDIA GPU with CUDA support.

FIRST RUN: Models (~10GB) will be downloaded automatically on first start. This may take several minutes depending on your internet speed. Subsequent starts are instant.

SETTINGS GUIDE:

DiT Model - The core music generation model.

turbo (default): Fast generation in 8 steps. Best for most users.
turbo-rl: Turbo with reinforcement learning refinement.
sft: Higher quality, 50 steps (slower).
base: 50 steps with all features (extract, lego, complete).

Language Model - Controls lyrics understanding and chain-of-thought reasoning.

1.7B (default): Best balance of quality and VRAM. Recommended for 12-16GB GPUs.
0.6B: For GPUs with less than 12GB VRAM.
4B: Highest quality lyrics understanding. Requires 24GB+ VRAM.

Enable LLM - Whether to load the language model.

auto (default): Detects based on your GPU VRAM.
false: DiT-only mode. Faster startup, uses less VRAM, but disables thinking/sample features.
true: Force enable.

LM Backend - Engine for the language model.

pt (default): PyTorch native. Works on all GPUs including RTX 50-series.
vllm: Faster inference but may crash on RTX 50-series (Blackwell) GPUs.

CPU Offloading - Moves models between GPU and CPU to save VRAM.

auto (default): Offloads if GPU has less than 20GB VRAM.
false: Keep all models on GPU. Faster generation but uses ~12GB VRAM at idle.
true: Always offload. Slower but frees VRAM for other containers.

UI Language - Web interface language: English, Chinese, or Japanese.

Links

Template Docker Hub Project

Details

Repository

spaceinvaderone/ace-step:latest

Registry

https://hub.docker.com/r/spaceinvaderone/ace-step/

Last Updated2026-02-22

First Seen2026-02-22

Categories

ACE-Step

Overview

Categories

Download Statistics

Links

Details