ACE-Step

ACE-Step

Application

Back to apps

Overview

ACE-Step 1.5 - AI Music Generation. Generate full songs with vocals, instrumentals, and lyrics using a Diffusion Transformer. Supports text-to-music, remixing, cover generation, and LoRA fine-tuning. Requires NVIDIA GPU with CUDA support.

FIRST RUN: Models (~10GB) will be downloaded automatically on first start. This may take several minutes depending on your internet speed. Subsequent starts are instant.

SETTINGS GUIDE:

DiT Model - The core music generation model.

  • turbo (default): Fast generation in 8 steps. Best for most users.
  • turbo-rl: Turbo with reinforcement learning refinement.
  • sft: Higher quality, 50 steps (slower).
  • base: 50 steps with all features (extract, lego, complete).

Language Model - Controls lyrics understanding and chain-of-thought reasoning.

  • 1.7B (default): Best balance of quality and VRAM. Recommended for 12-16GB GPUs.
  • 0.6B: For GPUs with less than 12GB VRAM.
  • 4B: Highest quality lyrics understanding. Requires 24GB+ VRAM.

Enable LLM - Whether to load the language model.

  • auto (default): Detects based on your GPU VRAM.
  • false: DiT-only mode. Faster startup, uses less VRAM, but disables thinking/sample features.
  • true: Force enable.

LM Backend - Engine for the language model.

  • pt (default): PyTorch native. Works on all GPUs including RTX 50-series.
  • vllm: Faster inference but may crash on RTX 50-series (Blackwell) GPUs.

CPU Offloading - Moves models between GPU and CPU to save VRAM.

  • auto (default): Offloads if GPU has less than 20GB VRAM.
  • false: Keep all models on GPU. Faster generation but uses ~12GB VRAM at idle.
  • true: Always offload. Slower but frees VRAM for other containers.

UI Language - Web interface language: English, Chinese, or Japanese.

Download Statistics

1,191
Total Downloads

Details

Repository
spaceinvaderone/ace-step:latest
Last Updated2026-02-22
First Seen2026-02-22