
Z-Image Turbo LoRA Training: Complete Guide with De-Distillation Adapter
Learn how to train custom LoRA models for Z-Image Turbo using the Ostris AI Toolkit, including the de-distillation adapter technique, dataset preparation, and training parameters.
Training custom LoRA models for Z-Image Turbo requires a specific approach due to its distilled architecture. This guide covers the de-distillation adapter technique developed by Ostris, training parameters, and deployment steps.
Quick Start: If you prefer a web-based solution, use our LoRA training tool to train custom Z-Image Turbo LoRAs directly in your browser—no local GPU required.
Understanding the De-Distillation Challenge
Z-Image Turbo is a step-distilled model—it achieves fast 8-step generation through a distillation process. Standard LoRA training breaks this distillation quickly, resulting in:
- Loss of speed benefits (no longer 8-step capable)
- Unpredictable quality degradation
- Artifacts in generated outputs
The de-distillation training adapter created by Ostris solves this problem.
How the Training Adapter Works
According to the Hugging Face engineering blog:
- Adapter generation: Thousands of images were generated using Z-Image Turbo at various sizes and aspect ratios
- Controlled distillation breakdown: A LoRA was trained on these images at low learning rate (1e-5), allowing distillation to break down in a controlled manner
- Training on top: Your custom LoRA trains on top of this adapter, learning only your new content
- Adapter removal: At inference time, remove the training adapter—your LoRA preserves distilled speed
This approach lets you train style, character, or concept LoRAs while maintaining 8-step Turbo generation.
Training Methods
Method 1: Web-Based Training (Recommended)
For those without local GPU resources, zimageturbo.com offers Z-Image Turbo LoRA training directly in your browser:
Benefits of our platform:
- No GPU required—training runs on cloud infrastructure
- Optimized parameters pre-configured for Z-Image Turbo
- Automatic de-distillation adapter integration
- Download trained LoRA weights directly
Method 2: Local Training (AI Toolkit)
For local training, use the Ostris AI Toolkit.
Hardware Requirements:
- Minimum: 12GB VRAM (RTX 3080, RTX 4070)
- Recommended: 16GB+ VRAM (RTX 4090, RTX 3090)
- CPU offloading available for lower VRAM systems
Installation:
git clone https://github.com/ostris/ai-toolkit
cd ai-toolkit
pip install -r requirements.txtDownload Training Adapter:
# Download from Hugging Face
huggingface-cli download ostris/zimage_turbo_training_adapterV2Use V2 of the adapter for refined results.
Dataset Preparation
Quality training data is essential for good LoRA results.
Image Requirements
- Count: 5-15 images for characters, 15-25 for styles
- Resolution: 1024×1024 minimum (matches Z-Image Turbo's native resolution)
- Format: PNG or JPEG
- Variety: Include different angles, lighting, and contexts
Caption Files
Create a text file for each image with the same name:
training_images/
├── image001.png
├── image001.txt
├── image002.png
├── image002.txtCaption format example:
a portrait of [trigger_word], detailed face, natural lightingUse a consistent trigger word that will activate your LoRA during generation.
Folder Structure
my_lora_training/
├── config.yaml
├── images/
│ ├── 001.png
│ ├── 001.txt
│ ├── 002.png
│ └── 002.txt
└── output/Training Parameters
Based on community-tested configurations from GitHub issues and tutorials:
Recommended Settings
| Parameter | Value | Notes |
|---|---|---|
| Steps | 2,000-5,000 | Start with 3,000 for testing |
| Learning Rate | 1e-4 to 5e-5 | Lower for fine details |
| LoRA Rank (r) | 8-16 | Higher = more capacity |
| Resolution | 1024×1024 | Match base model |
| Batch Size | 1-2 | Adjust based on VRAM |
Training Adapter Selection
Two options in AI Toolkit:
-
Z-Image Turbo W/ Training Adapter
- Preserves 8-step Turbo speed
- Best for shorter runs (< 5,000 steps)
- Remove adapter at inference
-
Z-Image De-Turbo (De-Distilled)
- No adapter needed at inference
- Suitable for longer training runs
- Slightly slower generation
For most use cases, option 1 (with training adapter) is recommended.
Running Training
Basic AI Toolkit Command
python run.py config.yamlSample config.yaml
job: extension
config:
name: my_z_image_lora
process:
- type: sd_trainer
training_folder: ./images
base_model: Tongyi-MAI/Z-Image-Turbo
training_adapter: ostris/zimage_turbo_training_adapterV2
resolution: 1024
train_batch_size: 1
learning_rate: 1e-4
max_train_steps: 3000
network:
type: lora
rank: 16
alpha: 16
save_steps: 500
output_dir: ./outputAdjust paths and parameters based on your dataset and hardware.
Training Time Estimates
Based on user reports:
| GPU | 3,000 Steps | 5,000 Steps |
|---|---|---|
| RTX 4090 | ~1 hour | ~1.5 hours |
| RTX 3090 | ~1.5 hours | ~2.5 hours |
| RTX 4070 Ti | ~2 hours | ~3.5 hours |
| RTX 3080 (12GB) | ~3 hours | ~5 hours |
Enable Low VRAM mode if you encounter memory errors on 12GB cards.
Using Your Trained LoRA
In ComfyUI
- Copy LoRA file to
ComfyUI/models/loras/ - Add "Load LoRA" node to workflow
- Connect to Z-Image Turbo model loader
- Set LoRA strength (0.5-1.0)
In Python (Diffusers)
from diffusers import ZImagePipeline
import torch
pipe = ZImagePipeline.from_pretrained(
"Tongyi-MAI/Z-Image-Turbo",
torch_dtype=torch.bfloat16
)
pipe.load_lora_weights("./my_lora.safetensors")
pipe.to("cuda")
# Generate with trigger word
image = pipe(
prompt="a portrait of [trigger_word], detailed face",
num_inference_steps=9,
guidance_scale=0.0
).images[0]Remember to include your trigger word in prompts.
Troubleshooting
LoRA Not Applying
- Verify file path is correct
- Check LoRA was trained on Z-Image Turbo (not FLUX or SD)
- Increase LoRA strength
Quality Degradation
- Training too long (distillation breakdown)
- Learning rate too high
- Reduce steps or use V2 training adapter
VRAM Errors
- Enable Low VRAM mode in AI Toolkit
- Reduce batch size to 1
- Use gradient checkpointing
- Consider web-based training on zimageturbo.com
Artifacts in Output
- Remove training adapter if using Method 1
- Check LoRA strength (reduce if artifacts appear)
- Verify training completed without errors
Best Practices
- Start small: Test with 1,000 steps before committing to longer runs
- Caption carefully: Good captions improve LoRA quality significantly
- Use V2 adapter: The refined V2 training adapter produces better results
- Monitor checkpoints: Save every 500 steps to find optimal training point
- Test incrementally: Generate samples at each checkpoint to avoid overtraining
Related Resources
- Train your LoRA on zimageturbo.com — No GPU required
- Z-Image Turbo vs FLUX Comparison — Understand the technical differences
- ComfyUI Workflow Guide — Set up Z-Image Turbo locally
- Pricing Plans — View training credits and subscription options
Sources:
Autor

Categorías
Más Publicaciones de Generación de Imágenes AI

Z-Image Turbo vs FLUX: Technical Specifications and Performance Comparison
A detailed comparison of Z-Image Turbo and FLUX.1-dev based on official specifications from Hugging Face, covering parameters, inference speed, VRAM requirements, licensing, and benchmark results.


Z-Image Turbo vs Midjourney: Open-Source Alternative Comparison 2025
A detailed comparison between Z-Image Turbo (free, open-source) and Midjourney ($10-120/month). Compare pricing, features, text rendering, speed, and commercial licensing.


How to Set Up Z-Image Turbo in ComfyUI: Complete Workflow Guide
Step-by-step instructions for installing and configuring Z-Image Turbo in ComfyUI, including model downloads, directory structure, node setup, and optimization tips.
