LTX-2 Technical Optimization: Benchmarking FP8, NVFP4, and Schedulers
While our initial LTX-2 technical guide covers the deployment fundamentals, professional exploitation of Lightricks’ model requires a granular understanding of quantization formats. With an asymmetric 19-billion parameter architecture (14B video, 5B audio), the choice between FP8 and NVFP4 is a direct trade-off between inference speed and latent fidelity.
Blackwell and Ada Architectures: The NVFP4 Stakes
Latest NVIDIA architectures (RTX 50 and 40 series) introduce native NVFP4 support. Contrary to the myth of “near-lossless” quality, LTX-2 benchmarks reveal visible degradation compared to FP16 or FP8, particularly in complex textures and audio clarity.
Quality Benchmarks: Fidelity vs. Speed
Rounding errors in the 4-bit format impact sampling precision, leading to smoother faces and more metallic audio.
| Format | Quality vs. FP16 | Speed Gain (RTX 5090) | LTX-2 Specific Impact |
|---|---|---|---|
| FP16 / BF16 | 100% (Baseline) | 1x | Maximum detail, crystal-clear audio. |
| FP8 | 90–95% | 1.5–2x | Near-original, excellent balance. |
| NVFP4 | 75–85% | 2–4x | Simplified visuals, “robotic” audio. |


Comprehensive LTX-2 Version Comparison
The following table helps you select the optimal version based on your workflow, from initial drafts to final production.
| Version | Visual Quality | Audio Quality | Time (RTX 5090) | VRAM | Ideal Use Case |
|---|---|---|---|---|---|
| Full BF16 | Maximal | Maximal | 100s+ | ~38 GB | Absolute reference. |
| Full FP8 | High | Excellent | ~50s | ~27 GB | High-quality production. |
| Full NVFP4 | Medium | Medium | ~40s | ~20 GB | Balanced speed. |
| Distilled | Medium | Medium | ~2–5s | ~19 GB | Ultra-fast prototyping. |
Official weights are hosted on the LTX-2 Hugging Face Model Card.
Deep Dive: Schedulers and Sharpness in NVFP4
In NVFP4 format, the loss of precision makes the model extremely sensitive to the choice of Scheduler. Fine textures tend to smear under aggressive quantization.
Scheduler Impact on Textures
- Euler / Euler Ancestral: Recommended for NVFP4 as they maintain a solid structure despite compression.
- DPM++ SDE: Avoid in NVFP4; it amplifies quantization noise, creating unstable artifacts. Reserve this for FP8.
- UniPC: The best speed/sharpness compromise for 4-bit, avoiding excessive contour smoothing.
Sharpness Optimization
To compensate for NVFP4’s visual simplification, it is advisable to increase the step count (24–28) and strengthen the audio guidance (Modality-CFG st=8+). This technique indirectly stabilizes visual details by firmly synchronizing latents with audio frequencies.
FAQ
Why is my RTX 5090 slow in NVFP4?
Performance gains require CUDA 13+. Without these libraries, NVFP4 can be 30–40% slower than expected.
Is the GGUF format relevant on an RTX 5090?
No. GGUF is a community solution for low-VRAM cards. On a 5090, always prefer native FP8 or NVFP4.
How can I improve degraded audio in NVFP4?
Increase the st`** value in audio guidance to compensate for latent precision loss. Native audio remains 24 kHz stereo regardless of the format.
Your comments enrich our articles, so don’t hesitate to share your thoughts! Sharing on social media helps us a lot. Thank you for your support!
