AI Model Guide
NGMC integrates multiple AI models for image, video, and audio generation. Each model has unique capabilities, strengths, and limitations. This guide helps you choose the right model and get the most out of it.
Image Models
Generate and edit still images from text prompts and reference images.
| Model | Speed | Max Resolution | Reference Images | Key Strength |
|---|---|---|---|---|
| GPT Image 2 | ~45s | 4K (3840x3840) | Yes | Prompt-faithful generation + edits, three quality tiers |
| Nano Banana Pro | ~45s | 4K (6336x5504) | Up to 6 | Professional quality, multi-turn editing |
| Nano Banana 2 | ~15s | 4K (4096x4096) | Up to 14 | Fastest, most reference images, 0.5K draft mode |
Video Models
Generate videos from text, images, or extend existing clips.
| Model | Speed | Max Resolution | Duration | Audio | Key Strength |
|---|---|---|---|---|---|
| Veo 3.1 | ~2min | 4K | 4 / 6 / 8s | Native audio + dialogue | Highest quality, 4K, video extension |
| Veo 3.1 Fast | ~45s | 720p | 4 / 6 / 8s | Native audio + dialogue | Fast iteration, video extension |
| Seedance 2.0 | ~2min | 1080p | 4–15s (continuous) | Optional native audio | Cinematic quality, multimodal references, mixed image + video + audio inputs |
| Seedance 2.0 Fast | ~45s | 1080p | 4–15s (continuous) | Optional native audio | Fast Seedance variant with the same capability surface |
Audio Models
Generate sound effects, music, and ambient audio.
| Model | Speed | Duration | Key Strength |
|---|---|---|---|
| Lyria 3 (Clip) | ~30s | Up to 30s | Fast iteration for SFX and short music cues |
| Lyria 3 (Pro) | ~1min | Up to 2min | Higher-quality music and ambient soundscapes |
Referencing Images in Prompts
You can use @-mentions in your prompt to tell the model exactly how to use each reference image.
How it works
Type @ in the prompt textarea to open the reference picker. Select an image, and a mention like @[Image 1] is inserted at that position. When generated, the model sees the actual image inline at that exact point in your text, giving it precise context.
Example:
Add one more ragdoll cat based on @[Image 2], and replace the background
with the scenery from @[Image 1].
The model receives Image 2 right after "based on" and Image 1 right after "scenery from", so it understands which image to use for which purpose.
Without @-mentions
If you don't use @, all reference images are appended after your prompt text. The model still sees them but has to infer how to use each one. For simple cases (single reference, style transfer) this works fine. Use @ when you need the model to distinguish between multiple images.
Provider behavior
| Provider | @-mention behavior |
|---|---|
| Image models (Nano Banana Pro/2) | Full interleaving — images placed inline at the mentioned position |
| Video models (Veo 3.1/Fast, Seedance 2.0/Fast) | Mentions are resolved to display names in the prompt text; images are passed as structured reference inputs |
| Audio models (Lyria 3 Clip/Pro) | Not applicable (audio references only) |
- Iterating on ideas? Start with Nano Banana 2 (images) or a Fast video variant (Veo 3.1 Fast, Seedance 2.0 Fast) for speed.
- Final output quality? Use Nano Banana Pro (images), Veo 3.1, or Seedance 2.0 (video) for production.
- Need many reference images? Nano Banana 2 supports up to 14 reference images; Seedance 2.0 accepts up to 9.
- Mixed reference inputs (image + video + audio)? Seedance 2.0 accepts all three modalities in the same run.
- Need dialogue in video? Veo 3.1 / Fast generate speech from quoted text in your prompt.
- Custom duration between 4 and 15 seconds? Seedance 2.0 accepts any integer duration in that range; Veo is fixed to 4 / 6 / 8 seconds.