Skip to main content

Visuals: Create Spotify Canvas & Loop Videos

Learn how to create Spotify Canvas-style loops in Visuals, choose the right models, and write prompts that produce clean, usable motion.

Sebastian Mourra avatar
Written by Sebastian Mourra
Updated yesterday

Canvas & Loop Videos lets you generate short videos for Spotify Canvas and for social. You can generate from scratch, generate from a song, or animate an existing image (like your cover art or an artist photo).


What you can create

Use this flow for:

  • Spotify Canvas loops (short, vertical loops)

  • Cover art → motion visual (animate your cover art)

  • Photo → motion (animate an artist photo)

  • Short music visuals for TikTok/Reels/IG (non-loop videos also work)


How this works (so the settings make sense)

In most cases, Visuals does two steps for you:

  1. It generates (or uses) a base image

  2. It turns that image into a video using your video model

That’s why you’ll see both:

  • Image Generation Model

  • Video Generation Model

At the time of writing, you use one prompt for the overall generation (you can’t separately prompt the base image vs the video). If you want exact control over the base image, upload your own Reference Image.


Step-by-step (quick)

  1. Go to CreateSpotify Canvas & Loop Videos

  2. Choose your starting point:

  • Inspired by (generate a new visual from your prompt)

  • Based on (animate an image you provide)

  1. Optional: Select a song for context

  2. Optional: Upload a reference image (cover art, artist photo, logo, style)

  3. Write your prompt

  4. Choose your models, duration, and aspect ratio

  5. Click Generate

Your results are saved in My Library.


The biggest thing that improves results

Most “bad” results happen because the prompt describes the scene, but not the motion.

Good Canvas prompts do both:

  • What we’re looking at

  • What should move (and how)

And they also clearly say what you don’t want:

  • No text

  • No warped faces

  • No fast camera moves


Start here: realistic vs abstract

A lot of people want visuals that feel real: real places, real people, real motion.

Below are two approaches. Both work. Pick the one that matches your goal.

Realistic (most common)

Use this when you want a believable scene.

Examples:

  • A real-world environment (street, studio, rooftop, beach)

  • A realistic performance moment

  • A cinematic “music video” style shot

Abstract (great for mood)

Use this when you want vibe, texture, and motion.

Examples:

  • Light streaks, particles, liquid metal, neon smoke

  • Dreamy gradients and slow movement


Prompt formula (simple)

For Luma Ray-2 and Runway Gen-4 Turbo (best for single scenes)

Write prompts like this:

  1. Scene (what we see)

  2. Motion instructions (what moves, what stays still)

  3. Quality guardrails (no text, no distortions)

Template:

  • Create a short video of [scene].

  • Animate [what moves] with [speed/style]. Keep [what stays still] unchanged.

  • No text. No logos unless provided. No warping or distortion.

For Veo 3.1 and Veo 3.1 Fast (best for multi-scene + audio)

These models can handle:

  • Multiple scenes

  • Sound effects

  • Voiceover (optional)

Template:

  • Create a short video for this song.

  • Scene plan: [Scene 1], then [Scene 2], then [Scene 3] (optional).

  • Camera + motion: [how it moves].

  • Audio: [enable/disable]. If enabled, include [voiceover + SFX].

  • No text.

Important:

If you don’t need audio, disable it to save credits.


Prompt examples (realistic)

Artist photo → subtle motion (safe and usable)

  • Create a short vertical video using the person in my reference photo as the main subject. Maintain their identity and facial features. Keep their face and body unchanged. Animate only the background with subtle light movement and gentle floating dust. Add a slow, steady camera drift. No text. No warping.

Cover art → motion visual

  • Use my reference cover art as the base. Keep the main subject unchanged. Animate soft fog, lighting glow, and subtle movement in the background. Add a gentle camera drift. No text. No distortion.

Real environment

  • Create a cinematic night street scene with neon reflections and light rain. Animate gentle rain and soft reflections, with slow camera drift. Keep everything realistic. No text.


Prompt examples (abstract)

  • Create a dark, high-contrast visual with neon particles flowing in slow waves. Animate subtle pulsing light and gentle motion that feels smooth and continuous. No text. No harsh flicker.


Choosing your Video Generation Model

Your video model matters more than anything else.

Luma Ray-2 (Loop)

Use this when:

  • You specifically want a Spotify Canvas loop

  • You want to use Seamless Loop

Notes:

  • Luma models don’t support audio generation.

Luma Ray-2 Fast

Use this when:

  • You want faster iteration on motion concepts

  • You don’t need audio

Veo 3.1 Fast (recommended for many users)

Use this when:

  • You want strong results quickly

  • You want real-looking scenes and motion

  • You may want multi-scene output

Audio:

  • Veo models can generate audio.

  • If you don’t need audio, turn it off to save credits.

Veo 3.1

Use this when:

  • You want the highest quality Veo output

  • You want to include voiceover and sound effects

Runway Gen-4 Turbo

Use this when:

  • You want an alternative motion style

Tip:

  • For most users, Veo 3.1 Fast + Luma Ray-2 are the best starting points.


Choosing your Image Generation Model (optional)

Visuals may generate a base image before creating the video.

You can choose:

  • Nano Banana (Default) (5 credits)

  • Nano Banana Pro (10–20 credits)

  • FLUX Kontext Max (Style Transfer) (10 credits)

  • OpenAI GPT-Image-1 (General) (20 credits)

Simple guidance:

  • Use Nano Banana for fast iteration

  • Use Nano Banana Pro when you want better consistency and cleaner details (especially for faces and text)

  • Use FLUX Kontext Max if you’re doing style transfer with strong reference images

  • Use GPT-Image-1 if you want to experiment, but it may be less consistent for faces

Reminder:

  • The base image is optional if you upload your own Reference Image.


Duration and aspect ratio (model-dependent)

Visuals offers preset durations and aspect ratios depending on the video model you pick.

Examples you may see:

  • Veo models: 4s / 6s / 8s (often limited aspect ratios like 9:16 and 16:9)

  • Luma models: 5s / 9s (with more aspect ratio options)

Spotify guidance:

  • Spotify Canvas recommends 3–8 seconds.

  • If you generate 9 seconds, you may need to trim it to 8 seconds for Spotify.


Seamless Loop (Spotify Canvas Ready)

If you want a true loop:

  • Choose Luma Ray-2 (Loop)

  • Enable Seamless Loop

Seamless Loop can sometimes create unwanted effects. If it does:

  • Try turning Seamless Loop off and generating again

  • Try a longer duration (when available) because the model has more time to create smoother motion

Why this happens:

To make a loop, the video needs to end in a way that matches the first frame. If the motion is too intense, or the duration is too short, the model can “rush” the transition.


Audio (Veo models only)

If you choose a Veo model, you may see an Audio toggle.

Use audio when:

  • You want sound effects (SFX)

  • You want voiceover

Turn audio off when:

  • You only need visuals and you’ll add music in post

  • You want to save credits


Avoid text (important)

Most video models do not render text cleanly.

Unless you truly need text, include:

  • No text

If you need a logo in the video:

  • Upload the logo as a reference image

  • Say exactly how it should appear (small, clean, readable)


Social use case (Reels/TikTok)

These videos are also great as short clips for Reels/TikTok with your music behind them.

If you want on-screen text, it’s usually best to add it in your editor after export.

Hook ideas you can pair with your visuals:

  • New music out now

  • If you’ve ever felt this… this song is for you

  • This is what [emotion] sounds like

  • I made this song at my lowest point

  • Turn the volume up for this part


Important: review your prompt before you generate

There’s a common quirk to watch for.

If you select settings (genre, mood, style) and then edit your prompt (or edit the prompt and then select settings), those settings can sometimes rewrite or override parts of your prompt.

Before you click Generate, quickly re-read the prompt to make sure it still says what you intend.


Troubleshooting: fix bad results fast

Faces or subjects get distorted

This can happen even without looping.

Try:

  • Add: Maintain identity and facial features

  • Add: Keep the face unchanged, animate only the background

  • Reduce motion intensity

  • Generate again (video models can vary between tries)

Motion is too chaotic

  • Add: subtle motion only

  • Remove extra motion instructions

  • Ask for one clear motion type (fog, lighting, camera drift)

The model ignored my instructions

  • Write motion instructions as full sentences

  • Say what stays still

  • Include what you don’t want (no text, no warping)

My loop has weird artifacts

  • Try Seamless Loop OFF

  • Try a different duration

  • Make motion more subtle


Credits (quick note)

Spotify Canvas & Loop Videos uses Visuals Credits when you generate.

You’ll always see the credit cost before generation runs.


FAQs

Can I turn my cover art into a Spotify Canvas loop?
Yes. Upload your cover art as a reference image, choose Luma Ray-2 (Loop), enable Seamless Loop, and describe what should move.

Which model should I start with?
For most users: start with Veo 3.1 Fast for strong motion concepts, and use Luma Ray-2 (Loop) when you need a true Spotify Canvas loop.

Why is audio not available sometimes?
Luma models don’t support audio generation. Veo models can support audio, and you can usually turn it on or off.

Why did Seamless Loop create weird artifacts?
Seamless Loop can sometimes create unwanted effects because the video has to match the first frame. Try Seamless Loop OFF, reduce motion, or try a different duration.

How do I keep my face from warping?
Use a strong reference image and say: Maintain identity and facial features. Keep the face unchanged. Animate only the background.

What should I include if something fails or errors out?
Use the in-app chat bubble to message us in Intercom and include a screenshot and the error message.

Did this answer your question?