Skip to main content

Visuals: Create Cover Art & Images

Learn how to generate cover art and images in Visuals, and how to get results you’ll actually want to use.

Sebastian Mourra avatar
Written by Sebastian Mourra
Updated yesterday

Visuals Cover Art & Images lets you generate artwork from scratch, use a song for inspiration, and use reference images (like artist photos) to guide style and subject.


What you can create

This creation flow is used for:

  • Single / album cover art

  • Promo artwork (release announcements, teasers, social graphics)

  • Artist imagery (portraits, concepts, themed photos)

  • General image generation (thumbnails, posters, artwork, infographics)

If you want to turn an image into motion, use Spotify Canvas & Loop Videos (image-to-motion) after you generate your cover.


The part most people get wrong

Getting to the Create page is easy.

The hard part is getting an output that matches what’s in your head.

Most “bad results” come from one of these:

  • The prompt is too vague

  • The prompt asks for too many things at once

  • The reference images are unclear (or fighting each other)

  • The user changes 5 settings at once and can’t tell what helped

This article is focused on fixing that.


Step-by-step (quick)

  1. Go to CreateCover Art & Images

  2. Choose your direction:

  • Generate from scratch (prompt-only)

  • Generate from a song (select a track for song-aware outputs)

  • Generate with reference images (style and/or subject guidance)

  1. Add your inputs (prompt, song, reference images)

  2. Pick your settings (style, mood, model)

  3. Click Generate

Your results are saved in My Library.


Option A: Generate from scratch (prompt-only)

This is the fastest option and works great when you already know what you want.

Use this when you want to generate almost anything, like:

  • Cover art concepts (abstract or photoreal)

  • Characters or mascots

  • Avatar-style figures for content

  • YouTube thumbnails, artwork, or graphics

  • Visual concepts for a release (multiple directions quickly)

Best practice:

  • Write a clear prompt (see the Prompt Formula below)

  • Keep your first generation simple

  • Iterate 1–2 changes at a time


Option B: Generate from a song (song-aware cover art)

Visuals supports music uploads and analyzes your track.

That analysis can guide the vibe of your cover art (tone, energy, pacing).

Use this when:

  • You want cover art that matches a specific track

  • You want the AI to follow the song’s vibe without guessing

  • You don’t have a strong visual direction yet

Tip:

Sometimes song-aware generations can go a little too literal.

If you already have a creative direction, write it clearly, then add a line like:

  • This is my rough visual vision. Also use the context from the song to guide the mood and style.

Keep the song selected, then iterate your prompt until it matches your direction.


Option C: Use reference images (how to use artist photos correctly)

Reference images are the fastest way to get consistent, usable results.

1) Decide what your reference image is for

Reference images usually do one of two jobs:

  • Subject reference (who or what should appear)

  • Style reference (the look and feel)

If you upload an artist photo and you want that person in the image, say it clearly.

Examples:

  • Create cover art using the person in my reference photo as the main subject. Maintain their identity and facial features.

  • Keep the same outfit as the reference photo.

  • Use the person in my reference photo as the subject, but change the outfit to a futuristic stage outfit.

Important:

  • Avoid using the full name of a publicly known artist/person in your prompt. If you need a specific person, use reference photos and describe the vibe instead.

2) Using multiple reference images

You can upload multiple references to combine:

  • A person (subject)

  • A style (lighting, colors, texture)

  • An object (car, instrument, symbol)

Best practice:

  • Keep references consistent (don’t mix 5 different styles)

  • If results get messy, remove references until it improves, then add back one at a time

3) Common mistakes with reference images

  • Uploading a low-quality selfie and expecting a magazine cover

  • Uploading multiple faces and not saying which one is the subject

  • Expecting the AI to perfectly recreate a person without stating it

  • Mixing references that don’t match (cartoon + studio portrait + anime)


The Prompt Formula (simple and repeatable)

A strong prompt usually has 4 parts:

  1. Subject (what’s on screen)

  2. Setting (where it is)

  3. Style (how it should look)

  4. Composition (how it’s framed)

Here’s a template you can copy:

  • Subject: [who/what]

  • Setting: [where]

  • Style: [photoreal / illustration / collage / 3D / minimalist]

  • Composition: [close-up portrait / centered / wide / negative space]

  • Lighting/Color: [warm / cool / neon / film grain / high contrast]

Prompt examples (good)

Artist photo cover (clean and usable)

  • Create an album cover using the person in my reference photo as the main subject. Maintain their identity and facial features. Use dramatic side lighting, a black background, and a high-contrast cinematic look. Frame it like a real album cover and leave negative space.

Minimalist single cover

  • Create a minimalist single cover with a black background and one red symbol centered. Make it clean, modern, and high contrast, with subtle texture and lots of negative space.

Retro collage cover

  • Create an album cover in a vintage collage style. Use torn paper textures, warm film grain, and a dreamy mood. Keep the subject centered and make the composition feel intentional, not messy.

Prompt examples (too vague)

  • Make a cool cover art

  • Make it cinematic

  • Make it modern

If you’re using genre/mood/style buttons and it’s not working, write exactly what you want in plain language.

  • Make a cool cover art

  • Make it cinematic

  • Make it modern

If you’re using genre/mood buttons and it’s not working, write exactly what you want in plain language.


Getting consistent results (the “pro” workflow)

If you want a consistent brand across a release or catalog:

  1. Pick 1–3 strong reference images

  2. Reuse the same prompt structure

  3. Keep the same style choices

  4. Change 1–2 things at a time (not everything at once)

This is how you stop wasting generations.


Important: review your prompt before you generate

There’s a common quirk to watch for.

If you select settings (genre, mood, style) and then edit your prompt (or edit the prompt and then select settings), those settings can sometimes rewrite or override parts of your prompt.

Before you click Generate, quickly re-read the prompt to make sure it still says what you intend.


Choosing an image model (simple guide)

In Cover Art & Images, you can choose between these models:

  • Nano Banana (default)

  • Nano Banana Pro

  • GPT-Image

  • Flux Kontext Max

Here’s the simplest way to choose.

Nano Banana (default)

Use this when you want fast iteration.

  • Best for: rapid concepting, exploring directions, quick retries

  • Speed: usually very fast (often under ~10 seconds)

  • Strengths: strong character consistency, good quality, best “first pass” model

Nano Banana Pro

Use this when you want your final.

  • Best for: highest-quality cover art, best text/spelling, best consistency when using your own photos

  • Speed: slower than Nano Banana (often ~30–60 seconds)

  • Best workflow: iterate with Nano Banana, then remix/refine with Nano Banana Pro once you like the direction

GPT-Image

Use this when you want to try a different look, but avoid it for projects that require consistent faces.

Flux Kontext Max

Use this if you want another high-quality option and you care about prompt adherence and typography.

  • Strengths: strong prompt following and typography

  • Speed: varies by provider and settings. Some providers advertise single-digit to low double-digit seconds.

Recommendation:

  • Start with Nano Banana while you dial in the prompt

  • Switch to Nano Banana Pro when you want the final output

Reminder:

  • Nano Banana is the default model in Visuals unless you manually change it.


Titles, logos, and text

If your cover needs a lot of text or perfect spelling, use Nano Banana Pro.

If you want a logo inside the image:

  • Upload the logo as a reference image

  • Say it clearly in your prompt (example: “Include the logo from my reference image on the cover, clean and readable”)

Tip:

  • If you created a great concept in Nano Banana, you can remix it in Nano Banana Pro to clean up spelling and polish the final output.


Troubleshooting: Fix bad results fast

The output doesn’t match my idea

  • Make the subject more specific

  • Remove extra ideas from the prompt

  • Add a reference image

The person looks wrong

  • Use a better subject photo (clear face, good lighting)

  • Upload 2–3 photos of the same person (different angles/poses) so the model understands their identity

  • Say it clearly: “Use the person in my reference photo as the main subject. Maintain their identity and facial features.”

  • Don’t upload multiple faces unless you want multiple people

The style keeps changing

  • Reuse the same references

  • Keep your prompt structure consistent

  • Avoid mixing styles (realistic + anime + cartoon)

The cover is too busy

Add one of these lines:

  • Minimal background

  • Clean composition

  • Lots of negative space

  • Centered subject

The image has unwanted text or marks

We don’t currently have a separate “Avoid” or “Negative” field, so include it directly in your prompt.

Add lines like:

  • No text

  • No watermark

  • No extra logos

  • No random letters

Tip:

What you don’t want is just as important as what you do want.


Advanced tip: get multiple ideas in one generation

If you want lots of cover art options without spending credits on many generations, you can ask for multiple concepts in one image.

Example workflow:

  1. Set aspect ratio to 9:16

  2. Prompt: “Create a 3×3 grid of 9 different cover art concepts for this song. Each tile should be a different pose/composition, but keep the style consistent. Make each tile clean and readable.”

  3. Generate once to get 9 options

  4. If you like one tile, upload that grid image as a reference image and say which tile to extract and remake as a single high-res cover


Privacy and sharing

Everything you generate is private by default.

If you want, you can choose to make something public to share it and get visibility.


Credits (quick note)

Cover Art & Images uses Visuals Credits when you generate.

You’ll always see the credit cost before generation runs.

If you need help with credits, see: Visuals Credits: How they work.


FAQs

Can I use Visuals Cover Art & Images for non-music designs (thumbnails, infographics, general artwork)?
Yes. You can use it as a general image generator for artwork, thumbnails, infographics, and more.

Do I need to upload a song to create cover art?
No. You can generate from a prompt and reference images. Uploading a song helps when you want song-aware outputs.

How do I make sure my artist photo is used in the image?
Upload the photo as a reference image and say it clearly in your prompt (example: “Use the person in my reference photo as the main subject”).

Why does the text on my cover look misspelled?
AI images often misspell text. Use text fields if available, or add text after export.

Can I keep my images private?
Yes. Everything you generate is private by default. You can choose to make an image public if you want.

What should I do if something fails or errors out?
Use the in-app chat bubble to message us in Intercom and include a screenshot and the error message.

Did this answer your question?