AI Model Benchmarks

Side-by-side comparisons of every major AI image model on real prompts. Same input, same prompt, different models — see how they stack up.

New: Model Showdowns

One prompt, every frontier text model — SVG drawings, animations and code, published raw.

Bizarre Plush Toy Web Catalog
Featured benchmark

Bizarre Plush Toy Web Catalog

Three AI models interpret an intentionally open-ended brief: a website catalog full of bizarre, unique plush toys. Stress-tests creative range under loose constraints, multi-subject density in one frame, and website-UI rendering (thumbnails, titles, prices, filters).

Stack of Pancakes with Melting Butter

Stack of Pancakes with Melting Butter

Three AI models render a stack of pancakes with melting butter and a pouring syrup stream. Stress-tests gloss/translucent materials + specular highlights.

Wes Anderson Hotel Lobby

Wes Anderson Hotel Lobby

Three AI models attempt a Wes Anderson symmetrical pastel-coloured hotel lobby. Stress-tests recognisable director-style adherence + symmetry.

1960s Pulp Paperback Cover

1960s Pulp Paperback Cover

Three AI models attempt a noir 1960s pulp paperback cover with title typography + a gouache illustration. Stress-tests period typography + illustration style.

Hands Holding a Coffee Cup

Hands Holding a Coffee Cup

Three AI models render two anatomically correct hands holding a ceramic mug. Stress-tests the canonical AI hand-anatomy failure mode.

Tokyo Alley After Rain

Tokyo Alley After Rain

Three AI models render a rain-soaked Shinjuku alley with stacked neon signs reflecting in puddles. Stress-tests wet-surface reflections + colour-light interaction.

Vintage 1980s Cereal Box

Vintage 1980s Cereal Box

Three AI models render a fictional 1980s cereal box with retro display typography + a mascot. Stress-tests multi-line text + nostalgic packaging style.

Windows 7 Desktop, Early 2010s

Windows 7 Desktop, Early 2010s

Three AI models try to render a pixel-perfect Windows 7 desktop screenshot circa 2012 — Aero glass taskbar, Harmony wallpaper, Internet Explorer 9, Windows Live Messenger. Stress-tests small UI text rendering, period-authentic detail recall, and style adherence (must NOT default to modern flat design).

Brutalist Cafe Sign at Golden Hour

Brutalist Cafe Sign at Golden Hour

Three AI models render a backlit metal cafe sign on raw concrete at golden hour. Stress-tests text-rendering + weathered material + directional light.

AI Movie Poster

AI Movie Poster

Same prompt, seven image models — same fictional title, see who renders the bold display headline cleanly, who handles the credits-block hierarchy, who nails the aged-paper Drew Struzan aesthetic.

Vintage Bookshop Neon Sign

Vintage Bookshop Neon Sign

Same prompt, eight image models — who renders the neon sign cleanest, who nails the wet-cobblestone dusk lighting, who composes the bookshop window best.

Dramatic Mountain Silhouette

Dramatic Mountain Silhouette

Same prompt, three AI models. A surreal double-exposure portrait with a mountain landscape blended into the silhouette. See how GPT Image 2, Nano Banana 2, and Flux 1.1 Pro each interpret the brief — including where one of them surprises you.