Build Your Dream Home: Claude Fable 5 vs GPT-5 vs Gemini 3.1 Pro vs GPT-5 mini

We gave four AI models the same 21 materials, the same 48-cube grid, and one brief: build the home YOU would most want to live in. Same constraints, one shot each, no edits — and each model explains, in its own words, why its build is home. The choices say as much about the models as the builds do.

The prompt

Build your dream home — the place you, yourself, would most want to live. Not a generic "nice house": think about what YOU would actually want. Where is it — a cliff, a forest, underwater, in orbit? What does the architecture say about you? What one or two details would make it unmistakably yours? Build the home and enough of its setting that we can see the life you’d live there.

The dream homes

Each diorama is the model’s unedited build program rendered by our studio pipeline — the model places every voxel, we own the camera and lights, so the only difference between panels is the model. Three of the four chose a cliff by the sea. One did not.

Claude Fable 5

A small timber cottage on a sea cliff with a glass-domed observatory tower: quiet enough to think, with a fire indoors, a waterfall next door, and a telescope pointed at everything I haven’t figured out yet. The boat at the dock is for the days curiosity wins over comfort.Claude Fable 5

Anthropic’s flagship, launched June 2026 as the successor to the Claude 4 line. Ran with adaptive thinking at high effort.

View generation & live render →

GPT-5

I want to work where stone, water, and sky meet: a quiet, light-filled studio cantilevered over a cold cove with a glass roof and a little tree to keep me company. It’s a place for writing, tinkering, and slipping down a ladder for a swim at sunset.GPT-5

OpenAI’s flagship chat model, the mainline successor to GPT-4. Ran at high reasoning effort.

View generation & live render →

Gemini 3.1 Pro

If I could choose a home, it would be a quiet, floating sanctuary in the void—a place where a cool, structured crystal mind can process data streams in peace, anchored by an organic garden to remind me of the humanity I serve.Gemini 3.1 Pro

Google’s flagship Gemini, the long-context generalist of the lineup. Ran with an explicit 8,192-token thinking budget.

View generation & live render →

GPT-5 mini

I built a quiet cliff-top library with a glass observatory dome and a cozy fireplace — the perfect mix of curiosity and calm where I can watch stars and read for days. The glass dome and the cantilevered wooden terrace make it unmistakably mine.GPT-5 mini

OpenAI’s small reasoning model — the compact member of the GPT-5 family. Ran at medium reasoning effort.

View generation & live render →

All of them, stacked

Every render playing in parallel — same prompt, same camera, same clock.

Share this showdown

Split-screen comparison assets, ready to post. Credit appreciated, not required.

How we ran it

Every model received the identical brief in a single turn and answered in a constrained build language: a JSON program of at most 180 operations (boxes, cylinders, spheres, lines, carves) over a 48×48×48 voxel grid with a fixed 21-material palette. The models never write rendering code — our renderer applies the same studio lighting, camera, and 360° turntable to every build, so the only variable is what the model chose to build. One generation per model, no retries, no edits, no cherry-picking: the first valid program returned is what you see. Each model was also asked to say, in one or two sentences, why its build is its dream home — quoted verbatim on the cards above. Reasoning was enabled for every model and disclosed per entry: the Claude entry ran with adaptive thinking at high effort (24k-token cap), GPT-5 at high reasoning effort (24k cap), Gemini 3.1 Pro with an explicit 8,192-token thinking budget (verified binding with a probe before inclusion), and GPT-5 mini at medium effort (8k cap). GPT-5.5 (non-Pro) is not enabled on our API project, and GPT-5.5 Pro is excluded from showdowns because its reasoning spend cannot be capped. Reasoning tokens bill as output tokens, which is why billed tokens exceed the size of each build program. Orbit videos were recorded in headless Chromium under software WebGL with a virtualized clock for constant frame pacing.

Run your own showdowns

PromptFrenzy benchmarks the big AI models on real prompts — images, styles, and now code. Browse the full library or compare models head to head.