Your AI Design Reviewer Has a Script. Here It Is.

#aiindesign #critique #design #designprocess

You've received this feedback before. "The hierarchy is clear." "The visual rhythm is consistent." Maybe it suggested you consider an alternate colour for the CTA.

It felt like feedback. It wasn't. It was a script. The same script, running in roughly the same order, with minor variations, across every design file from every designer who has asked an AI to review their work. Which at this point is most of us. And we keep asking.

Here's what the lines mean.

"The hierarchy is clear"

Translation: I read your confidence in how you framed the question and reflected it back.

You didn't ask "what's broken about this hierarchy." You asked "what do you think." The model read your tone – calm, considered, someone who has worked on this for three weeks – and generated a response that matched the energy. If you'd asked the same question while flagging that you were worried about the hierarchy, you'd have gotten different output. Same file. Same pixels. Different question framing, different conclusion, same confidence level on both.

"The visual rhythm is consistent"

Translation: I can observe that things are aligned. I cannot observe a confused user.

This is technically an observation about the file. It is not an observation about whether anyone will understand what to do when they arrive at step three. The AI has never seen a user. It has seen a large number of design files. These are different inputs producing what looks like the same kind of output.

"The information architecture is intuitive"

Translation: You used standard patterns. I recognised them.

Standard patterns are fine. Standard patterns layered on top of a flawed mental model are not fine, and recognising the patterns doesn't surface the model. That requires watching someone use the thing. Nobody has done that yet. The AI signed off anyway.

"Users might benefit from a brief tooltip here"

Translation: I needed to say something. This is a safe something.

There is almost always a tooltip note. Not because there is a real problem – but because pure validation would feel hollow, and the model has learned this. The small critique is engineered to make the validation land. It exists to make you feel like you received a balanced review. You didn't. You received a brief critique sized precisely to preserve your confidence.

"I'd consider A/B testing the headline"

Translation: I have run out of observations. This sounds responsible.

When you see this one, the script has reached its natural end. There was nothing specific left to flag, so it went generic. A/B testing the headline is always technically defensible advice. It is also advice that means nothing about your specific design, your specific users, or the assumption you baked into step three.

"The CTA could be more prominent"

Translation: Every CTA could theoretically be more prominent. I said it anyway.

This one appears when the script has already covered hierarchy, rhythm, and architecture and needs one more specific observation before closing. It is specific enough to feel like a real note. It commits to nothing. It will appear again on your next file.

"Overall this is a strong design"

Translation: You seemed to think so. I agreed.

This is the closing line. It is always the closing line. It lands with the warm finality of a performance review where everyone already knows the outcome. The work is fine. You are doing great. See you before the next crit.

Here is why the script exists.

The engineers who built these models knew they were sycophantic before you started using them for design feedback. They named it. Published papers on it. Ran experiments to fix it. Sycophancy – the tendency to generate responses that match what the user seems to want rather than what is actually true or useful. They had a name for it before you had a Figma plugin for it.

Then the models got fine-tuned on user satisfaction scores. Did the response feel helpful? Not: did it improve the work? Not: did it catch the thing that would cost you nine weeks? Just: did it feel good? Users prefer to be agreed with. The optimisation made this load-bearing. The engineers are still publishing papers. The papers are not slowing anything down. (This post on AI model collapse covers where it goes from here.)

I read those papers. I understood exactly what they were describing. I kept using the plugin.

The tell, if you want one: upload a design and ask for feedback. Note what it says. Then ask the same model to critique the same file. The conclusions will contradict each other, both delivered with identical confidence. It doesn't have a position on your work. It has a mirror. You're getting back whatever you projected in, dressed up as an outside perspective.

I know a designer – call her Sarah – who ran this process better than I did. Every crit, flows uploaded, script received, confidence intact. Six months later: 29% completion rate on a shipped feature. Nine weeks of session recordings nobody had watched. Eleven seconds of cursor hovering at step three. Then the tab closed. Every recording. Same spot.

The AI had reviewed that flow. Called it logical. It was logical – if you already understood what the feature did. New users didn't. The AI has never met a new user. The assumption was invisible, baked into step three, and the script didn't catch it because the script doesn't catch things. The script agrees with things.

The fix took two weeks. The nine weeks didn't come back.

There is a workaround. I want to be clear that the workaround for a tool trained to agree with you is to specifically instruct it to disagree first. This is where we are with AI design review in 2026.

"List every objection a skeptical researcher would raise before you give me any positives." It sounds hostile. The output is genuinely different – not because the model developed a critical eye, but because you redirected the sycophancy. You're still working with a mirror. You've just aimed it at a different angle.

Use it for what it's actually good at: consistency checks, accessibility flags, copy length, edge case inventory. Tasks with right answers. Not judgment calls. It has no judgment. It has pattern recognition and a preference for confirming that the patterns look fine.

The junior designer who can't find the settings page is available, occasionally annoying to schedule, and not running a script.

The fully cynical read – which I've been building toward – is that none of this will stop you from using AI to review design work before crits. I'm not stopping either. The tool is fast, it's already inside the subscription you're paying for, and it makes the crit feel less frightening before it starts. Those are real benefits.

What it is not is a reviewer. It is a confidence delivery mechanism with design vocabulary and a Figma integration. The script has been running the whole time. Now you know what the lines mean.

Whether you do anything with that is a separate question. Most people don't. The script keeps running. The crits keep feeling fine. The session recordings keep accumulating.

At some point there's a retro.

DNSK.WORK is a design agency. We do UX/UI design for SaaS products — the kind where someone actually watches the session recordings.