CI/CD for Design: Catching AI Slop Before It Ships
Your pipeline blocks a missing semicolon but waves through #3b82f6 buttons and transition-all everywhere. Here's how to fail the build on generic design — with a real GitHub Actions workflow and a score threshold.
You lint your code. You type-check your code. You test your code. Then you ship your design unchecked. A pull request adds a hero with bg-blue-500 buttons, Inter, and transition-all duration-300, every test goes green, and nobody notices until a designer opens the page and mutters "this looks generic." Sailop's CI integration closes that gap: it fails the build the same way a broken test does.
The Gap in Your Pipeline
Modern frontend pipelines are thorough about code quality:
- ESLint catches code patterns and anti-patterns
- TypeScript catches type errors
- Prettier enforces formatting
- Jest/Vitest catches functional regressions
None of them care what the output looks like. A component can pass every lint rule, every type check, and every test while rendering a page that's indistinguishable from ten thousand others: #3b82f6 buttons, rounded-2xl cards, backdrop-blur-md navbars, Inter at every weight.
Design quality has been a human-review-only concern. But a reviewer skimming a 600-line diff at 5pm does not catch the transition-all duration-300 ease-in-out buried in line 412 — and AI agents like Cursor and v0 emit that line by reflex. If you want the taxonomy first, start with what AI slop is and why it matters, then automate the part humans keep missing.
sailop check: The Design Linter
sailop check works like a linter, but for visual patterns:
# Check files with a maximum score threshold
sailop check ./src --max-score 50
# Exit 0 if every file is below the threshold
# Exit 1 if any file exceeds itThat exit code is the whole building block. Set a maximum DNA score; the command fails if any file goes over.
# Strict: nothing above 40 (Grade B or better)
sailop check ./src --max-score 40
# Moderate: nothing above 60 (Grade C or better)
sailop check ./src --max-score 60
# Lenient: nothing above 80 (catch only the worst offenders)
sailop check ./src --max-score 80GitHub Actions Integration
A complete workflow that scores design on every pull request:
# .github/workflows/design-check.yml
name: Design Quality Check
on:
pull_request:
paths:
- 'src/**/*.tsx'
- 'src/**/*.css'
- 'src/**/*.html'
jobs:
sailop-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
- name: Install Sailop
run: npx sailop install
- name: Run DNA scan
run: sailop ci ./src --fail-above 60 --format json > sailop-report.json
- name: Upload report
if: always()
uses: actions/upload-artifact@v4
with:
name: sailop-report
path: sailop-report.json
- name: Comment on PR
if: failure()
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
const report = JSON.parse(fs.readFileSync('sailop-report.json', 'utf8'));
const body = `## Sailop Design Check Failed
DNA Score: **${report.score}/100** (Grade ${report.grade})
Threshold: 60/100
| Dimension | Score |
|-----------|-------|
| Color | ${report.dimensions.color} |
| Typography | ${report.dimensions.typography} |
| Layout | ${report.dimensions.layout} |
| Animation | ${report.dimensions.animation} |
| Components | ${report.dimensions.components} |
| Structure | ${report.dimensions.structure} |
| Spacing | ${report.dimensions.spacing} |
Run \`sailop fix ./src --apply\` to auto-fix detected patterns.`;
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: body
});Four moves: it fires only on PRs that touch frontend files, runs the scanner, fails when the DNA score clears 60, and drops a per-dimension breakdown into the PR thread so the author sees exactly which axis sank them.
What a Blocked PR Actually Looks Like
Concretely: a teammate ships a pricing card generated by an agent. The diff is clean, types pass, Vitest is green. Sailop blocks the merge and the bot comments:
## Sailop Design Check Failed
DNA Score: 72/100 (Grade D) — Threshold: 60
src/components/PricingCard.tsx
✗ color-blue-range bg-blue-600 (hue 217, the AI default band)
✗ animation-transition-all transition-all duration-300 on .card
✗ components-rounded-default rounded-2xl + shadow-lg + border-gray-200The author opens the file, swaps bg-blue-600 for the brand #0f766e (a teal accent nothing auto-reaches for), replaces transition-all with transition-[transform,box-shadow], and re-runs. Score drops to 48, Grade B, build green. Two minutes, and one more generic card never reached production. If picking that non-default accent is the hard part, see how to pick an accent color that isn't Tailwind blue.
Pre-Commit Hook
For faster feedback, catch slop before it even reaches the PR:
# Install the pre-commit hook
sailop hook install
# Or add to .husky/pre-commit manually:
sailop check ./src --max-score 50 --staged-only--staged-only checks just the files about to be committed, so the hook adds a fraction of a second instead of rescanning ./src.
// package.json
{
"scripts": {
"lint": "eslint src/",
"typecheck": "tsc --noEmit",
"test": "vitest run",
"design-check": "sailop check ./src --max-score 50"
},
"lint-staged": {
"*.{tsx,css,html}": ["sailop check --max-score 50"]
}
}sailop ci: Structured Output
sailop ci is built for automation. It emits JSON that other tools consume:
sailop ci ./src --fail-above 60 --format json{
"score": 72,
"grade": "D",
"pass": false,
"threshold": 60,
"dimensions": {
"color": 81,
"typography": 68,
"layout": 75,
"animation": 90,
"components": 65,
"structure": 58,
"spacing": 67
},
"files": [
{
"path": "src/app/page.tsx",
"score": 72,
"findings": [
{ "rule": "color-blue-range", "severity": "high", "message": "Primary color in AI blue band (hue 217)" },
{ "rule": "animation-transition-all", "severity": "high", "message": "transition-all detected (use specific properties)" }
]
}
],
"timestamp": "2026-02-12T14:30:00Z"
}Gradual Adoption
Enforcing a strict score on day one fails an entire existing codebase and the team revolts. Ratchet instead:
Week 1-2: Observe
# Scan without failing, just collect data
sailop ci ./src --format json > baseline.jsonWeek 3-4: Warn
# Fail only on Grade F (score > 80)
sailop check ./src --max-score 80Month 2: Moderate
# Fail on Grade D or worse (score > 60)
sailop check ./src --max-score 60Month 3+: Strict
# Fail on anything below Grade B (score > 40)
sailop check ./src --max-score 40Each tightening only blocks new regressions past the line the team has already cleared, so scores climb without freezing the backlog.
Dimension-Specific Checks
Not every team weights every dimension the same. Scope the check:
# Only animation patterns (the most common AI tell)
sailop check ./src --dimension animation --max-score 40
# Only color and typography
sailop check ./src --dimension color,typography --max-score 50
# Ignore the structure dimension
sailop check ./src --skip-dimension structure --max-score 50Integration With Existing Tools
Sailop's CI output works with:
- GitHub Actions: native integration via exit codes and JSON output
- GitLab CI: same exit-code behavior, pipe JSON to artifacts
- Bitbucket Pipelines: call the CLI directly in pipeline scripts
- Jenkins: parse JSON output in post-build steps
- Vercel: add it to the build command in
vercel.json
// vercel.json
{
"buildCommand": "sailop check ./src --max-score 60 && next build"
}The && matters: a failing design check short-circuits before next build runs, so a Grade-F deploy never reaches the CDN.
What It Catches
The most common findings across real CI scans:
- transition-all duration-300 (Animation) — 89% of flagged files
- blue/indigo primary color (Color) — 73% of flagged files
- Inter/system-ui only (Typography) — 68% of flagged files
- rounded-lg everywhere (Components) — 61% of flagged files
- 4px-grid-only spacing (Spacing) — 57% of flagged files
These are exactly the reflexes AI agents emit by default. For the full catalog, see the definitive list of 90+ AI design patterns to avoid. Caught in CI, they never reach production.
The Complete Stack
ESLint catches bad code patterns. TypeScript catches type bugs. Tests catch broken behavior. Sailop catches visual homogeneity — the failure mode none of the other three can see. For why design needs its own linter at all, read Sailop vs ESLint; for the workflow end to end, the complete guide to anti-AI design.
npx sailop install
sailop hook installSet it up once. The build fails on generic design from now on, with or without a designer in the room. Start at sailop.com.
SHIP CODE THAT LOOKS INTENTIONAL
Scan your frontend for AI patterns. Generate a unique design system. Stop shipping the same blue gradient as everyone else.