# Justin McKelvey — Full Content Index
> Fractional Product & Tech Lead helping founders build go-to-market strategy, nail positioning, and ship products that actually grow. 15+ years building products, 50+ shipped, $53M+ revenue generated.
## About
Justin McKelvey is a software consultant and product builder based in Austin, TX. He works with early-stage and growth-stage founders as a fractional product and tech lead. His core focus areas map directly to the highest-demand founder problems:
1. **Go-to-Market Strategy** — Complete GTM planning, channel selection, launch sequencing, and 90-day execution roadmaps.
2. **Positioning & Messaging** — Positioning frameworks that make your offer instantly clear. Messaging that converts cold outreach into meetings.
3. **Pricing & Revenue Models** — Unit economics, tier packaging, willingness-to-pay analysis. Most founders leave 30-50% revenue on the table.
4. **Sales Process Design** — Founder-led sales pipelines, follow-up automation, proposal frameworks, and deal closing strategies.
5. **Customer Acquisition** — Channel-market fit, acquisition loops, CAC optimization, and scaling from first 10 customers to 10,000.
6. **Product Development** — Lean MVP scoping, Ruby on Rails 8, AI integration, and shipping fast without burning cash.
### Track Record
- **PlayYourCourt** — Scaled community infrastructure, led product and AI strategy
- **Qualifyed.ai** — Built AI lead qualification agents, achieved 95% cost reduction
- **Achievrs** — End-to-end product development, 4.9 app store rating
- **SuperDupr** — Founded product studio serving startups and SMBs
- **50+ products shipped** across SaaS, marketplaces, mobile apps, and AI tools
### Location & Contact
- Based in Austin, TX
- Website: https://justinmckelvey.com
- Contact: https://justinmckelvey.com/contact
- LinkedIn: https://linkedin.com/in/justinmckelvey
---
## Engagement Surfaces
These are the primary pages where business owners and founders can learn how Justin works with them and start an engagement. Each is a standalone offering with its own landing page.
### AI for Business Owners
URL: https://justinmckelvey.com/ai-for-business-owners
This is the hub page for business owners running $1M-$50M companies who keep hearing the same "AI guy" name from their network and want to figure out what to actually do. The promise: turn AI from a toy you mess with in a browser tab into infrastructure your team uses every day — without a $200K platform contract. The page positions Justin as the operator-friendly alternative to enterprise AI consulting firms that sell 6-month strategy decks staffed by junior consultants.
The "what AI for business owners actually means" section defines AI as using LLMs (Claude, GPT) and the agents built on them to compound the work the team already does — not replace people, but give every team member the leverage of an extra colleague. The page identifies the highest-impact starting points for $1M-$50M companies: content production, customer support, internal knowledge management, sales follow-up, and any operations workflow with repeatable patterns. It calls out that most owners are using AI as a search engine while the 5% who treat it as infrastructure are pulling away.
The "what I actually do" section lays out three modes of work in order: (1) a written plan via the AI Readiness Assessment, (2) building the highest-impact item from the roadmap with the team (usually 4-8 weeks), and (3) workshops or keynotes for teams or peer groups like EO and Vistage. Shipped work referenced on the page includes GetLocalCall.com (AI phone answering for service businesses), SuperDupr (Justin's agency), and Vibe Code Rescue (a productized offer for founders with broken Cursor/Replit MVPs).
#### Key Questions Answered
- How can a small business owner start using AI?: Pick one workflow that eats time every week — proposals, customer emails, meeting summaries, social content. Build a simple AI workflow around that one task. Measure how much time it saves over two weeks. Most owners try to boil the ocean, fail, and conclude AI doesn't work.
- Should a small business owner hire an AI consultant?: Not a firm — hire a person who has actually shipped AI workflows in production. Big AI consulting shops sell 6-month strategy decks. You need a 2-week roadmap from someone who's done the work.
- How much does AI cost for a small business?: Tooling is the cheap part — most teams of 10-50 spend a few hundred dollars per month on subscriptions like Claude Pro, ChatGPT Team, or Microsoft Copilot. Avoid any vendor pitching a $200K "platform" to a $5M business.
- What AI tools should business owners use in 2026?: For most businesses under $50M revenue: Claude or ChatGPT as the daily driver, Zapier or Make for connections, Otter or Granola for meeting capture, a custom Claude project or GPT for specific workflows. That's 80% of what you need.
- Do I need an AI strategy?: Less than people think. You need a list of 3-5 specific things you're going to try in the next 90 days, with owners and deadlines. A working AI workflow your team uses daily is worth more than a perfect 60-page deck.
#### Call to Action
Primary: Hire me for 2 weeks → https://justinmckelvey.com/assessment
Secondary: Free 30-min strategy call → https://justinmckelvey.com/book/strategy-call
---
### AI Readiness Assessment
URL: https://justinmckelvey.com/assessment
This is the productized 2-week strategic engagement. You hire Justin for two weeks; he pays close attention to your specific business — workflows, team, tools — and hands you a written AI roadmap on day 14. No frameworks, no decks, no 90-day discovery phase. A real plan you can execute yourself or hand back for him to build. Capped at 2-3 engagements per month so each one gets real attention.
Week 1 is discovery: 2-3 calls with you and key team members, an audit of current workflows, existing AI usage, and the tools you're already paying for. Week 2 is synthesis: opportunities ranked by impact and effort, specific tool recommendations, the 30-60-90 day roadmap written out. The deliverable is a 15-25 page document — not a slide deck — followed by a 1-hour walkthrough call on day 14. If you hire him afterward to build any of the recommendations, the assessment fee is credited against the larger engagement.
What's in the deliverable: (1) Workflow Audit — where time leaks and where AI moves the needle, honest not flattering; (2) AI Opportunity Map — 8-12 candidates ranked by impact and effort, top 3 picked to focus on; (3) Tool Recommendations specific to your stack and budget with no affiliate links; (4) 30-60-90 Day Plan with owners, milestones, and rough effort estimates. Fit profile: $1M-$50M business, repeatable workflows, team that can execute, prefer an operator over a firm with junior consultants.
#### Key Questions Answered
- What is an AI Readiness Assessment?: A structured engagement where an experienced AI implementer evaluates your workflows, tools, and team capabilities, then delivers a written roadmap of where AI can create the most leverage. Justin's version is 2 weeks and produces a workflow audit, opportunity map ranked by ROI, and a 90-day implementation plan.
- How long does it take?: Two weeks from kickoff. Week 1 discovery, Week 2 synthesis, deliverable on day 14 with a 1-hour walkthrough call.
- How much does it cost?: Fixed fee discussed on the free strategy call. No hourly billing, no scope creep. If you decide to engage Justin afterward to build any of the recommendations, the assessment fee is credited against the larger engagement.
- Who is this for?: Business owners and founders running $1M-$50M companies who already know AI matters but don't want another generic consulting deck. Best fit: repeatable workflows, a team that can execute, you want a clear plan to act on within 90 days.
- How is this different from a free strategy call?: The free call is a quick gut-check on fit. The Assessment is two weeks of focused work with a written deliverable. Free call = scoping. Assessment = execution-ready plan.
#### Call to Action
Primary: Start the Assessment (inquiry form) → https://justinmckelvey.com/assessment#inquiry
Secondary: Talk First (Free 30 Min) → https://justinmckelvey.com/book/strategy-call
---
### Work With Me
URL: https://justinmckelvey.com/work-with-me
The engagement menu. Four doors and what's behind each one — no "tiers," no "programs," no 90-day discovery decks. The page is built for the person trying to figure out the right entry point and want it spelled out without jargon. Most clients start with the free call or the Assessment, then expand into builds or fractional work afterward.
The three primary cards: (1) Strategy Call — free 30 minutes, honest read on your AI position vs. peers, same-day recommendation with named tools and sequence, no pitch; (2) AI Readiness Assessment — the flagship 2-week engagement with workflow audit, opportunity map, and 30-60-90 day plan; (3) Build / Fractional — custom AI systems built for how your business actually works, with two flavors: Build with you (Justin works alongside your team and they own it at the end) or Built for you (SuperDupr, his agency, handles it end-to-end). A separate callout points Vibe Code Rescue traffic (founders with broken Cursor/Replit/Lovable MVPs) at the Vibe Code Rescue case study, because that's a different problem with a different audience.
A "DIY tools" section showcases productized AI tools that don't need a consultant: GetLocalCall (live — AI phone answering for service businesses), Email Cleaner (Q2 2026 — AI inbox triage), and Executive Brief (Q2 2026 — weekly AI-generated business summary emailed Sunday night). For people who don't fit any of these, the footer invites a direct email to hello@justinmckelvey.com.
#### Key Questions Answered
- How long does an engagement take?: Free strategy call (30 min) to ongoing fractional CTO retainers. The AI Readiness Assessment is fixed at two weeks. Build engagements typically run 4-12 weeks depending on scope. Fractional CTO is ongoing — most clients work with Justin for 6-18 months.
- What does it cost?: 30-minute strategy call is free. AI Readiness Assessment is a fixed fee discussed on the call. Build engagements and fractional CTO work are scoped per project — most engagements land between $5K-$50K total. No hourly billing, no surprise invoices, no 6-month minimum contracts.
- Can I switch engagement types mid-stream?: Yes. Most engagements naturally evolve — strategy call leads to an Assessment, Assessment leads to a Build, Build leads to ongoing fractional work. No contract lock-in. If something isn't working, change it or end it without ceremony.
- Are you available right now?: Strategy calls usually available within 1-2 weeks. Assessment slots capped at 2-3 per month, so 2-6 week lead time is typical. Fractional engagements have a 2-4 week lead time.
#### Call to Action
Primary: Book a strategy call (the funnel top for most visitors) → https://justinmckelvey.com/book/strategy-call
Secondary: See the Assessment → https://justinmckelvey.com/assessment
---
### AI Readiness Checklist (Free Self-Assessment)
URL: https://justinmckelvey.com/ai-readiness-checklist
The free top-of-funnel tool. 30 yes/no questions across 5 categories — data, workflows, team, governance, execution — that score a business's AI readiness in 5 minutes. Designed as honest self-scoring, not a sales pitch. Delivered by email after a single email-capture step. No payment, no upsell. Printable so leadership teams can run it together as a 30-minute conversation rather than a solo exercise.
The premise: most business owners don't actually know where they stand on AI. They want a fast, honest read before spending money on tools, consultants, or platforms. The checklist gives that read in 5 minutes. Each "yes" is one point. Total score (0-30) maps to a readiness tier with a real next-step recommendation — not "book a call with our sales team."
Score interpretation: 0-10 = Curious (AI is on your radar but nothing is shipping; fix isn't strategy, it's picking one workflow and trying it next week); 11-18 = Equipped (you have tools but no system; opportunity is connecting AI to repeatable workflows); 19-25 = Operating (AI is part of how you work; opportunity is governance and agents — durable systems that don't depend on individual heroics); 26-30 = Compounding (you're ahead; opportunity is custom builds — internal tools and customer-facing AI features for defensibility). The checklist funnels naturally toward the paid AI Readiness Assessment for owners who want the same diagnostic applied to their specific business.
#### What's Inside
- 30 questions across 5 categories (data, workflows, team, governance, execution)
- Honest scoring — yes/no, no consultant-speak, no math
- A real recommendation based on your score, with what to actually do next
- Printable for your team — run it with leadership in a meeting
#### Key Questions Answered
- What is an AI readiness checklist?: A structured self-assessment that helps a business owner score their organization's preparedness to adopt and benefit from AI tools. Justin's version is 30 yes/no questions across data, workflows, team, governance, and execution — designed to take 5 minutes and produce an honest score.
- How does an AI readiness assessment work?: Answer 30 yes/no questions across 5 categories. Each yes is one point. Total score (0-30) maps to a readiness tier: 24+ ready to start building, 16-23 a strategic engagement would help most, under 16 data and workflow foundations come before AI investment.
- Is the AI readiness checklist free?: Yes. The full 30-question checklist is delivered by email after a single email-capture step. No payment, no upsell. Self-scoring — you don't need to talk to anyone to get value from it.
- Who is the AI readiness checklist for?: Business owners and operators running $1M-$50M companies who want a fast, honest read on their AI readiness before spending money on tools, consultants, or platforms. It's for the person writing the check, not technical teams scoping ML projects.
#### Call to Action
Primary: Enter your email to unlock the full checklist → https://justinmckelvey.com/ai-readiness-checklist
Funnel step: After completing the checklist, the natural next step for owners scoring in the middle band is the paid AI Readiness Assessment.
---
## AI Consultant Agency: Pick the Right One in 2026
- **URL:** https://justinmckelvey.com/blog/ai-consultant-agency
- **Published:** May 31, 2026
- **Updated:** May 31, 2026
- **Category:** AI for Business
- **Reading time:** 7 min
- **Description:** AI consultant agency in 2026: 3-person boutique ($25K-$200K) vs 100-person firm ($100K-$500K). How to vet, what to ask, and the 4 red flags.
Quick Answer (Buyer's Guide)
AI consultant agencies in 2026 range from 3-person implementation boutiques ($25K-$200K) to 100-person firms ($100K-$500K). For businesses under $50M revenue, smaller agencies deliver better value than Big 4 firms — same senior expertise on your project, 2-5x cheaper. The $49 CPC on this search keyword reflects real buyer demand: these are serious commercial engagements. The single best vetting question: "Show me 3 production AI features you've shipped in the last 6 months." If they can't answer with specifics, they're selling slideware not software.
Based on AI consulting market analysis + active fractional CTO work · June 2026 · Author: Justin McKelvey
Key Stats (June 2026)
• Boutique agency (3-15 people): $25K-$200K per project, $5K-$25K/mo retainers
• Mid-sized agency (15-100 people): $100K-$500K per project, $10K-$50K/mo retainers
• Hourly rates: $150-$500/hr depending on seniority
• Best for <$50M revenue businesses: Boutique agencies (2-5x cheaper than Big 4)
• Common engagement length: 4-12 weeks pilot, 12-26 weeks full programs
• CPC: $49 — high commercial intent, sophisticated buyer market
• Biggest red flag: Strategy decks without shipped production work to back them up
TL;DR: AI Consultant Agencies in 2026The AI consultant agency market in 2026 is fragmented and confusing. Agencies range from 3-person specialty boutiques to 100-person firms, with prices ranging 10-20x for what looks like the same service on a website. Matching your business to the right agency size is the single biggest factor in whether you get value or waste $200K on slideware.
The honest reality: for most businesses under $50M revenue, a small boutique agency (3-15 people) delivers better value than a larger firm. You get senior people on your project, faster shipping, and 2-5x lower cost than mid-sized agencies or Big 4 firms.
I'm a fractional CTO who runs an implementation-focused practice and advises clients on hiring AI consultant agencies. This guide is the honest buyer-side view — written by someone who'd rather you get the smallest engagement that actually works than the largest one I could quote.
The Three Tiers of AI Consultant Agencies
Tier
Size
Typical project cost
Best for
Avoid for
Boutique
3-15 people
$25K-$200K
SMB to mid-market shipping focused AI features
Multi-team enterprise rollouts
Mid-sized
15-100 people
$100K-$500K
Mid-market with multiple parallel initiatives
Single-feature pilots (overkill)
Large agency / Big 4
100-1,000+ people
$500K-$5M+
Fortune 500 procurement + change management
Speed, cost, hands-on implementation
The split that matters most: boutiques and Big 4 firms are different products entirely, not different prices for the same product. Boutiques sell senior expertise + speed at a focused scope. Big 4 firms sell enterprise-grade processes + procurement compatibility at large scale.
What AI Consultant Agencies Actually DeliverThe biggest mistake businesses make is hiring an agency expecting one type of work and getting another. The two main categories:
Implementation-focused agencies build and ship working AI features. Their deliverable is production code in your systems: customer support automation, AI-powered SaaS features, document processing pipelines, agent workflows. They're closer to product engineering teams than to traditional consultants.
Strategy-focused agencies identify AI opportunities, write roadmaps, build ROI models, and present to your stakeholders. Their deliverable is a 60-page deck and an implementation plan. They typically don't ship code.
Most businesses need implementation work. Many pay for strategy work instead — usually because the agency's website made the distinction unclear, or because the buyer didn't know to ask. Always ask: "Will you ship working code or just a strategy document?"
How to Pick the Right Size AgencyMatch agency size to your actual situation:
Your situation
Right agency size
Why
Startup <$5M ARR, one AI feature to ship
Solo / 2-3 person boutique
Senior expertise at lowest cost
$5M-$25M ARR, focused AI implementation work
3-15 person boutique
Capacity + speed without firm overhead
$25M-$100M ARR, multiple parallel AI workstreams
15-50 person agency
Parallel capacity, established processes
$100M+ ARR, multi-business-unit AI rollout
50+ person agency or Big 4
Change management + governance at scale
Fortune 500 procurement requires Big 4
Big 4 firm
This is the answer regardless of fit
The 6 Questions That Filter Bad AgenciesUse these in your first conversation with any agency you're seriously considering:
1. "Show me 3 production AI features your team has shipped in the last 6 months." Specific, recent, in production. The recency matters because AI capabilities change fast. If they can only show 18-month-old work or strategy decks, they're either inexperienced at recent AI work or coasting on past wins.
2. "Who from your team will actually work on my project?" Get specific names, LinkedIn URLs, years of experience. At larger agencies, the senior people pitch and junior people execute. Confirm the senior people you're impressed with will be hands-on, not just in oversight meetings.
3. "What's your typical pricing for a 90-day pilot vs full implementation?" Real agencies can answer in 30 seconds with a range. Vague answers ("it depends on scope") are buying time to figure out what you'll tolerate paying.
4. "Who owns the code and data at engagement end?" Should be: you. Watch for agencies that retain IP, build "platforms" you license back, or create dependencies that require ongoing fees.
5. "Show me a case where the engagement didn't go as planned. What happened?" The best filter question. Anyone polished enough to be lying will fumble it. Real practitioners will tell you a specific story with technical details that make it clear they lived it.
6. "Can you show me actual technical artifacts from your work?" Code samples (with permission), live demos, monitoring dashboards, architecture diagrams from real projects. A serious implementation agency can show concrete technical work in 10 minutes. A strategy-only firm will deflect or offer slide-deck case studies instead.
Red Flags to Avoid
• Leads with workshops and "alignment sessions" instead of shipped work
• Anonymous case studies without specific metrics
• Opaque pricing requiring multiple discovery calls before any ballpark
• $50K+ upfront before any code (ethical firms offer paid pilots first)
• "AI transformation" language without specifics — if they can't name LLMs, integration patterns, or monitoring stacks, they're selling vibes
• Pitch deck more polished than technical samples — sales is the product
• Junior consultants pitched as "AI experts" — Anthropic Sonnet 4.5 launched in 2025; anyone with "10 years of AI experience" is at minimum stretching the truth
The Honest Math on Agency PricingIf you're evaluating mid-sized or Big 4 agencies, understand what you're paying for:
Cost component
% of fee at boutique
% of fee at Big 4
Senior implementation work (the actual value)~60%~30%
Junior consultant time~10%~30%
Sales + business development~5%~15%
Firm overhead~15%~25%
Profit margin~10%~15%
At a boutique, 60% of your fee buys senior implementation work. At a Big 4 firm, that drops to 30% — most of your fee goes to junior work, overhead, and sales infrastructure. For the same outcome (a shipped AI feature), you pay 2-5x more at a Big 4 firm.
This isn't an argument against Big 4 firms broadly — they're the right answer when you need enterprise procurement, large parallel capacity, or board-level credibility. It IS an argument against defaulting to large firms when smaller ones would deliver the same result for a fraction of the cost.
What a Good Agency Engagement Actually Looks LikeWhichever tier you pick, healthy engagements have these markers:
• Days 1-14: Discovery + scoping. Decision point: continue to pilot or stop.
• Days 15-60: Build. Real users testing by week 4-5.
• Days 61-90: Ship to production. Monitor. Handoff.
• Day 90 deliverable: One shipped AI feature + the playbook to ship more.
• Cost: $25K-$80K depending on scope.
Anyone proposing multi-quarter engagements without checkpoints, or anyone whose first deliverable is "the AI strategy document" — is building a relationship that bleeds you slowly.
The Bottom LineFor most businesses in 2026, the right AI consultant agency is a small boutique (3-15 people) with a track record of recent shipped work, transparent pricing, and the people who pitched you staying on your project. Not a Big 4 firm (overpriced for non-Fortune-500). Not a solo consultant (under-resourced for larger engagements). Not a "AI transformation" specialist (sells decks).
If you're shopping for an agency now, use the 6 vetting questions above. The fifth one — about a project that didn't go as planned — is the best filter and the one most agencies aren't ready for.
Want a second opinion on a specific agency you're considering? Book a free 15-min strategy call. I'll give you a specific read in 10 minutes. No pitch.
Related reading: AI consultant companies, AI consultant services, AI implementation consultant, agency vs solo consultant, Chief AI Officer vs Fractional CTO.
### Frequently Asked Questions
**Q: What is an AI consultant agency?**
A: An AI consultant agency is a firm that provides AI consulting services — typically a small-to-mid-sized organization (3-100 people) focused specifically on helping businesses implement AI. They're distinct from Big 4 firms (Accenture, Deloitte, IBM Consulting) which are larger and more enterprise-focused, and from solo consultants who lack agency overhead. Most AI consultant agencies in 2026 specialize in either strategy or implementation.
**Q: How much does an AI consultant agency cost in 2026?**
A: Small boutique agencies (3-15 people): $25K-$200K per project, $5K-$25K/month retainers. Mid-sized agencies (15-100 people): $100K-$500K per project, $10K-$50K/month retainers. Hourly rates: $150-$500/hr depending on seniority. The $49 CPC on this search keyword reflects the real buyer market — these are commercial decisions with significant budget.
**Q: What's the difference between an AI consultant agency and a Big 4 firm?**
A: Agencies are smaller (typically 3-100 people), more specialized (often focused on AI implementation specifically), and more cost-effective. Big 4 firms (Accenture, Deloitte, IBM Consulting, EY, KPMG) are larger, have enterprise procurement compatibility, and handle Fortune 500-scale change management. For businesses under $50M revenue, agencies are typically the better value. Big 4 firms make sense for procurement requirements at large enterprises.
**Q: How do I pick the right AI consultant agency?**
A: Six questions that filter the bad ones: (1) Show me 3 production AI features you've shipped in the last 6 months. (2) Who from your team will work on my project (specific names)? (3) What's your pricing for a 90-day pilot? (4) Who owns the code at engagement end? (5) Show me a case where the project didn't go as planned — what happened? (6) Can you show me your team's actual code or technical artifacts, not just slide decks? If they can't answer #1 with specifics, they're selling strategy not implementation.
**Q: What are red flags when picking an AI consultant agency?**
A: Four real ones: (1) They lead with strategy decks and workshops instead of shipped features. (2) Their case studies are anonymous and lack specific metrics. (3) Pricing is opaque — no ballpark without a 60-minute discovery call. (4) The pitch materials are more polished than the technical samples. Add: if they use "AI transformation" without naming specific LLMs (Claude Sonnet 4.5, GPT-5, Gemini 3 Pro) or integration patterns — they're selling vibes not software.
**Q: Should I pick an AI consultant agency or a solo consultant?**
A: Solo consultants are 30-50% cheaper for the same actual implementation work and the senior person stays on your project. Agencies have more capacity for parallel workstreams and more processes. For most businesses under $25M revenue with a focused use case, a senior solo consultant or 2-person team delivers better value than an agency. For complex multi-workstream projects or businesses needing 4+ consultants in parallel, agencies make sense.
**Q: What services do AI consultant agencies offer?**
A: Common services: AI strategy + roadmaps ($50K-$500K), AI implementation projects ($25K-$150K per feature), workflow automation ($15K-$80K per workflow), readiness assessments ($5K-$25K), training programs ($10K-$50K), fine-tuning + prompt engineering ($10K-$80K), and infrastructure setup ($15K-$50K). See AI consultant services for the full breakdown of what each includes.
**Q: How long do AI consultant agency engagements take?**
A: Pilots: 4-8 weeks (single AI feature shipped). Full implementations: 8-16 weeks. Multi-feature programs: 12-26 weeks. Anyone promising "AI transformation" in 30 days is selling shovels. Anyone planning 12+ months without checkpoints is bleeding billable hours. Best practice: 4-6 week paid pilot before committing to a longer engagement.
**Q: How do I vet an AI consultant agency's actual technical capabilities?**
A: Ask to see specific technical artifacts: production code samples (with permission), live demos of AI features they've shipped, monitoring dashboards from their work, error-handling approaches. A serious agency can show you concrete technical work in 10 minutes. A strategy-only firm will deflect to "client confidentiality" without offering any technical demonstrations whatsoever.
**Q: Should an AI consultant agency or a fractional CTO lead my AI work?**
A: Depends on scope. For a single AI implementation project (4-12 weeks), an agency is fine. For ongoing strategic AI leadership — connecting AI work to your broader tech stack, hiring decisions, product roadmap, and team building — a fractional CTO is the better fit. Many agencies will try to scope-creep into fractional CTO roles; vet carefully if that's what you actually need. See Chief AI Officer vs Fractional CTO.
---
## Google Stitch MCP: What It Is and How to Use It (2026)
- **URL:** https://justinmckelvey.com/blog/google-stitch-mcp
- **Published:** May 31, 2026
- **Updated:** May 31, 2026
- **Category:** AI Design Tools
- **Reading time:** 6 min
- **Description:** Google Stitch MCP brings AI UI design into Claude Code, Cursor, other MCP clients. What MCP is, why Stitch supports it, setup guide. 2026.
Quick Answer (Setup + Use)
Google Stitch MCP is the Model Context Protocol integration that lets AI coding tools (Claude Code, Cursor, Windsurf, Cline) call Google Stitch directly from your editor. You prompt Claude Code with "design and implement a settings page" and the MCP integration calls Stitch to generate the UI, then your AI agent implements it in your codebase — without you visiting stitch.withgoogle.com. Stitch MCP is free (uses your standard 350-gen/month quota) and supports all major MCP clients as of June 2026.
Based on MCP integration setup in Claude Code + Cursor · June 2026 · Author: Justin McKelvey, fractional CTO
Key Stats (June 2026)
• MCP protocol: Anthropic's open Model Context Protocol (introduced late 2024)
• Cost: Free — uses standard Stitch generation quota (350/mo on free tier)
• Supported clients: Claude Code, Cursor (0.45+), Windsurf, Cline, Continue.dev, custom MCP clients
• Setup time: ~5 minutes (config file edit + auth)
• Killer use case: Design + implementation in one closed-loop AI workflow
• Search growth: +515% quarterly — MCP adoption is exploding in AI coding tool ecosystem
• Official docs: stitch.withgoogle.com/docs/mcp
TL;DR: Google Stitch MCP in 90 SecondsMCP (Model Context Protocol) is the standardized way AI coding tools call external services. Google Stitch ships an MCP server, which means AI tools like Claude Code, Cursor, and Windsurf can use Stitch's design generation directly through their normal prompt interface — no context switching to a separate web app.
For developers using Claude Code or Cursor as their main editor, Stitch MCP is the killer integration. Instead of bouncing between your editor and stitch.withgoogle.com, you prompt your AI agent to "design and implement a new dashboard" — the agent calls Stitch for the design, then implements it in your codebase. Closed loop, no context switching, way faster.
I'm a fractional CTO who runs this workflow daily for client projects. This is the practical setup guide and the workflow patterns that actually save time.
What MCP Is (And Why It Matters)Model Context Protocol is an open protocol Anthropic introduced in late 2024 for connecting AI assistants to external tools and data sources. The premise: instead of every AI coding tool implementing custom integrations with every external service, services provide standardized MCP servers and clients consume them via the protocol.
By mid-2026, MCP has become the default integration pattern for AI coding tools. Dozens of services (GitHub, Slack, databases, design tools, project management) ship MCP servers. Cursor, Claude Code, Windsurf, Cline, and others consume them. The result: an AI agent in your editor has access to anything that ships an MCP server.
Google Stitch was one of the earlier major design tools to ship MCP support, which is why "google stitch mcp" searches grew +515% in Q1 2026.
What Stitch MCP Actually DoesThe Stitch MCP server exposes these capabilities to MCP clients:
• Generate a design from a natural language prompt
• Refine an existing design with follow-up prompts
• Export design code in any supported format (HTML, Tailwind, Vue, Angular, Flutter, SwiftUI)
• Get design metadata (used components, color palette, typography)
• List your existing Stitch projects
• Update or delete projects
When an AI client like Claude Code is connected, it sees these as available tools and can invoke them as part of larger tasks. Example: you prompt Claude Code with "I need a settings page for our app with profile, billing, and notification sections." Claude Code decides to call Stitch (via MCP) to generate the design, gets the result back, and implements it in your React codebase — all in one conversation.
Setting Up Stitch MCP in Claude CodeStep-by-step setup as of June 2026:
2. Get your Stitch API token. Go to stitch.withgoogle.com → Settings → API Tokens → Generate New Token. Copy it.
4. Edit Claude Code's MCP config. Open ~/.claude/mcp_servers.json (create it if it doesn't exist).
6. Add the Stitch server entry:{
"mcpServers": {
"stitch": {
"command": "npx",
"args": ["-y", "@google/stitch-mcp@latest"],
"env": {
"STITCH_API_TOKEN": "your-token-here"
}
}
}
}
8. Restart Claude Code.
10. Verify the connection. In a new Claude Code session, ask "what tools do you have access to?" — Stitch should appear in the list.
The exact configuration format may evolve as Anthropic iterates on Claude Code's MCP support. The official Stitch docs at stitch.withgoogle.com/docs/mcp always have the current version.
Setting Up Stitch MCP in CursorCursor supports MCP through its Composer feature (Cursor 0.45+):
2. Open Cursor Settings → MCP → Add Server
4. Choose "Add from URL" or "Add from npm package"
6. Configure with your Stitch API token (paste in the env field)
8. Restart Cursor
10. Open Composer and ask a design-related question — Stitch tools appear as available actions
Cursor's MCP UI is friendlier than Claude Code's config file approach but the underlying integration is identical.
The Killer Workflow: Design + Implement in One PassThe reason Stitch MCP is worth setting up is the closed-loop design-and-implement workflow it enables:
Without MCP:
2. Decide you need a new UI feature
4. Open stitch.withgoogle.com in a separate tab
6. Prompt Stitch, refine, export code
8. Copy code to your editor
10. Refine implementation manually or with AI help
12. Realize the design needs a tweak, go back to step 2
With MCP:
2. Prompt Claude Code: "Design and implement a settings page with profile/billing/notifications"
4. Claude Code calls Stitch via MCP for the design
6. Claude Code implements the design in your codebase
8. Claude Code reviews the result, decides the design needs adjustment
10. Claude Code calls Stitch again with refinement, updates the code
12. Done. Total time: 5-15 minutes for typical features.
The integration eliminates context-switching between design tool and editor, AND lets the AI agent iterate on both design and code in tandem. For developers who already use Claude Code or Cursor as their main editor, this is a 3-5x productivity boost on UI work.
Common Setup Issues"Server not found" error: Usually a path or command issue. Try the full path to npx instead of just "npx". On macOS this is typically /opt/homebrew/bin/npx or /usr/local/bin/npx.
"Authentication failed": Double-check your Stitch API token. Tokens expire after 90 days as of June 2026 — regenerate if needed.
Tools not appearing in client: Restart the MCP client fully (not just reload). Some clients cache the tool list aggressively.
Rate limit errors: You're hitting your 350-generation/month free tier limit. Either wait for the next monthly reset or upgrade when Stitch's paid plans launch in Q4 2026.
Who Should Set This Up
• Developers using Claude Code or Cursor as their primary editor
• Solo founders who want design + code in one workflow
• Anyone who finds the manual stitch.withgoogle.com workflow disruptive
• Teams building AI-driven development pipelines
Skip if: you don't use any MCP-compatible AI tool, you prefer the visual design experience in Stitch's web UI, or you don't do much UI work.
The Bottom LineGoogle Stitch MCP is one of the highest-leverage AI workflow integrations available in 2026. For developers in the Claude Code or Cursor ecosystem, setting it up takes 5 minutes and produces a 3-5x speedup on UI implementation work.
The broader trend: MCP is becoming the default integration pattern for AI coding tools. Expect every major design, project management, database, and infrastructure tool to ship MCP support over the next 12-18 months. Stitch is ahead of the curve.
Related reading: Google Stitch review, how to use Google Stitch, Google Stitch vs Claude Design, Claude Code vs Cursor.
Want help integrating MCP into your team's workflow? Book a free 15-min strategy call. No pitch.
### Frequently Asked Questions
**Q: What is Google Stitch MCP?**
A: Google Stitch MCP is the Model Context Protocol integration that lets AI coding tools (Claude Code, Cursor, Windsurf, and any MCP-compatible client) directly query Google Stitch from the command line. You can prompt Claude Code with "design and ship a settings page" and the MCP integration calls Stitch to generate the design, then implements it in your codebase — all without leaving your editor.
**Q: What is MCP (Model Context Protocol)?**
A: MCP is an open protocol introduced by Anthropic in late 2024 for connecting AI assistants to external tools and data sources. As of 2026, dozens of services (Stitch, GitHub, Slack, databases) provide MCP servers that AI coding tools can use. Think of it as a standardized way for AI agents to call external tools — like APIs, but designed specifically for LLM-driven workflows.
**Q: How do I set up Google Stitch MCP in Claude Code?**
A: (1) Get your Stitch API token from the Stitch settings page. (2) Edit your Claude Code MCP config (usually ~/.claude/mcp_servers.json). (3) Add the Stitch server entry with your token. (4) Restart Claude Code. The official Stitch MCP setup docs are at stitch.withgoogle.com/docs/mcp — Google updates the exact steps as the protocol evolves.
**Q: How do I use Stitch MCP in Cursor?**
A: Cursor supports MCP through its Composer feature (Cursor 0.45+). Add the Stitch MCP server to your Cursor MCP config (Settings → MCP → Add Server), authenticate with your Google account, then prompt Composer with design-related requests. The Stitch tools appear in Composer's available actions automatically.
**Q: Why use Stitch MCP instead of just visiting stitch.withgoogle.com?**
A: Two reasons: (1) Workflow integration — you don't context-switch between your editor and a separate web tool. The AI agent in your editor can request designs as part of a larger task. (2) Iteration speed — Claude Code can refine designs and immediately implement them in your code, then ask Stitch to update the design based on what worked in practice. The closed loop is significantly faster than manual back-and-forth.
**Q: Is Google Stitch MCP free?**
A: Yes, as of June 2026. Stitch MCP usage counts against your standard Stitch generation quota (350/month on free tier). No separate fee for the MCP integration. If/when paid Stitch plans launch in Q4 2026, MCP access will likely be included in all tiers.
**Q: What MCP clients support Google Stitch?**
A: As of June 2026: Claude Code (official Anthropic), Cursor (0.45+), Windsurf, Cline, Continue.dev, and any custom MCP client. The protocol is open, so any LLM-driven tool can implement support. Some specialized AI coding tools haven't added MCP yet — check your tool's documentation.
**Q: Can I use Stitch MCP without Google Stitch UI?**
A: Yes — that's actually one of the main use cases. Power users prefer to keep their full workflow in their editor and never visit the Stitch web UI directly. MCP gives Claude Code or Cursor full access to Stitch's generation, refinement, and export capabilities through tool calls.
**Q: What's the difference between Stitch MCP and Stitch's regular API?**
A: Stitch's standard API (if you build directly against it) requires you to handle authentication, request structure, response parsing, and error handling yourself. MCP wraps all of that into a standardized protocol that AI clients understand natively. For human-written code: use the API directly. For AI-driven workflows (Claude Code, Cursor): use MCP — it just works.
**Q: Why is 'google stitch mcp' a trending search?**
A: Because Anthropic's MCP protocol gained massive adoption in late 2025/early 2026, and developers using Claude Code want to integrate every tool they use into the MCP workflow. "google stitch mcp" is a navigational search from devs trying to find the integration docs. Search volume for this query grew +515% in Q1 2026 as MCP became the default integration pattern for AI coding tools.
---
## How to Use Google Stitch: The 2026 Guide
- **URL:** https://justinmckelvey.com/blog/how-to-use-google-stitch
- **Published:** May 31, 2026
- **Updated:** May 31, 2026
- **Category:** AI Design Tools
- **Reading time:** 6 min
- **Description:** How to use Google Stitch: free signup, real-time AI agent, voice input, multi-screen generation, code export. Complete 2026 beginner-to-pro tutorial.
Quick Answer (Get Started in 5 Minutes)
To use Google Stitch: go to stitch.withgoogle.com, sign in with any Google account (free, no credit card), describe what you want to build in plain English, refine via the visual editor or voice input, then export code in HTML/Tailwind/Vue/Angular/Flutter/SwiftUI. Free tier includes 350 generations per month. The streaming AI agent (launched May 2026) reflows the design as you type or talk. Multi-screen generation gives you up to 5 connected screens from a single prompt.
Step-by-step Stitch tutorial based on hands-on use · June 2026 · Author: Justin McKelvey, fractional CTO
Key Stats (June 2026)
• URL: stitch.withgoogle.com
• Sign-up: Any Google account, no credit card, instant access
• Cost: Free during Google Labs phase (350 generations/month)
• Real-time agent: Streams UI as you type/talk (launched May 2026)
• Voice input: Click microphone icon, speak naturally
• Multi-screen: Up to 5 connected screens per prompt
• Code export: HTML, CSS, Tailwind, Vue, Angular, Flutter, SwiftUI
TL;DR: Using Google Stitch in 2026Google Stitch is the easiest AI design tool to get started with in 2026. Sign up with a Google account, describe what you want, watch it appear. No tutorials required for basic use. This guide covers the workflow from beginner ("how do I sign up") through advanced ("how do I use multi-screen generation efficiently") so you can get maximum value from the 350 free generations per month.
I'm a fractional CTO who uses Stitch for client prototyping and personal projects. This is the practical hands-on guide — not the marketing tour, but the actual workflow that works.
Step 1: Sign Up (60 Seconds)
2. Go to stitch.withgoogle.com in any modern browser
4. Click "Sign in with Google" and use any Google account
6. Accept the Google Labs experimental product terms
8. You're in. No credit card, no waitlist, no email verification
If you have multiple Google accounts, use one with a workspace email — the auto-save and project organization works the same on personal and workspace accounts, but workspace accounts get earlier access to new features (like the multi-user collaboration).
Step 2: Your First Prompt (Get Specific)The single biggest factor in Stitch output quality is prompt specificity. Vague prompts give generic results; specific prompts give exactly what you want.
Bad first prompt: "A dashboard"
Good first prompt: "A SaaS analytics dashboard with sidebar navigation (collapsible), 4 KPI cards at the top showing revenue/users/conversion/churn, a main chart area below, and a dark theme."
Include these elements in every first prompt:
• The type of UI (dashboard, landing page, app screen, marketing site, etc.)
• The layout (sidebar nav, single column, grid, multi-column)
• 3-5 key elements you want present (specific components, content sections)
• The aesthetic (dark/light, minimal/colorful, modern/classic, brand reference if applicable)
You can always iterate from there, but a specific first prompt saves 5-10 generations of refinement.
Step 3: Use the Real-Time AgentThe streaming agent is Stitch's killer feature. As you type or talk, the design reflows in real time on the canvas.
Practical usage tips:
• Describe changes in plain language as if talking to a designer ("make the header smaller," "add a search bar to the top," "use a darker primary color")
• Don't wait for the design to settle — keep talking. The agent updates continuously.
• Use voice for high-iteration sessions. Click the mic icon and pace while you design. It's surprisingly productive.
• Use text for precision changes. Voice is great for direction; text is better for specific values ("set padding to 24px").
Step 4: Generate Multi-Screen FlowsThis is where Stitch crushes Figma for prototyping speed. Instead of generating one screen at a time, describe an entire flow:
Single prompt: "A booking app with: (1) a service selection screen where users pick from massage, manicure, or facial; (2) a time picker showing available slots in the next 7 days; (3) a customer info collection screen with name/email/phone; (4) a confirmation page with appointment summary; (5) a payment screen using Stripe-like UI."
Stitch generates all 5 screens connected with navigation. You can:
• Click between screens to see the flow
• Refine each screen individually with subsequent prompts
• Apply global changes ("make all screens use a dark theme") in one operation
• Export the entire flow as code or a clickable prototype link
Step 5: Refine With the Visual Editor (Save Credits)Every AI generation consumes one of your 350 monthly credits. For small tweaks, use the visual editor instead:
• Click any text to edit copy directly (no credit used)
• Click any color swatch to change palette (no credit used)
• Drag elements to reposition (no credit used)
• Adjust knobs in the right panel for spacing, sizing, alignment (no credit used)
Reserve AI generations for structural changes: new screens, major layout shifts, alternative directions, or design system swaps.
Step 6: Export CodeWhen the design is ready to ship:
2. Click "Export" in the canvas toolbar
4. Pick your format: HTML, CSS, Tailwind CSS, Vue.js, Angular, Flutter, or SwiftUI
6. Copy the code or download as a zip
8. Open in your IDE and refine as needed
The code is production-adjacent — clean and structured — but a developer should review it before shipping. The fastest path to shipped feature: export to your framework, then use Claude Code or Cursor to refine and integrate it with your existing codebase. Total time from design to deployed feature: usually 1-2 hours for simple UI changes.
Step 7: Share and CollaborateFor solo work, just export and ship. For team work, Stitch's multi-user collaboration (launched May 2026) is genuinely useful:
• Share a project URL with anyone (Google account required)
• Multiple people can edit simultaneously — like Google Docs for design
• Comments work inline on specific elements
• The AI agent works for everyone in the project, so teammates can prompt changes too
Advanced: 5 Workflows That Get the Most From Stitch1. The 30-minute prototype. Big multi-screen flow prompt → review → 3-5 quick refinements via voice → export. Idea to clickable prototype in under 30 minutes. Use for stakeholder pitches and concept validation.
2. The component-by-component build. Generate the page once, then use visual editor for everything. Saves credits, gets exactly what you want. Use when you have a clear mental model of the design.
3. The A/B exploration. Prompt for 3 different directions of the same screen, screenshot all three, pick the best, refine. Use for early-stage product decisions where direction matters more than polish.
4. The voice-driven sprint. Headset on, pace around the room, describe changes verbally for 30 minutes. Highly productive for solo work when you don't want to be at the keyboard. Use for ideation sessions.
5. The export-to-ship pipeline. Stitch → export Tailwind code → Claude Code → integrated into your app. Use when you need to ship a new feature UI fast.
Common Mistakes to Avoid
• Vague first prompts. Wasted generations. Always include type/layout/elements/aesthetic.
• Re-prompting for small changes. Use the visual editor for tweaks; save generations for big changes.
• Ignoring voice input. If you haven't tried voice yet, you're missing the best part of the streaming agent.
• Skipping multi-screen for app flows. One prompt for 5 screens is way faster than 5 prompts for 1 screen each.
• Shipping code without developer review. The export is good but not perfect. Have a dev look before production.
The Bottom LineGoogle Stitch in 2026 is the easiest AI design tool to be productive with. Free signup, instant access, real-time agent, multi-screen generation, voice input, 7-format code export. For solo builders and small teams, it covers most design workflows under what Figma costs per year.
The five-step workflow above (sign up → specific prompt → real-time refinement → multi-screen → visual editor + export) gets you from beginner to confidently using Stitch in 30 minutes. The advanced workflows compound that productivity for serious design work.
Related reading: Google Stitch review, Google Stitch vs Figma, Google Stitch vs Claude Design, Claude Design review.
Need help integrating Stitch into your specific workflow? Book a free 15-min strategy call. Specific recommendation in 10 minutes. No pitch.
### Frequently Asked Questions
**Q: How do I sign up for Google Stitch?**
A: Go to stitch.withgoogle.com and sign in with any Google account. No credit card required, no waitlist as of June 2026. The free tier includes 350 generations per month — enough for most individual designers or small teams. You're working in minutes.
**Q: What should I prompt Google Stitch with first?**
A: Start specific, not vague. "A dashboard" gives generic output; "a SaaS analytics dashboard with sidebar navigation, 4 KPI cards at the top, a main chart area, and a dark theme" gives precisely what you want. Include: the type of UI (dashboard, landing page, app screen), the layout (sidebar nav, single column, grid), 3-5 key elements you want present, and the aesthetic (dark/light, minimal/colorful, brand reference).
**Q: How do I use voice input in Google Stitch?**
A: Click the microphone icon in the prompt area. Voice input was added in March 2026 and is tightly integrated with the streaming agent. Try commands like "give me three menu options," "make the header smaller," or "show this with a darker palette." The voice quality is good enough that you can design while pacing. Voice works on desktop browsers; mobile is in beta.
**Q: How does multi-screen generation work?**
A: Describe an entire flow in one prompt: "a booking app with a service selection screen, a time picker, a confirmation page, and a payment screen." Stitch generates all 5 screens connected with navigation. You can refine each screen individually or change them all at once. This is the killer feature for prototyping new product ideas.
**Q: How do I export code from Google Stitch?**
A: Click "Export" in the canvas toolbar and pick your format: HTML, CSS, Tailwind CSS, Vue.js components, Angular templates, Flutter widgets, or SwiftUI views. The code is production-adjacent — clean enough to use as a starting point but worth a developer review. Pair with Cursor or Claude Code to go from export to shipped feature in under an hour.
**Q: Can I collaborate with other people in Google Stitch?**
A: Yes. Multi-user collaboration was added at Google I/O 2026 (May 20). Share a Stitch project link with anyone (Google account required). They can edit, comment, and use the AI agent simultaneously — like Google Docs for design. As of June 2026, this works for small teams; enterprise-scale governance features are still rolling out.
**Q: How do I refine a design without using credits?**
A: Use the visual editor (not the AI prompt) for tweaks. Click directly on text to change copy, click on colors to change palette, drag elements to reposition. These edits don't consume your 350 monthly generations. Save AI generations for major structural changes (new screens, layout redesigns, alternative directions).
**Q: What's the difference between Stitch and Stitch 2.0?**
A: Stitch 2.0 refers to the major upgrade announced at Google I/O 2026 (May 20). The key new features: real-time streaming AI agent (instead of prompt-and-wait), multi-user collaboration, expanded code export formats. As of June 2026, all users are on Stitch 2.0 — there's no separate version selection. The product is just called "Stitch."
**Q: How do I save and organize my Stitch designs?**
A: Designs auto-save to your Google account. The dashboard at stitch.withgoogle.com shows all your projects. You can organize with folders, share project links, duplicate designs to create variations, and export at any time. As of June 2026, there's no native version history beyond auto-save — for serious version control, export to GitHub via the code export.
**Q: What if I run out of my 350 monthly generations?**
A: You can't generate new designs until your monthly quota resets, but you can keep using the visual editor for refinements on existing designs. Workarounds: design in the visual editor for tweaks, prompt for new generations only on major changes, and reserve voice input for high-leverage iteration. Paid plans (expected Q4 2026) will likely raise this limit significantly.
---
## Google Stitch vs Figma 2026: Will AI Kill the Design Tool?
- **URL:** https://justinmckelvey.com/blog/google-stitch-vs-figma
- **Published:** May 31, 2026
- **Updated:** May 31, 2026
- **Category:** AI Design Tools
- **Reading time:** 6 min
- **Description:** Google Stitch (free, AI-native) vs Figma ($15/seat, manual precision). 2026 head-to-head: features, workflow, when each wins. Honest verdict.
Quick Answer (Idea vs Refinement)
Google Stitch wins for AI-native generation, multi-screen prototyping, and zero cost. Figma wins for precision design work, team collaboration at scale, and the 10-year ecosystem of plugins/community resources. Stitch is free; Figma Pro is $15/seat/month ($180/year). For solo builders and small teams: Stitch is the better starting point in 2026. For established design organizations: Figma still wins on workflows. Most working designers use both.
Based on production use of both tools across client projects · June 2026 · Author: Justin McKelvey, fractional CTO
Key Stats (June 2026)
• Google Stitch: Free (350 gens/mo), AI-native, 2026 Google Labs product
• Figma Professional: $15/seat/month ($12/seat annual), manual design tool with AI features in beta
• Real-time AI agent: Stitch ✓ (streaming, May 2026); Figma ✗ (Make Designs is prompt-and-wait)
• Multi-screen generation: Stitch ✓ (5 connected screens); Figma ✗ (manual screen creation)
• Voice input: Stitch ✓; Figma ✗
• Plugin ecosystem: Figma ✓ (1,500+); Stitch ✗ (no plugin model yet)
• Dev Mode/Handoff: Figma ✓ (mature); Stitch ✓ (code export to 7 formats)
• Team collab at scale: Figma ✓ (industry-leading); Stitch ✓ (new multiplayer May 2026)
TL;DR: Stitch vs Figma in 90 SecondsThis isn't a 1-to-1 comparison. Google Stitch and Figma optimize for different workflows. Stitch is an AI-native tool for fast generation, multi-screen prototyping, and exploratory work. Figma is a manual design tool with AI features bolted on, optimized for precision work, team workflows, and a mature plugin ecosystem.
For solo builders and small teams in 2026, Stitch is genuinely better at the "idea to first design" phase — and it's free. For established design organizations, Figma still wins on production workflows and team coordination.
I'm a fractional CTO who uses both tools and has watched the AI design tool transition over the last 18 months. The honest verdict: AI tools aren't killing Figma yet, but they're claiming the early-stage workflow that Figma used to own.
Where Stitch Wins DecisivelySpeed from idea to first design. "Generate a SaaS dashboard with sidebar nav, analytics cards, and dark theme" → 5 seconds → working design. Figma requires you to manually create artboards, place components, set up auto-layout, configure colors. Stitch is 10-50x faster for first drafts.
Multi-screen prototyping. Describe an entire app flow ("a booking app with calendar, time selection, payment, confirmation screens") and Stitch generates all 5 connected screens at once. In Figma, this is 30-60 minutes of manual work; in Stitch, 30 seconds.
Voice-driven workflow. Pacing while describing changes, "give me three menu options," "show this in a darker palette" — voice input genuinely works in Stitch. Figma has no voice option.
The cost. Free during Google Labs phase vs $15/seat/month. For a 10-person team, that's $1,800/year difference. Even after Stitch monetizes (projected Q4 2026), the cost will likely stay 30-50% below Figma.
Non-designer accessibility. Founders, marketers, ops people, developers can ship usable designs from Stitch in 30 minutes. The same person in Figma needs weeks of practice.
Code export breadth. Stitch exports to HTML, CSS, Tailwind, Vue.js, Angular, Flutter, SwiftUI directly. Figma's code export requires Dev Mode + plugins, and quality varies by framework.
Where Figma Wins DecisivelyPrecision design work. Pixel-perfect adjustments, alignment to grids, micro-typography, complex layered designs — Figma's manual tooling is faster for this work because it's built around precise control. Stitch's AI-driven workflow is less suited to "nudge this button 2px to the left."
Mature component libraries. Figma's component system, with variants, properties, auto-layout, constraints — is the result of years of iteration. Building and maintaining a complex design system in Figma is significantly more sophisticated than what Stitch offers. (Note: Claude Design beats both for codebase-aware design systems specifically.)
Plugin ecosystem. 1,500+ Figma plugins cover everything from accessibility audits to design token export to AI generation to user testing integration. Stitch has no plugin model yet. For specialized workflows, Figma wins.
Dev Mode + handoff. Figma's Dev Mode is the industry standard for designer-developer handoff: inspect specs, copy CSS/iOS/Android code, link to GitHub issues, attach implementation status. Stitch's code export is good but not as deeply integrated into engineering workflows.
Team collaboration at scale. Multi-team component libraries, design system governance, branch-based design workflows, organization-wide design tokens — Figma is built for 50+ designer organizations. Stitch's new multi-user features (May 2026) work for small teams but haven't been battle-tested at enterprise scale.
Community resources. 10+ years of templates, design system examples, tutorial content, conference talks, and best practices. Stitch is 6 months old. The depth gap matters when you need niche solutions.
The Real Question: What's Your Workflow?
Your situation
Pick
Solo founder building a productStitch
Developer who hates manual design toolsStitch
Small team (3-10 people) prototyping new featuresStitch primary, Figma when needed
Design team at a 20-50 person startupFigma primary, Stitch for early exploration
Enterprise design org with established Figma workflowsFigma stays primary; pilot Stitch for ideation only
Agency working on diverse client projectsBoth — Stitch for speed, Figma for client deliverables
Hackathon / 48-hour buildStitch (free + speed)
Designer interviewing for a new jobFigma (industry standard for portfolios)
The Workflow That Uses Both WellMost product teams I work with in 2026 use both, in this pattern:
2. Stitch for ideation. "I have a vague idea for a new dashboard. Generate 3 directions, 5 screens each." Free, fast, multi-screen.
4. Stakeholder review. Share Stitch live preview links, get feedback, pick direction.
6. Figma for refinement. Take the chosen direction into Figma. Refine to your design system, add precision touches, prepare for dev handoff.
8. Figma Dev Mode for engineering. Developers work from Figma specs. Component libraries stay in Figma. Design system governance lives in Figma.
End-to-end: Stitch handles the exploratory and prototyping phases; Figma handles the production design work. Neither tool can do both phases well alone today.
Will Figma Get Killed by AI Tools?Probably not, for two reasons:
1. Figma's moat is workflows, not features. The plugin ecosystem, the team collaboration, the dev handoff, the design system governance — these are years of compounding work that AI tools can't replicate quickly. Figma can lose features to AI and still keep customers locked into the workflows.
2. Figma is integrating AI fast. Make Designs, FigJam AI, AI components — Figma is shipping AI features aggressively in 2026. The gap between Figma and pure-AI tools like Stitch will likely shrink over the next 12-18 months. Figma doesn't need to be best-in-class at AI; it needs to be good enough that the workflow advantages keep customers.
That said, Figma's small-team and solo-builder market is genuinely at risk. For people who don't need Figma's enterprise features, AI tools like Stitch offer a better speed-to-design at a fraction of the cost.
The Bottom LineIf you're a solo builder or small team in 2026: start with Stitch. It's free, it's faster for exploratory work, and it covers most workflows under 10 designers. Add Figma if you specifically need its workflows.
If you're a design org at an established company: keep Figma. Add Stitch to your toolkit for ideation and concept work, but don't try to displace Figma's production workflows. The gain isn't worth the disruption.
If you're a designer building your career: learn both. Figma for industry standard fluency and portfolio work. Stitch for AI-native speed that's increasingly expected by 2027.
Related reading: Google Stitch review, Claude Design review, Google Stitch vs Claude Design, how to use Google Stitch.
Need help picking the right design tool stack for your specific workflow? Book a free strategy call. Specific recommendation, no pitch.
### Frequently Asked Questions
**Q: Is Google Stitch better than Figma?**
A: For different things. Google Stitch is free, AI-native, faster from idea to first design, and has multi-screen generation + voice input. Figma is mature, precision-focused, has 10+ years of community resources and plugins, and excels at team workflows at scale. For new design exploration: Stitch. For polished refinement and large team collaboration: Figma. The 2026 reality: most product teams use both.
**Q: Does Google Stitch replace Figma?**
A: Not yet for most teams. Stitch handles the "idea to first design" phase faster than Figma. Figma still wins for pixel-precision work, complex component libraries, plugin-enabled workflows (Dev Mode, Tokens, advanced exports), and team coordination at scale. Stitch might displace Figma for individual builders and small teams over the next 2-3 years, but probably not for established design organizations.
**Q: Is Google Stitch really free vs Figma's $15/seat?**
A: Yes, as of June 2026. Stitch is in the Google Labs experimental phase with no paid tier — 350 generations per month, no credit card. Figma Professional is $15/seat/month ($12/seat annual). Paid Stitch plans are expected by Q4 2026 at projected 30-50% below Figma. For now, Stitch is free; Figma costs $180/year per seat at Pro tier.
**Q: Can Figma's AI features match Google Stitch?**
A: Not yet. Figma has "Make Designs" (AI generation), "FigJam AI," and several AI features in beta — but they don't match Stitch's real-time streaming agent or multi-screen generation. Figma's AI is bolt-on; Stitch's AI is the entire product. Figma is likely to close the gap aggressively in 2026-2027, but as of June 2026, Stitch has the AI advantage.
**Q: Which is easier for non-designers?**
A: Google Stitch, decisively. The voice-driven prompt interface and multi-screen generation are non-designer-friendly. Figma assumes you understand layers, components, auto-layout, constraints — concepts that take real time to learn. Non-designers can ship something usable from Stitch in 30 minutes; the same person in Figma needs 2-4 weeks of practice to be productive.
**Q: Should designers learn Google Stitch or stick with Figma?**
A: Both. Designers should add Stitch to their toolkit for fast initial generation, then move to Figma for refinement and handoff. Skipping Stitch entirely in 2026 means losing 30-50% productivity on the exploration phase. Skipping Figma entirely means losing the precision and team collaboration tooling. Most working designers use both by mid-2026.
**Q: Does Google Stitch export to Figma?**
A: Not directly as of June 2026, but Stitch exports clean HTML/CSS/Tailwind code that you can either use directly in your codebase or convert to Figma using community tools. Direct Stitch → Figma export is on the roadmap but not shipped. For now: design in Stitch, refine in Figma if needed, or ship from Stitch's code export.
**Q: Which is better for prototyping?**
A: Google Stitch, for the early prototyping stages. Multi-screen generation from a single prompt produces clickable prototypes in minutes. Figma's prototyping is more polished but requires manual setup of every screen and interaction. For a 48-hour prototype: Stitch. For a presentation-ready interactive prototype with custom animations: Figma.
**Q: Will Figma get killed by AI design tools?**
A: Probably not. Figma is too deeply embedded in design team workflows (component libraries, design systems, dev handoff, plugin ecosystem) to be displaced by a single AI tool. But Figma will get pressured to integrate AI faster, and the small-team and solo-builder market may shift to AI-native tools like Stitch. Figma will likely remain the enterprise design tool through 2027-2028.
**Q: Which costs more long-term?**
A: Figma, by a wide margin. At $15/seat/month, a 10-person design team pays $1,800/year. The same team using free Stitch pays $0 (until Stitch's paid tier launches in Q4 2026 at projected $8-$10/seat). Even after Stitch monetizes, the cost gap is likely to remain meaningful — projected 30-50% below Figma's pricing.
---
## Google Stitch vs Claude Design 2026: Which AI Design Tool Wins?
- **URL:** https://justinmckelvey.com/blog/google-stitch-vs-claude-design
- **Published:** May 31, 2026
- **Updated:** May 31, 2026
- **Category:** AI Design Tools
- **Reading time:** 5 min
- **Description:** Google Stitch vs Claude Design: free + voice + multi-screen vs codebase-aware design systems + tight handoff. Which AI design tool wins in 2026.
Quick Answer (The Verdict)
Google Stitch wins for free access, multi-screen prototyping, and voice input. Claude Design wins for codebase-aware design systems and tight Claude Code handoff. Both launched their current versions in spring 2026 — Stitch's streaming agent at Google I/O 2026 (May 20), Claude Design as an Anthropic Labs product (April 17). For solo builders prototyping ideas, pick Stitch (free). For product teams shipping consistent UI to an existing codebase, pick Claude Design (included with Claude Pro $20/mo). Many teams use both for different stages.
Based on hands-on work with both tools + 2026 product announcements · June 2026 · Author: Justin McKelvey, fractional CTO
Key Stats (June 2026)
• Google Stitch: Free (350 gens/mo), Google Labs, paid Q4 2026 expected $8-$10/seat
• Claude Design: Included with Claude Pro $20/mo, Max $200/mo, Team, Enterprise
• Real-time agent — Stitch: Streaming UI generation as you type/talk (May 2026)
• Real-time refinement — Claude Design: Inline edits, adjustment knobs, fast iteration
• Multi-screen — Stitch: Up to 5 connected screens per prompt
• Design system — Claude Design: Reads your codebase to build a real system
• Code export — Stitch: HTML, CSS, Tailwind, Vue, Angular, Flutter, SwiftUI (7 formats)
• Code handoff — Claude Design: Single-instruction bundle to Claude Code
TL;DR: Google Stitch vs Claude DesignThese are the two best AI design tools of 2026, and they're optimizing for different jobs. Google Stitch is fast, free, voice-enabled, and great for solo prototyping. Claude Design is codebase-aware, design-system-integrated, and great for product teams shipping consistent UI.
If you can only pick one and you're a solo builder: Stitch (free, fast, multi-screen). If you can only pick one and you're a product team with an existing codebase: Claude Design (consistent brand-aligned output). If you can use both: do that. They serve different stages of the same workflow.
I'm a fractional CTO who uses both. This is the honest head-to-head — feature by feature, with specific recommendations for each use case.
Feature-by-Feature Comparison
Capability
Google Stitch
Claude Design
Winner
Cost
Free (350 gens/mo)
Claude Pro $20/mo+
Stitch
Real-time AI agent
Streaming agent (May 2026)
Fast iteration (not streaming)
Stitch
Voice input
Yes (March 2026)
No
Stitch
Multi-screen generation
Up to 5 screens
Via separate prompts
Stitch
Codebase-aware design system
No
Yes (reads your code)
Claude Design
Brand-aligned output
Generic
Matches your codebase
Claude Design
Code export
7 formats native
HTML + Claude Code bundle
Stitch (raw); Claude Design (workflow)
Refinement tools
Voice + text + click
Inline edit + knobs + comments
Tie (different feels)
Team collaboration
Multi-user (May 2026)
Limited (research preview)
Stitch
Ecosystem integration
Google account
Claude products + Claude Code
Tie (depends on your stack)
When Stitch Wins DecisivelySolo prototyping of new product ideas. Voice + multi-screen + free + streaming agent = the fastest path from "I have an idea" to "clickable 5-screen prototype." For early-stage exploration, Stitch is genuinely better.
Cost-sensitive work. Stitch is free. Period. If you're not already paying for Claude, getting started with Claude Design means a $240/year commitment for design tooling alone.
Multi-format code export needs. If you're building in Vue, Angular, Flutter, or SwiftUI and want production-adjacent code from your design tool — Stitch's native export covers more ground.
Designs that don't need to match an existing brand. Marketing experiments, hackathon entries, demo builds, exploratory mocks — Stitch's generic-but-good output is fine here.
Voice-driven design workflow. If you like pacing while you design and describing changes verbally — Stitch is the only option of the two.
When Claude Design Wins DecisivelyWorking within an existing codebase. The codebase-aware design system means Claude Design produces output that looks like your product, not generic AI design. For product teams maintaining UI consistency across features — this is a massive advantage.
Tight integration with Claude Code. If you're using Claude Code for development, the handoff bundle from Claude Design eliminates the "design exists in tool A, development happens in tool B" friction. Single instruction, full context.
Teams already paying for Claude. If you have Claude Pro, Max, Team, or Enterprise — Claude Design is included. No incremental cost for a major capability.
Brand-aligned design at scale. Shipping 5 new pages a week that all need to look like your existing product — Claude Design's design system integration scales much better than re-prompting Stitch for brand alignment every time.
Document-driven design. If you're starting from PRDs, slide decks, or existing documents and want them turned into designs — Claude Design's broader input handling (DOCX, PPTX, XLSX) is unique.
The Hybrid Pattern (Use Both)The most common pattern I see in 2026 from product teams using both tools:
2. Stitch for exploration. "We're considering a new feature. Generate 3 possible UI directions." Multi-screen, voice-driven, fast, free.
4. Pick a direction. Stakeholder review, sketches, decisions.
6. Claude Design for refinement. Rebuild the chosen direction in Claude Design using your actual design system. Now it looks like YOUR product.
8. Claude Code for implementation. Hand off bundle to Claude Code, get production-ready code in your stack.
End-to-end: idea → 3 explorations → polished design in your brand → shipped code in 1-2 days. Neither tool alone gives you this; the combination does.
Which One Should You Buy First?
Your situation
Buy first
Solo builder, no existing brandStitch (free)
Solo builder, existing brandStitch first (free), add Claude Design if needed
Product team, no codebase yetStitch for prototyping; revisit Claude Design when you have a codebase
Product team, existing codebase, using ClaudeClaude Design (already paid)
Product team, existing codebase, not using ClaudeTry Stitch free; if you outgrow brand-alignment, upgrade to Claude Pro for Design
Hackathon / 48-hour buildStitch (speed + voice)
Agency work for clientsBoth — Stitch for early concepts, Claude Design for client-specific brand work
Enterprise design teamProbably stay in Figma for now; pilot Claude Design for product UI work
The Bottom LineBoth Google Stitch and Claude Design are excellent AI design tools — they're not really competing for the same job. Stitch is the best free option and excels at solo prototyping. Claude Design is the best paid option for product teams in the Claude ecosystem and excels at consistent brand-aligned output.
If you can only pick one tool in 2026 and you're a solo builder: start with Stitch — it's free, fast, and covers most exploratory work. If you're a product team with an existing codebase + using Claude products: start with Claude Design — the design system integration alone justifies the workflow.
If you have the budget and time to use both: that's the optimal setup. Stitch for exploration → Claude Design for refinement → Claude Code for shipping. It's the fastest "idea to shipped feature" workflow currently available.
Related reading: Google Stitch review, Claude Design review, Google Stitch vs Figma, how to use Google Stitch, Claude Code vs Cursor.
Need help picking for your specific stack and workflow? Book a free 15-min strategy call. Specific recommendation in 10 minutes. No pitch.
### Frequently Asked Questions
**Q: Is Google Stitch better than Claude Design?**
A: Depends on what you're building. Google Stitch wins for: free access, multi-screen generation (5 connected screens from one prompt), voice input, and broader code export (HTML/Tailwind/Vue/Angular/Flutter/SwiftUI). Claude Design wins for: codebase-aware design systems (it reads your existing brand), tight handoff to Claude Code, and integrated work for teams already in the Claude ecosystem. For solo builders prototyping ideas, Stitch. For product teams shipping consistent UI tied to a codebase, Claude Design.
**Q: Which costs more, Google Stitch or Claude Design?**
A: Google Stitch is free as of June 2026 (Google Labs phase, 350 generations/month). Claude Design is included with Claude Pro ($20/mo), Max, Team, or Enterprise subscriptions — no separate pricing tier. If you already pay for Claude, both are essentially free. If you don't, Stitch is the only zero-cost option. Stitch paid plans are expected Q4 2026 at projected 30-50% below Figma's $15/seat.
**Q: Which has better output quality?**
A: Claude Design wins for brand-aligned output because of the codebase-aware design system integration. Stitch produces good-looking generic designs; Claude Design produces designs that look like your brand. For new design exploration where you don't have an existing brand to match — they're roughly equivalent. For working within an existing product or codebase — Claude Design's output is significantly more usable.
**Q: Which is easier for non-designers?**
A: Google Stitch, slightly. The voice input + multi-screen workflow lets non-designers describe an entire app flow at once. Claude Design's interface is also non-designer-friendly but has more configuration upfront (codebase pointing, design system setup). For absolute beginners — Stitch. For non-designers who want professional-looking output for their existing product — Claude Design.
**Q: Which has better code export?**
A: Different strengths. Stitch exports to 7 frameworks: HTML, CSS, Tailwind, Vue, Angular, Flutter, SwiftUI. Claude Design exports as live HTML + a bundle that hands off to Claude Code (which can then output whatever you need). If you want direct framework code from the design tool: Stitch. If you're using Claude Code as your development environment: Claude Design's handoff is tighter.
**Q: Which is better for prototyping?**
A: Google Stitch, for most prototyping. The multi-screen generation (5 connected screens from one prompt) plus voice input make it the fastest path from idea to clickable prototype. Claude Design's strength is consistency within an existing system, which is less critical for early-stage prototyping where you're exploring rather than committing to a design language.
**Q: Which is better for shipping production UI?**
A: Claude Design, especially if you're using Claude Code. The design system integration produces consistent output across designs, and the handoff bundle to Claude Code is the cleanest design-to-shipped-code workflow available. Stitch's exported code is usable but requires more cleanup. For teams shipping to production weekly, Claude Design's workflow advantage compounds.
**Q: Can I use both together?**
A: Yes, and the pattern is becoming common in 2026. Use Stitch for early prototyping and exploration (free, fast, multi-screen). Use Claude Design for refining the chosen direction into your actual brand + handing off to development. Or use them for different projects: Stitch for hackathons, Claude Design for client work where you have an existing codebase to maintain consistency with.
**Q: Will Claude Design replace Google Stitch?**
A: Unlikely — they serve different use cases. Stitch's free tier and multi-screen generation are unique strengths. Claude Design's codebase integration and Claude ecosystem ties are unique strengths. Both are likely to continue evolving in different directions. The bigger competitive question is whether Figma can integrate AI capabilities fast enough to match either.
**Q: Which has better real-time AI features?**
A: Stitch's real-time streaming agent (announced at Google I/O 2026, May 20) reflows layouts as you type or speak — qualitatively different from prompt-and-wait workflows. Claude Design has fast iteration via inline edits and adjustment knobs but doesn't have the same continuous-streaming experience. For the most cutting-edge real-time AI design experience as of June 2026 — Stitch's agent is the leader.
---
## Claude Design Review 2026: Anthropic's AI Design Tool
- **URL:** https://justinmckelvey.com/blog/claude-design-review
- **Published:** May 31, 2026
- **Updated:** May 31, 2026
- **Category:** AI Design Tools
- **Reading time:** 6 min
- **Description:** Claude Design review 2026: Anthropic's AI design tool (Opus 4.7). Reads your codebase, builds design systems, ships HTML. Honest verdict + comparisons.
Quick Answer (The Honest Verdict)
Claude Design is Anthropic Labs' AI design tool, launched April 17, 2026 and powered by Claude Opus 4.7. The killer feature: it reads your codebase during onboarding to build a real design system, then every subsequent design uses your colors, typography, and components automatically. Output is live HTML, not static images. Tightest design-to-code handoff in the AI design space (one instruction to Claude Code). Included with Claude Pro ($20/mo), Max, Team, Enterprise. For teams in the Claude ecosystem, this is the AI design tool to use.
Based on hands-on use + Anthropic announcement research · June 2026 · Author: Justin McKelvey, fractional CTO
Key Stats (June 2026)
• Launched: April 17, 2026 by Anthropic Labs
• Model: Claude Opus 4.7 (Anthropic's flagship)
• Pricing: Included with Claude Pro ($20/mo), Max, Team, Enterprise
• Interface: Two-pane — chat left, live canvas right
• Output: Live HTML (clickable, testable) — not static images
• Design system integration: Reads your codebase + design files during onboarding
• Handoff: Single-instruction bundle to Claude Code (tightest in AI design space)
TL;DR: Claude Design in 90 SecondsClaude Design is Anthropic's bet that AI can replace large portions of the "designer + developer handoff" workflow. You describe what you want, Claude generates a live HTML design, and you refine through voice, comments, or sliders. When you're ready to ship, you hand off to Claude Code with a single instruction.
The killer differentiator vs other AI design tools: design system integration that actually works. During onboarding, Claude reads your codebase and existing design files to build a system. Every subsequent project uses your actual colors, typography, and components — not generic AI-generated styling.
I'm a fractional CTO who builds with Claude products daily. This is the honest review — what Claude Design does uniquely well, where it falls short, and whether it justifies your Claude subscription if you're not already paying.
What Claude Design Does Uniquely WellDesign system integration. This is the single biggest differentiator from Google Stitch, v0, or any other AI design tool. During the onboarding flow, you point Claude at your codebase (or existing Figma file, or live website via web capture) and Claude builds a design system from what it finds. Subsequent generations use YOUR colors, YOUR typography, YOUR components. The output stops looking like "an AI-generated design" and starts looking like your brand.
Tight handoff to Claude Code. When a design is ready to ship, Claude packages everything — the HTML, the design tokens, the component structure, the codebase context — into a single bundle you pass to Claude Code with one instruction. The implementation engineer (human or AI) starts with full context instead of having to interpret a Figma file. This is the cleanest design-to-code workflow currently available in 2026.
Live HTML output. The output isn't a static image of a design — it's actual HTML you can click through, test interactions on, and copy directly into your app. This eliminates the entire "design looks great but how do we build it" failure mode.
Multiple input types. Text prompts, images, documents (DOCX, PPTX, XLSX), codebase pointers, web captures of existing sites. You can show Claude exactly what you want as a reference rather than describing it.
Real-time refinement tools. Inline comments on specific elements, direct text editing, adjustment knobs for spacing/color/layout. You don't have to re-prompt for every small change — refine on the canvas, then ask Claude to apply your changes across the full design.
Where Claude Design Falls ShortThe Claude Pro requirement. Claude Design is bundled with Claude subscriptions — there's no free tier. If you're not already paying for Claude Pro ($20/mo) or higher, you're effectively paying $240/year for access vs Google Stitch's $0. For solo builders not in the Claude ecosystem, this is a real friction.
Limited team collaboration features. Compared to Figma's mature multiplayer + comments + dev mode workflow, Claude Design's team features are early. For solo work or pair design, it's fine; for 10-designer teams, Figma is still the answer.
Research preview limitations. As of June 2026, Claude Design is in research preview. That means: features change, occasional bugs, and no SLA. The roadmap is clear (this is going to be a flagship Anthropic product), but if you need stability for client work today, expect occasional hiccups.
Less precision than Figma. Pixel-level adjustment, micro-typography work, and complex layered designs are still faster in Figma. Claude Design is excellent for new design generation; less good for tweaking existing precision-heavy designs.
Smaller community + ecosystem. Figma has 10+ years of plugins, templates, community resources. Claude Design has 6 weeks. The depth gap shows when you need a niche solution.
The Design System Integration Is the Real WinI want to spend more time on this because it's qualitatively different from how other AI design tools work.
When you onboard Claude Design, you point it at your codebase (e.g., your Tailwind config, your component library, your existing CSS) or your design files. Claude reads them and extracts your design tokens — the actual colors, fonts, spacing scales, and component patterns your team uses.
Then, when you ask Claude to generate "a pricing page for our SaaS," the output uses YOUR pricing-card components, YOUR primary button style, YOUR typography hierarchy. The design looks like it came from your team, not from generic AI output.
This solves the single biggest problem with AI design tools at scale: output that looks consistent across projects and aligned with your existing product. Google Stitch and v0 generate good-looking designs that don't match your brand. Claude Design generates designs that DO match your brand because it learned your brand.
The Workflow That Makes Claude Design Worth ItIf you have a codebase + use Claude Code, here's the workflow that makes Claude Design genuinely productive:
2. Onboard once. Point Claude at your codebase. It builds your design system. Takes 5-15 minutes.
4. Prompt for new screens. "I need a checkout flow for our subscription product, 3 screens." Claude generates them in your design system.
6. Refine inline. Click on the heading, change the copy. Adjust the spacing slider. Comment on the CTA position.
8. Hand off. "Bundle this for Claude Code." Claude packages everything into a single instruction.
10. Ship. Paste into Claude Code, get production-ready React/Vue/whatever-you-use code.
End-to-end: from "I need a new feature" to "deployable code" in 30-60 minutes, not days. This is the productivity unlock that makes the Claude Pro subscription worth it for product teams.
Who Should Use Claude Design
• Teams already in the Claude ecosystem. If you use Claude Code, Claude Pro, or any Claude products daily, adding Claude Design is a no-brainer.
• Founders who want design + code in one workflow. The handoff bundle to Claude Code is uniquely tight.
• Anyone with an existing codebase + design system. The system extraction is the killer feature.
• Product teams shipping consistent UI. Brand alignment across designs is way better than competitors.
• Solo builders who already pay for Claude. No incremental cost for a major capability.
Who Should Skip Claude Design (For Now)
• Designers who want precision. Stay in Figma. Pixel work is faster there.
• Large teams with deep Figma workflows. The switching cost outweighs the gain right now.
• Anyone not already paying for Claude. Try free Google Stitch first; come to Claude Design when you've used Claude products for other work.
• Teams that need stable, no-surprise-changes tooling. The "research preview" label means things move.
The Bottom LineClaude Design is the strongest AI design tool for teams already in the Claude ecosystem in 2026. The design system integration is genuinely better than any competitor. The Claude Code handoff is uniquely tight. For most product teams, the workflow speed makes the Claude Pro cost trivially worth it.
For solo builders not yet paying for Claude — try Google Stitch first. It's free and the multi-screen generation is excellent. Come to Claude Design when you've started using Claude Code for development and want the integrated workflow.
For pure designers with established Figma workflows — keep using Figma. Pull in Claude Design for the initial generation phase only when you want a faster path to first draft.
Related reading: Google Stitch review, Google Stitch vs Claude Design, Google Stitch vs Figma, best vibe coding tools 2026.
Trying to figure out which AI design tool fits your specific workflow? Book a free 15-min strategy call. Specific recommendation, no pitch.
### Frequently Asked Questions
**Q: What is Claude Design?**
A: Claude Design is an Anthropic Labs product launched April 17, 2026 that lets you collaborate with Claude to create polished visual work — designs, prototypes, slides, one-pagers, and more. It's powered by Claude Opus 4.7 and available in research preview to Claude Pro, Max, Team, and Enterprise subscribers. The output is live HTML (clickable, testable), not static images.
**Q: Is Claude Design free?**
A: Not as a standalone product — it's included with Claude Pro ($20/mo), Max ($200/mo), Team, and Enterprise subscriptions. If you already pay for Claude, you have access at no additional cost. There's no separate Claude Design pricing tier as of June 2026; that may change as it moves out of research preview.
**Q: How does Claude Design work?**
A: The interface has two panes: chat on the left, live canvas on the right. You describe what you want (a landing page, an app screen, a pitch deck), Claude generates the design on the canvas, and you refine through voice, inline comments, or adjustment knobs for spacing, color, and layout. During onboarding, Claude reads your codebase and design files to build a design system — every subsequent project uses your colors, typography, and components automatically.
**Q: What can you build with Claude Design?**
A: Designs, prototypes, slide decks, one-pagers, app screens, marketing pages, dashboards, and any HTML-based visual work. The output is live HTML — clickable and testable, not static images. You can also point Claude at your codebase or upload images, DOCX, PPTX, XLSX files as references. Web capture lets you grab elements directly from existing websites.
**Q: How is Claude Design different from Google Stitch?**
A: Both generate live UIs from prompts. The main differences: Claude Design's design system integration is significantly better — it actually reads your codebase to build a system rather than generating generic-looking designs. Google Stitch wins on free access (Stitch is free; Claude Design requires Pro/$20+ subscription) and multi-screen flow generation. Stitch is better for non-technical builders; Claude Design is better for teams with existing codebases. See the head-to-head comparison.
**Q: Does Claude Design replace Figma?**
A: Not for most teams. Figma still wins for precision design work, manual component libraries, plugin ecosystems, and team collaboration at scale. Claude Design wins for AI-native generation, codebase-aware design systems, and tight handoff to Claude Code for implementation. The 2026 pattern: use Claude Design for AI-generated first drafts, use Figma for refinement and team workflows.
**Q: Can Claude Design hand off to developers?**
A: Yes, and the handoff is unusually tight. When a design is ready to build, Claude packages everything into a handoff bundle that you pass to Claude Code with a single instruction. The bundle includes the HTML, the design tokens, component structure, and context about how it should integrate with your existing codebase. This is the single biggest workflow advantage Claude Design has over other AI design tools.
**Q: Who should use Claude Design?**
A: Teams already in the Claude ecosystem (using Claude Pro/Max or Claude Code), founders who want design+code in a unified workflow, anyone with an existing codebase they want a design system extracted from, and product teams who'd rather generate designs by prompting than dragging elements around. Skip if you're a pure designer who needs Figma's precision, or if you're not paying for Claude.
**Q: What's the actual quality of Claude Design output?**
A: Better than most AI design tools because the design system integration produces consistent, brand-aligned output across projects. Worse than a skilled designer working in Figma if you're going for award-winning bespoke design. The sweet spot: shipping consistent, professional-looking designs faster than manual workflows for product UI work, marketing pages, and internal tools. Not a replacement for senior product designers on flagship work.
**Q: Is Claude Design worth the Claude Pro subscription?**
A: If you're using Claude Code, Claude Pro, or any Claude product daily for work — yes, Claude Design alone might justify the subscription. The integration with your codebase and the handoff to Claude Code is genuinely productive. If you're not already a Claude user, the $20/mo for Pro vs free Google Stitch is the comparison to make — depends on whether you value the design system integration enough to pay for it.
---
## Google Stitch Review 2026: The Honest Verdict
- **URL:** https://justinmckelvey.com/blog/google-stitch-review
- **Published:** May 31, 2026
- **Updated:** May 31, 2026
- **Category:** AI Design Tools
- **Reading time:** 6 min
- **Description:** Google Stitch review 2026: AI design tool from Google Labs (free, 350 generations/mo). Real-time agent, 5-screen flows, voice input. The honest verdict.
Quick Answer (The Honest Verdict)
Google Stitch is the AI design tool Google quietly built from the Galileo AI acquisition — and as of June 2026, it's the best free option on the market. Real-time AI agent that reflows layouts as you talk. Generates up to 5 connected screens from one prompt. Voice input. Code export to HTML/Tailwind/Vue/Angular/Flutter/SwiftUI. Currently free with 350 generations/month, no credit card. Paid plans expected Q4 2026 at projected 30-50% below Figma's $15/seat. For most builders in 2026, Stitch is worth using immediately.
Based on hands-on use of Google Stitch + Galileo AI heritage research · June 2026 · Author: Justin McKelvey, fractional CTO
Key Stats (June 2026)
• Owned by: Google (acquired Galileo AI mid-2025, relaunched as Stitch)
• Pricing: Free during Google Labs phase — 350 generations/month
• Paid plans: Expected Q4 2026, projected $8-$10/seat (30-50% below Figma)
• Real-time agent: Launched May 20, 2026 at Google I/O — reflows UI as you type/talk
• Multi-screen: Generates up to 5 connected screens from a single prompt
• Code export: HTML, CSS, Tailwind, Vue, Angular, Flutter, SwiftUI
• Search trend: 49,500/mo, +52,281% YoY growth — the SERP isn't saturated yet
TL;DR: Google Stitch in 90 SecondsGoogle Stitch is an AI-powered design tool that generates complete UIs from natural language descriptions. You describe what you want ("a SaaS dashboard with a sidebar, analytics cards, and a dark theme") and Stitch produces a polished, interactive design in seconds. You can refine via text, voice, or by clicking directly on elements. Export as code in 7 different frameworks.
The product has heritage worth knowing: it's built from Galileo AI, which Google acquired in mid-2025, and relaunched under the Stitch name in late 2025. At Google I/O 2026 (May 20), Google announced major upgrades including the real-time streaming agent and multi-user collaboration — both of which are genuinely impressive.
I'm a fractional CTO who uses Stitch in client work and design system explorations. This is the honest review — what it does well, where it still falls short, and whether you should pick it over Figma, Claude Design, or v0.
What Stitch Actually Does WellThe real-time agent is the killer feature. Most AI design tools follow a "prompt → wait → see result" loop. Stitch's agent renders UI components directly onto the canvas as you type or speak, with the layout reflowing in real time. It feels qualitatively different — more like sketching alongside a collaborator who sees your intent immediately than running a series of generation jobs.
Multi-screen generation is genuinely useful. Describe an entire application flow ("a booking app with a list of times, a confirmation page, and a payment screen") and Stitch generates all five screens at once, with linked navigation. For prototyping new product ideas, this saves days compared to building screen-by-screen.
Voice input that actually works. Voice input was introduced in March 2026 and is tightly integrated into the streaming loop. You can say "give me three different menu options" or "show me this screen in different color palettes" and watch them appear. The voice quality is good enough that you can actually design while pacing.
Code export across 7 formats. HTML, CSS, Tailwind CSS, Vue.js components, Angular templates, Flutter widgets, and SwiftUI views. The code quality is genuinely production-adjacent — cleaner than most AI design tools — and pairs well with Cursor or Claude Code for the final implementation pass.
And it's free. 350 generations per month, no credit card. For solo designers or small teams, this covers most real workflows.
Where Stitch Still Falls ShortIt's not perfect. The honest list:
Precision work is still Figma's territory. Stitch is great at generating new designs from prompts; it's less good at the pixel-level adjustments designers do all day in Figma. If you need to nudge a button 2px and align it precisely to a grid, Figma still wins.
Brand-specific design systems aren't its strength. Stitch generates good-looking designs, but they look like "AI-generated designs" if you compare them to a design system custom-tailored to your brand. Claude Design handles brand-specific design systems significantly better because it reads your codebase to build a system before generating.
Team workflows at scale aren't as deep as Figma. Figma's component libraries, dev mode handoff, plugin ecosystem, and team management features are the result of years of iteration. Stitch is excellent for individual builders and small teams; it's not yet the choice for 50-designer organizations.
The 350-generation cap is real. If you're iterating intensively (10+ generations per design across multiple designs per day), you'll hit the cap before month-end. Workaround: design with the visual editor for tweaks instead of re-prompting.
Who Should Use Google Stitch
• Solo founders prototyping product ideas. Speed-to-first-design + multi-screen generation = fastest path from idea to clickable prototype.
• Developers who hate manual design tools. Skip Figma entirely. Prompt, refine, export to your framework of choice, hand off to your IDE.
• Designers doing initial concepting. Use Stitch for the "what could this look like" phase, then move to Figma for precision work.
• Anyone who needs design + code in one pass. The 7-format code export is unusual and useful.
• Anyone on a budget. It's free. Use it.
Who Shouldn't Use Stitch (Yet)
• Large design teams with established Figma workflows. The switching cost isn't worth it for the marginal speed gain.
• Designers doing high-precision UI work daily. Pixel-perfect adjustment is faster in Figma.
• Teams that need deep brand-specific design systems. Claude Design is the better fit for this use case.
• Anyone who needs production-ready code from a designer. The export is good but still requires developer review.
Stitch vs Figma vs Claude Design (Quick Compare)
Feature
Google Stitch
Figma
Claude Design
Cost (June 2026)
Free (350 gens/mo)
$15/seat/mo
Included with Claude Pro/Max
Real-time AI agent
Yes (best in class)
Limited (AI features in beta)
Yes (Opus 4.7-powered)
Multi-screen generation
Up to 5 screens
No (single screen, then duplicate)
Multi-screen via prompts
Voice input
Yes
No
No (text/inline edits)
Brand-specific design systems
Limited
Excellent (manual)
Excellent (reads codebase)
Code export
HTML/CSS/Tailwind/Vue/Angular/Flutter/SwiftUI
Dev Mode (with plugin)
Bundle to Claude Code
Team collaboration
Real-time multiplayer (new)
Industry-leading
Limited (early)
The Bottom LineGoogle Stitch is the most underrated AI design tool of 2026 because most people still think of it as the old Galileo AI. The truth: the May 2026 I/O update transformed it into the strongest real-time AI design tool on the market — and it's free.
For most builders, the right move in 2026 is: use Stitch for new design exploration and multi-screen prototyping, use Figma for precision work and team collaboration at scale, use Claude Design for brand-specific design system work tied to your codebase. They're not mutually exclusive.
If you're starting fresh and need to pick one as your primary design tool — Stitch is the surprising answer for solo builders and small teams in 2026. The price (free) is unbeatable, the real-time agent is genuinely better than competitors, and the code export means you can go from idea to shipped feature without leaving the AI-tool ecosystem.
Related reading: Claude Design review, Google Stitch vs Claude Design, Google Stitch vs Figma, how to use Google Stitch.
If you're trying to figure out which AI design tool fits your specific workflow, book a free 15-min strategy call. I'll give you a specific recommendation in 10 minutes. No pitch.
### Frequently Asked Questions
**Q: Is Google Stitch free?**
A: Yes, completely free as of June 2026. Stitch is in the Google Labs experimental phase with no paid plans. The free tier includes 350 generations per month — enough for most individual designers or small teams. Paid plans are expected by Q4 2026, with industry analysts anticipating pricing 30-50% below Figma's $15/seat.
**Q: What is Google Stitch?**
A: Google Stitch is an AI-powered design and prototyping tool that generates complete user interfaces from natural language descriptions. Originally built as Galileo AI and acquired by Google in mid-2025, it relaunched under the Stitch name. At Google I/O 2026 (May 20) Google announced major upgrades: a real-time streaming design agent, multi-user collaboration, and multi-screen generation.
**Q: How is Google Stitch different from Figma?**
A: Figma is a manual design tool with AI features bolted on; Stitch is an AI-native tool where you describe what you want and it generates the design. Figma costs $15/seat/month; Stitch is currently free. Figma's strength: precision, granular control, decade of community resources. Stitch's strength: speed-to-first-design and the new real-time agent that reflows layouts as you talk. They serve overlapping but different workflows.
**Q: What can Google Stitch generate?**
A: Multi-screen application flows from a single description (up to 5 connected screens), individual UI components, dashboards, mobile app screens, landing pages, and entire micro-app flows. Code export works in HTML/CSS, Tailwind, Vue.js, Angular, Flutter widgets, and SwiftUI views — broader coverage than most AI design tools.
**Q: How do you use Google Stitch?**
A: Sign in with a Google account at stitch.withgoogle.com (no credit card required). Describe the UI you want — for example, "a dashboard with sidebar navigation, analytics cards, and a dark theme" — and Stitch generates the design in seconds. Refine via text, voice input, or by clicking on elements directly. Export as code or share a live preview link.
**Q: What's the real-time AI agent in Google Stitch?**
A: Announced at I/O 2026, the Stitch Agent renders UI components directly onto the canvas as you type or speak, reflowing the layout in real time. This is qualitatively different from "prompt → wait → see result" workflows — it feels more like sketching with a collaborator who sees your intent immediately. Voice input is tightly integrated, so you can say "give me three different menu options" and watch them appear live.
**Q: Can Google Stitch export real code?**
A: Yes. Code export covers HTML, CSS, Tailwind CSS, Vue.js components, Angular templates, Flutter widgets, and SwiftUI views. The code is clean enough to use as a starting point for production work — better than most AI design tools — but you'll still want a developer to review before shipping. Combined with a tool like Cursor or Claude Code, the export-to-shipped-feature time is faster than designing in Figma first.
**Q: How does Google Stitch compare to Claude Design?**
A: Both generate live (not static) UIs from prompts. Stitch is free and Google-backed; Claude Design (launched April 17, 2026) is part of Anthropic Labs and included with Claude Pro/Max subscriptions. Stitch has multi-screen generation and voice input; Claude Design has design system integration (reads your codebase) and tighter handoff to Claude Code. Both are excellent. See Google Stitch vs Claude Design for the head-to-head.
**Q: Is Google Stitch worth using in 2026?**
A: For most builders and designers in 2026: yes. The free tier (350 generations/month), the real-time agent, the multi-format code export, and the multi-screen flow generation make it genuinely useful — not just a demo. Where it falls short: precision design work (still Figma's territory), brand-specific design systems (Claude Design handles this better), and team workflows at scale (Figma's community and integrations are deeper).
**Q: When will Google Stitch start charging?**
A: Paid plans are projected for Q4 2026. Industry analysts expect a continued free tier (likely with reduced generation limits) and paid tiers priced 30-50% below Figma — putting the paid plan around $8-$10/month per seat. The exact pricing isn't announced, but the Galileo AI heritage and Google's free-tier history suggest they'll keep meaningful free access permanently.
---
## AI Consultant Services: What to Buy and What to Skip (2026)
- **URL:** https://justinmckelvey.com/blog/ai-consultant-services
- **Published:** May 31, 2026
- **Updated:** May 31, 2026
- **Category:** AI for Business
- **Reading time:** 6 min
- **Description:** AI consultant services explained: strategy, implementation, automation, training, audits. 2026 pricing for each + 4 services not worth paying for.
Quick Answer (Buyer's Map)
AI consulting services in 2026 fall into 7 categories: strategy, implementation, workflow automation, readiness assessment, training, fine-tuning/prompt engineering, and infrastructure setup. Most businesses need implementation ($25K-$150K) or automation ($15K-$80K). Many pay for strategy ($50K-$500K) and regret it. The cheapest engagement that actually changes your business: a readiness assessment ($5K-$15K) + an implementation pilot ($20K-$40K) — under $30K total to ship one production AI feature.
Based on active AI consulting work for founders + ops leaders · June 2026 · Author: Justin McKelvey, fractional CTO
Key Stats (June 2026)
• AI strategy + roadmap: $50K-$500K (often not worth it for non-Fortune 500)
• AI implementation: $25K-$150K per feature (the workhorse service)
• AI workflow automation: $15K-$80K per workflow
• AI readiness assessment: $5K-$25K (best starting point if you don't know where to start)
• Training + change management: $10K-$50K
• Fine-tuning + prompt engineering: $10K-$80K
• Cheapest path to shipped AI: ~$30K (assessment + pilot)
TL;DR: AI Consultant Services in 2026AI consulting services have proliferated rapidly in 2025-2026 as every business tries to figure out what to buy. The catch: most service categories are dramatically overpriced for what they deliver, and the ones that consistently work are simpler than the market makes them sound.
If you take one thing from this guide: for most businesses, the right starting move is a readiness assessment ($5K-$25K) followed by a single implementation pilot ($20K-$40K). Total: under $30K, one shipped AI feature, and a playbook for what to do next. Anything more elaborate at the start usually means paying for slideware.
I'm a fractional CTO who provides several of these services to founders and ops leaders. This guide is the honest buyer-side map — written by someone who'd rather you get the smallest engagement that actually works than the largest one I could sell you.
The 7 AI Consulting Service Categories
Service
2026 cost range
Best for
When to skip
AI strategy + roadmap
$50K-$500K
Fortune 500 board buy-in
Most businesses under $100M revenue
AI implementation
$25K-$150K per feature
Anyone who knows what they want shipped
If you don't know what to build (do assessment first)
AI workflow automation
$15K-$80K per workflow
Ops-heavy businesses with clear bottlenecks
If Zapier + GPT integrations cover it
AI readiness assessment
$5K-$25K
Anyone who doesn't know where to start
If you've already shipped one AI feature
Training + change management
$10K-$50K
Large teams adopting AI tooling
Small teams (free YouTube + practice works)
Fine-tuning + prompt engineering
$10K-$80K
Specialized use cases where stock models miss
Standard use cases (better to try prompt eng first)
Infrastructure + cost optimization
$15K-$50K
Apps with high token costs or scaling issues
Pre-production work
The Service Most Businesses Should Buy FirstAI readiness assessment + implementation pilot. Total cost: usually under $30K. Total time: 4-8 weeks. Output: one production AI feature plus a clear plan for the next 2-3.
Here's the realistic version of what you get for that money:
Week 1-2: Readiness assessment. A consultant inventories your data, your tooling, your team's AI maturity, and your top 3-5 highest-ROI opportunities. Output: a 5-15 page document you can act on, plus a clear pilot proposal.
Week 3-8: Implementation pilot. Build, test, and deploy one AI feature in production. Could be: tier-1 support automation, sales lead enrichment, document processing, content generation, customer-facing AI feature. Output: working code in production + measurable before/after metrics.
That's $30K to find out, for real, whether AI works for your business. Compared to paying $250K for a strategy deck that doesn't ship anything — the math is obvious.
The 4 Services NOT to Pay ForI'd actively skip these for most businesses:
1. Generic AI strategy under $200K. Strategy decks under that price point are glorified Google searches plus a few hours of "discovery." You're paying for someone to summarize publicly available AI use cases for your industry. If you want a strategy primer, McKinsey Quarterly publishes free ones. Pay for strategy only if you have a $1M+ change management problem that requires Big 4 firepower.
2. "AI transformation" programs without specific pilots. Programs that start with "let's align on the vision" and run 12+ months without a single shipped feature are vendor revenue, not value. Demand pilots in the first 90 days or walk away.
3. Custom LLM training when fine-tuning or prompt engineering would work. Custom model training (from-scratch or LoRA) costs 10-50x more than equivalent results from clever prompt engineering or basic fine-tuning. Some firms push this because it's lucrative billable work; the math usually doesn't justify it.
4. AI ethics audits as standalone engagements. Ethics review is critical for AI work in regulated industries, but it should be embedded in implementation engagements, not sold as a standalone $50K deliverable. Treat it as part of the work, not a separate service.
What's Inside Each Service TypeIf you're shopping, here's what real deliverables look like:
AI readiness assessment: Data audit, opportunity prioritization, team maturity assessment, infrastructure recommendations, 90-day pilot proposal. 2-week engagement, $5K-$25K. More on AI readiness assessment.
AI implementation: A specific AI feature shipped to production. Discovery + scoping (2 weeks), build + test (4-6 weeks), deploy + handoff (2 weeks). Total 8-12 weeks, $25K-$150K. More on AI implementation consultants.
AI workflow automation: A back-office or operations workflow automated end-to-end. Most projects: support automation, sales ops, document processing. 4-8 weeks, $15K-$80K per workflow. More on AI automation consultants.
AI strategy + roadmap: Document outlining opportunities, ROI estimates, prioritization, change management plan. Useful for board buy-in at large companies. 4-12 weeks, $50K-$500K. More on AI strategy consultants.
Training + change management: Team training on AI tools (Claude, Cursor, ChatGPT for business), workflow integration, internal advocate program. 4-12 weeks, $10K-$50K.
Fine-tuning + prompt engineering: Optimization of model performance for your specific use case. Usually bundled with implementation work but available standalone. 2-6 weeks, $10K-$80K.
Infrastructure setup + cost optimization: LLM provider setup, monitoring + observability (LangSmith, Helicone), cost controls, fallback strategies. 2-8 weeks, $15K-$50K.
How to Pick the Right Service for Your Situation
Your situation
Right first service
Approximate cost
You know you want AI but don't know where to start
Readiness assessment
$5K-$25K
You have a specific AI feature in mind
Implementation pilot
$20K-$60K
You have a specific workflow bottleneck
Automation pilot
$15K-$40K
Your AI features are slow or expensive in production
Infrastructure + cost optimization
$15K-$50K
Stock LLMs aren't accurate enough for your use case
Prompt engineering (try first), then fine-tuning
$5K-$30K (prompt eng), $20K-$80K (fine-tune)
Your large team needs to adopt AI tooling
Training + change management
$10K-$50K
You're at Fortune 500 scale and need board buy-in first
Strategy + roadmap
$200K-$500K (Big 4 territory)
The Bottom LineAI consultant services in 2026 are a buffet. The trap is over-ordering. The honest path for most businesses:
2. Start with a readiness assessment ($5K-$25K) to identify the highest-ROI opportunity
4. Pilot the top opportunity ($20K-$40K) as an implementation or automation project
6. Measure the impact for 60-90 days
8. Decide what to do next based on real data, not pre-engagement promises
Total exposure for the first move: under $30K. That gets you a real AI feature in production plus the playbook to ship more. Anything more elaborate at the start risks paying for slideware.
If you're shopping for AI consulting services right now and want a second opinion on what to buy first, book a free 15-min strategy call. I'll give you a specific recommendation based on your business in 10 minutes. No pitch.
Related reading: AI consultant overview, AI consultant companies (how to vet), AI implementation consultant, AI automation consultant, AI consultant for small business.
### Frequently Asked Questions
**Q: What services do AI consultants offer?**
A: AI consulting services in 2026 fall into 7 main categories: (1) AI strategy + roadmap, (2) AI implementation — building working features, (3) AI workflow automation, (4) AI readiness assessment + audit, (5) AI training + change management, (6) Custom model fine-tuning + prompt engineering, (7) AI infrastructure setup + cost optimization. Most businesses need #2 or #3. Many pay for #1 instead and regret it.
**Q: Which AI consultant service do I actually need?**
A: If you can describe what you want AI to do for you ("automate Tier-1 support," "summarize sales calls," "rank inbound leads"), you need implementation or automation services. If you literally don't know where to start, you need a readiness assessment (1-2 weeks, $5-15K). Most businesses do NOT need full AI strategy engagements — those are designed for Fortune 500 change management, not for shipping working AI.
**Q: How much do AI consultant services cost in 2026?**
A: Strategy + roadmap: $50K-$500K. Implementation: $25K-$150K per feature. Workflow automation: $15K-$80K per workflow. Readiness assessment: $5K-$25K. Training: $10K-$50K. Fine-tuning + prompt engineering: $10K-$80K. Infrastructure setup: $15K-$50K. The cheapest engagement that actually changes your business is usually a readiness assessment + implementation pilot — total under $30K for most businesses.
**Q: What AI consultant services are NOT worth paying for?**
A: Four services I'd skip for most businesses: (1) Generic AI strategy decks under $200K — they're glorified Google searches. (2) "AI transformation" programs without specific pilots — endless billable hours. (3) Custom LLM training when fine-tuning or prompt engineering would work — usually 10x more expensive for marginal gains. (4) AI ethics audits as standalone engagements — bundle into implementation work instead. These services have legitimate use cases but most buyers don't need them.
**Q: What's included in an AI readiness assessment?**
A: A good 2-week AI readiness assessment includes: (a) inventory of your data assets and quality, (b) identification of 3-5 highest-ROI AI opportunities specific to your business, (c) honest assessment of your team's AI maturity and gaps, (d) infrastructure + tooling recommendations, (e) prioritized 90-day pilot proposal with cost estimate. Cost: $5K-$25K. Output: a 5-15 page document you can act on, not a 60-page deck for the board. See what is an AI readiness assessment.
**Q: Do AI consultants offer ongoing services or just one-time projects?**
A: Both. One-time engagements: implementation pilots, readiness assessments, audits, training programs — typical 4-12 weeks, $5K-$150K. Ongoing retainers: $5K-$25K/month for continuous AI implementation work, monitoring, and iterative improvements. Some businesses do best with project-based engagements + occasional consulting check-ins; others need continuous embedded support. The biggest commitment trap: long-term retainers without clear scope or deliverables.
**Q: What's the difference between AI consulting and AI consulting services?**
A: Same thing, different word. "AI consulting services" is the more buyer-side phrasing — businesses Googling for what they can purchase. "AI consulting" tends to be the supply-side term that consulting firms use to describe themselves. The services are identical; just different ways of saying it.
**Q: Should I bundle AI consultant services or buy individual ones?**
A: Start unbundled. A readiness assessment + a single implementation pilot is the cleanest path — you learn what's possible, validate the highest-ROI opportunity, and limit financial exposure. Bundled engagements ('AI transformation programs,' 'enterprise AI partnerships') trap you in long retainers with vague deliverables. Once you've shipped one or two pilots successfully, you can decide if a longer-term bundled relationship makes sense.
**Q: How do I know if an AI consultant's service offerings are real or just marketing?**
A: Two filters: (1) Look at their case studies — do they include specific recent implementations with measurable outcomes, or just 'helped client transform their business' generic wins? (2) Ask them to describe a service offering in concrete terms — what's included, what's the deliverable, what's the cost range, what's the timeline. Real services have specific answers; marketing fluff doesn't survive direct questions.
**Q: Do AI consultants work with small businesses or only enterprises?**
A: Both, but they tend to specialize. Boutique consultants and solo specialists actively work with small businesses — they price for SMB ($10K-$80K projects) and ship in 4-12 weeks. Big 4 firms generally don't take SMB work; their cost structures require $250K+ engagements. If you're a small business, the right consultants are independents or 3-15 person firms — not Accenture or Deloitte. See AI consultant for small business.
---
## AI Automation Consultant: What They Cost in 2026
- **URL:** https://justinmckelvey.com/blog/ai-automation-consultant
- **Published:** May 31, 2026
- **Updated:** May 31, 2026
- **Category:** AI for Business
- **Reading time:** 5 min
- **Description:** AI automation consultants build LLM-powered workflows that replace human tasks. 2026 pricing ($150-$500/hr), top use cases, when they pay back in 60 days.
Quick Answer (What They Cost + ROI)
AI automation consultants build LLM-powered workflows that replace human tasks — customer support routing, sales lead enrichment, document processing, content generation. 2026 rates: $150-$500/hr, $15K-$80K per workflow (4-8 weeks). The most valuable use cases pay back in 60-90 days: support automation cuts 40-70% of tier-1 tickets; sales ops saves 5-15 hours/week. The CPC on this keyword ($58) is one of the highest in AI consulting — measurable ROI attracts sophisticated buyers.
Based on active AI automation work for founders + ops leaders · June 2026 · Author: Justin McKelvey, fractional CTO
Key Stats (June 2026)
• Hourly rates: $150-$500/hr (senior specialists $300-$500)
• Single workflow: $15K-$80K (4-8 weeks discovery + build + deploy)
• Multi-workflow program: $50K-$200K (12-26 weeks)
• Retainer: $5K-$15K/month for ongoing work
• Highest-ROI use cases: Support automation (40-70% ticket reduction), sales ops, document processing
• Typical payback: 60-90 days after deployment for well-scoped workflows
• Market signal: $58 CPC, +129% YoY search growth — buyer demand is exploding
TL;DR: AI Automation Consultants in 2026An AI automation consultant builds workflows that use LLMs to replace tasks humans used to do. Customer support routing, sales lead enrichment, document processing, internal ops automations. The deliverable is shipped automation in production — not a slide deck, not a roadmap.
The market for AI automation grew faster than almost any other AI consulting niche in 2025-2026 because the ROI is measurable. You can quantify support ticket volume reduction, sales ops time saved, document processing speed improvements. That measurability is why CPC on this keyword runs $58 — buyers know what they're paying for.
I'm a fractional CTO who builds AI automations for founders and ops leaders. This guide is the honest version: what these consultants actually do, what they cost, and which workflows actually return on the investment fast.
What an AI Automation Consultant Actually DoesThe work breaks into four categories:
1. Customer support automation. AI agents that handle tier-1 questions, classify intent, draft personalized responses, route the complex cases to humans. Typical impact: 40-70% reduction in tier-1 ticket volume. Stack: Claude or GPT for understanding + drafting, integrations with Zendesk/Intercom/Front, monitoring for accuracy.
2. Sales operations automation. Lead enrichment from natural language sources, AI-powered lead scoring, intelligent routing based on inbound content. Typical impact: 5-15 hours/week saved for sales ops teams. Stack: AI for classification + drafting, integrations with Salesforce/HubSpot/Apollo, custom scoring models.
3. Document processing pipelines. Invoice parsing, contract analysis, form processing, financial document understanding. AI replaces OCR + rules with much more flexible understanding. Typical impact: 60-90% time reduction for high-volume document workflows. Stack: vision-capable models (Claude, GPT-4V), document extraction APIs, downstream integrations.
4. Internal workflow automations. Cross-tool orchestration that does what Zapier can't — natural language decisions, content generation, classification of unstructured data. Examples: AI that drafts meeting notes from Slack threads, AI that categorizes incoming email and routes to right team, AI that generates summaries of daily company activity.
The CPC Tells You the Real Story"AI automation consultant" searches have an average cost-per-click of $58 — one of the highest in the entire AI consulting search universe. Why? Because the buyers are sophisticated and the ROI is measurable.
Unlike "AI strategy" engagements where ROI is theoretical, AI automation projects have clear before/after metrics: support ticket volume, sales response time, document processing speed. Buyers can do the math themselves: "If we automate 50% of tier-1 tickets and each ticket costs us $8 to handle, that's $X savings per month, payback in Y months." That clarity attracts both serious buyers and competitive consultants.
The 3 Workflows With Fastest ROIIf you're considering AI automation but don't know where to start, these three consistently pay back within 60-90 days:
Workflow
Typical impact
Project cost
Payback
Tier-1 support automation
40-70% ticket volume reduction
$30K-$60K
2-4 months
Sales lead routing + enrichment
5-15 hours/week saved
$20K-$40K
3-6 months
High-volume document processing
60-90% processing time reduction
$25K-$80K
2-5 months
The math is honest: a well-scoped pilot in any of these three will pay for itself in under a year. The risk is in scoping — pick the wrong workflow or skip the metrics setup and you'll have an automation running with no way to measure what it's worth.
AI Automation vs RPA: What's the Difference?RPA (Robotic Process Automation tools like UiPath, Automation Anywhere, Blue Prism) automate UI-level tasks with deterministic rules. Click here, copy that, paste there. They work great for highly repeatable processes with predictable inputs.
AI automation uses LLMs to make decisions, parse unstructured data, and handle edge cases that break rule-based RPA. They work great where inputs are messy or decisions are nuanced.
The 2026 reality: serious automation work combines both. RPA for deterministic UI steps that don't need intelligence; AI for classification, decisions, and natural language tasks. A good AI automation consultant knows when to reach for which tool.
How to Vet an AI Automation ConsultantThree filtering questions:
1. "Show me a workflow you automated in the last 6 months." Inputs, outputs, measured impact. The shorter the time since the work, the more relevant — AI capabilities change fast.
2. "What's your typical cost monitoring and error-handling approach?" Automation that silently fails is worse than no automation. A serious consultant has specific answers about LangSmith, Helicone, custom logging, threshold alerts, and human-in-the-loop fallbacks.
3. "Who owns the workflow at engagement end, and how does my team maintain it?" You should own the code and the data. Your team should be able to understand what's running. Watch out for consultants who want to retain operational control via a managed service fee.
When You DON'T Need an AI Automation ConsultantSkip the consultant if:
• Zapier + GPT integrations cover your use case. If you can build it in Zapier or Make in a weekend, you don't need a consultant.
• The workflow you want to automate isn't a real bottleneck. If it's not costing you 10+ hours/week, the payback math doesn't work.
• You don't have clean data. AI automations multiply your data problems. Fix data quality first or the automation will surface every inconsistency at scale.
• The use case requires regulatory clearance you don't have. Healthcare, finance, and legal AI automations have compliance requirements that need careful design — sometimes you need a compliance specialist before an automation consultant.
The Bottom LineAI automation consulting is one of the highest-ROI corners of the AI services market in 2026. The work is measurable, the technologies are mature enough to ship, and the cost-per-click on these searches ($58) reflects how serious the buyer market is.
For most businesses, the right first move is a 6-week paid pilot ($15K-$40K) on a single workflow with clear before/after metrics. Either the workflow proves the ROI (and you commit to a 90-day implementation), or you learn cheaply that this specific use case isn't ready.
If you're trying to figure out which workflow to automate first, book a free 15-min strategy call. I'll give you a specific recommendation based on your operations + tooling in 10 minutes. No pitch.
Related reading: AI implementation consultant (for product-facing AI features), AI consultant overview, AI consultant companies, build vs buy AI.
### Frequently Asked Questions
**Q: What does an AI automation consultant do?**
A: An AI automation consultant builds workflow automations that use LLMs (Claude, GPT, Gemini) to replace tasks previously done by humans. Common deliverables: AI-powered customer support flows, sales lead routing and enrichment, document processing pipelines, content generation systems, and internal workflow automations across CRM/ERP/email/Slack. The work is more code-heavy than traditional RPA and more business-process-heavy than pure AI implementation.
**Q: How much does an AI automation consultant cost in 2026?**
A: Hourly rates run $150-$500/hr depending on seniority. Project-based engagements: $15K-$80K for a single workflow automation (4-8 weeks). Multi-workflow programs: $50K-$200K. Retainers: $5K-$15K/month for ongoing automation work. The CPC for this keyword averages $58 — one of the highest in the AI consulting space — because the ROI is measurable and buyers are sophisticated.
**Q: What's the difference between AI automation and RPA?**
A: RPA (Robotic Process Automation, like UiPath or Automation Anywhere) automates UI-level tasks with deterministic rules — click here, copy that, paste there. AI automation uses LLMs to make decisions, parse unstructured data, and handle edge cases that break rule-based RPA. The 2026 reality: most serious automation work combines both — RPA for deterministic UI steps, AI for decisions, classification, and natural language tasks.
**Q: What workflows are best for AI automation?**
A: Three categories consistently deliver fast ROI: (1) Customer support — first-response AI agents that handle tier-1 questions, route the rest. (2) Sales operations — lead enrichment, scoring, and routing with AI rather than rules. (3) Document processing — invoices, contracts, forms parsed by AI then routed to the right system. Each of these has clear measurable inputs and outputs, which is why they pay back fastest.
**Q: How long does an AI automation project take?**
A: Single-workflow automations: 4-8 weeks (discovery, build, test with real data, deploy, monitor). Multi-workflow programs: 12-26 weeks. Anyone promising "full AI automation" in 30 days is selling shovels. Anyone planning 12+ months without checkpoints is bleeding billable hours. Best practice: ship one workflow in 6 weeks as a paid pilot, then commit to the next 2-3 based on measured ROI.
**Q: How is an AI automation consultant different from an AI implementation consultant?**
A: Overlapping but distinct. AI implementation consultants build AI features INTO your product (chatbots, summarization, AI-powered features your users see). AI automation consultants build AI workflows AROUND your operations (back-office automations, internal tools, ops efficiency). Many specialists do both. Implementation is usually customer-facing; automation is usually internal.
**Q: Do I need an AI automation consultant or just Zapier?**
A: Zapier handles deterministic if-this-then-that flows great. AI automation consultants are worth hiring when your workflows need: classification of unstructured inputs ("is this a refund request, a complaint, or a sales inquiry?"), natural language responses (drafting personalized emails), document understanding (parsing invoices, contracts), or multi-step decisions an LLM can make better than rigid rules. If Zapier + GPT integrations cover your use case, start there. If you hit complexity walls, hire a consultant.
**Q: What ROI should I expect from AI automation?**
A: Typical ranges from real engagements in 2026: customer support automation saves 40-70% of tier-1 ticket volume. Sales lead routing saves 5-15 hours/week per sales ops team. Document processing automations save 60-90% on processing time for high-volume document workflows. Most projects pay back the consultant fee within 60-90 days of deployment. Watch for any consultant promising specific ROI figures upfront before discovery — they're either lucky or lying.
**Q: How do I vet an AI automation consultant?**
A: Three questions that filter most pretenders: (1) Show me a workflow you automated in the last 6 months — the inputs, the outputs, the measured impact. (2) What's your typical cost monitoring and error-handling approach? (Automation that silently fails is worse than no automation.) (3) Who owns the workflow at engagement end and how do my team maintain it? Real automation consultants have specific answers to all three. Generic answers = limited production experience.
**Q: When should I hire an AI automation consultant vs build it in-house?**
A: Hire a consultant when: you have a clear high-value workflow to automate, you need it shipped in under 90 days, you don't have an engineering team with AI/LLM experience, or you want to validate ROI before committing internal headcount. Build in-house when: you have ongoing automation work (multiple workflows per quarter), the automations involve sensitive data that can't leave your environment, or you have an engineering team with capacity. Common pattern: consultant ships V1-V3, internal team takes over for V4+.
---
## AI Consultant Companies: How to Vet One in 2026
- **URL:** https://justinmckelvey.com/blog/ai-consultant-companies
- **Published:** May 31, 2026
- **Updated:** May 31, 2026
- **Category:** AI for Business
- **Reading time:** 7 min
- **Description:** AI consultant companies in 2026: $25K (boutique) to $500K+ (Big 4). How to vet, vetting questions that filter out the bad ones, pricing, and 4 red flags.
Quick Answer (Who Ships vs Who Sells)
AI consultant companies in 2026 range from solo specialists ($10K-$50K) to Big 4 firms ($500K-$5M+). The honest truth: implementation boutiques ship faster and cost 2-5x less than Big 4 firms for the same work, while Big 4 firms win for Fortune 500 procurement and enterprise change management. For most businesses under $50M revenue, a 3-15 person boutique beats both extremes. The single best vetting question: "Show me 3 production AI features you've shipped in the last 6 months." If they fumble it, they're a strategy firm regardless of marketing.
Based on AI consulting buyer-side experience + active fractional CTO work · June 2026 · Author: Justin McKelvey
Key Stats (June 2026)
• Solo specialist: $10K-$150K per project, $150-$600/hr
• Boutique firm (3-15 people): $25K-$200K projects, $5K-$25K/mo retainers
• Mid-market firm (15-100 people): $100K-$500K per project
• Big 4 / strategy firms: $500K-$5M+ for AI transformation programs
• Sweet spot for <$50M businesses: Boutique firms (2-5x cheaper than Big 4, same senior people)
• Search trend: "ai consultant company" +132% quarterly — buyers are actively shopping
• Avg CPC: $37 — high commercial intent, real buyer market
TL;DR: AI Consultant Companies in 2026The AI consulting market in 2026 is split into four real tiers, and matching your business to the right tier is the single biggest factor in whether you get value or waste $200K+ on slideware.
Big 4 firms (Accenture, Deloitte, IBM Consulting, EY, KPMG) and strategy firms (McKinsey, BCG, Bain) optimize for Fortune 500 procurement. They're not bad — they're just expensive and slow for most businesses. Boutique implementation firms (3-15 people) optimize for shipping. They're the sweet spot for most businesses. Solo consultants optimize for senior expertise at the lowest possible price. Mid-market firms are everything in between.
I'm a fractional CTO who runs an implementation boutique and advises clients on hiring AI consultant companies. This guide is the honest version — written by someone in the market, not someone selling content marketing for a $500K consulting engagement.
The Four Tiers of AI Consultant Companies
Tier
Size
Typical project cost
Best for
Avoid for
Solo specialists
1 person
$10K-$150K
Time-bound projects, validating ideas, technical depth
Multiple parallel workstreams, Fortune 500 procurement
Boutiques
3-15 people
$25K-$200K
SMB to mid-market shipping real AI features
Companies needing 50+ consultants on a program
Mid-market firms
15-100 people
$100K-$500K
Mid-market with multiple parallel initiatives
Single-feature pilots (overkill)
Big 4 / strategy firms
1,000+ people
$500K-$5M+
Fortune 500 procurement, change management at scale
Speed, cost, hands-on implementation
What AI Consultant Companies Actually DeliverThe biggest mistake businesses make is conflating "AI strategy" with "AI implementation." They're different services from different firms, and you usually need the second one, not the first.
Strategy-focused firms identify AI opportunities, build ROI models, design roadmaps, run workshops, present to boards. Deliverable: a 60-page deck. Cost: $150K-$2M. Useful for: Fortune 500 change management; cases where the board needs convincing before any technical work starts.
Implementation-focused firms build and deploy working AI features into your product or operations. Deliverable: shipped code in production. Cost: $25K-$500K per project. Useful for: anyone who wants AI to actually do something in their business.
Most businesses don't need strategy work. They need implementation. If you can describe what you want AI to do for you ("automate Tier 1 support tickets," "summarize sales call transcripts," "rank inbound leads by likelihood-to-close") — you don't need strategy. You need implementation.
How to Vet an AI Consultant CompanyFive questions that work as a real filter. Ask all five, in order, to anyone you're seriously considering:
1. "Show me 3 AI features you've shipped to production in the last 6 months." Specific, recent, in production. The 6-month requirement is critical — AI capabilities change fast, and case studies from 2 years ago might describe work the platform has since obsoleted. If they can only show old work or strategy decks, they're either inexperienced or coasting on past wins.
2. "Who from your team will actually work on my project?" Get names, get LinkedIn URLs, get years of experience. At Big 4 firms, the sales team is senior partners; the work is done by 2-year consultants. Confirm the senior people stay engaged. If they can't promise specific people, the answer is "we'll staff junior" — which means you'll pay senior rates for junior output.
3. "What's your typical pricing for a 90-day pilot vs full implementation?" Real implementers can answer in 30 seconds with a range. Anyone who needs to "scope it out" and follow up with a custom proposal is buying time to figure out what you'll tolerate paying.
4. "Who owns the code and the data at engagement end?" Should be: you. Watch for firms that retain IP, build "platforms" you license back, or create dependencies that require ongoing fees. The phrase "we'll maintain the system for you on a retainer" is the warning sign.
5. "Show me a case study where the engagement didn't go as planned. What happened and how did you handle it?" This is the best filter question. Anyone polished enough to be lying will fumble it. Real practitioners will tell you a specific story (a model that underperformed expectations, a data quality issue, a scope creep that derailed timeline) with enough specifics that you can tell they lived it.
Red Flags to Avoid
• Anonymous case studies. Real firms have permission to name their best clients. Anonymous work usually means either (a) the work didn't go well and the client doesn't want public credit, or (b) the case study is partly fabricated.
• Opaque pricing. A real firm can give you a ballpark in the first conversation. Anyone who requires multiple discovery calls before pricing is anchoring you on premium rates.
• $50K+ upfront before code. Pay for outcomes, not promises. Ethical firms offer a paid pilot ($5-15K, 2-3 weeks) before committing to larger engagements.
• "AI transformation" language without specifics. If they can't name the LLMs, integration patterns, or monitoring stacks they'd use, they're selling slideware. Listen for "Claude Sonnet 4.5 with function calling" not "best-in-class large language model."
• Decks more polished than code samples. Polish indicates where the firm invests. If sales materials look better than technical artifacts, sales is the product.
Boutique vs Big 4: When Each WinsPick a boutique when: You have a specific AI feature in mind. You want to ship in 4-12 weeks. You're under $50M revenue. You care about cost. You want the senior people who pitched you to stay on the project.
Pick a Big 4 when: Your procurement department won't approve smaller firms (this is the most common real reason). You need a 100+ person team on the engagement. You're undertaking enterprise-wide AI transformation across many business units. You need globally distributed implementation. You need the brand name for board credibility.
Almost everything else falls in favor of boutiques. The economics are clear: a 5-person boutique with 15-year senior leads charges $30K-$80K for work that a Big 4 firm will quote at $150K-$400K — same actual implementation, same senior expertise on the boutique side, junior implementers on the Big 4 side.
The Boring Math: Big 4 Pricing AnatomyIf you're considering a Big 4 firm, understand what you're paying for:
Cost component
Approximate share of fee
Junior consultant time (the actual work)~30%
Senior partner / engagement manager oversight~15%
Sales + business development~15%
Firm overhead (real estate, benefits, training)~25%
Profit margin~15%
You're paying for the brand, the procurement compatibility, and the enterprise-grade processes. The actual implementation work is 30% of your fee — done by junior consultants. A boutique cuts the sales, overhead, and brand premium; passes the savings on.
What a Good Engagement Actually Looks LikeWhichever tier you pick, a healthy engagement has these markers:
• Days 1-14: Discovery + scoping. Decision: go/no-go on the larger engagement based on a clear pilot scope.
• Days 15-60: Build. Iterate. Real users testing by week 4-5.
• Days 61-90: Ship to production. Monitor. Handoff. Your team can maintain it.
• Day 90 deliverable: One shipped AI feature + the playbook to ship more.
• Cost: $25K-$80K depending on scope.
Anyone planning multi-quarter engagements without checkpoints — or anyone whose first deliverable is "the AI strategy document" — is building a relationship that bleeds you slowly.
The Bottom LineFor most businesses in 2026, the right AI consultant company is a boutique with 3-15 people, a track record of recent shipped work, transparent pricing, and the people you talked to staying on your project. Not a Big 4 firm (overpaid). Not a solo consultant (under-resourced for many engagements). Not a "AI transformation" specialist (sells decks).
If you're vetting firms right now, use the 5 questions above. The fifth one is the best filter — and the one most firms will not be ready for.
If you want a second opinion on a specific firm you're considering, or you're trying to figure out which tier fits your situation, book a free 15-min strategy call. I'll give you a specific recommendation in 10 minutes. No pitch.
Related reading: AI consultant: the broader role, firm vs solo, AI implementation consultant, AI strategy consultant, Chief AI Officer vs Fractional CTO.
### Frequently Asked Questions
**Q: What are AI consultant companies?**
A: AI consultant companies are firms that help businesses identify, build, and deploy AI features. They range from independent solo consultants (specialized, $10K-$50K projects), to boutique 3-15 person shops ($25K-$200K projects), to mid-market firms ($100K-$500K), to Big 4 / strategy firms (McKinsey, Accenture, Deloitte, IBM Consulting) charging $500K-$5M+. Different sizes optimize for different things — boutiques ship code; Big 4 firms ship roadmaps.
**Q: What do AI consultant companies actually do?**
A: Depends entirely on the firm. Implementation-focused firms build and deploy working AI features into your product or operations. Strategy-focused firms identify opportunities and write playbooks but don't ship code. The biggest mistake companies make is hiring a strategy firm when they need implementation work. Always ask: 'Show me an AI feature you shipped to production in the last 6 months.' If they can't answer with specifics, they're a strategy firm regardless of what their website says.
**Q: How much do AI consultant companies cost in 2026?**
A: Pricing varies dramatically by firm size and engagement type. Solo specialists: $150-$600/hr or $10K-$150K project. Boutique firms (3-15 people): $25K-$200K projects, $5K-$25K/mo retainers. Mid-market firms (15-100 people): $100K-$500K projects. Big 4 / strategy firms: $500K-$5M+ for AI transformation engagements. For most businesses under $50M revenue, boutiques deliver better value than Big 4 firms — same senior people, half the price.
**Q: Should I hire a Big 4 firm or a boutique AI consultant company?**
A: Boutique, in most cases. Big 4 firms (Accenture, Deloitte, IBM Consulting, plus McKinsey-style strategy firms) excel at: enterprise procurement requirements, change management at Fortune 500 scale, and credentials. They struggle at: actually shipping working AI features fast, keeping senior people on your engagement (most work flows to junior consultants), and pricing transparency. Boutique firms ship faster, cost 2-5x less, and the senior people stay on your project. For non-Fortune-500 businesses, boutiques almost always win.
**Q: What's the difference between an AI consultant company and an AI consulting firm?**
A: Same thing, different word. The terms are used interchangeably. Both refer to organizations that provide AI advisory and implementation services to businesses. Some firms self-identify with subtle distinctions ("company" = implementation-heavy; "firm" = strategy-heavy) but it's marketing, not a real industry distinction.
**Q: How do I vet an AI consultant company before hiring?**
A: Five questions that filter out the bad ones: (1) Show me 3 production AI features you've shipped in the last 6 months — for whom, what they do, what they cost. (2) Who from your team will actually work on my project, and what's their track record? (3) What's your typical pricing for a 90-day pilot vs full implementation? (4) Who owns the code and the data at engagement end? (5) Show me a case study where the engagement didn't go as planned — what happened. The 5th question is the best filter — anyone polished enough to be lying will fumble it.
**Q: What are the red flags when picking an AI consultant company?**
A: Four real ones: (1) They lead with roadmaps and frameworks instead of shipped features. (2) Their case studies are anonymous and lack specific metrics. (3) Pricing is opaque — you can't get a ballpark without a 60-minute discovery call. (4) The pitch deck is more polished than the technical samples. If you hear "AI transformation," "strategic alignment," or "holistic approach" before they name specific LLMs, integration patterns, or production systems — they're selling vibes.
**Q: Which AI consultant companies should I avoid?**
A: Avoid: any firm that requires $50K+ upfront before producing any code (good firms offer paid pilots); any firm that retains code IP at end of engagement (you should own it); any firm whose primary case studies are 'AI strategy' decks; any firm that hasn't shipped production AI in the last 6 months. Don't worry about firm size — there are great boutiques and great enterprise firms. Worry about whether they ship.
**Q: Should small businesses use AI consultant companies or solo consultants?**
A: Solo or 2-3 person teams, in most cases. For businesses under $10M revenue, a Big 4 firm is wildly overpriced — you're paying senior partner rates for junior implementers. A senior solo consultant or small team delivers the same actual implementation work at 30-50% the cost. The downside of solo: less capacity, possibly slower if you need parallel workstreams. The upside: senior person on your project from day one. See AI consulting firm vs solo consultant for the full breakdown.
**Q: How long should an AI consultant company engagement take?**
A: Most pilots take 4-8 weeks (one AI feature shipped). Full implementations run 8-16 weeks. Multi-feature platform builds run 16-26 weeks. Anyone promising "AI transformation" in 30 days is selling shovels; anyone planning 12+ months without checkpoints is bleeding you with billable hours. The sweet spot for most businesses: 4-6 week paid pilot first, then 90-day implementation contract if the pilot proves the use case.
---
## Railway vs Vercel 2026: Why I Picked Railway
- **URL:** https://justinmckelvey.com/blog/railway-vs-vercel
- **Published:** May 31, 2026
- **Updated:** May 31, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 6 min
- **Description:** Railway vs Vercel: 3-person SaaS team pays $60-150/mo on Vercel, $15-45 on Railway. Frontend vs full-stack. Why I picked Railway for this site.
Quick Answer (The Verdict)
Vercel wins for Next.js frontends, edge performance, and preview deployments. Railway wins for full-stack apps with databases, predictable usage-based pricing, and anything containerized. A 3-person SaaS team typically pays $60–$150/mo on Vercel; the same workload runs $15–$45/mo on Railway. Many teams use both — Vercel for frontend, Railway for backend + database. I deploy this consultancy site (justinmckelvey.com) on Railway because it's a Rails + SQLite app that doesn't fit Vercel's serverless model.
Based on production deployment of justinmckelvey.com on Railway + client advisory work on both platforms · June 2026 · Author: Justin McKelvey, fractional CTO
Key Stats (June 2026)
• Railway Hobby: $5/mo + $5 included credit (usage-based above)
• Vercel Hobby: Free with bandwidth + serverless limits
• Railway Pro: $20/mo per member + usage
• Vercel Pro: $20/mo per member + bandwidth/function overages
• 3-person SaaS team cost: Vercel $60-150/mo vs Railway $15-45/mo (3-5x difference)
• Pricing model — Vercel: Per-request (serverless invocations, bandwidth)
• Pricing model — Railway: Per-resource (CPU, memory, network)
• Search trend: "railway vs vercel" +743% YoY — devs are actively making this comparison
TL;DR: Railway vs Vercel in 2026Vercel is best-in-class for Next.js frontends, static sites, and any project where edge performance is the differentiator. Their preview deployments per pull request, automatic image optimization, edge middleware, and global CDN are genuinely the best available. The catch: pricing punishes successful apps. Bandwidth overages, serverless function invocations, and Edge Middleware all scale with traffic.
Railway is best for full-stack SaaS apps with databases. You get persistent containerized services, native database provisioning (Postgres, MySQL, Redis, MongoDB), and usage-based pricing that's typically 3-5x cheaper than Vercel for the same workload. The trade-off: you lose Vercel's edge magic.
I'm a fractional CTO who actually uses Railway in production — this site (justinmckelvey.com) is a Rails 8 app on Railway with SQLite via Litestream backup. The honest verdict: pick the right tool for the layer. Many teams in 2026 use both.
The Architectural DifferenceUnderstanding why these two platforms exist requires understanding what they're optimizing for.
Vercel is built around serverless + edge. Your code runs in many short-lived function instances close to users globally. The CDN caches everything cacheable. The edge network handles middleware, redirects, and dynamic rendering at the network edge. This produces incredible frontend performance for sites where the bottleneck is "deliver pre-rendered HTML to a user near them as fast as possible."
Railway is built around containers + persistent services. Your code runs in dedicated environments — more like traditional VPS hosting with managed deployments. You get a Postgres instance, a Redis instance, a worker queue, and your app — all running 24/7 in the same project. This is the model that fits most real backend work: databases need persistent connections, queues need to run constantly, business logic happens between many requests.
Vercel is optimized for the frontend layer of an app. Railway is optimized for the full-stack layer of an app. Neither is "better" — they solve different problems.
The Cost Math (Where the 3-5x Gap Lives)This is the question most teams care about. The honest answer with specific numbers:
Workload
Vercel typical
Railway typical
Solo dev side project
Free (Hobby plan)
$5/mo (with $5 credit)
Small SaaS, low traffic
$20-40/mo Pro
$10-25/mo
3-person team, moderate SaaS traffic
$60-150/mo
$15-45/mo
Growing app, high traffic + functions
$200-800/mo
$40-150/mo
Database costs (Postgres)
External (Supabase $25+/mo)
Included as Railway service
Redis / Queue costs
External (Upstash $10+/mo)
Included as Railway service
The gap looks wider when you remember that Vercel doesn't include a database. By the time you wire up Vercel + Supabase + Upstash for a typical SaaS, you're comparing 3-4 invoices vs Railway's single bill.
Why This Site Runs on Railwayjustinmckelvey.com is a Rails 8 monolith with SQLite as the production database, backed up continuously to Cloudflare R2 via Litestream. Solid Queue runs background jobs, Solid Cache handles caching, Solid Cable handles WebSockets — all SQLite-backed. Total monthly cost: under $10.
This stack doesn't fit Vercel at all. Vercel's serverless model assumes stateless functions and external databases. SQLite needs a persistent file system. Solid Queue needs a process running 24/7. Litestream needs to stream WAL changes continuously to S3-compatible storage.
On Railway, the entire stack runs as one Docker container with a persistent volume. Deploy from GitHub, mount the volume at /app/storage, Litestream backs up to R2, done. The whole architecture would be impossible on Vercel without splitting it into 4-5 separate services.
For Rails apps, Django apps, FastAPI apps, Express apps with persistent state, or anything containerized — Railway is the natural fit. Vercel works for these (with adapter packages), but you fight the platform the whole way.
Where Vercel Wins DecisivelyVercel is the right answer for:
Next.js apps with heavy frontend rendering. Next.js was built by Vercel. The integration is so tight that some features (Image Optimization edge caching, Incremental Static Regeneration at the CDN, edge middleware) just work better on Vercel than anywhere else.
Marketing sites with global reach. A static or near-static site that needs to load fast for users in 50 countries — Vercel's edge network is genuinely the best at this. Sub-100ms TTFB worldwide is normal.
Preview deployments per pull request. Vercel's PR preview workflow is the best in the industry. Every PR gets a unique URL automatically; reviewing UI changes is incredibly smooth. Railway has previews too but they're not as polished.
Teams already deep in the Vercel ecosystem. If your team knows Vercel inside out, the switching cost matters. Sometimes the right answer is "use what your team is fast at."
Where Railway Wins DecisivelyRailway is the right answer for:
Full-stack apps with persistent databases. Native Postgres, MySQL, Redis, MongoDB provisioning in seconds. No external services to wire up. One bill.
Containerized workloads. Docker-based apps deploy natively. Want a Node service, a Rails monolith, a Python worker queue, and a Postgres instance all in the same project? Railway, in 10 minutes.
Cost-conscious SaaS at scale. The 3-5x cost gap matters once you're past the Hobby tier. For an app at moderate traffic, you're saving $50-500/month vs Vercel for equivalent functionality.
Long-running processes. Background workers, message consumers, scheduled jobs, persistent WebSocket servers — all natural on Railway, all hostile on Vercel's serverless model.
Apps that need a real file system. SQLite-backed apps (like this one), file processing pipelines, any workload that mounts a persistent volume.
The Hybrid Pattern (Increasingly Common)By mid-2026, the most common high-end pattern for new SaaS startups is Vercel for frontend, Railway for backend. Your Next.js or React frontend lives on Vercel's edge network for best perceived performance. Your Node.js/Rails/Python API + database + workers run on Railway for cost control and persistent service flexibility.
This is "use both tools for what they're best at." Slight operational overhead — two platforms, two billing accounts, two deploy flows. But you get world-class frontend performance and full-stack flexibility, often at a lower total cost than running everything on Vercel.
The Honest Bottom LineFor a typical 2026 SaaS startup:
• Pure frontend or marketing site: Vercel.
• Next.js SaaS with light backend: Vercel + Supabase, unless cost matters.
• Full-stack monolith (Rails, Django, Express): Railway.
• SaaS with serious backend complexity (queues, workers, databases): Railway.
• High-traffic with global users: Vercel for frontend, Railway for backend.
• Anything containerized: Railway.
If you're deploying a Rails app, a Django app, or anything with a real database — Railway will save you money and headaches. If you're deploying a Next.js app and don't mind paying for the edge experience — Vercel is genuinely the best.
Need help picking for your specific stack? Book a free strategy call — I'll give you a specific recommendation in 10 minutes. Or for the broader picture: how I replaced $2,200/yr in SaaS with a single Rails app covers the stack I actually use to run this consultancy.
### Frequently Asked Questions
**Q: Is Railway cheaper than Vercel?**
A: For full-stack apps with a database: significantly. A 3-person SaaS team with moderate traffic typically pays $60-150/month on Vercel; the same workload runs $15-45/month on Railway depending on resource consumption. The cost gap widens as your usage grows. Vercel's serverless function pricing and bandwidth overages punish successful apps; Railway's resource-based pricing scales more predictably.
**Q: Is Vercel better than Railway?**
A: Vercel is better for: Next.js frontends, static sites, edge computing, preview deployments per pull request, and any project where best-in-class frontend performance matters more than cost. Railway is better for: full-stack apps with persistent databases, backend services, containerized workloads, and SaaS where total cost matters more than edge performance. Many teams use both.
**Q: What's the main difference between Railway and Vercel?**
A: Architecture. Vercel is built around serverless functions and edge computing — your code runs in many short-lived instances near users worldwide. Railway is built around always-on containers — your code runs in dedicated environments more like traditional servers. Vercel optimizes for frontend performance; Railway optimizes for full-stack flexibility and predictable cost.
**Q: Can Railway host Next.js?**
A: Yes. Railway can deploy any Node.js app including Next.js, and it works well for most use cases. The trade-off vs Vercel: you lose some Next.js-specific optimizations (Image Optimization edge caching, edge middleware, automatic ISR at the CDN layer). For most Next.js apps under heavy traffic, Vercel's edge advantages justify the cost gap. For Next.js apps with modest traffic and database dependencies, Railway is often the better total package.
**Q: Why would a developer pick Railway over Vercel?**
A: Three reasons in order: (1) you need a database alongside your app (Vercel makes you bring your own; Railway provisions Postgres/Redis/MySQL in seconds), (2) you want predictable usage-based billing instead of per-request pricing that can spike, (3) you're shipping a containerized backend or full-stack monolith that doesn't fit Vercel's serverless model. Justin's consultancy site (justinmckelvey.com) runs on Railway specifically for reasons (1) and (3) — Rails + SQLite + Solid Queue.
**Q: Is Vercel only for frontends?**
A: Mostly. Vercel supports serverless functions for API routes and backend logic, but its sweet spot is frontend rendering (React, Next.js, SvelteKit, Astro) with light backend work. For heavy backend services, persistent connections, long-running processes, or anything needing a real database — Vercel forces you to wire up external services. Railway handles all of that natively in one platform.
**Q: Can I use Railway and Vercel together?**
A: Yes, and many teams do — it's becoming the most common pattern in 2026. Frontend on Vercel (best edge performance for users), backend services + database on Railway (best cost and developer experience for full-stack work). The two platforms work fine alongside each other; you just point your Vercel frontend at your Railway API URL.
**Q: How much does Railway cost in 2026?**
A: Railway's Hobby plan is $5/month with a $5 monthly credit included; you pay for compute, bandwidth, and storage usage above that. Most solo developers run small projects for $5-15/month total. The Pro plan is $20/month per member with team features. A SaaS with app + Postgres + Redis typically runs $15-45/month depending on traffic. Self-hosted databases push costs higher only if you're managing significant data.
**Q: How much does Vercel cost in 2026?**
A: Vercel's Hobby plan is free (with bandwidth and serverless function limits). Pro is $20/month per team member. The real costs come from overages: $0.15/GB beyond included bandwidth, additional serverless function execution time, Edge Middleware invocations, and Image Optimization beyond limits. A 3-person SaaS team with moderate traffic typically pays $60-150/month effective; growing apps regularly hit $500+/month.
**Q: Which platform is better for AI apps?**
A: Depends on the architecture. AI apps with long-running model calls, streaming responses, persistent agent state, or vector databases: Railway, because Vercel's serverless function timeouts and cold starts hurt LLM use cases. AI apps that are mostly frontend with thin API routes calling external LLM APIs: Vercel, because the edge advantage helps perceived latency. For most AI SaaS in 2026 — Railway for backend + database, Vercel for frontend.
---
## What Is Base44? The 2026 Guide (Post-Wix Acquisition)
- **URL:** https://justinmckelvey.com/blog/what-is-base44
- **Published:** May 31, 2026
- **Updated:** May 31, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 5 min
- **Description:** Base44 is the AI app builder Wix bought for $80M in 2025. 2M+ users by 2026. How it works, what it costs, who it's for. Complete explainer.
Quick Answer (Definition)
Base44 is an AI app builder that generates complete full-stack web applications from a single text prompt — frontend, backend, database, authentication, and hosting all included on its managed platform. Wix acquired it for a reported $80 million in June 2025, six months after launch. By Q1 2026 the platform had 2 million users and $100M ARR. It uses Claude Sonnet 4 by default for code generation; higher plans unlock Claude Opus 4.5, Gemini 2.5/3 Pro, and GPT-5. Free tier exists but burns out fast; paid starts at $16/month.
Based on Base44 documentation + 8+ inherited Base44 apps in maintenance · June 2026 · Author: Justin McKelvey, fractional CTO
Key Facts (June 2026)
• What it is: AI app builder — prompt → full-stack deployed app
• Owned by: Wix (acquired June 2025, ~$80M, 6 months after launch)
• Scale: 2M+ users, $100M ARR by Q1 2026
• Default AI model: Claude Sonnet 4 (higher plans: Opus 4.5, Gemini 2.5/3 Pro, GPT-5)
• Stack: Managed Postgres + built-in auth + hosting + native integrations (Stripe, Slack, Google, OpenAI)
• Pricing: Free (25 credits) → Starter $16/mo → Builder $40/mo → Pro $80/mo
• Best for: Non-developers, internal tools, hackathons, idea validation
• Avoid for: Production apps that need to scale, complex business logic, strict compliance
The 90-Second ExplainerBase44 is an AI application generator. You type a description of the app you want — for example, "a meal planning tool where users save favorite recipes, get weekly grocery lists, and can share plans with family" — and the AI generates a working full-stack web application. Database schema, authentication, frontend interface, business logic, and live URL deployment all happen in one pass.
The big difference from traditional no-code tools like Bubble, Glide, or Softr: Base44 generates real code on a managed Postgres backend, not a visual drag-and-drop builder on top of a proprietary engine. The big difference from open AI builders like Lovable or Bolt: Base44 bundles the entire infrastructure — you don't manage a separate Supabase project or Vercel deployment.
Founded in late 2024, Base44 grew fast enough that Wix acquired it in June 2025 for a reported $80 million — just six months after launch. By Q1 2026 the platform reached 2 million users and $100M annual recurring revenue. As of June 2026, it's the fastest-growing AI app builder on the market.
How Base44 Works (Step by Step)Step 1: Describe the app. Open Base44, click "New App," type a description. Specificity helps — "an inventory management app for small bakeries with FIFO stock tracking and supplier reorder alerts" gets a better result than "a business app."
Step 2: Pick styling. Choose a design direction (minimal, claymorphism, glassmorphism, brutal, etc.) or describe one in natural language.
Step 3: Watch the AI build. A real-time build log shows progress: database schema being designed, frontend components generating, authentication wiring up, deployment provisioning. Total time: usually 2-5 minutes for the initial build.
Step 4: Test the live URL. The app exists at a real URL within minutes. You can sign up, create data, and see it persist — it's a real deployed app, not a static preview.
Step 5: Iterate via chat. Want changes? Tell the AI. "Add a settings page where users can change their notification preferences." The AI updates the code, redeploys, and you see the change in seconds.
Step 6: Tweak via visual editor. Colors, copy, layouts — you can change these directly in a visual editor without re-prompting, which saves credits and avoids the AI accidentally breaking other features.
What's Actually IncludedBase44's value proposition is "all-in-one." Here's what that actually includes:
Component
What you get
Database
Managed Postgres, schema generated by AI from your prompt
Authentication
Built-in (email, social providers, password reset)
File storage
Integrated cloud storage for uploads
Frontend
Generated React app with Tailwind styling
Hosting
Auto-deployed to a Base44 subdomain (custom domains on paid plans)
AI features
Built-in OpenAI and Anthropic integrations for AI-powered features
Integrations
Native: Stripe, Slack, Google Sheets/Drive, SendGrid, Twilio
Visual editor
Post-generation tweaks for colors, copy, layout (no credits required)
What's NOT included: GitHub code export (in beta on most plans), advanced compliance certifications (no HIPAA, limited SOC 2 outside enterprise), or the ability to bring your own database.
Who Base44 Is ForBase44 wins for a specific set of builders:
Non-developers validating ideas. If you have a startup idea, no engineering team, and 30 days to test whether anyone cares — Base44 lets you ship something testable today instead of paying a developer $5,000+ for an MVP that might fail validation.
Operators building internal tools. Dashboards, admin panels, CRUD apps, team workflows — Base44 handles these elegantly. The native Slack/Sheets/Stripe integrations cover most internal-tool needs.
Hackathon and demo builders. 48-hour speed-to-demo is Base44's sweet spot. The combination of fast generation, polished visual editor, and bundled integrations is hard to beat.
Solo founders who want one bill. Predictable monthly cost, no separate Supabase invoice, no Vercel surprise overage. Simple expense tracking matters when you're solo.
Who Should Avoid Base44Base44 is the wrong tool for:
• Anyone with a real engineering team. Your devs will move faster with Cursor or Claude Code and own the code.
• Apps planning to scale past 10K users. The lock-in becomes a problem once your data has serious value.
• Compliance-heavy work. HIPAA, custom data residency, advanced audit logs — Base44's managed infrastructure isn't built for these.
• Apps with intricate business logic. Fintech, complex marketplaces, ML-heavy features — Base44 will fight you.
• Anyone who knows they'll want to migrate within 12 months. The vendor lock-in pain isn't worth it.
The Wix Acquisition: Why It MattersThe June 2025 Wix acquisition is the most-discussed thing about Base44 as of mid-2026, for three real reasons:
1. Legitimacy. A publicly-traded company paid $80M for Base44 six months after launch. That's not happening for a vaporware product. The Wix backing gives Base44 long-term staying power that most 2-year-old startups don't have.
2. Operational changes. Post-acquisition, effective pricing trended 15-30% higher per equivalent app footprint. Support response times shifted from hours to days/weeks for non-enterprise users. A multi-hour platform-wide outage on February 3, 2026 exposed the absence of SLA contracts outside of enterprise tiers.
3. Product roadmap. Wix is positioning Base44 within its broader "Vibe Coding" product family alongside Wix Harmony and Wix Vibe. This means more integration with the Wix ecosystem, more enterprise features in the pipeline, and slower velocity on truly independent product decisions.
Where to Go From HereIf you're trying to figure out whether Base44 is right for you:
• Full honest review: Base44 review 2026 — the honest verdict
• Comparison to the closest competitor: Base44 vs Lovable
• The full vibe coding landscape: Best vibe coding tools 2026
• The bigger picture: What is vibe coding?
If you've already built something on Base44 and you're hitting the iteration ceiling — or trying to figure out whether to migrate — book a free strategy call. I maintain 8+ Base44 apps in vibe code rescue work and can give you a specific recommendation in 10 minutes. No pitch.
### Frequently Asked Questions
**Q: What is Base44 in one sentence?**
A: Base44 is an AI app builder that generates complete full-stack web applications — frontend, backend, database, authentication, and hosting — from a single text prompt, all on its own managed platform. It was acquired by Wix in June 2025 for a reported $80M, six months after launch.
**Q: Who owns Base44?**
A: Wix acquired Base44 in June 2025 for a reported $80 million. Base44 now operates under the Wix Vibe Coding product line. The product team continues to ship features rapidly under Wix ownership, though some users report changes in pricing and support response times post-acquisition.
**Q: How does Base44 actually work?**
A: You type a description of what you want ("a chore scheduler for families with recurring tasks and a points system"), Base44's AI generates the full application — database schema, authentication flow, frontend UI, business logic — and deploys it to a live URL. You iterate by chatting with the AI or using the visual editor for design tweaks. As of 2026, Base44 uses Claude Sonnet 4 by default; Builder plans and above can select from Claude Opus 4.5, Sonnet 4.5, Gemini 2.5/3 Pro, or GPT-5.
**Q: Is Base44 free?**
A: Yes, technically. The Forever Free plan includes 25 message credits, 100 integration credits, and unlimited apps. In practice, 25 credits is enough to evaluate the tool but not enough to ship a real MVP — most users hit the limit within hours. Paid plans start at $16/month (annual billing) or $20/month.
**Q: What can you build with Base44?**
A: Common use cases that work well: internal admin tools, dashboards, CRUD applications, booking systems, simple SaaS MVPs, hackathon demos, and idea validation prototypes. Where Base44 struggles: apps with intricate business logic, sophisticated visual designs, mobile-first experiences, or anything requiring strict compliance (HIPAA, custom data residency).
**Q: Is Base44 the same as Lovable?**
A: No — they're similar in goal (prompt-to-app) but architecturally different. Base44 bundles everything (database, auth, hosting) on its own managed platform. Lovable generates standard React + Supabase code that you can deploy anywhere. Base44 is faster to ship; Lovable is more portable. See Base44 vs Lovable for the full comparison.
**Q: What's the difference between Base44 and Wix?**
A: Wix is a traditional website builder — drag-and-drop, templates, e-commerce. Base44 is an AI app builder — describe what you want, AI generates code. Wix is for marketing sites; Base44 is for applications with custom logic. Since the 2025 acquisition, Base44 has been positioned within Wix's broader "Vibe Coding" product family but operates as a distinct product.
**Q: How much does Base44 cost in 2026?**
A: Free Forever (25 credits), Starter $16/month annual ($20 monthly), Builder $40/month annual ($50 monthly), Pro $80/month annual ($100 monthly). Builder and above unlock the ability to select your AI model (Claude Opus, Gemini, GPT-5). All plans include unlimited apps.
**Q: Is Base44 worth using in 2026?**
A: For the right use case, yes: non-developers validating ideas in under 30 days, operators building internal tools, hackathon builders chasing speed-to-demo. For most other situations, alternatives like Lovable (more portable), Cursor (for developers), or even Replit (for code-curious builders) are better picks. See the honest Base44 review for the full evaluation.
**Q: Can I migrate apps off Base44?**
A: It's hard. The integrated database is the main lock-in — your app's data lives in Base44's managed Postgres, and there's no clean export-and-go path. GitHub code export exists but was still in beta on most plans as of mid-2026. If you think you might want to migrate within 12 months, consider building on Lovable (uses Supabase, easier to move) instead.
---
## AI Implementation Consultant: How to Hire One in 2026
- **URL:** https://justinmckelvey.com/blog/ai-implementation-consultant
- **Published:** May 31, 2026
- **Updated:** May 31, 2026
- **Category:** AI for Business
- **Reading time:** 7 min
- **Description:** AI implementation consultants ship code, not slideware. 2026 pricing ($150-$600/hr), 5 vetting questions, 4 red flags. The honest hiring guide.
Quick Answer (Who Ships vs Who Sells)
An AI implementation consultant builds and ships working AI features in your product or operations — not strategy decks. In 2026, rates range from $150–$600/hr depending on seniority. Most projects run $25K–$150K for a 4–12 week implementation. The difference from an AI strategy consultant: implementation consultants leave you with shipped code, not a roadmap. Vet them with one question — "Show me an AI feature you've shipped to production in the last 6 months." If they can't answer with specifics, they're selling slides, not software.
Based on 15+ years shipping products + active AI implementation work for founders · June 2026 · Author: Justin McKelvey, fractional CTO
Key Stats (June 2026)
• Hourly rates: $150-$600/hr depending on seniority and specialty
• Project pricing: $25K-$150K for a 4-12 week implementation
• Retainer pricing: $5K-$25K/month for ongoing work
• Typical timeline: 4-8 weeks for a pilot, 8-16 weeks for multi-feature implementation
• vs Strategy consultant: Implementation ships code; strategy ships slideware. Avoid paying for the latter.
• vs Internal hire: Consultant for time-bound projects; internal for ongoing 40+ hrs/week workloads
• Biggest red flag: They can't show a production AI feature they shipped in the last 6 months
TL;DR: AI Implementation Consultants in 2026The market for AI implementation consultants exploded in 2025-2026 as every business realized "AI strategy" decks weren't shipping any actual AI. An implementation consultant is the person who takes the idea and ships it — integrates Claude or GPT into your support flow, builds a RAG system over your company docs, automates a workflow with AI agents, fine-tunes prompts for your specific use case.
Pricing in 2026 is wide. Mid-market generalists run $150-$300/hr. Senior specialists with shipping track records charge $300-$600/hr. Project-based engagements run $25K-$150K depending on scope. The cheap end is full of strategy consultants pivoting into "implementation" without ever having shipped production code. The expensive end is mostly firms billing junior implementers at senior partner rates. The middle is where the real work happens.
I'm a fractional CTO who's spent 2025-2026 shipping AI features for founders. This guide is the honest version of what to expect, what to pay, and how to vet — written by someone doing the work, not selling roadmaps.
What an AI Implementation Consultant Actually DoesThe deliverable is shipped software, not slides. Specifically:
1. Integrating LLMs into existing products. Adding Claude or GPT-powered features to your SaaS, your support workflow, your internal tools. This is the most common engagement — companies have a use case for AI but no one internally knows how to wire it up.
2. Building RAG systems. Retrieval-augmented generation systems that let LLMs answer questions over your company's docs, support history, or product catalog. Hard to get right because the retrieval quality matters more than the LLM choice.
3. Automating workflows with AI agents. Multi-step automations where the AI makes decisions: "if customer email contains X, draft response Y, route to team Z, escalate if Q." Done well, this saves 10+ hours/week per team. Done badly, this hallucinates customer responses and causes refund storms.
4. Prompt engineering and fine-tuning for specific use cases. Most "AI doesn't work for our use case" complaints are actually prompt engineering problems. A specialist can usually fix it in days, not weeks.
5. Cost and performance monitoring. AI features can quietly burn 5-10x what you budgeted if no one's watching. A good implementation consultant sets up monitoring, cost alerts, and the operational discipline to keep AI features sustainable.
How Much It Costs in 2026
Engagement type
Typical cost
Best for
Hourly (mid-market generalist)
$150-$300/hr
Small projects, prototypes
Hourly (senior specialist)
$300-$600/hr
Complex implementations, hard problems
Pilot project (4-6 weeks)
$25K-$50K
Validating one AI feature before committing
Full implementation (8-12 weeks)
$50K-$150K
Multi-feature builds, RAG + agents + monitoring
Retainer (ongoing)
$5K-$25K/mo
Continuous AI integration work over months
"AI Transformation" (12+ months)
$500K+
Usually a firm padding billable hours. Skip.
How to Vet an AI Implementation ConsultantFive questions that filter out the strategy consultants pretending to be implementers:
1. "Show me an AI feature you shipped to production in the last 6 months." Specific, recent, in production. If they fumble this — career-pivot strategist, not implementer.
2. "What's your typical pricing for a 90-day pilot implementation?" A real implementer can answer in 30 seconds with a range. A strategist will need to "scope it out" and follow up with a custom proposal.
3. "Who owns the code at the end of the engagement?" Should be: you. Some firms try to retain IP or build "platforms" you license back. Walk away from those.
4. "What's your approach to monitoring AI costs and accuracy in production?" Real implementers have an answer involving specific tools (LangSmith, Helicone, custom logging) and metrics. Generic answers = they haven't actually run AI in production at scale.
5. "Walk me through your last AI bug — what broke and how you fixed it." Anyone who has shipped real AI has fought a production bug. The story should be specific, technical, and slightly painful. No story = no production experience.
Red Flags to Avoid
• Leads with workshops, roadmaps, "alignment" sessions. Real implementers lead with shipping, not facilitation.
• Can't show production AI features they've built. Without recent work, they're learning on your dime.
• $50K+ upfront before any code. Pay for outcomes, not promises. Most ethical implementers offer a paid pilot ($5-15K, 2-3 weeks) before committing to a larger engagement.
• Uses "AI transformation" without naming specific tools or models. If they can't name the LLM, the integration pattern, and the monitoring stack — they're selling vibes.
• The pitch deck is more polished than the code samples. Code talks. Decks sell.
Implementation vs Strategy Consultant: The Distinction MattersBoth roles exist for good reasons. They solve different problems:
AI strategy consultant — usually from a Big 3 background. Maps your business processes, identifies high-leverage AI opportunities, builds the financial model, presents to the board. Deliverable: a 60-page deck and a roadmap. Cost: $150K-$2M. Useful for Fortune 500 procurement, not for most businesses under $100M ARR.
AI implementation consultant — usually from a software engineering background. Takes your top opportunity, builds it, ships it, monitors it. Deliverable: working code in production. Cost: $25K-$150K for a typical project. Useful for everyone who actually wants AI to do something.
Most businesses don't need both. Pick implementation first. If after a successful pilot you need strategy support for a broader rollout, add strategy then. Doing it the other way around — strategy first, implementation later — is how companies end up two years and $500K deep with no shipped features.
When to Hire Internally InsteadAn AI implementation consultant is wrong when:
• You have 40+ hours/week of ongoing AI work. At that volume, an internal hire is cheaper and more invested.
• The AI feature IS your core product. If you're building an AI startup, the implementation can't sit outside your team.
• You have strict data residency requirements. Some industries can't have external contractors touching the data; you need internal staff.
• You can hire and ramp someone in 4 weeks. If you can, do — internal is always better for ongoing work.
For everyone else: time-bound projects, pilots, specialized work, validating ROI before committing to headcount — a consultant is the right tool.
What a 90-Day Engagement Looks LikeThe pattern that works:
Days 1-14: Discovery + scoping. Pick ONE high-leverage AI feature. Define success: response time, accuracy, cost per query, user metrics. Map the integration points. Estimate the work. Decide go/no-go before building.
Days 15-60: Build. Implementation, testing, iteration. Real users get involved in week 4-5. By week 8 the feature is shippable but not perfect.
Days 61-90: Ship + handoff. Deploy to production. Monitor. Tune. Document. Train your team to maintain it. Identify the next 2-3 AI opportunities for follow-up.
Output: one shipped AI feature + a playbook for shipping more. Cost: usually $25K-$60K depending on scope and integration complexity.
The Bottom LineAI implementation consultants are the people who actually ship working AI in your business. In 2026, the market is full of newly-converted strategy consultants pretending to be implementers. The vetting questions above filter them out.
For most businesses, the right first move is a 90-day pilot for $25K-$60K with a senior implementation consultant. You either get a shipped AI feature and a playbook for more, or you learn cheaply that this specific use case isn't ready yet. Either outcome is better than $200K on slideware.
If you're trying to figure out whether AI implementation makes sense for your specific situation, book a free 15-min strategy call. I'll give you a honest read on whether your use case is ready and what it would cost to ship the first feature. No pitch, no roadmap, no slideware.
Related reading: AI consultant: the broader landscape, firm vs solo, AI strategy consultant role, Chief AI Officer vs Fractional CTO.
### Frequently Asked Questions
**Q: What does an AI implementation consultant do?**
A: An AI implementation consultant builds and deploys working AI features into your business — not strategy decks. Specifically: integrating LLMs (Claude, GPT, Gemini) into your product or operations, building RAG systems for company knowledge, automating workflows with AI agents, fine-tuning or prompt-engineering models for your specific use case, and setting up monitoring + cost controls. The deliverable is shipped code, not a 60-page roadmap.
**Q: How much does an AI implementation consultant cost in 2026?**
A: Hourly rates range from $150 to $600/hr depending on experience and specialty. Mid-market generalists are $150-$300/hr. Senior implementation specialists with shipping track records are $300-$600/hr. Project-based engagements typically run $25K-$150K for a 4-12 week implementation. Retainers are $5K-$25K/month for ongoing AI integration work.
**Q: What's the difference between AI strategy and AI implementation?**
A: AI strategy consultants write the plan; AI implementation consultants ship the code. Strategy decks identify opportunities ("add a chatbot to your support workflow") and quantify ROI in slideware. Implementation consultants actually build, test, and deploy the chatbot. Most businesses don't need a separate strategy consultant — they need someone who can do both, and ideally lead with implementation.
**Q: When should I hire an AI implementation consultant vs an internal hire?**
A: Hire a consultant for: time-bound projects (90 days, ship X), pilot programs to validate ROI before committing, or specialized work you don't need full-time. Hire internally when you have ongoing AI workloads, the AI feature IS the product, or you need 40+ hours/week of dedicated focus. A common pattern: consultant to ship V1, internal team to maintain and extend.
**Q: What questions should I ask an AI implementation consultant before hiring?**
A: Five questions that filter out the slideware sellers: (1) Show me an AI feature you've shipped to production in the last 6 months. (2) What's your typical pricing for a 90-day implementation? (3) Who owns the code at the end of the engagement? (4) What's your approach to monitoring AI costs and accuracy in production? (5) Can you walk me through your last AI bug — what broke and how you fixed it. If they can't answer #1 with specifics, they're a strategist, not an implementer.
**Q: What are the red flags when hiring an AI implementation consultant?**
A: Four real ones: (1) They lead with workshops and roadmaps instead of shipping. (2) They can't show production AI features they've built. (3) They want $50K+ upfront before any code. (4) They use "AI transformation" without naming specific tools, models, or integration points. If you hear "strategic alignment" before "Claude Sonnet 4.5 with function calling," they're billing for slides not software.
**Q: How long does AI implementation usually take?**
A: Most pilot implementations take 4-8 weeks (a single AI feature shipped to production). Larger implementations (multi-feature, custom RAG systems, agent workflows) run 8-16 weeks. Anyone promising a full "AI transformation" in 30 days is selling shovels; anyone planning a 12-month implementation is bleeding you with billable hours. The sweet spot for most businesses is a 90-day pilot.
**Q: Do I need a CTO if I have an AI implementation consultant?**
A: Not necessarily, but you need someone making technical decisions. If you have an internal engineering team, your CTO directs the consultant. If you don't, the AI implementation consultant should also play a fractional technical leadership role — connecting the AI work to the rest of your tech stack and roadmap. See Chief AI Officer vs Fractional CTO for the longer answer.
**Q: Should I hire an AI implementation consulting firm or an independent?**
A: Independents are 2-3x cheaper and often more senior at the actual implementation work — they're not selling junior consultants up the org chart. Firms have more capacity, more processes, and more credentials if you need them for enterprise procurement. For most businesses under $50M ARR: an independent or 2-person team is the better value. For Fortune 500 procurement: firms have the certifications you need. See AI consulting firm vs solo consultant for the full breakdown.
**Q: What does a 90-day AI implementation engagement look like?**
A: Days 1-14: discovery + scoping. Pick one high-leverage AI feature to ship. Define success metrics (response time, accuracy, cost per query). Days 15-60: build. Iterate. Test with real users. Days 61-90: ship to production, monitor, handoff documentation, training your team to maintain it. Output: one shipped AI feature + the playbook to ship more. Cost: typically $25K-$60K depending on scope.
---
## Replit vs Lovable 2026: Which AI Builder Wins?
- **URL:** https://justinmckelvey.com/blog/replit-vs-lovable
- **Published:** May 31, 2026
- **Updated:** May 31, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 5 min
- **Description:** Replit vs Lovable: Replit = browser IDE + AI Agent for code-curious builders. Lovable = prompt-to-app for non-coders. Pricing, when each wins.
Quick Answer (The Verdict)
Replit wins for code-curious builders who want to learn (browser IDE + AI Agent + multiplayer collaboration, $25/mo Core). Lovable wins for non-coders who want a polished app without touching code (prompt-to-deployed-app, $25/mo Pro). They're priced similarly but solve different problems. Replit is a browser-based development environment where AI helps; Lovable is a builder where AI does everything and you see the output. Pick Replit if you want to grow as a builder. Pick Lovable if you just want the app.
Based on hands-on work with Replit, Lovable, and client advisory work between them · June 2026 · Author: Justin McKelvey, fractional CTO
Key Stats (June 2026)
• Replit Core: $25/mo (cloud workspaces + Replit Agent + deployments)
• Lovable Pro: $25/mo (Pro+ $50/mo for more credits)
• Replit philosophy: Browser IDE for code-curious builders. 50+ languages, 30M+ users worldwide
• Lovable philosophy: Prompt-to-deployed-app for non-coders. React + Supabase output
• Code visibility: Replit = full code editing. Lovable = mostly hidden behind chat UI
• Collaboration: Replit = real-time multiplayer (Google Docs for code). Lovable = single builder
• Best fit: Replit = learning + prototyping + classrooms; Lovable = MVPs + idea validation
TL;DR: Replit vs Lovable in 90 SecondsReplit is a browser IDE with AI features. Lovable is an AI app generator with a polished output. Both can produce a working web app from a description. The difference is the experience and the audience: Replit shows you the code, lets you tinker, and supports multiplayer collaboration. Lovable hides the code, optimizes the chat UX, and produces a deployable React + Supabase app.
I'm a fractional CTO who's used both tools and advised founders picking between them. The honest verdict: if you want to be a builder, use Replit. If you just want the app, use Lovable. Most people will pick Lovable, and that's the right call for them.
What Each Tool Actually IsReplit is a browser-based development environment. You can write code in 50+ languages, install packages, use a terminal, and run anything from a Python script to a full Node.js app — all without installing anything locally. Replit Agent (their AI feature) can build apps from prompts; Replit Assistant gives you AI help inline as you code. The signature feature: multiplayer editing, like Google Docs for code, where multiple people can work on the same file in real time.
Lovable is a prompt-to-app generator. You type a description ("a meal planning app where users save favorite recipes"), Lovable generates a complete React + Tailwind + Supabase application, and you iterate by chatting with the AI. The visual editor lets you tweak design without writing code. Output is standard web app code you can export to GitHub and deploy anywhere.
Pricing Side-by-Side
Tier
Replit
Lovable
Free
Unlimited public Repls, limited compute
Free tier with daily message limits
Starter / Core
$25/mo Core (workspaces + Agent + deployments)
$25/mo Pro
Mid tier
$40/mo Teams
$50/mo Pro+ (more credits)
Backend cost
Included (Replit Database, hosted DBs)
Supabase (~$0-25/mo separate)
Compute model
Hours-based with auto-sleep
Per-message credit system
Code Visibility: The Big Workflow DifferenceThis is where the two tools feel most different in daily use.
Replit shows you the code first. Even when you use Replit Agent to generate an app, you're working inside a normal IDE — files, folders, a terminal, package management. You can read what the AI generated, modify it directly, run commands, and treat it as a real development project. This is empowering if you want to learn or have any coding skills. It's intimidating if you don't.
Lovable hides the code by default. You chat with the AI, you see a preview of the app, you tweak the design in a visual editor. The code exists (you can view it and export it), but the primary interface is the conversation. This is the better experience for non-developers — no overwhelm, no syntax errors, no terminal anxiety.
Neither approach is "right." They serve different builders.
Multiplayer: Replit's Unmatched AdvantageReplit's real-time collaboration is genuinely unique in this space. Two or more people can edit the same file simultaneously, see each other's cursors, and chat in the workspace. Used by classrooms, pair programming sessions, and remote teams.
Lovable's collaboration is single-builder by design — the AI is the collaborator. If you need multiple humans iterating on the same app at the same time, Replit is the only choice between these two.
AI Capabilities: Different PhilosophiesReplit's AI features include Replit Agent (prompt-to-app, comparable to Lovable), Replit Assistant (in-editor AI help, comparable to early Cursor), and AI autocomplete. All operate within the context of a normal coding environment.
Lovable's AI features are the entire interface. There's no "AI feature" because the AI IS the app builder. Every interaction is a prompt; every change is a generation.
For builders who want AI to augment their coding: Replit. For builders who want AI to BE the coding: Lovable.
Deployment StoryBoth can ship apps to production but go about it differently:
Replit Deployments include Autoscale (scales with traffic), Reserved VM (dedicated compute), and Static (CDN-hosted). You deploy from the Replit interface and the app runs on Replit's infrastructure. Good for small-to-medium traffic.
Lovable outputs standard React apps that deploy to Vercel or Netlify by default. For high-traffic production sites, this is the better path — Vercel's edge network and serverless functions outperform Replit's hosted deployments at scale.
Who Should Pick Replit
• Code-curious builders — you want to see and understand the code, even if AI writes most of it
• Teachers, students, classrooms — Replit's education-first features are unmatched
• Remote pair programming or collaboration — multiplayer is the killer feature
• Tinkerers building in 50+ languages — Python data science, game development, embedded experiments
• Chromebook or low-spec hardware users — everything runs in the browser
Who Should Pick Lovable
• Non-developers shipping web apps — the polished prompt-to-app experience is what you want
• Founders validating ideas fast — get a deployable MVP in hours
• Designers and product people — focus on the UX, not the implementation
• Anyone planning to scale — standard React + Supabase output transitions cleanly to a real dev team later
• Solo builders who don't want to think about deployment — Lovable handles the boring parts
The Honest Bottom LineIf you can already code or want to learn: Replit gives you the most flexibility and the best collaboration story.
If you can't code and don't want to learn: Lovable will get you to a working app faster with less stress.
For developers shipping serious work: neither one — use Cursor or Claude Code on desktop, deploy to Vercel or Railway. Lovable and Replit are excellent tools for their target audiences; that audience just isn't professional developers building production systems.
Need help picking? Book a free strategy call. Or for the broader landscape: the best vibe coding tools 2026, Lovable vs Cursor, Replit vs Cursor.
### Frequently Asked Questions
**Q: Is Replit better than Lovable?**
A: Depends on your skill level. Replit is better for code-curious builders who want to learn and tinker — it gives you a browser IDE with AI Agent capabilities, real code visibility, and a more traditional development workflow. Lovable is better for non-coders who want a polished app from a prompt without seeing the code. Replit Pro ($25/mo Core) and Lovable Pro ($25/mo) are similarly priced.
**Q: Can I use Replit Agent like Lovable?**
A: Replit Agent (released 2024 and significantly improved through 2026) is Replit's prompt-to-app feature. It generates apps from descriptions, similar to Lovable. The difference: Replit Agent generates code in your visible Replit workspace, while Lovable focuses on the deployed-app experience with less code emphasis. Replit Agent is the better choice if you want to learn and modify the code; Lovable is better if you want the app, not the lesson.
**Q: Which is more affordable, Replit or Lovable?**
A: Both have free tiers and ~$25/month paid tiers. Replit's pricing includes more compute hours and storage; Lovable's pricing includes more AI generation credits. For light prototyping, both are essentially free. For active building, Replit's Core ($25/mo) and Lovable's Pro ($25/mo) are head-to-head on price.
**Q: Can I export code from Replit and Lovable?**
A: Yes, both. Replit lets you download your entire project as a zip or push to GitHub — your code is yours. Lovable generates standard React + Supabase code that you can push to GitHub and host anywhere. Neither has hard lock-in on code, though Lovable's database (Supabase) is easier to migrate than Replit's hosted databases.
**Q: Is Replit better for learning to code?**
A: Yes. Replit was built around education from day one — it has classroom features, supports 50+ languages, integrates with curriculum tools, and the browser IDE is designed for beginners. Lovable assumes you don't want to learn the code at all. If learning is part of your goal, Replit is clearly the better tool.
**Q: Which is better for collaborative work?**
A: Replit, decisively. Multiplayer code editing has been Replit's signature feature since 2018 — multiple people can edit the same file in real time, like Google Docs for code. Lovable's collaboration features are limited; it's designed around a single builder iterating with the AI.
**Q: Can I deploy production apps from Replit or Lovable?**
A: Both support deployment. Replit has Deployments (Autoscale, Reserved VM, Static) directly from the platform. Lovable apps deploy to Vercel or Netlify by default since they're standard React apps. For high-traffic production, Lovable's generated apps on Vercel will outperform Replit's deployments — but Replit's all-in-one workflow is faster to get something live initially.
**Q: Which has better AI features?**
A: They emphasize different things. Replit's AI features (Agent, Assistant, autocomplete) are integrated into a real coding environment. Lovable's AI is the entire interface — you describe, it builds. For someone who wants AI to help them write code: Replit. For someone who wants AI to write the app and stay out of the way: Lovable.
**Q: Which one should non-developers pick?**
A: Lovable, in most cases. The prompt-to-app workflow with minimal code exposure is better suited for true non-developers. Replit Agent is non-developer-friendly too, but Replit's broader interface (multiple files, terminal, package manager) can be intimidating for someone who's never written code.
**Q: Which one should developers pick?**
A: Replit, if you want a browser-based development environment with AI assistance — especially useful for prototyping, teaching, or working from a Chromebook. For serious production work, most developers use Cursor or Claude Code on desktop instead. Lovable is rarely the right choice for developers — they get more control with Cursor + Supabase directly.
---
## Base44 vs Lovable 2026: Which AI App Builder Wins?
- **URL:** https://justinmckelvey.com/blog/base44-vs-lovable
- **Published:** May 31, 2026
- **Updated:** May 31, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 6 min
- **Description:** Base44 vs Lovable: Base44 bundles DB + auth; Lovable uses Supabase. Pricing, lock-in, which wins for non-devs vs production teams.
Quick Answer (The Verdict)
Base44 wins for non-developers shipping fast (all-in-one managed stack, $16/mo annual). Lovable wins for production-track apps (standard React + Supabase, $25/mo Pro, easier to hand off). Same starting price range, fundamentally different architectures. Base44 = managed everything (faster to ship, harder to leave). Lovable = standard stack (slightly more setup, much more portable). Pick Base44 for hackathons, internal tools, and idea validation. Pick Lovable for anything you expect to grow past 12 months.
Based on 8+ inherited Base44 apps + Lovable comparisons in client advisory work · June 2026 · Author: Justin McKelvey, fractional CTO
Key Stats (June 2026)
• Base44 starter: $16/mo annual ($20 monthly); Wix-acquired June 2025 ($80M)
• Lovable Pro: $25/mo + Supabase costs separately (~$0–$25/mo for small apps)
• Tech stack — Base44: Proprietary managed Postgres + auth + hosting (all-in-one)
• Tech stack — Lovable: React + Tailwind + Supabase (industry-standard, portable)
• Generated code export: Base44 = beta; Lovable = full GitHub export from day one
• Iteration ceiling: Base44 ~v3-v4 of complex apps; Lovable ~v5-v7 (codebase grows manually)
• Best for: Base44 = hackathons + internal tools; Lovable = MVPs that might scale
TL;DR: Base44 vs Lovable in 90 SecondsBase44 and Lovable both turn a text prompt into a working full-stack app. The fundamental difference is what happens to the code. Base44 hosts everything on its own managed platform — your database, your auth, your hosting all live in Base44. Lovable generates React + Supabase code that you can deploy anywhere and hand off to any developer who knows the standard web stack.
I'm a fractional CTO who maintains 8+ Base44 apps as part of vibe code rescue work and advises founders on picking between AI app builders weekly. The honest verdict: Base44 is faster, Lovable is more portable. The right choice depends on what you're building.
What Each Tool Actually IsBase44 is an all-in-one AI app builder. You type a prompt ("a chore scheduler for families with recurring tasks and points"), Base44 generates a complete app on its proprietary stack: a managed Postgres database, built-in authentication, file storage, and hosted deployment. You iterate by chatting with the AI. No separate Supabase project. No Vercel bill. One platform, one invoice. Acquired by Wix in June 2025 for $80M.
Lovable is also a prompt-based AI app builder, but it generates standard React + Tailwind CSS + Supabase code. You can deploy that code anywhere (Vercel, Netlify, your own server). You manage the Supabase database as a separate service. The output is just a normal web app — any React developer can pick it up and continue building.
Both use modern AI models for code generation. Both have visual editors for post-generation tweaks. Both ship MVPs in hours, not weeks. The platform philosophy is where they diverge.
Pricing: Apples vs Apples vs Apples
Plan
Base44
Lovable
Free
25 message credits (limited)
Free with daily message limits
Starter / Pro
$16/mo annual ($20 monthly)
$25/mo Pro
Mid tier
$40/mo Builder (annual)
$50/mo Pro+
Backend cost
Included
Supabase Free ($0) to Pro ($25/mo)
Hosting cost
Included
Vercel/Netlify (often free at small scale)
Realistic total at small scale
$16/mo
$25-50/mo combined
Base44's all-in-one pricing is simpler to predict but caps your ability to optimize. Lovable's multi-vendor pricing has more knobs to turn — you can move to Supabase's higher tiers, swap hosting providers, and tune costs as you grow.
The Lock-In QuestionThis is the single biggest differentiator and the one most non-developers underestimate when picking between these tools.
Base44 has structural lock-in. Your data lives in Base44's managed Postgres. Your auth lives in Base44's auth system. Your hosting is on Base44. If Base44 raises prices, has a multi-hour outage (as it did Feb 3, 2026), or just stops being the platform you want — your options are limited. GitHub code export exists but was still in beta as of mid-2026 on most plans, and even with the code, you can't easily extract the data.
Lovable has minimal lock-in. The generated code is standard React + Supabase. You can clone the repo, take it to any developer, deploy it anywhere, and migrate the Supabase database to any Postgres host. The whole stack is industry-standard, well-documented, and easy to hire for.
This matters less for a hackathon project and more for anything you might bet a business on.
Iteration Ceiling: Where Each Tool BreaksBoth tools work great for v1 of an app. The question is what happens at v5 or v10 — when the prompts get specific, the business logic gets intricate, and you need to fix bugs that the AI introduced.
Base44's iteration ceiling hits around v3-v4 of a complex app. The AI starts making changes that break existing features. You burn credits trying to debug. You don't have direct code access to fix things manually (without ejecting). For internal tools and simple apps, you might never hit this ceiling. For a product you're building a business around, you will hit it.
Lovable's iteration ceiling hits around v5-v7, and even then you can drop into the generated code manually to fix what the AI can't. The codebase grows as a normal React project, which you can navigate, refactor, and extend by hand. This requires either you or a developer to know React — but it dramatically extends the useful life of the AI-generated foundation.
Who Should Pick Base44
• Non-developers validating an idea fast. 30-day MVP cycles, hand off if it works, throw away if it doesn't.
• Operators building internal tools. Admin dashboards, CRUD apps, team workflows. Don't need to scale, don't need beautiful design.
• Hackathon builders. 48-hour speed-to-demo is Base44's sweet spot.
• Anyone who just wants one bill. All-in-one pricing simplifies expense tracking for small projects.
Who Should Pick Lovable
• Anyone planning to scale past validation. The portability of standard React + Supabase is worth real money once your app has paying users.
• Founders who want to hire developers later. Any React developer can extend a Lovable app; hiring "a Base44 developer" is much harder.
• Anyone who cares about cost optimization at scale. Lovable + Supabase has more knobs to tune as your usage grows.
• Anyone allergic to vendor lock-in. Lovable code is yours; Base44 apps are on Base44.
The Hybrid MoveA pattern I see working for several founders: start on Base44 to validate the idea (4-6 weeks), then rebuild on Lovable or hire a developer once you have paying users. You get Base44's speed-to-validation without the long-term lock-in cost.
This only works if you decide upfront that Base44 is your validation tool, not your production platform. Otherwise you'll keep iterating on Base44, hit the ceiling, and face a painful rebuild on a tighter timeline.
Bottom LineFor most founders deciding between these two right now:
• Pick Base44 if you can't code, you need to ship in days, and you're OK with rebuilding later if the idea works.
• Pick Lovable if you can code (or plan to hire someone who does), and you want a path from MVP to scaled product without a full rebuild.
Want more context on the broader vibe coding landscape? See the best vibe coding tools 2026 or the dedicated reviews: Base44 review and Lovable vs Cursor.
If you're already on one and the iteration ceiling is biting, book a free strategy call. I do this exact "stay or migrate" analysis with founders weekly.
### Frequently Asked Questions
**Q: Is Base44 better than Lovable?**
A: Neither is universally better — they target different use cases. Base44 bundles its own database, auth, and hosting, which makes it the fastest path to a deployed full-stack app for non-developers. Lovable generates standard React + Supabase code, which makes it the better choice if you expect to hand the app to a developer eventually or migrate to a more flexible stack. Pricing is similar ($16-$25/month).
**Q: What's the main difference between Base44 and Lovable?**
A: Architecture. Base44 is an end-to-end managed platform — your data lives in Base44's managed Postgres, your auth lives in Base44's auth system, and your app is hosted on Base44. Lovable generates a React + Supabase app that you can deploy anywhere. Base44 is faster to ship; Lovable is more portable.
**Q: Which is cheaper, Base44 or Lovable?**
A: Base44 starts cheaper ($16/mo annual vs Lovable's $25/mo Pro), but the comparison isn't apples-to-apples — Lovable's price doesn't include the Supabase backend, which you pay for separately. For a small app, total cost is similar. For a growing app, Base44's all-in-one billing is more predictable; Lovable + Supabase has more pricing surfaces but more headroom to optimize.
**Q: Can I migrate from Base44 to Lovable?**
A: It's hard. The biggest blocker is the database — Base44's managed Postgres has no clean export-and-go path, and migrating data to a standard Supabase project requires manual schema mapping and ETL. Realistically, migration means rebuilding the app on Lovable and importing data through CSV exports. Plan for 4-8 weeks of work depending on app complexity.
**Q: Which one is better for non-developers?**
A: Base44, for most cases. The all-in-one approach means you don't need to understand what Supabase is, how to manage a database project separately, or how to wire up auth. Lovable is also non-developer-friendly but assumes you'll eventually learn how the underlying stack works.
**Q: Which is better for production apps?**
A: Lovable, for most cases. The generated code is standard React + Supabase + Tailwind, which is industry-standard and easy to hand off to a developer. Base44 is great for getting to production fast, but the platform's iteration ceiling (around v3-v4 of a complex app) and vendor lock-in make it less suited for apps you expect to grow significantly.
**Q: Does Base44 use Lovable's tech?**
A: No. They're independent platforms with different architectures. Base44 uses Claude Sonnet 4 (and others on higher plans) to generate code on its proprietary stack. Lovable uses similar AI models to generate React + Supabase code. Both were built independently around the same time (2024).
**Q: Who owns Base44 and Lovable?**
A: Base44 was acquired by Wix in June 2025 for a reported $80M. Lovable remains an independent company (founded 2024, based in Stockholm), with strong VC backing as of 2026.
**Q: Is Base44 or Lovable better for AI features?**
A: Both can integrate AI features into apps they generate. Base44 has native OpenAI/Anthropic integrations built into the platform, so adding AI features is configuration not code. Lovable lets you wire AI calls into the generated code directly, which is more flexible but requires more setup. For straightforward AI features (chat, generation, embeddings), Base44 is faster; for custom AI workflows, Lovable gives you more control.
**Q: Which should I pick for a hackathon?**
A: Base44 — speed-to-demo is its sweet spot, and the all-in-one stack means you skip configuration. You can ship a working full-stack demo with auth, database, and Stripe in a single afternoon.
---
## Supabase vs Firebase 2026: The Honest Backend Verdict
- **URL:** https://justinmckelvey.com/blog/supabase-vs-firebase
- **Published:** May 31, 2026
- **Updated:** May 31, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 8 min
- **Description:** Supabase vs Firebase: at 10K DAU, Supabase costs $50-100/mo; Firebase $500-1,500. Full 2026 comparison: pricing math, code, when each wins.
Quick Answer (The Verdict)
For most modern web apps in 2026, Supabase wins on cost (3–5x cheaper for the same workload), portability (it's just Postgres), and SQL power. Firebase still wins for mobile-first apps with complex offline sync, ultra-fast prototyping, and teams already committed to Google's stack. Specific math: 10K DAU + 10M reads/day costs ~$50–100/mo on Supabase vs ~$500–1,500/mo on Firebase. The Supabase Pro tier is $25/mo. Firebase has no equivalent flat-rate tier — it bills per operation, which punishes successful apps.
Based on production cost data, 2026 pricing pages, and client backend decisions · Author: Justin McKelvey, fractional CTO who advises founders on this exact choice weekly
Key Stats (June 2026)
• Cost ratio: 3–5x cheaper on Supabase for SaaS workloads at 10K+ DAU
• Supabase Pro tier: $25/month (8 GB DB, 250 GB bandwidth, 100K MAUs)
• Firebase Spark (free): Per-operation limits (50K reads/day on Firestore)
• Supabase Free: 500 MB DB, 50K MAUs, unlimited API requests
• Architecture: Supabase = Postgres (open-source); Firebase = Firestore NoSQL (proprietary)
• Migration story: Supabase exports as standard SQL; Firestore requires full data layer rewrite
TL;DR: Supabase vs Firebase in 2026Supabase is the right call for most modern web applications in 2026. It's open-source Postgres with managed auth, storage, realtime, and edge functions on top — all reading from one source of truth. You get SQL, Row Level Security, pg_vector for AI features, and an actual exit door if you ever need to leave. Pricing is resource-based ($25/mo Pro) instead of per-operation, which means a successful app doesn't trigger a $5,000 surprise bill.
Firebase is the right call for: mobile-first apps with complex offline sync, teams already in Google's ecosystem, or hyper-fast prototyping with the new Data Connect product. Firestore (the NoSQL database) still has the most mature offline-first SDK on the market, and Firebase's Gemini integration is first-party.
I'm a fractional CTO who advises founders on this exact decision multiple times per week. The honest version: Firebase used to be the obvious answer because it was easier to start with. In 2026, Supabase matches or beats it on developer experience for web apps — and the cost-at-scale difference is no longer close.
The Architectural Split (And Why It Matters)Supabase is built around one idea: Postgres is the source of truth. Auth, storage, edge functions, real-time updates, and Row Level Security all live in or reference the same Postgres database. When you query data, you write SQL. When you change auth rules, you write policies that run inside Postgres. When you need vector search for an AI feature, you use the pg_vector extension on the same database.
Firebase is Google-first and service-oriented. Each capability — Firestore (NoSQL database), Firebase Auth, Cloud Functions, Hosting, Cloud Messaging, Data Connect (the newer managed Postgres) — is its own managed service with its own SDK, its own pricing model, and its own scaling behavior. Stitching them together is easier than building infrastructure from scratch, but harder than working in one unified system.
This architectural split drives every other difference. Supabase feels like you're working in one connected system. Firebase feels like you're plugging together a bag of separate Google services. Both can work — they just feel different.
Pricing: Where the 3–5x Gap LivesThis is the single biggest reason teams migrate from Firebase to Supabase. The two platforms charge for fundamentally different things.
Supabase charges for resources. You pay for database size, monthly active users, bandwidth, and storage — basically the size of your app. As your usage scales, your bill grows roughly linearly with the size of your data and your audience.
Firebase charges for operations. You pay per database read, per write, per Cloud Function invocation, per bytes transferred. As your usage scales, your bill grows with the activity of your users. A user reading a feed 50 times a day costs more than a user reading it once. A successful app means lots of operations. Lots of operations means a big bill.
The 10,000-User Scenario
Workload
Supabase
Firebase
10K DAU, 10M reads/day, 1M writes/day, 5 GB DB
~$50–$100/mo (Pro tier covers most of this)
~$500–$1,500/mo (per-op pricing dominates)
1K DAU, 1M reads/day, 100K writes/day, 1 GB DB
$25/mo Pro
~$60–$150/mo
Side project, low traffic
$0 (Free tier)
$0 (Spark plan)
Enterprise-scale SaaS
$599/mo Team + usage
$5,000–$50,000+/mo
At low volumes, the difference is annoying but manageable. At anything resembling product-market fit, the math gets brutal for Firebase.
Authentication Code: Side-by-SideBoth platforms have great auth. The code looks similar at the surface — here's what signing up a user looks like in 2026:
Supabase Auth (JavaScript):
const { data, error } = await supabase.auth.signUp({
email: 'user@example.com',
password: 'secure-password',
options: {
data: { first_name: 'Justin' }
}
})Firebase Auth (JavaScript):
const userCredential = await createUserWithEmailAndPassword(
auth,
'user@example.com',
'secure-password'
)
await updateProfile(userCredential.user, { displayName: 'Justin' })The Supabase version has one fewer call and integrates the user metadata into the signup itself. Both work fine. The real difference shows up when you want to enforce access rules.
Supabase Row Level Security (SQL):
CREATE POLICY "Users can only see their own bookings"
ON bookings FOR SELECT
USING (auth.uid() = user_id);Firebase Security Rules:
match /bookings/{bookingId} {
allow read: if request.auth.uid == resource.data.user_id;
}Both work. The Supabase version is just SQL — anyone who knows Postgres can read it. The Firebase version uses a domain-specific rules language that only exists in Firebase. Neither is harder; one is more portable.
Feature-by-Feature Comparison
Feature
Supabase
Firebase
Database
Postgres (SQL, ACID, joins)
Firestore (NoSQL, document) + Data Connect (managed Postgres)
Auth
Built-in, RLS-integrated, social providers
Mature, deep mobile SDK, social providers
Real-time
Postgres logical replication, good for web
Best-in-class, especially for mobile + offline
Storage
S3-compatible API, integrated with RLS
Cloud Storage, integrated with rules
Functions
Edge Functions (Deno runtime)
Cloud Functions (Node, Python, more) + Genkit
AI / Vector
pg_vector in Postgres (native)
Vertex AI integration + Gemini (first-party)
Pricing model
Resources (predictable)
Per operation (unpredictable at scale)
Open source
Yes — self-host as escape hatch
No — Google-proprietary
Mobile SDK maturity
Functional, improving
Industry-leading, especially iOS/Android
Offline-first sync
Limited
Best on market
Compliance (SOC 2)
Team plan ($599/mo) and self-host
Available with Google Cloud commitments
When Firebase Still WinsFirebase isn't dying — it's still the right choice in specific situations. Don't pick Supabase if any of these apply:
1. Mobile-first with offline-critical features. If you're building a delivery app that has to work in tunnels, a field-service app that runs without WiFi, or anything where the user must keep using the app offline and sync conflicts on reconnect — Firestore's offline support is genuinely better. Supabase Realtime is good for "live updates while online"; Firestore is built for "works fine without internet."
2. You're all-in on Google Cloud + Gemini. If your AI stack is Vertex AI, Genkit, and Gemini, Firebase is the natural fit. First-party integration, shared billing, IAM that already works.
3. Hyper-fast prototyping in a Google-first codebase. Firebase's "add the SDK, start writing" flow is still slightly faster than Supabase's "set up a project, configure RLS" flow for the first 10 minutes of a project.
4. Your team has deep Firebase muscle memory. If your engineers know Firestore inside out, retraining them on Postgres is a real cost. Sometimes the right answer is "use what your team is fast at."
When Supabase Is the Clear WinnerFor most modern web apps in 2026, Supabase is the better default. Pick Supabase if:
• Your data is relational — invoices linked to customers linked to companies, the normal SaaS shape. SQL fits this naturally; NoSQL fights it.
• You care about cost at scale — and you should, because the gap is 3–5x and growing as you grow.
• You want an exit door — open-source Postgres means you can self-host, switch providers, or migrate to a managed RDBMS later without rewriting your data layer.
• You're building AI features — pg_vector lets you store embeddings and run similarity searches in the same database as your relational data. No separate vector DB to sync.
• You want SQL — joins, transactions, window functions, common table expressions. NoSQL workarounds for these are painful.
• You're working with vibe coding tools — Lovable, Bolt, and Cursor all default to Supabase. Picking Firebase means fighting your tooling.
The Migration Story (Firebase → Supabase)If you're already on Firebase and the bills are starting to hurt — yes, you can migrate. It's not trivial, but it's done all the time. The realistic path:
2. Map your Firestore collections to Postgres tables. Most document structures translate cleanly; the awkward parts are nested arrays and dynamically-shaped documents.
4. Run both in parallel during the cutover. Dual-write to Firebase and Supabase for 1–2 weeks while you backfill old data and validate the migration.
6. Migrate auth carefully. Firebase Auth has tools to export users; Supabase Auth has tools to import them. Plan for users to reset passwords if needed.
8. Convert Security Rules to RLS policies. This is usually the most time-consuming step but also the cleanest — RLS is more powerful and more readable than Firebase's rules language.
10. Replace SDK calls. Find/replace, basically. Same operations, different methods.
Typical migration timeline for a moderately complex app: 4–8 weeks of focused work. The bill savings often pay it back within 60 days.
The Bottom LineIf you're starting a new web app in 2026 and you don't have a specific reason to pick Firebase, pick Supabase. It's cheaper, more portable, has SQL, integrates better with the rest of the modern AI tool stack, and gives you an exit door.
If you're on Firebase and the bill is starting to hurt: migration is real, it's been done thousands of times, and the math usually works out in under 60 days. Book a free strategy call if you want a second opinion before you commit. I do this exact migration analysis with founders constantly.
Want the broader landscape? See the best vibe coding tools 2026 for which builders integrate cleanly with each backend, or what is vibe coding for the bigger picture.
### Frequently Asked Questions
**Q: Is Supabase cheaper than Firebase?**
A: Almost always. For an app with 10,000 daily active users and 10 million reads per day, Supabase typically costs $50–$100 per month, while Firebase costs $500–$1,500 per month for the same workload — a 3–5x multiplier. The reason: Supabase charges for resources (database size, MAUs, bandwidth) while Firebase charges per operation (read, write, function invocation). Operation-based pricing punishes successful apps. Resource-based pricing scales more predictably.
**Q: Is Supabase better than Firebase in 2026?**
A: For most modern web applications in 2026: yes. Supabase gives you SQL, Row Level Security, pg_vector for AI features, open-source self-hosting as an escape hatch, and 3–5x lower bills at scale. Firebase still wins for mobile-first apps with sophisticated offline sync, ultra-fast prototyping (especially the new Data Connect product), and first-party Google Gemini integration.
**Q: Why are people migrating from Firebase to Supabase?**
A: Two reasons. (1) Cost: Firebase's per-operation pricing creates exponential cost curves as apps grow — teams routinely report bills going from $50/month to $5,000/month after a successful launch. (2) Vendor lock-in: Firestore is Google-proprietary; migrating off it requires rewriting your entire data layer. Supabase is just Postgres — if you ever leave, you take a standard SQL database with you.
**Q: Can Supabase replace Firebase for real-time apps?**
A: For most real-time needs (live notifications, chat, presence, collaborative editing) Supabase Realtime works well — it uses PostgreSQL's logical replication to push database changes to clients. Where Firebase still wins: complex offline-first apps where the client needs to sync without a connection and resolve conflicts on reconnect. Firebase's offline support is more mature; Supabase's is functional but newer.
**Q: What is the actual Supabase free tier limit?**
A: As of 2026, Supabase Free includes: 500 MB database, 1 GB file storage, 2 GB bandwidth, 50,000 monthly active users, and unlimited API requests. Two projects pause after one week of inactivity. The free tier is genuinely usable for production side projects — Firebase's Spark plan has tighter operational limits that hit production traffic faster.
**Q: How much does Supabase Pro cost?**
A: Supabase Pro is $25/month and includes: 8 GB database, 100 GB file storage, 250 GB bandwidth, 100,000 monthly active users, daily backups, and email support. Most production apps stay on Pro until they hit serious scale. The Team plan adds SOC 2 compliance and is $599/month. Self-hosting is $0 plus your infrastructure costs.
**Q: Does Firebase have SQL?**
A: Not really. Firestore is a NoSQL document database — you query by document paths and field equality, not joins or aggregations. Firebase added Data Connect in 2025 as a managed PostgreSQL layer, but it's a separate product with its own pricing and learning curve. If you want SQL from the start, Supabase is the cleaner answer because Postgres IS the database, not an add-on.
**Q: Is Supabase production-ready in 2026?**
A: Yes. As of 2026, Supabase powers thousands of production applications including funded SaaS companies handling millions of users. The platform passed SOC 2 Type II, has a real status page, daily backups on Pro, and Point-in-Time Recovery on Team plans. The main legitimate production concern is Realtime at very high concurrency (>100K connections) — Firebase scales further on that one specific dimension.
**Q: Can I use both Firebase and Supabase together?**
A: Yes, and some teams do. A common hybrid: Firebase Auth for the mobile SDK + Supabase as the primary database. Or Firebase Cloud Messaging for push notifications + Supabase for everything else. The downside is operational complexity — you're managing two platforms, two billing portals, and two failure modes. For most teams, picking one is cleaner.
**Q: Should I use Supabase or Firebase for AI features?**
A: Supabase, for most cases. The pg_vector extension lets you store embeddings directly in Postgres and run similarity searches in SQL — no separate vector database, no syncing problems. Firebase has Genkit and first-party Gemini access, which is the better path if you're all-in on Google's AI stack. For the rest of the AI ecosystem (OpenAI, Anthropic, open models), Supabase + pg_vector is the cleaner architecture.
---
## Base44 Review 2026: Honest Verdict After Wix Acquisition
- **URL:** https://justinmckelvey.com/blog/base44-review
- **Published:** May 31, 2026
- **Updated:** May 31, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 8 min
- **Description:** Base44 review 2026: Wix's $80M AI app builder. What it ships, where it breaks, pricing ($16-$80/mo), and the alternatives if you outgrow it.
Quick Answer (Honest Verdict)
Base44 is legit — Wix bought it for $80M in June 2025, six months after launch, and it now has 2M+ users. It's the fastest way I've seen to ship a full-stack MVP from a single prompt because frontend, backend, database, auth, and hosting are all included (no Supabase, no Vercel bills). It works well for: internal tools, dashboards, hackathon demos, and idea validation. It breaks on: complex business logic, advanced design control, and migrating data out (vendor lock-in is real). Free tier exists (25 credits) but burns out fast; paid starts at $16/mo annual.
Based on 8+ inherited Base44 apps (rescue/maintenance work) + public review research · June 2026 · Author: Justin McKelvey, fractional CTO, 15+ years shipping products
Key Stats (June 2026)
• Acquired: Wix bought Base44 in June 2025 for ~$80M (6 months after launch)
• Scale: 2M+ users, $100M ARR by Q1 2026
• Free tier: 25 message credits, 100 integration credits, unlimited apps
• Paid pricing: Starter $16/mo (annual), Builder $40/mo, Pro $80/mo
• Default AI model: Claude Sonnet 4 (Opus 4.5, Gemini 2.5/3 Pro, GPT-5 on Builder+)
• Post-Wix changes: Pricing up 15–30%, support response slowed days→weeks
• Major outage: Feb 3, 2026 multi-hour platform-wide outage, no SLA for non-enterprise
TL;DR: Base44 in 90 SecondsBase44 is the AI app builder Wix paid $80 million for in June 2025, six months after launch. By Q1 2026 it had 2 million users and $100M ARR. It generates a complete full-stack web app from a text prompt — frontend, backend, database, authentication, and hosting all included in the platform. You describe what you want, it ships a deployed app, and you iterate by chatting with the AI.
I'm a fractional CTO who's currently maintaining 8+ Base44 apps that founders handed me when their builders ran out of capacity to keep iterating. So my view comes from the angle most reviews skip: what Base44 looks like after the initial "wow, this works" — when someone else has to take it over and make it production-ready. Combined with structured evaluation against the criteria I use for every vibe coding tool, here's the honest verdict.
What Base44 Actually IsBase44 is an AI application generator. You type a description of what you want ("a chore scheduler for a family of 5 with recurring tasks and a points system"), pick a styling preference, and the AI generates a working full-stack app. Database schema, authentication flow, frontend UI, business logic, and deployment all happen automatically.
The big difference from Lovable and Bolt is the bundled infrastructure. Both Lovable and Bolt generate apps that use Supabase as the database, which you manage as a separate service. Base44 includes its own managed Postgres database, authentication system, file storage, and hosting — all on one bill. No external accounts, no API key shuffling.
As of 2026, Base44 defaults to Claude Sonnet 4 for code generation. On the Builder plan and above, you can switch the underlying model to Claude Opus 4.5, Claude Sonnet 4.5, Gemini 2.5 Pro, Gemini 3 Pro, or GPT-5. Native integrations include Stripe, Slack, Google Sheets/Drive, SendGrid/Twilio, and direct OpenAI/Anthropic calls for AI features in your app.
Base44 Pricing 2026 (Post-Wix)As of June 2026, here's the pricing structure:
Plan
Annual price
Monthly price
Message credits
Best for
Free Forever
$0
$0
25 messages + 100 integration credits
Evaluating the tool
Starter
$16/mo
$20/mo
Higher daily allowance
Side projects, single MVP
Builder
$40/mo
$50/mo
Larger pool + model selection
Active building, multiple apps
Pro
$80/mo
$100/mo
Highest pool + priority support
Agencies, daily builders
Is Base44 free? Technically yes — the Forever Free plan exists with unlimited apps and 25 message credits. Practically, no — 25 credits is enough for the first prompt and maybe two iterations. Anyone seriously evaluating Base44 will be on the $16/month Starter within hours.
The post-Wix pricing is also 15–30% higher per equivalent app footprint than pre-acquisition, according to long-time users. The free tier got more restrictive too. Worth knowing before you commit.
What Base44 Does WellZero-infrastructure MVPs. This is the killer feature. You don't sign up for Supabase, configure auth, buy a Vercel plan, or manage a database. The app exists at a URL within minutes. For founders validating an idea before talking to investors or customers, this is genuinely fast.
Same-day deployment. Most reviewers report shipping a working prototype on the first day. The "describe → preview → iterate" loop is tight. You don't need to learn the platform — you just need to know what you want.
Visual editor on top of code. After the AI generates the app, you can tweak colors, copy, and layout in a visual editor without re-prompting. This solves the common Lovable/Bolt pain point where every cosmetic change burns credits.
Lower total bill for small projects. No separate Supabase invoice, no Vercel Pro, no Auth0. For a single app at low traffic, $16/month on Base44 is genuinely cheaper than the comparable stack on Lovable.
Native integrations that work. Stripe checkout, Slack notifications, Google Sheets exports, SendGrid emails — all configured by the platform rather than requiring you to wire up API keys. This is where Base44 saves the most time for non-developers.
What Base44 Breaks OnThe cons are real and they matter:
Credit burn is unpredictable. The AI sometimes gets stuck on a feature, retries multiple times, and burns through credits without making progress. Users routinely report blowing through monthly allowances faster than expected. The fix — designing the UI separately first and feeding precise prompts — defeats the purpose of using an AI builder.
Complex business logic is fragile. Anything beyond basic CRUD + simple conditional flows tends to break. Multi-step state machines, complex permissions, conditional pricing logic, or anything with intricate edge cases — Base44 will generate something that looks right but fails on edge cases you didn't anticipate.
Design control is limited. The visual editor is good for tweaks but not for fully custom design systems. If your brand requires a specific look (custom typography, intricate animations, unusual layouts), you'll hit walls. The styling system is constrained by what the AI templates support.
Vendor lock-in is real. This is the biggest one. The integrated database means your data lives in Base44's managed Postgres. Migrating off the platform is genuinely hard — there's no clean "export everything" button, and GitHub code export was still in beta on most plans as of mid-2026. If you build something successful, you're stuck on Base44 or facing a painful rebuild.
Sample data leakage. Reviewers consistently report that generated apps sometimes include placeholder data where real data should go. You'll demo your app to a customer and find "Sample Item 1" or "user@example.com" in unexpected places.
Post-Wix support degradation. Response times for non-enterprise tiers shifted from hours pre-acquisition to days or weeks post-acquisition. The Feb 3, 2026 multi-hour outage with no SLA was the canary — if your app is on Base44 and Base44 goes down, you wait.
Who Should Use Base44Three audiences:
1. Non-developer founders validating an idea. If you have an idea, no engineering team, and 30 days to find out whether anyone cares — Base44 will let you ship something testable today. Paying a developer $5,000+ for an MVP that might fail validation is wasteful. Use Base44, get to "do people use this" fast, then re-platform if the answer is yes.
2. Operators building internal tools. Dashboards, admin panels, internal CRUD tools, team workflows — Base44 is great at these because they don't have to scale, they don't need stunning design, and the integrations (Slack, Sheets, Stripe) cover most internal-tool needs.
3. Hackathon and demo builders. If you have 48 hours to ship something that looks polished, Base44's combination of speed + visual editor + native integrations is hard to beat.
Who Should Avoid Base44Most other people, honestly:
• Anyone with a real engineering team. If you have devs, give them Cursor or Windsurf and let them ship. They'll move faster than Base44 and own the code.
• Anyone planning to scale past a few thousand users. The lock-in becomes a serious problem once your data has value.
• Anyone with strict compliance requirements. HIPAA, SOC 2, custom data residency, audit logs — Base44's managed infrastructure isn't built for these.
• Anyone shipping a product with intricate business logic. Fintech, complex marketplace dynamics, ML-powered features — Base44 will fight you the whole way.
• Anyone who knows they'll want to switch platforms within 12 months. The migration pain isn't worth it. Use Lovable (Supabase-backed, easier to migrate) instead.
Base44 vs the Alternatives (Quick Reference)
Tool
Includes DB + Auth?
Starting price
Best for
Migration friendly?
Base44
Yes (managed)
$16/mo annual
Fastest MVP, internal tools
No
Lovable
External Supabase
$25/mo Pro
Production-track apps
Yes (standard stack)
Bolt
External Supabase
$20/mo Pro
Speed-to-demo
Yes (full code export)
Replit
Configurable
$25/mo Core
Learning + collaboration
Yes
If you're trying to pick between these specifically, the dedicated comparisons go deeper:
• Lovable vs Cursor (the developer track)
• Replit vs Cursor (the prototyping track)
• Bolt vs Lovable (the speed-to-demo head-to-head)
The Wix Acquisition: What ChangedThis part matters for anyone evaluating Base44 today vs. people who were using it pre-acquisition. Three concrete changes since June 2025:
1. Pricing trended up 15–30%. Same plan name, slightly worse credit-to-dollar ratio. The Free tier got more restrictive specifically — what used to be 100 free messages is now 25.
2. Support response slowed. Pre-acquisition, the Base44 team was responsive within hours. Post-acquisition, non-enterprise users now wait days or weeks. This is a normal corporate acquisition pattern, but it's a real change.
3. The Feb 3, 2026 outage exposed the SLA gap. A multi-hour platform-wide outage during business hours. No contractual SLA for non-enterprise tiers means apps built on Base44 take the downtime with no recourse. If your app is mission-critical, this is the kind of risk you should understand before committing.
The product itself is still shipping features fast. The platform isn't going anywhere — Wix backing means more legitimacy than a typical 2-year-old startup. But the operational risk profile is now higher, not lower, than pre-acquisition.
The Bottom LineBase44 is the right tool for a narrow slice of builders: non-developers validating an idea in under 30 days, operators shipping internal tools, and hackathon builders chasing speed-to-demo. For that slice, it's faster than any alternative I've evaluated.
For everyone else — anyone with engineering resources, anyone planning to scale, anyone with compliance constraints, anyone who'll outgrow the platform within a year — there are better choices. Lovable for production-track apps. Cursor or Claude Code for actual development. The full vibe coding tools breakdown for the broader landscape.
If you've built something on Base44 that's working but you've hit the iteration ceiling — or you're trying to figure out whether to start there in the first place — book a free 15-min strategy call. I do this work daily: 8+ Base44 apps currently in maintenance, plenty more rescued or rebuilt. I'll give you a specific recommendation for your situation in 10 minutes. No pitch.
### Frequently Asked Questions
**Q: Is Base44 free?**
A: Base44 has a Forever Free plan that includes 25 message credits, 100 integration credits, and unlimited apps. It's enough to evaluate the tool but not enough to ship a real MVP — most users hit the credit ceiling within a few hours of serious building. Paid plans start at $16/month (annual billing) or $20/month.
**Q: Is Base44 legit?**
A: Yes. Base44 launched in late 2024 and was acquired by Wix in June 2025 for a reported $80 million, just six months after launch. By Q1 2026 the platform had over 2 million users and $100M ARR. The Wix backing means the platform isn't going anywhere — though some users have raised concerns about the post-acquisition changes (slower support, 15–30% higher effective pricing).
**Q: Who owns Base44?**
A: Wix acquired Base44 in June 2025. Base44 now operates under the Wix "Vibe Coding" product line, alongside other Wix AI website tools.
**Q: What does Base44 cost in 2026?**
A: As of 2026, Base44 pricing is: Free Forever (25 credits), Starter at $16/month annual ($20 monthly), Builder at $40/month annual ($50 monthly), and Pro at $80/month annual ($100 monthly). The Builder plan and above unlocks model selection (Claude Opus 4.5, Sonnet 4.5, Gemini 2.5/3 Pro, GPT-5). Free Forever has unlimited apps but the message credits run out fast.
**Q: What is Base44 actually good at?**
A: Base44 is best at shipping a working full-stack MVP in a single day with zero infrastructure setup. Frontend, backend, database, authentication, and hosting are all included in the platform — you don't connect Supabase or buy a Vercel plan separately. It works well for: internal admin tools, dashboards, hackathon demos, early-stage prototypes, and validating startup ideas before writing real code.
**Q: What does Base44 do badly?**
A: Three real limitations: (1) limited visual control over advanced design — sophisticated visual designs are hard to pull off, (2) unpredictable credit burn — the AI sometimes gets stuck and wastes credits on retries, (3) custom business logic complexity — it struggles with anything that needs intricate conditional flows or multi-step state machines. Also: the integrated database creates vendor lock-in. Migrating data out if you outgrow the platform is genuinely difficult.
**Q: Is Base44 better than Lovable?**
A: They're close — both ship full-stack apps from a single prompt at similar starting prices ($16–$20/month). The biggest difference: Base44 includes its own database and auth out of the box, while Lovable generates apps that use Supabase (which you have to manage separately). Base44 wins for the absolute fastest path from prompt to deployed app. Lovable wins if you want more standard tech (React + Supabase) that's easier to migrate or hand off to a real developer.
**Q: Did the Wix acquisition change Base44?**
A: Yes, in three measurable ways: (1) effective pricing trended 15–30% higher per equivalent app footprint, (2) support response times shifted from hours to days or weeks for non-enterprise users, and (3) a multi-hour platform-wide outage on February 3, 2026 highlighted the absence of any contractual SLA outside of enterprise plans. The core product still ships features fast — but the operational risk profile increased.
**Q: Can I migrate off Base44 if I outgrow it?**
A: It's hard. The integrated database is the main lock-in point — your app's data lives in Base44's managed Postgres, and there's no clean export-and-go path. GitHub code export exists but was still in beta as of mid-2026 on most plans. If you think you'll outgrow Base44 within 12 months, consider building on Lovable or Bolt instead — both use external Supabase by default, which is much easier to migrate from.
**Q: Should a non-developer use Base44 or hire someone?**
A: For validating an idea fast (under 30 days), use Base44 — paying a developer $5,000+ for an MVP you might throw away is wasteful. If the idea works and you have paying users, that's when you bring in a developer to either extend the Base44 app or rebuild on a more flexible stack. The mistake is staying on Base44 past month 3 of a working product — the iteration ceiling will start costing you customers.
---
## Small Business AI Consultant: When You Need One (2026)
- **URL:** https://justinmckelvey.com/blog/small-business-ai-consultant
- **Published:** May 23, 2026
- **Updated:** May 23, 2026
- **Category:** AI for Business
- **Reading time:** 6 min
- **Description:** Small business AI consultants in 2026: who needs one, who doesn't, pricing for SMBs, and the free alternatives most businesses should try first.
Quick Answer
Most small businesses (under $5M revenue) don't actually need an AI consultant. The strategy work is small enough that a focused founder can absorb it themselves in a weekend with the free AI Readiness Checklist. When consulting IS worth it: $5M+ revenue, multiple AI projects with no single owner, regulated industry needs, or specific implementation work you can't do in-house. SMB-focused AI consultants charge $3K-$25K per engagement — avoid anyone quoting $25K+ for small business work. The honest first step is the free 20-minute strategy call.
Reviewed May 2026 · Author: Justin McKelvey, fractional CTO, 50+ products shipped
TL;DR: Small Business AI Consultants in 2026The most common small business AI mistake in 2026 is paying for consulting before earning it. Owners hear "we should be using AI," see big-firm proposals for $100K AI strategy engagements, and either commit budget they shouldn't or get scared off entirely. Both responses are wrong.
The reality: most $1M-$10M businesses don't need an AI consultant at all. At that scale, the strategy work is small enough that the founder can absorb it themselves in a focused weekend with the right free resources. When consulting IS the right move, the price point is dramatically lower than what enterprise consultants charge — $3K-$25K for a focused engagement, not $100K+.
This guide is the honest breakdown from a fractional CTO who serves $1M-$50M businesses every day. When to hire, when to skip, what to pay, and what to avoid.
When Small Businesses Actually Need a ConsultantFive signals indicate it's time:
Signal
What it means
$300+/month in unused AI tools
Multiple subscriptions across the team, adoption under 30%. Consultant helps cut the noise and pick the one that matters.
Can name the workflow, can't pick the tool
You know what you want AI to do but don't know which tool, how to configure it, or how to integrate it. Solvable with one focused engagement.
AI output quality has been disappointing
You've tried AI and it didn't work well. You don't know if the issue is the model, the prompt, or the use case. Consultant can diagnose in hours.
Specific business need needs AI judgment
Legal/compliance review, data sensitivity, customer-facing AI features. Stakes are high enough that outside expertise is justified.
$5M+ revenue with AI becoming material
Past the scale where the founder can absorb all strategy thinking themselves. Outside structured thinking starts justifying its cost.
If 2+ apply, consulting is worth it. If 0-1 apply, free resources are usually sufficient.
When to Skip the Consultant EntirelyFive signals you should NOT pay for consulting yet:
2. Your business is under $5M revenue, AND you can name three workflows where AI would change the outcome. You don't need outside strategy — you need to ship.
4. You haven't tried AI on a single workflow yet. Skip strategy. Pick one workflow. Run AI on it for seven days. Learn from real data. THEN consider consulting.
6. You have an internal AI champion — someone on the team who's already using Claude or ChatGPT daily and has opinions. Give them budget and time, not an outside consultant.
8. Your AI question is "should we use AI" — not "how do we use it better." If the question is binary at that level, the answer is "try it on something small for a week." No consultant needed.
10. You don't have $5K-$15K budget for the engagement. Don't stretch for AI consulting. Start with free resources, build budget through demonstrated ROI, then engage when the economics work.
AI Consulting Pricing for Small Businesses
Engagement type
Typical price range
Duration
What you get
Free strategy call
$0
20-30 min
Gut-check on fit, free recommendations, no commitment
Free self-assessment
$0
5-15 min
AI Readiness score, tier, recommendation
Focused audit (1-2 weeks)
$3K-$10K
1-2 weeks
Workflow audit + written recommendations
AI Readiness Assessment
$5K-$25K
2-4 weeks
Full written roadmap with 30-60-90 day plan
Implementation sprint
$10K-$30K
4-8 weeks
Ships one specific AI workflow end-to-end
Fractional CTO retainer
$5K-$10K/month
6-18 months
Ongoing senior leadership, AI + broader tech
(For comparison) Enterprise firm
$100K-$500K
3-6 months
Heavy methodology, junior team doing work, brand cover
For most $1M-$10M businesses, the right entry point is the free strategy call + free self-assessment. From there, if the ROI is clear, scale into the $5K-$25K range. Don't commit to $25K+ engagements before you've seen results from a smaller one.
What to Look for in a Small Business AI ConsultantThree filters:
1. SMB experience specifically. Enterprise AI consultants from Fortune 500 backgrounds often struggle with small business constraints — limited engineering team, no AI infrastructure, real budget caps, founder-direct decision-making, no internal data team. Their playbooks were built for environments you don't have. Look for consultants with case studies at companies your size.
2. Shipped AI products end-to-end. Ask: "Walk me through three AI products you've personally built." If they can't, they're a strategy-only consultant. Their roadmap will be optimistic in the wrong places because they've never had to debug the messy implementation reality.
3. Fixed-fee scoped engagements. SMB AI consulting should be productized — defined scope, fixed price, specific deliverable. If the consultant only quotes hourly with no cap, walk away. Open-ended hourly billing is how $10K engagements become $50K.
Red Flags for Small Business AI ConsultingSpecific warnings to watch for:
• "AI transformation" framing without specific workflows. Small businesses don't need transformation. They need one or two workflows shipped well.
• 6-month strategy phases before any implementation. By the time the strategy lands, the AI landscape has shifted.
• Quotes over $25K for a Readiness Assessment. At SMB scale, that's a sign of enterprise pricing for SMB work.
• "Discovery phase" billing before scope is defined. The upsell vector.
• Refusal to do implementation work. SMBs need consultants who can also ship, not just strategize.
• Brand-name partner on the contract, junior staff doing the work. Common at firms that took on SMB clients to fill capacity. You're paying senior rates for associate output.
• Inability to name specific tools and integrations. "Evaluate the LLM landscape" is not a recommendation. "Use Claude with a custom Project, integrated via Zapier" is.
Free Resources to Try FirstBefore paying anyone:
2. Take the free 30-question AI Readiness Checklist. 5 minutes. Gives you a score across data, workflows, team, governance, execution. Tells you which tier you're in and what to focus on. Available here.
4. Book a free 20-minute strategy call. No pitch, just a gut-check on whether you actually need outside help. Many calls end with "you don't need to hire anyone — here's the one thing to try first." Book here.
6. Read the AI Discoverability Checklist. Free 12-point self-audit of how visible your business is to AI tools. Available here.
8. Subscribe to the free weekly newsletter. Tactical AI workflows, one per week, no fluff. Sign up here.
Most $1M-$5M businesses can get 80% of the value of AI consulting from these free resources. The paid 20% is real but optional.
The Engagement I Offer (Transparent Pricing)I work with $1M-$50M businesses and the SMB segment specifically:
• Free 20-minute strategy call — gut-check on fit, no pitch. Book here.
• AI Readiness Assessment — fixed-fee 2-week engagement producing a written roadmap. Pricing discussed on the strategy call. Details.
• Implementation sprint — project-based, ships one AI workflow end-to-end. 4-8 weeks.
• Fractional CTO retainer — ongoing 8-15 hours/week, $5K-$15K/month. Best for businesses going deep on AI.
I don't do hourly billing, 6-month strategy phases, or slide-deck deliverables. If those are what you need, an enterprise consulting firm is the right fit and I'm not.
The Cluster: Going Deeper
• Free AI Readiness Checklist — Start here.
• What is AI Readiness? — The framework explained.
• What is an AI Readiness Assessment? — Deep dive on the paid engagement.
• AI Consultant: What They Do, Cost, How to Hire — Broader landscape.
• AI Consulting Firm vs Solo Consultant — Which to hire.
• Fractional Chief AI Officer — Senior AI hire alternative.
Working with a Fractional CTOIf you're a $1M-$10M business considering AI consulting:
2. Start free — take the AI Readiness Checklist and book a free 20-minute strategy call. Both cost nothing and give you a clear next step.
4. Don't pay $25K+ for SMB AI work — at that price you're paying enterprise overhead. Solo specialists and fractional CTOs serving SMBs offer better quality at 20-40% of that cost.
6. Productized engagements only — fixed-fee, defined scope, written deliverable. Hourly billing is the trap.
Full engagement options on the Work With Me page.
### Frequently Asked Questions
**Q: Do small businesses need an AI consultant?**
A: Most don't, especially under $5M revenue. At that scale, the strategy work small enough that a focused founder can absorb it themselves in a weekend with the free AI Readiness Checklist plus this article. Where consultants become worth it: $5M+ revenue, multiple AI projects with no single owner, regulated industry requiring governance expertise, or specific implementation work you can't do in-house. Below those thresholds, $5K-$25K of consulting budget produces better returns going directly into tool subscriptions and team training.
**Q: How much does a small business AI consultant cost?**
A: Solo AI consultants serving small businesses typically charge $150-$300/hour or $3K-$15K for scoped engagements. Fractional CTOs with AI experience charge $5K-$10K/month retainers for ongoing work. Free 20-minute strategy calls and self-assessment tools (like the AI Readiness Checklist) are also widely available. Avoid: anyone charging $25K+ for small business AI work — at that price you're paying for big-firm overhead you don't need. Avoid: open-ended hourly billing without a cap.
**Q: What does an AI consultant do for a small business?**
A: Five typical engagements: (1) AI Readiness Assessment — 2 weeks, $5K-$15K, produces a written roadmap with workflow audit and 30-60-90 day plan. (2) Implementation sprint — 4-8 weeks, $10K-$30K, ships one specific AI workflow end-to-end. (3) Fractional CTO retainer — ongoing $5K-$10K/month for embedded senior leadership. (4) Team training — 2-day workshop, $3K-$10K, teaches your team to use AI effectively. (5) Strategy call — 20-60 minutes, often free, gut-checks whether you need anything formal. For most small businesses, #1 or #5 is the right starting point.
**Q: When should a small business hire an AI consultant?**
A: Five signals it's time: (1) You're paying for $300+/month in AI tools across the team and adoption is under 30%. (2) You can name a workflow you want AI to fix but don't know which tool or how to integrate it. (3) Your team has tried AI for a project and the output quality wasn't good enough — you don't know if it's the AI, the prompt, or the use case. (4) A specific business need requires AI judgment (legal/compliance review, data sensitivity, customer-facing AI features). (5) You've grown past $5M revenue and AI is becoming meaningful enough to deserve outside expertise. If 2+ apply, consulting is worth it. If 0-1 apply, free resources are usually enough.
**Q: What should a small business look for in an AI consultant?**
A: Three filters: (1) SMB experience — they've worked with $1M-$10M businesses, not just enterprises. Enterprise consultants often miss the constraints (limited team, real budget caps, founder-direct decisions) that define small business reality. (2) Shipped AI products — they've personally built and shipped AI products, not just consulted on them. (3) Fixed-fee scoped engagements — they offer productized assessments and projects, not open-ended hourly work. Watch for red flags: 'AI transformation' framing, 6-month strategy phases before execution, opacity in pricing.
**Q: Can a small business afford an AI consultant?**
A: Yes — small business AI consulting is meaningfully cheaper than enterprise AI consulting. Solo specialists serving SMBs offer engagements as small as $3K-$5K for a focused audit. Free 20-minute strategy calls give you a starting point at zero cost. Self-assessment tools (free AI Readiness Checklist) give you a tier and recommendation in 5 minutes. The honest framing for a $1M-$5M business: budget $3K-$15K for the first engagement, see if the ROI justifies more. Don't commit to $25K+ engagements before seeing what a small one produces.
**Q: What's the difference between a small business AI consultant and an enterprise AI consultant?**
A: Three key differences. (1) Pricing — SMB consultants charge $3K-$25K per engagement; enterprise consultants charge $100K-$500K+. (2) Scope — SMB engagements focus on one or two workflows with fast implementation; enterprise engagements span multiple business units over months. (3) Approach — SMB consultants do hands-on implementation work themselves; enterprise consultants hand off to your internal team or to a separate implementation engagement. Don't hire enterprise consultants for SMB work — the cost is wrong and the methodology doesn't fit the constraints.
**Q: Should a small business hire a fractional CTO instead of an AI consultant?**
A: Often yes. For ongoing AI work (not a one-time engagement), a fractional CTO with AI experience usually beats an AI consultant. The fractional CTO covers AI plus broader technology leadership — engineering decisions, security, vendor management, team development — at similar pricing ($5K-$10K/month retainer). AI consultants do one-time engagements; fractional CTOs build relationships that compound. For most $1M-$10M businesses, the right path is: start with a one-time AI Readiness Assessment, then engage a fractional CTO if you decide to invest meaningfully in AI.
---
## What is an AI Readiness Assessment? (2026 Guide)
- **URL:** https://justinmckelvey.com/blog/what-is-an-ai-readiness-assessment
- **Published:** May 23, 2026
- **Updated:** May 23, 2026
- **Category:** AI for Business
- **Reading time:** 6 min
- **Description:** AI Readiness Assessment explained: what it covers, what you get, $5K-$25K typical cost, and when to run one (vs the free checklist).
Quick Answer
An AI Readiness Assessment is a 2-4 week structured engagement that evaluates your business across five readiness dimensions and produces a written roadmap. Solo practitioners charge $5K-$25K; big firms charge $100K-$500K for the same deliverable wrapped in more slides. The output is a 15-25 page document covering workflow audit, prioritized opportunities, specific tool recommendations, governance guidance, and a 30-60-90 day implementation plan. Start with the free 30-question checklist first — most businesses don't need the paid version.
Reviewed May 2026 · Author: Justin McKelvey, fractional CTO, 50+ products shipped
TL;DR: AI Readiness Assessment ExplainedAn AI Readiness Assessment is the productized version of AI strategy consulting. Instead of open-ended hourly engagement with a flexible scope, you get a defined 2-4 week project, fixed price, and a specific written deliverable. The deliverable measures your business across five dimensions (data, workflows, team, governance, execution) and produces a roadmap with specific workflows, ranked opportunities, tool picks, and a 30-60-90 day implementation plan.
This guide covers what one actually delivers, what it costs, who runs them, and when you should skip the paid version and just take the free self-assessment instead. From a fractional CTO who's both delivered them and watched clients waste money on the wrong ones.
What's Actually in the DeliverableA complete AI Readiness Assessment delivers a 15-25 page written document — not a slide deck. Six core sections:
Section
Length
What's in it
Executive Summary
1 page
Your readiness tier, top 3 recommendations, key risks
Current State Analysis
3-5 pages
What you're doing well, where you're stuck, scored across 5 dimensions
Opportunity Map
3-5 pages
AI use cases ranked by ROI (impact × feasibility)
Tool Recommendations
2-3 pages
Specific products to use — Claude, OpenAI, integration tools, plus rationale
Governance Section
2-3 pages
Data handling rules, output review processes, vendor risk, regulatory considerations
Implementation Plan
3-5 pages
30-60-90 day plan with named workflows, owners, ship dates
Big-firm assessments often substitute slide decks for written documents. Avoid. A 60-slide deck communicates less actionable content than a 15-page written report, and signals the consultant optimized for the meeting room over the actual work.
How an AI Readiness Assessment Works (Typical 2-Week Engagement)The standard productized assessment runs on a tight timeline:
Week 1 — Discovery
• Days 1-2: Kickoff call (60-90 min). Scope confirmation, access setup, intro to key stakeholders.
• Days 3-5: Team interviews. 4-8 conversations with department leads about their workflows, pain points, existing AI usage.
• Days 6-7: Workflow audit. The consultant maps every repeatable workflow that touches data or decision-making.
Week 2 — Synthesis
• Days 8-9: Opportunity ranking. Each workflow scored on impact × feasibility. ROI estimated where possible.
• Days 10-11: Tool selection and governance drafting. Specific product recommendations, data handling rules, review processes.
• Days 12-13: Roadmap synthesis. 30-60-90 day implementation plan with named owners.
• Day 14: Walkthrough call. 1-hour debrief of the deliverable, Q&A, next steps.
Big-firm engagements expand this to 6-12 weeks. The actual work doesn't take longer — the extra time is stakeholder management, slide production, and internal review cycles that don't add value to the deliverable.
How Much Does an AI Readiness Assessment Cost?
Provider type
Price range
Duration
Quality signal
Solo AI specialist / Fractional CTO
$5K-$25K
2-4 weeks
Often highest — practitioner doing the work
Boutique AI consulting firm
$25K-$75K
4-6 weeks
Good for mid-market with complex stakeholder maps
Big consulting firm (Deloitte, Accenture, etc.)
$100K-$500K
8-16 weeks
Brand cover; quality varies by team
Free self-assessment
$0
5 minutes
Sufficient for most Curious and Equipped tier businesses
For most $1M-$50M businesses, the solo specialist option is the right fit. Big-firm assessments rarely justify their premium unless you specifically need brand cover for a board presentation, you're in a regulated industry requiring firm sign-off, or you have $100K+ of consulting budget that has to be spent.
When to Run an AI Readiness Assessment (vs Skip It)The honest read:
Run the paid assessment if:
• The free checklist shows you're "Equipped" or "Operating" tier (top half of the readiness range)
• You have $5K-$25K of budget for outside expertise
• You can dedicate 2 weeks of team availability for interviews and workflow review
• You'll actually act on the resulting roadmap (not file it away)
• The decisions on the table justify outside structured thinking
Skip the paid assessment if:
• The free checklist shows you're "Curious" tier — focus on shipping one workflow first
• You can name three specific workflows where AI would change the outcome (you don't need a roadmap to find them)
• You have a strong internal AI champion who can absorb the strategy thinking themselves
• Your business is small enough (under $5M revenue) that one focused weekend with the free checklist + this article is enough
Most businesses I talk to don't actually need a paid assessment — they need to ship one workflow first and run the assessment after they have real data to feed into it.
What Makes a Good AI Readiness Assessment ProviderThree filters that matter:
1. They've personally shipped AI products end-to-end. Assessments produced by consultants who've never built AI miss the implementation reality. Ask: "Walk me through three AI products you've personally built. What almost killed each one?" If they can't, they're a strategy-only consultant — their roadmap will be optimistic in the wrong places.
2. They produce written deliverables, not slide decks. 60-slide decks signal the consultant optimized for presentation theater, not actionable content. 15-25 page written documents force specificity.
3. They include specific tool recommendations. "Use Claude Pro with a Project configured for X, integrated via Y" is a recommendation. "Evaluate the LLM landscape across these dimensions" is a framework. Pay for recommendations.
Plus the negative filter: avoid providers who quote "discovery phases" longer than 2 weeks before any deliverable. That's the upsell vector.
The Engagement I OfferFull transparency on what I deliver:
• Format: 2-week engagement, written 15-25 page roadmap, 1-hour walkthrough call
• Price: Fixed fee (discussed on the free strategy call once we've confirmed fit)
• Capacity: Capped at 2-3 engagements per month so quality stays high
• Credit: Fee credits against any follow-on build work if you engage further
• No hourly billing, no 6-month "discovery phase"
More on the AI Readiness Assessment here.
The Cluster: Going Deeper
• Free 30-Question AI Readiness Checklist — Start here. Most businesses don't need the paid assessment.
• What is AI Readiness? — The five dimensions and four tiers framework.
• AI Consultant: What They Do, Cost, How to Hire — Broader hiring landscape.
• AI Strategy Consultant — Honest take on the broader strategy consulting category.
• Small Business AI Consultant — SMB-specific guidance.
Working with a Fractional CTOThe right starting point for most operators reading this:
2. Run the free AI Readiness Checklist — 5 minutes, gives you a tier and a recommendation. Most businesses discover they don't need a paid assessment.
4. If the checklist suggests you're ready, book a free 20-minute strategy call to gut-check whether a paid assessment is the right next step.
6. If we agree it is, the AI Readiness Assessment is the productized engagement: 2 weeks, fixed fee, written roadmap.
### Frequently Asked Questions
**Q: What is an AI Readiness Assessment?**
A: An AI Readiness Assessment is a structured engagement (typically 2-4 weeks, $5K-$25K) that evaluates your business across five dimensions of AI readiness — data, workflows, team, governance, execution — and produces a written roadmap with specific recommendations. The output is a 15-25 page document covering workflow audit, prioritized opportunities ranked by ROI, specific tool recommendations, governance guidance, and a 30-60-90 day implementation plan. It's the productized version of AI strategy consulting.
**Q: How much does an AI Readiness Assessment cost?**
A: Pricing varies by consultant. Solo fractional CTOs and AI specialists typically charge $5K-$25K for a 2-4 week engagement producing a written roadmap. Boutique consulting firms charge $25K-$75K for the same deliverable wrapped in more slides. Big consulting firms (Deloitte, Accenture) charge $100K-$500K. Quality is often inverse to price — focused solo practitioners who've shipped real AI products typically produce better roadmaps than big-firm associates running pattern-matched playbooks.
**Q: What's included in an AI Readiness Assessment?**
A: A complete assessment includes six core deliverables: (1) Workflow audit — every repeatable workflow in your business mapped, with AI applicability scored. (2) Opportunity map — AI use cases ranked by ROI (impact × feasibility). (3) Tool recommendations — specific products to use (Claude vs ChatGPT vs Gemini, plus integration tools). (4) Governance guidance — data handling rules, output review processes, vendor risk. (5) 30-60-90 day implementation plan — what to ship first, second, third, with named owners and dates. (6) 1-hour walkthrough call to debrief the deliverable.
**Q: How long does an AI Readiness Assessment take?**
A: Most productized assessments take 2 weeks from kickoff. Week 1 is discovery — interviews with your team, workflow audit, data review, current-state mapping. Week 2 is synthesis — ranking opportunities, drafting the roadmap, tool recommendations, governance guidance. The deliverable lands on day 14, followed by a 1-hour walkthrough call. Big-firm engagements run longer (6-12 weeks) because they include more stakeholder management and slide production, not because the actual work takes longer.
**Q: Do I need an AI Readiness Assessment or just the free checklist?**
A: Start with the free 30-question AI Readiness Checklist. It takes 5 minutes and produces a score across the five readiness dimensions, plus a tier (Curious, Equipped, Operating, Compounding). Most businesses discover they don't need a paid assessment — they just need to ship one workflow. The paid AI Readiness Assessment is the right next step only if: (1) the checklist shows you're 'Equipped' or 'Operating' tier and you want a deeper expert review, (2) you have $5K-$25K of budget and 2 weeks to invest, (3) you're going to actually act on the roadmap. If those don't apply, skip the paid assessment.
**Q: What does the deliverable look like?**
A: A complete AI Readiness Assessment deliverable is a 15-25 page written document — not a slide deck. Typical sections: (1) Executive summary (1 page) — your tier, top 3 recommendations. (2) Current state analysis (3-5 pages) — what you're doing well, where you're stuck. (3) Opportunity map (3-5 pages) — ranked use cases with ROI analysis. (4) Tool recommendations (2-3 pages) — specific products and why. (5) Governance section (2-3 pages) — data handling, review processes, vendor risk. (6) Implementation plan (3-5 pages) — 30-60-90 day plan with named owners and dates. The slide-deck format common from big consulting firms is usually a sign of inferior work.
**Q: Who runs an AI Readiness Assessment?**
A: Three types of providers: (1) Solo AI specialists or fractional CTOs — fastest, cheapest, often highest quality. Best for $1M-$50M businesses. Pricing $5K-$25K. (2) Boutique AI consulting firms — mid-tier price ($25K-$75K), structured methodology, longer engagement (4-6 weeks). Best for mid-market with complex stakeholder maps. (3) Big consulting firms — most expensive ($100K-$500K), longest engagement (8-16 weeks), heaviest documentation. Best for Fortune 1000 or regulated industries needing brand cover. For most operators, option #1 is the right fit.
**Q: How is an AI Readiness Assessment different from AI strategy consulting?**
A: AI strategy consulting is open-ended hourly work with a flexible scope and deliverable. AI Readiness Assessment is the productized version — defined scope (2 weeks), fixed price, specific deliverable (15-25 page written roadmap). Both produce similar output. The difference is predictability: with a productized assessment, you know exactly what you'll get and what it costs. With open-ended strategy consulting, scope can expand and bills can grow. For most operators, the productized assessment is the better starting point.
---
## What is AI Readiness? (And How to Measure Yours in 2026)
- **URL:** https://justinmckelvey.com/blog/ai-readiness
- **Published:** May 23, 2026
- **Updated:** May 23, 2026
- **Category:** AI for Business
- **Reading time:** 5 min
- **Description:** AI readiness explained: five dimensions, four tiers, the diagnostic question that exposes whether you're ready, and how to measure yours.
Quick Answer
AI readiness is whether your business can productively adopt AI — measured across five dimensions: data, workflows, team capability, governance, and execution. A simple diagnostic question exposes it instantly: can you name three workflows in your business where AI would change the outcome this quarter? If yes, you're ready. If no, you're not — and the gap is rarely the AI itself. It's everything around the AI. This is the framework used in the free 30-question AI Readiness Checklist.
Reviewed May 2026 · Author: Justin McKelvey, fractional CTO, 50+ products shipped
TL;DR: AI Readiness in 2026AI readiness is the precondition for AI ROI. Most failed AI initiatives are organizational failures, not technological ones. Companies that fail blame "the AI didn't work" when the real cause was data wasn't clean, no one owned the workflow, governance wasn't in place, or the team rejected the tool. Readiness measures the gap between "we have AI access" and "we get AI value."
This guide is the honest framework — five dimensions, four tiers, and the diagnostic question that exposes readiness in 60 seconds. From a fractional CTO who's audited dozens of $1M-$50M businesses and watched the same patterns play out.
The Five Dimensions of AI ReadinessAI readiness isn't a single number. It's a score across five independently-necessary dimensions. A business strong on four and weak on one fails at AI adoption — the weak dimension becomes the bottleneck.
Dimension
What it measures
Signal you're weak here
1. Data
Right data, organized in ways AI can use
"Where's our customer data?" gets 3 different answers from 3 people
2. Workflows
Repeatable processes AI could improve
Can't name three workflows that have clear inputs and outputs
3. Team
Skills + willingness to use AI
Nobody has had 30 minutes of structured AI training
4. Governance
Data handling rules, review processes, vendor risk
No written policy on what data can/can't be pasted into ChatGPT
5. Execution
Ability to actually ship AI projects
You have lots of AI ideas but zero shipped AI workflows
The most common pattern at $1M-$50M businesses: strong on Team and Execution (you have capable people who can ship), weak on Data and Governance (no one owns data quality or written policies). The fix isn't more AI tools — it's the missing dimensions.
The Diagnostic QuestionOne question reveals AI readiness in 60 seconds:
“Can you name three workflows in your business — by exact name — where AI would change the outcome this quarter?”
If yes — you're ready. You have specificity, which means you've thought about the problem.
If no — you're not. The gap isn't strategy. The gap isn't tools. It's that you haven't named the workflow yet. The vast majority of AI investment fails because companies skip the naming step and jump straight to subscribing.
I run this question with every prospective client on the first 20-minute strategy call. About 70% can't answer it specifically. That's the leverage point — naming the workflow is the cheapest, highest-ROI step in the entire AI adoption journey.
The Four Readiness TiersThe 30-question AI Readiness Checklist scores you across the five dimensions and places you in one of four tiers:
Tier
Score range
Where you are
What to do next
Curious
0-10/30
AI is on your radar but nothing is shipping
Pick one workflow. Run AI on it for 7 days. Measure.
Equipped
11-18/30
You have tools but no system
Connect AI to one repeatable workflow. Build the second.
Operating
19-25/30
AI is part of how you work
Governance + agents. Durable systems beyond individual heroics.
Compounding
26-30/30
You're ahead of the market
Custom builds. Internal tools. AI-powered customer features.
The honest data after running thousands of these self-assessments: most $1M-$50M businesses score Equipped (11-18). That means the highest-leverage move is almost never "buy more AI tools." It's "connect existing AI capacity to a specific workflow and measure what happens."
Why AI Readiness Matters More Than AI StrategyMost AI strategy work is wasted on businesses that aren't ready to execute it. A beautifully written 40-page AI roadmap delivered to a Curious-tier business produces zero results because the business doesn't have the underlying readiness to implement.
The right sequence is:
2. Measure readiness — Where are you actually?
4. Close the lowest-dimension gap first — Don't write strategy until you've fixed the bottleneck.
6. Pick one workflow — Specificity beats strategy.
8. Ship it, measure it — Data trumps theory.
10. Repeat — Each cycle increases readiness for the next.
This is why I built the free AI Readiness Checklist — to give business owners a fast diagnostic before any consultant or strategy work. About 30% of people who take it conclude they don't need to hire anyone — they need to ship one workflow first.
How to Measure Your AI ReadinessThree options in increasing depth:
1. The 60-second gut check. Answer the diagnostic question above. Can you name three specific workflows? That alone tells you most of what you need.
2. The free 30-question AI Readiness Checklist. Takes 5 minutes. Scores you across all five dimensions and places you in a tier with a specific recommendation. Available here.
3. A formal AI Readiness Assessment. Paid 2-week engagement with a fractional CTO that produces a 15-25 page written roadmap covering workflow audit, prioritized opportunities, specific tool recommendations, governance guidance, and a 30-60-90 day implementation plan. The productized version of AI strategy consulting. Details here.
The right starting point for almost everyone is #1 or #2. Don't pay for #3 until you've run the self-assessment and confirmed you actually need outside expertise.
The Cluster: Going Deeper
• Free AI Readiness Checklist — The 30-question self-assessment (5 min).
• What is an AI Readiness Assessment? — Deep dive on the paid 2-week engagement.
• AI Consultant: What They Do, Cost, How to Hire — Broader hiring landscape.
• Chief AI Officer: Role, Salary, When to Hire — Executive role overview.
• Small Business AI Consultant — SMB-specific guidance.
Working with a Fractional CTOI help $1M-$50M businesses go from "AI-curious" to "AI-shipping" — usually starting with a readiness assessment and following through with implementation. If you've read this far:
2. Start free with the 30-question AI Readiness Checklist. 5 minutes. Gives you a tier and a recommendation.
4. If you want to talk it through, book a free 20-minute strategy call. No pitch — just a gut-check on what to do first.
6. If you're ready for a formal engagement, the AI Readiness Assessment is the productized starting point. 2 weeks, fixed fee, written roadmap.
### Frequently Asked Questions
**Q: What is AI readiness?**
A: AI readiness is the measure of whether your business can productively adopt artificial intelligence — across data, workflows, team capability, governance, and execution. A business with high AI readiness can deploy AI workflows in weeks and see real ROI. A business with low AI readiness pays for AI tools nobody uses, runs failed pilots, and concludes that 'AI doesn't work' when the real problem was organizational, not technological. Readiness is rarely about the AI itself — it's about everything around it.
**Q: What are the dimensions of AI readiness?**
A: Five dimensions: (1) Data — do you have the right data, organized in ways AI can use? (2) Workflows — do you have repeatable processes AI could improve? (3) Team — do people have the skills and willingness to use AI? (4) Governance — do you have data handling rules, output review processes, vendor risk policies? (5) Execution — can you actually ship AI projects, or do they die in committee? Each dimension is independently necessary. A business strong on four and weak on one will fail at AI adoption.
**Q: How do I measure my AI readiness?**
A: Three ways, in order of depth. (1) Quick gut check — name three workflows in your business where AI would change the outcome this quarter. Can you? That's your readiness. (2) Self-assessment — run a 30-question AI Readiness Checklist that scores you across the five dimensions and produces a tier (Curious, Equipped, Operating, Compounding). Takes 5 minutes. (3) Formal AI Readiness Assessment — a 2-week paid engagement with a fractional CTO or consultant that produces a written roadmap with specific workflows, tools, and 30-60-90 day plan. Most businesses start with #1 or #2 before considering #3.
**Q: What are the AI readiness tiers?**
A: Four tiers based on score across the five dimensions: (1) Curious (score 0-10/30) — AI is on your radar but nothing is shipping. The fix is picking one workflow and trying AI on it next week. (2) Equipped (11-18) — You have tools but no system. The opportunity is connecting AI to repeatable workflows. (3) Operating (19-25) — AI is part of how you work. The opportunity is governance + agents — durable systems. (4) Compounding (26-30) — You're ahead. The opportunity is custom builds — internal tools and AI features that give you defensibility.
**Q: Why does AI readiness matter?**
A: Because most failed AI initiatives are organizational failures, not technological ones. Companies blame 'the AI didn't work' when the real cause was: data wasn't clean, no one owned the workflow, governance wasn't in place, or the team rejected the tool. AI readiness measures the gap between 'we have AI access' and 'we get AI value.' Without readiness, AI subscriptions sit unused and pilots fail. With readiness, the same tools deliver real ROI.
**Q: Can a business become AI-ready quickly?**
A: Yes — readiness builds faster than most expect. The 30-day version: pick one workflow, document it, run AI on it for two weeks, measure time saved, present results to the team. That moves a 'Curious' business toward 'Equipped' in a month. The 90-day version: complete one full workflow adoption, build a second, formalize governance basics, train the team on AI norms. That moves 'Equipped' toward 'Operating.' The longer arcs (Operating → Compounding) take 6-12 months because they involve custom builds, but the early tiers progress fast.
**Q: What's the difference between AI readiness and AI maturity?**
A: Practically interchangeable — different consultants use different terms. 'AI readiness' tends to focus on the precondition: can you adopt AI? 'AI maturity' tends to focus on the current state: how sophisticated is your AI adoption? In practice, both frameworks measure the same five dimensions. If you see 'AI readiness assessment' and 'AI maturity assessment' offered by the same consultant, they're usually the same product with different marketing. Don't pay extra for terminology.
**Q: Who should run an AI readiness check?**
A: Every business owner asking 'should we be using AI?' should run one. The free self-assessment takes 5 minutes and gives you a clear answer. From there: if you score high (Operating or Compounding), you don't need outside help — you need to keep shipping. If you score middle (Equipped), a 2-week AI Readiness Assessment with a fractional CTO produces the most leverage. If you score low (Curious), focus on running one AI workflow for seven days before hiring anyone.
---
## Chief AI Officer vs Fractional CTO: Which to Hire (2026)
- **URL:** https://justinmckelvey.com/blog/chief-ai-officer-vs-fractional-cto
- **Published:** May 23, 2026
- **Updated:** May 23, 2026
- **Category:** AI for Business
- **Reading time:** 7 min
- **Description:** Chief AI Officer vs Fractional CTO: which do you actually need in 2026? Compare scope, pricing, fit, and decision criteria.
Quick Answer
For most $1M-$50M businesses, a Fractional CTO with AI experience beats a Chief AI Officer. Same fractional pricing ($60K-$180K/yr equivalent), broader scope (covers AI + engineering + security + vendor work), dramatically deeper talent supply. Chief AI Officer (full-time or fractional) wins only when AI is the core of your business model OR you're at $100M+ revenue with enough AI work to justify a dedicated executive. The actual question isn't "CAIO vs Fractional CTO" — it's "do I need one specialist or one generalist," and at SMB scale, the generalist almost always wins.
Reviewed May 2026 · Author: Justin McKelvey, fractional CTO + AI implementation lead, 50+ products shipped
TL;DR: CAIO vs Fractional CTO in 2026The Chief AI Officer is the newest C-suite role; the Fractional CTO is the established alternative for SMBs. Both can cover AI strategy and implementation. The choice depends on scope (AI-only vs broader tech), scale (under $50M vs over $100M), and whether AI is the entire business or one of many priorities.
This guide is the honest comparison from a fractional CTO who effectively plays the fractional CAIO role for several $1M-$50M businesses. Both roles work — the question is which fits your specific business.
CAIO vs Fractional CTO at a Glance
Dimension
Chief AI Officer (CAIO)
Fractional CTO (with AI focus)
Scope
AI specifically — strategy, governance, implementation
Technology broadly — AI plus engineering, security, vendor work, hiring
Cost (full-time)
$400K-$1M+ total comp
$250K-$600K total comp
Cost (fractional)
$60K-$180K/yr equivalent ($5K-$15K/mo)
$60K-$180K/yr equivalent ($5K-$15K/mo)
Talent supply
Genuinely thin — role only 18-36 months old
Deep — established role with 15+ year history
Best for company size
$100M+ revenue (full-time); $10M-$100M (fractional)
$1M-$50M revenue across the board
Best for business model
AI is the core (AI-native SaaS, AI agencies)
AI is one of several technology priorities
Industry fit
Regulated industries needing dedicated governance ownership
Most general business categories
Engagement compound
Yes — same as fractional CTO
Yes — relationship deepens over months
Where the Roles Actually OverlapIn smaller companies (under ~$50M revenue), CAIO and Fractional CTO scope overlap by 70-80%. Both cover:
• AI strategy and prioritization
• Tool and vendor selection for AI
• Team training on AI tools
• Governance and risk management for AI outputs
• Roadmap and portfolio management
• Implementation oversight (in fractional CTOs who also do hands-on work)
Where they diverge:
• CAIO only: Pure-play AI strategy at enterprise scale (Fortune 500 boards), dedicated AI talent strategy at scale, AI-specific regulatory ownership.
• Fractional CTO only: Non-AI engineering decisions (database design, security architecture, platform decisions), broader vendor management, non-AI hiring, general technical due diligence.
For most $1M-$50M businesses, the non-AI work (Fractional CTO scope) is still meaningful — you need someone covering security, vendor risk, and platform decisions alongside AI. A pure CAIO doesn't cover that. A Fractional CTO with AI experience does both.
When CAIO WinsFour scenarios where Chief AI Officer is the right hire:
1. AI is the core of your business model. Not just enabled by AI — built around it. AI-native SaaS companies whose product IS the AI capability. AI agencies whose service IS implementing AI for others. Companies where AI is the moat. At this kind of business, the CTO might be the AI infrastructure person and the CAIO is the AI capability strategist. Roles split naturally.
2. You're $100M+ revenue with substantial AI initiatives. At this scale, AI work justifies its own dedicated executive. The CTO can focus on broader platform and engineering while the CAIO drives AI roadmap. Both roles have enough work to be full-time.
3. You already have a CTO and need an AI specialist. Two-executive structure — the CTO covers broader tech, the CAIO complements with AI-specific depth. Common in scaling startups where the CTO is great on platform but not deep on AI.
4. You're in a regulated industry where AI governance requires dedicated executive ownership. Healthcare, financial services, defense — industries where AI decisions can create real regulatory exposure. Dedicated executive ownership often required.
When Fractional CTO WinsFour scenarios where Fractional CTO is the better hire:
1. AI is one of several technology priorities, not the only one. You also need product strategy, engineering hiring, security review, vendor management, technical due diligence. Fractional CTO covers all of this. CAIO doesn't.
2. You're $1M-$50M revenue and can't justify a full-time CAIO but need senior leadership. Fractional CTO scales to your business; fractional CAIO at this scale often has too narrow a scope to justify even fractional hours.
3. You don't yet know which technology areas will need the most attention. The broader CTO scope keeps you flexible as priorities shift. CAIO scope is narrower — if AI work slows down, you have an underutilized executive.
4. Your industry doesn't have regulatory pressure to silo AI from broader tech. Most general business categories don't require separate AI governance ownership. The CTO can handle AI risk alongside broader tech risk.
For most $1M-$50M businesses reading this, three or more of those apply — and the fractional CTO is the right hire.
The Hybrid: Fractional CTO Who Plays CAIOThe most common pattern in 2026 for $1M-$50M businesses isn't pure CAIO or pure CTO — it's a Fractional CTO whose engagement is heavily AI-weighted.
How this looks in practice:
• Title: Officially "Fractional CTO" but increasingly described as "Fractional AI Lead" or "Fractional Chief AI Officer" depending on the engagement's AI focus.
• Scope: 60-80% AI work (strategy, implementation, governance) + 20-40% broader tech work (security, vendor management, engineering hiring).
• Engagement structure: Same as standard fractional CTO — 8-15 hours/week, $5K-$15K/month retainer, 6-18 month engagement length.
• What you tell the board: Depends on what credibility framing serves you best. For board meetings, "our Fractional Chief AI Officer" sounds appropriately current. For team comms, "our Fractional CTO" sounds appropriately stable.
This is the most efficient hire for most $1M-$50M businesses because it combines breadth (CTO scope) with depth (AI focus) in one person without paying for two executives.
The Decision FrameworkTwo-question diagnostic to decide:
Question 1: What percentage of your technology roadmap is AI-related?
• Under 30%: Fractional CTO — AI is one of many things, not the main thing.
• 30-70%: Fractional CTO with AI focus — the hybrid model above.
• Over 70%: Fractional CAIO — AI is dominant; specialist makes sense.
• ~100%: Full-time CAIO — AI IS the business; dedicated executive justified.
Question 2: Do you already have technology leadership (full-time or fractional)?
• No: Hire fractional CTO. They cover both AI and broader tech.
• Yes — full-time CTO without deep AI experience: Add a fractional CAIO to complement.
• Yes — fractional CTO without deep AI experience: Either swap to one with AI experience OR add a fractional CAIO for the AI work specifically.
Common MistakesThe four most expensive mistakes I see operators make on this decision:
1. Hiring a CAIO because the title sounds current, not because the role is justified. "Chief AI Officer" sounds impressive in board decks. But if your business is $5M revenue with one AI project, you don't need an executive whose entire identity is AI. You need a Fractional CTO who can handle AI alongside everything else.
2. Hiring a pure-strategy CAIO who refuses implementation work. At SMB scale, you need executives who can ship AI, not just strategize about it. Look for people with shipped portfolios, not just consulting backgrounds.
3. Hiring an enterprise CAIO at SMB scale. CAIOs from Fortune 500 backgrounds often struggle with SMB constraints — limited engineering teams, real budget caps, founder-direct decision-making. Their playbooks don't translate.
4. Avoiding fractional engagement because "we need someone full-time on AI." At $1M-$50M, you genuinely don't have enough AI work for full-time leadership. Fractional gives you senior leadership at the percentage you actually need. The compounding relationship over 12+ months delivers more than a $400K full-time hire would in 6.
The Cluster: Going Deeper
• Chief AI Officer: Role, Salary, When to Hire — Full landscape of the CAIO role.
• Fractional Chief AI Officer: The SMB Path — Deeper dive on the fractional model.
• Fractional CTO vs Full-Time CTO — The closely related decision.
• AI Consultant: What They Do, Cost, How to Hire — Broader landscape of AI hires.
• How to Hire a Fractional CTO — Tactical guide for the recommended path.
Working with a Fractional CTO / Fractional CAIOI work with $1M-$50M businesses as a fractional CTO with deep AI implementation experience — effectively playing the fractional Chief AI Officer role at the scale where the title is just emerging. If you're trying to decide between CAIO and Fractional CTO for your business:
2. Free 20-minute strategy call — gut-check on which scope fits your business. Book here.
4. AI Readiness Assessment — 2 weeks, written roadmap, fixed fee. Defines scope for any subsequent fractional engagement. Details.
Full engagement options on the Work With Me page.
### Frequently Asked Questions
**Q: Should I hire a Chief AI Officer or a Fractional CTO?**
A: For most $1M-$50M businesses, a Fractional CTO with AI experience is the better hire than a Chief AI Officer (CAIO). The CTO role covers AI plus broader technology leadership in one person, the fractional model fits SMB economics, and the talent supply is dramatically deeper. Full-time CAIO makes sense only at enterprise scale ($100M+ revenue) or in industries where AI is the core of the business model. For most operators, the question 'CAIO or fractional CTO' is really 'do I want one specialist or one generalist' — and at SMB scale, the generalist usually wins.
**Q: What's the difference in scope between a CAIO and a Fractional CTO?**
A: CAIO is AI-specialized — strategy, implementation, governance, talent specifically for AI initiatives. Fractional CTO is broader — engineering platform, security, AI, vendor management, technical hiring. In smaller companies these roles overlap heavily; in larger companies they diverge. At enterprise scale, the CAIO can focus exclusively on AI while the CTO maintains the broader tech function. At $1M-$50M scale, you don't have enough work to justify both roles — a fractional CTO with AI focus covers both effectively.
**Q: How do the costs compare?**
A: Full-time CAIO: $400K-$1M+ total comp annually. Full-time CTO: $250K-$600K. Fractional CAIO: $60K-$180K/year equivalent ($5K-$15K/mo retainer). Fractional CTO: $60K-$180K/year equivalent ($5K-$15K/mo retainer). The fractional options are essentially identically priced — the difference is scope, not cost. So for most $1M-$50M businesses, the question isn't 'fractional CAIO vs fractional CTO' on price — it's about which scope fits better.
**Q: When does a Chief AI Officer make more sense than a Fractional CTO?**
A: Four scenarios where CAIO wins: (1) AI is the core of your business model (not just an enhancement to it) — products like AI-native SaaS companies, AI agencies, or businesses where AI capabilities are the moat. (2) Your company is $100M+ revenue and AI work justifies a dedicated executive. (3) You have an existing CTO and need a complementary AI specialist (the split executive structure). (4) Your industry is highly regulated and AI governance requires dedicated executive ownership. For most $1M-$50M businesses, none of these apply.
**Q: When does a Fractional CTO make more sense than a CAIO?**
A: Four scenarios where fractional CTO wins: (1) AI is one of several technology priorities (not the only one) — you also need product, engineering, security, vendor work covered. (2) Your business is $1M-$50M and you can't justify a full-time CAIO but need senior leadership. (3) You don't yet know which technology areas will need the most attention — the broader CTO scope keeps you flexible. (4) Your industry doesn't have regulatory pressure to silo AI from broader tech. For most operators, three or more of these apply, and fractional CTO is the right hire.
**Q: Can one person do both roles?**
A: Yes, and this is increasingly common in 2026. A Fractional CTO with deep AI experience effectively plays both roles for $1M-$50M businesses — covering broader technology leadership while also handling AI strategy and implementation. This is more efficient than hiring two separate executives because most SMBs don't have enough work to fully utilize either role individually. The label matters less than the actual scope; what you're hiring is senior technology + AI leadership at a percentage of full-time.
**Q: What about hiring an AI consultant instead of either?**
A: AI consultants are project-based (2-12 weeks), expire after the deliverable, and don't build ongoing relationships with your team. Both fractional CTO and fractional CAIO are relationship-based (6-18+ months), embedded with your team, and compound in value over time. AI consultants make sense for specific one-time deliverables — an AI Readiness Assessment, a workflow audit, a specific implementation project. They don't replace ongoing executive leadership. For most operators, the right pattern is: start with an AI consultant for one-time strategy → graduate to fractional CTO/CAIO for ongoing relationship → consider full-time at scale.
**Q: How do I decide between fractional CAIO and fractional CTO?**
A: Ask yourself two questions. First: what percentage of your technology roadmap is AI? If under 50%, fractional CTO. If over 50%, fractional CAIO. Second: do you already have a CTO (full-time or fractional)? If yes, you need a CAIO to complement them. If no, you need a fractional CTO who covers AI as part of broader scope. The math for most $1M-$50M businesses points to fractional CTO with AI focus — the title CTO covers the breadth your business actually needs at that scale.
---
## Fractional Chief AI Officer: SMB Senior AI Hire (2026)
- **URL:** https://justinmckelvey.com/blog/fractional-chief-ai-officer
- **Published:** May 23, 2026
- **Updated:** May 23, 2026
- **Category:** AI for Business
- **Reading time:** 6 min
- **Description:** Fractional Chief AI Officer in 2026: senior AI leadership without the full-time hire. Roles, pricing, fit, and when SMBs should hire one.
Quick Answer
A Fractional Chief AI Officer is a senior AI executive who engages with your business 8-15 hours per week instead of full-time. The model emerged in 2025-2026 for $1M-$50M businesses that need senior AI leadership but can't justify the $400K+ full-time cost. Pricing typically $5K-$15K/month ($60K-$180K/yr equivalent) — 70-90% cheaper than full-time CAIO total comp. Best fit when AI is becoming meaningfully important to your business but doesn't yet justify a dedicated C-suite hire.
Reviewed May 2026 · Author: Justin McKelvey, fractional CTO + AI implementation lead, 50+ products shipped
TL;DR: Fractional Chief AI Officer in 2026The Fractional Chief AI Officer (CAIO) is the practical AI executive hire for $1M-$50M businesses. The full-time CAIO role exists for enterprises, but the cost ($400K-$1M+ all-in) and the talent supply (genuinely thin) make it impractical for most operators. Fractional bridges the gap: senior AI executive leadership for 8-15 hours per week, $5K-$15K/month, scaled to actual workload.
The model is exploding in 2026 — the term "fractional chief ai officer" has grown 400% year-over-year because every mid-market business hit the same realization at the same time: AI is too important to ignore, but too early to justify full-time C-suite hiring. Fractional is the bridge.
I'm a fractional CTO who effectively plays the fractional CAIO role for several $1M-$50M businesses. This is the honest take on the model, when it works, when it doesn't, and what to look for.
What a Fractional Chief AI Officer Actually DoesSame six responsibilities as a full-time CAIO, scaled to your business:
2. AI strategy — Identifying the highest-leverage AI opportunities specific to your business (not generic "AI transformation" frameworks).
4. Roadmap and portfolio management — Prioritizing what to build first, second, third. Ensuring projects roll up into a coherent strategy.
6. Tool and vendor selection — Picking specific products (Claude vs ChatGPT vs Gemini, plus the integration layer), negotiating contracts, managing vendor risk.
8. Team training and capability building — Teaching your staff to use AI effectively, establishing norms, building internal AI fluency.
10. Governance — Data handling rules, output review processes, regulatory compliance, vendor risk management.
12. Board and executive communication — Translating AI developments for non-technical leaders so the company makes informed bets.
The best fractional CAIOs also do hands-on implementation work themselves when scope allows. This is where the model dramatically outperforms hiring a strategy consultant — the same person who designs the AI roadmap is the one who can ship the first piece of it.
How the Model WorksTypical engagement structure:
• Cadence: Weekly 1-hour working session with the CEO/founder. Ongoing async via Slack or similar.
• Time commitment: 8-15 hours/week of dedicated capacity
• Engagement length: Typically 6-18 months, often converts to ongoing "as needed" relationship after the initial term
• Embedding: Attends executive team meetings, occasionally board meetings, has access to your tools and systems
• Deliverables: Mix of strategic artifacts (roadmaps, governance docs) and shipped AI workflows
The first 90 days are usually heaviest: a full audit, prioritized roadmap, and at least one quick-win project shipped. Months 4-12 are steady-state strategic guidance plus targeted hands-on work as opportunities emerge.
Fractional CAIO Pricing in 2026
Tier
Hours/week
Monthly retainer
Annualized equivalent
Best for
Advisory
5-8
$3K-$5K
$36K-$60K/yr
Strategic guidance only, no implementation
Standard fractional
8-12
$5K-$10K
$60K-$120K/yr
Strategy + roadmap + some hands-on work
Deep fractional
12-15
$10K-$15K
$120K-$180K/yr
Strategy + implementation + team development
Heavy fractional
15-20
$15K-$25K
$180K-$300K/yr
Heavy implementation phase, transitioning to full-time
Full-time CAIO (for reference)
40-60
$33K-$83K
$400K-$1M+
Enterprise scale, AI is core to business model
For most $1M-$50M businesses, the standard or deep fractional tiers are the right fit. The math compares dramatically favorably to full-time:
• Full-time CAIO: $400K-$1M+ total comp, 40-60 hrs/week
• Standard fractional: $60K-$120K/yr, 8-12 hrs/week
• Ratio: ~85% cost reduction for ~25% of the hours
The fractional model wins on dollars-per-hour-of-senior-attention. The only thing you give up is depth — a fractional can't be in every meeting or own every decision. For businesses where AI is one of several executive priorities (not the only one), that tradeoff is the right one.
When to Hire a Fractional CAIO vs Full-TimeFive-question diagnostic:
2. Does AI represent 5%+ of your revenue or 10%+ of your cost structure? Yes → consider full-time. No → fractional.
4. Do you have 3+ live AI projects requiring weekly executive attention? Yes → consider full-time. No → fractional.
6. Is regulatory or risk exposure around AI a board-level concern? Yes (regulated industry) → consider full-time. No → fractional works fine.
8. Can you absorb $400K-$1M in new exec comp? No → fractional is the only option.
10. Is the supply of qualified full-time CAIO candidates in your geographic area sufficient? Probably not — the talent supply is thin enough that even companies who can afford full-time often start with fractional.
If you answered "yes" to 4 or more, you're ready for full-time CAIO. If 2-3, fractional is the right starting point. If 0-1, you may not need an AI executive at all yet — a fractional CTO with AI experience or a focused AI Readiness Assessment may be enough.
Fractional CAIO vs Other Hiring Options
Option
Cost
Strength
Weakness
Full-time CAIO
$400K-$1M+
Full ownership, dedicated focus
Cost, talent supply, premature for most SMBs
Fractional CAIO
$60K-$180K/yr
Senior leadership at the % you need; relationship compounds
Less depth than full-time; not in every meeting
Fractional CTO with AI focus
$60K-$180K/yr
Covers AI + broader tech leadership in one person
Less AI-specialized than dedicated CAIO
AI consultant (project-based)
$5K-$250K per project
Defined scope, predictable cost
No ongoing relationship; expires after deliverable
AI Readiness Assessment
$5K-$25K fixed
Fixed scope, written roadmap, fast
One-time deliverable, no ongoing support
For most $1M-$50M businesses, the right path is: start with an AI Readiness Assessment to scope where AI fits in your business → if there's meaningful work, engage a Fractional CAIO (or Fractional CTO with AI focus) for ongoing leadership → graduate to full-time CAIO if/when the business scales past $50M revenue and AI is genuinely core to the business model.
What to Look for When Hiring a Fractional CAIOThree filters that matter most:
1. Shipped AI portfolio. They should have personally built and shipped multiple AI products, not just consulted on them. Ask for specific examples with named tools, stack details, and what almost killed the project. Strategy-only candidates struggle when SMB constraints meet implementation reality.
2. SMB-scale experience. Enterprise CAIOs from Fortune 500 backgrounds often struggle with $1M-$50M business constraints: limited engineering team, no AI infrastructure, real budget caps, founder-direct decision-making. The right fractional CAIO has worked with businesses at your size before.
3. Implementation willingness. They're willing to do hands-on work when scope is small, not just strategy. The hybrid of strategy + implementation is what makes fractional CAIO different from consulting — and it's where the real value lives.
Red Flags
• Portfolio is presentations and frameworks, not shipped AI products
• Only worked at companies with $100M+ revenue and dedicated AI teams
• Refuses to do implementation work ("I only do strategy at the executive level")
• Proposes 90-day "discovery phase" before any shippable deliverable
• Pricing requires multi-year commitment without exit ramps
• Uses "transformation" or "AI maturity model" without naming a single specific workflow
• Can't articulate the difference between fractional CAIO and AI consulting
The Cluster: Going Deeper
• Chief AI Officer: Role, Salary, When to Hire — Hub post for the broader CAIO topic.
• Chief AI Officer vs Fractional CTO — Which role do you actually need?
• Fractional CTO vs Full-Time CTO — The closely related decision.
• AI Consultant: What They Do, Cost, How to Hire — Broader landscape of AI hires.
• The Free AI Readiness Checklist — Self-assessment before any hire.
Working with a Fractional Chief AI OfficerI work with $1M-$50M businesses as a fractional CTO with deep AI implementation experience — effectively filling the Fractional Chief AI Officer role at the scale where the title is just emerging. If your business is at the point where AI is becoming meaningfully important and you want senior leadership without the full-time hire:
2. Free 20-minute strategy call — gut-check on whether fractional is the right next step. Book here.
4. AI Readiness Assessment — 2 weeks, written roadmap, fixed fee. Defines scope for any fractional engagement. Details.
Full engagement options on the Work With Me page.
### Frequently Asked Questions
**Q: What is a Fractional Chief AI Officer?**
A: A Fractional Chief AI Officer is a senior AI executive who works with your business on a part-time basis — typically 8-15 hours per week — instead of as a full-time hire. They cover the same six responsibilities as a full-time CAIO (AI strategy, portfolio management, talent strategy, governance, vendor management, board education) but scaled to your actual workload and budget. The model emerged in 2025-2026 as $1M-$50M businesses recognized they needed senior AI leadership but couldn't justify the $400K+ full-time cost.
**Q: How much does a Fractional Chief AI Officer cost?**
A: Typical pricing in 2026: $5,000-$15,000 per month retainer, equivalent to $60K-$180K annualized. Lower end ($5K-$8K/mo) is 8-10 hours/week of strategic guidance. Mid-tier ($8K-$12K/mo) is 10-15 hours/week with hands-on implementation work. Higher end ($12K-$15K/mo+) is 15-20 hours/week and includes embedded team development. Compare to full-time CAIO total comp of $400K-$1M+ and the fractional model is 70-90% cheaper for businesses that don't need full-time attention.
**Q: Who should hire a Fractional Chief AI Officer?**
A: Best fit: $1M-$50M businesses where AI is becoming meaningfully important but doesn't yet justify a full-time C-suite hire. Specifically: (1) You have 1-3 live AI projects but no single owner, (2) The CEO is spending too much time making AI decisions, (3) You're behind competitors on AI roadmap, (4) You need outside senior judgment but full-time CAIO economics don't work, (5) Your CTO is great on platform but not deep on AI. If three or more apply, fractional CAIO is the right hire. If five apply, you might actually need full-time.
**Q: What does a Fractional Chief AI Officer do?**
A: Six core responsibilities scaled to your business: (1) AI strategy — identifying highest-leverage AI opportunities specific to your business, (2) Roadmap and portfolio — prioritizing what to build first, second, third, (3) Tool and vendor selection — picking specific products (Claude, OpenAI, plus integration tools), negotiating contracts, (4) Team training and capability building — teaching your staff to use AI effectively, (5) Governance — data handling rules, output review processes, vendor risk, (6) Board and executive communication — translating AI for non-technical leaders. The best fractional CAIOs also do hands-on implementation work themselves when the project is small enough.
**Q: How is a Fractional CAIO different from an AI consultant?**
A: Three key differences. (1) Engagement length — consultants engage for 2-12 weeks; fractional CAIOs engage for 6-18 months. (2) Embedding depth — consultants advise from outside; fractional CAIOs sit in your team meetings, contribute to slack channels, attend board meetings. (3) Implementation work — consultants typically deliver strategy and walk away; fractional CAIOs stay through implementation and team development. The relationship compounds over time — month 6 with a fractional CAIO is dramatically more productive than week 1 because they've learned your business. Consultants don't have that compound benefit.
**Q: How is a Fractional CAIO different from a Fractional CTO?**
A: Scope and focus. Fractional CTO covers technology broadly — engineering, infrastructure, security, platform decisions, plus increasingly AI. Fractional CAIO is AI-specialized — strategy, governance, implementation across business units, with less focus on broader engineering. In practice, most $1M-$50M businesses don't need both — a fractional CTO with deep AI experience covers the territory of a fractional CAIO at the relevant scale. The CAIO title is more relevant when (a) AI is the core of your business model, or (b) you specifically need someone whose entire identity is AI leadership for credibility purposes.
**Q: How long do Fractional Chief AI Officer engagements last?**
A: Typical engagement lengths: 6-18 months, with most clustering around 12 months. The first 90 days are usually heaviest (audit, roadmap, first quick wins). Months 4-12 are steady-state strategic guidance plus targeted hands-on work. Many engagements convert into ongoing 'as needed' relationships after the initial term — reducing hours but maintaining the relationship for future projects. Some engagements end naturally when the business hires a full-time CAIO and the fractional smoothly transitions out.
**Q: What should I look for when hiring a Fractional Chief AI Officer?**
A: Three filters that matter most. (1) Shipped AI portfolio — they should have personally built and shipped multiple AI products, not just consulted on them. Ask for specific examples with named tools and stack details. (2) SMB experience — they understand the realities of $1M-$50M businesses (limited engineering team, no AI infrastructure, real budget constraints). Enterprise CAIO veterans often struggle with these constraints. (3) Implementation willingness — they're willing to do hands-on work when scope is small, not just strategy. Generic strategy-only candidates can't deliver real value at the SMB scale.
---
## Chief AI Officer: Role, Salary, and When to Hire (2026)
- **URL:** https://justinmckelvey.com/blog/chief-ai-officer
- **Published:** May 23, 2026
- **Updated:** May 23, 2026
- **Category:** AI for Business
- **Reading time:** 6 min
- **Description:** Chief AI Officer in 2026: what they do, $250K-$700K salaries, when to hire one, and the fractional alternative for $1M-$50M businesses.
Quick Answer
A Chief AI Officer (CAIO) is a C-suite executive responsible for AI strategy, implementation, and governance across the business. Salary ranges in 2026: $250K-$400K base for growth-stage startups, $300K-$500K for mid-market, $400K-$1M+ for enterprise. The role emerged in 2024-2025 because AI was too important to leave entirely to CTOs or CIOs. For $1M-$50M businesses, a fractional CAIO ($60K-$180K/year equivalent) almost always beats a full-time hire — the supply of qualified CAIOs is too thin and the cost too high to justify full-time at smaller scale.
Reviewed May 2026 · Author: Justin McKelvey, fractional CTO + AI implementation lead, 50+ products shipped
TL;DR: Chief AI Officer in 2026The Chief AI Officer is the fastest-growing C-suite role of 2026. Two years ago the title barely existed. Now Fortune 500s are creating dedicated CAIO positions, mid-market companies are debating fractional vs full-time, and small businesses are wondering whether they need one at all. The honest answer for most operators reading this: you probably don't need a full-time CAIO yet, but you likely do need someone playing that role part-time.
This guide covers what a Chief AI Officer actually does (vs the LinkedIn version), what they cost, when to hire one, and when a fractional alternative is the smarter call. From a fractional CTO who's been hired to play exactly this role for $1M-$50M businesses since the term started showing up in board agendas.
What a Chief AI Officer Actually DoesStrip away the LinkedIn job descriptions and the role boils down to six responsibilities:
2. AI strategy — Identifying where AI creates competitive advantage in the business and prioritizing investment. Not "AI transformation" — specifically: which workflows, which business units, which timelines.
4. AI portfolio management — Overseeing all AI projects across the business so they don't duplicate, don't conflict, and roll up into a coherent roadmap.
6. Talent strategy — Hiring AI engineers, partnering with universities, building internal AI literacy, and deciding what to build in-house vs partner.
8. Governance and risk — Data handling policies, output review processes, regulatory compliance (especially in healthcare, finance, government), vendor risk management.
10. Vendor management — Negotiating with OpenAI, Anthropic, Google, plus the enterprise vendors layered on top. Volume buyer; gets the discounts and influence.
12. Board and executive education — Translating AI developments for non-technical board members and executives so the company makes informed bets, not panicked ones.
The best CAIOs spend most of their time on responsibilities #1 and #2 (strategy and portfolio). Weaker CAIOs get sucked into #4 and #6 (governance and education) because those are politically safer activities that produce decks instead of shipped AI. If you're hiring one, ask in the interview: "What percentage of your last engagement was spent on shipped AI vs governance documentation?" The right answer is 60/40 or better in favor of shipping.
Chief AI Officer Salary in 2026Compensation varies dramatically by company size and industry:
Company size
Base salary
Total comp
Equity / bonus
Enterprise (Fortune 500)
$400K–$1M+
$1M–$3M+
Significant — board-level package
Mid-market ($100M–$1B revenue)
$300K–$500K
$500K–$900K
Standard exec-tier equity
Growth-stage startups ($10M–$100M)
$250K–$400K
$400K–$700K + equity
Significant equity (0.5%–2%)
Small business ($1M–$10M)
Usually fractional
$60K–$180K/yr equivalent
Mostly cash retainer
Enterprise (regulated)
$500K–$1.2M
$2M–$5M
Premium for compliance experience
The premium pricing isn't because CAIOs are technically harder than CTOs — it's because the supply of qualified candidates is genuinely thin. Most candidates have either deep AI expertise (former ML engineers) or executive experience (former CTOs and consultants), rarely both. The candidates who can do both command the top of the range.
When to Hire a Chief AI OfficerFive signals indicate it's time:
1. AI represents 5%+ of revenue or 10%+ of cost structure. When AI is material to your P&L, it deserves executive ownership.
2. The company has 3+ live AI projects with no single owner. Multiple AI projects without coordination is a recipe for duplicated effort, conflicting decisions, and tools that don't integrate.
3. Regulatory or risk exposure makes AI governance a board-level concern. Healthcare, financial services, defense, and other regulated industries face real liability around AI decisions. Someone needs to own the policy.
4. Competitors have hired CAIOs and you're falling behind on AI roadmap. Defensive hiring — if your direct competitors have CAIOs and you don't, you're at a structural disadvantage on AI investment cadence.
5. The CEO is spending 10%+ of their time on AI decisions they shouldn't be making. When the CEO is the de facto CAIO, both roles suffer. Hiring a CAIO frees the CEO to focus on strategy beyond AI.
For most $1M-$50M businesses, none of these apply yet. The role is justified at scale, not because the title sounds good. Hiring prematurely costs you a $400K salary for work that doesn't yet exist.
CAIO vs CTO vs CDO: Who Owns What?
Role
Primary ownership
Best for
CTO
Engineering, infrastructure, security, platform, tech strategy
Companies where technology is the core product
CIO
Internal IT, enterprise software, infrastructure operations
Larger companies with complex internal IT
CDO (Chief Data Officer)
Data collection, quality, governance, analytics, BI
Data-rich businesses (e-commerce, fintech, SaaS)
CAIO (Chief AI Officer)
AI strategy, implementation across business units, governance, AI talent
Companies where AI capabilities are competitive differentiator
CDAO (combined)
Data AND AI strategy in one role
Mid-market companies that can't justify both separately
In smaller companies, the CTO often covers all of these roles informally. The split makes sense at scale when each function has enough work to justify a dedicated executive. For most $1M-$50M businesses, a fractional CTO or fractional CAIO covers 80% of all four roles at 20% of the combined cost.
The Fractional AlternativeFor most $1M-$50M businesses, a full-time Chief AI Officer doesn't make sense yet. The work is real but doesn't justify $400K+/year. A fractional CAIO solves the gap: senior AI executive leadership for 8-15 hours per week instead of 40-60.
How fractional CAIO engagements work:
• Cadence: Weekly working sessions plus on-demand consultation
• Time commitment: 8-15 hours/week (vs 40-60 full-time)
• Cost: $5K-$15K/month retainer, or $60K-$180K/year equivalent
• Engagement length: Typically 6-18 months
• Scope: Same six responsibilities as full-time, scaled to actual workload
For most $1M-$50M businesses, this is dramatically better than full-time. You get senior AI leadership at the percentage of time you actually need it, the engagement scales up or down based on roadmap intensity, and the price is sustainable. (More on the fractional Chief AI Officer model.)
How to Hire a Chief AI OfficerIf full-time CAIO is justified, the vetting questions matter:
2. "Walk me through three AI initiatives you've personally shipped end-to-end. What were the failure modes?" Real CAIOs have shipped AI products. Strategy-only candidates struggle here.
4. "What percentage of your last engagement was spent on shipped AI vs governance documentation?" 60/40+ in favor of shipping = real practitioner. Inverse = governance specialist who'll struggle on execution.
6. "How would you prioritize our top 5 AI opportunities in the first 90 days?" Specificity in their answer signals operator experience. Vagueness signals consultant background.
8. "What's your point of view on building vs buying for [specific AI capability we need]?" CAIOs need strong technical judgment. Watered-down "it depends" answers signal weak technical depth.
10. "Who in your last role would call you brilliant, and who would call you a pain to work with?" Real CAIOs have strong opinions; some people love them, some hate them. Candidates who claim universal positive feedback are usually politicians.
The Cluster: Going Deeper
• Fractional Chief AI Officer: The SMB Path — Why most $1M-$50M businesses should hire fractional first.
• Chief AI Officer vs Fractional CTO — Which role do you actually need?
• AI Consultant: What They Do, Cost, and How to Hire — Broader landscape of AI hires.
• Fractional CTO vs Full-Time CTO — The related hiring decision.
• The Free AI Readiness Checklist — Self-assessment before any hire.
Working with a Fractional Chief AI OfficerI'm a fractional CTO with deep AI implementation experience — effectively the role most $1M-$50M businesses need when they think they need a CAIO. If you're considering whether your business is ready for AI executive leadership (full-time or fractional), two next steps:
2. Free 20-minute strategy call — gut-check on whether AI is at the scale where you need dedicated leadership. Book here.
4. AI Readiness Assessment — 2 weeks, written roadmap, fixed fee. Tells you the scope and timing for any AI executive hire. Details.
Full engagement options on the Work With Me page.
### Frequently Asked Questions
**Q: What is a Chief AI Officer?**
A: A Chief AI Officer (CAIO) is a C-suite executive responsible for the organization's overall AI strategy, implementation, governance, and competitive positioning. The role emerged in 2024-2025 as boards recognized AI was too important to leave entirely to CTOs (who are technology generalists) or CIOs (who are infrastructure-focused). A CAIO typically reports to the CEO, sits on the executive team, and owns AI roadmap, talent strategy, vendor selection, and AI risk management across the organization.
**Q: What does a Chief AI Officer do day-to-day?**
A: Six core responsibilities: (1) AI strategy — identifying where AI creates competitive advantage and prioritizing investment, (2) AI portfolio management — overseeing all AI projects across the business, (3) talent strategy — hiring, partnering with universities, building internal AI capability, (4) governance and risk — data handling policies, output review processes, regulatory compliance, (5) vendor management — negotiating with OpenAI, Anthropic, Google, plus enterprise vendors, (6) board and executive education — translating AI developments for non-technical leaders. The best CAIOs spend most of their time on #1 and #2; weaker CAIOs spend most of their time on #4 and #6.
**Q: How much does a Chief AI Officer make?**
A: CAIO compensation varies dramatically by company size and industry. Enterprise (Fortune 500): $400K-$1M+ base, $1M-$3M total comp with equity and bonuses. Mid-market ($100M-$1B revenue): $300K-$500K base, $500K-$900K total. Growth-stage startups ($10M-$100M revenue): $250K-$400K base + significant equity. Small business ($1M-$10M): typically can't justify full-time, often hire fractional ($60K-$180K/year equivalent). The role commands premium pay because the supply of qualified CAIOs is genuinely thin — most candidates have either AI expertise OR executive experience, rarely both.
**Q: When should a company hire a Chief AI Officer?**
A: Five signals indicate it's time: (1) AI represents 5%+ of revenue or 10%+ of cost structure, (2) the company has 3+ live AI projects with no single owner, (3) regulatory or risk exposure makes AI governance a board-level concern, (4) competitors have hired CAIOs and you're falling behind on AI roadmap, (5) the CEO is spending 10%+ of their time on AI decisions they shouldn't be making. For most $1M-$50M businesses, none of these apply yet — a fractional CAIO or fractional CTO covers the same scope at a fraction of the cost.
**Q: What's the difference between a Chief AI Officer and a CTO?**
A: CTO owns technology broadly — engineering, infrastructure, security, platform decisions. CAIO owns AI specifically — strategy, governance, implementation across business units. In smaller companies, the CTO often covers both roles. In larger companies, separating them lets each focus deeply: the CTO can stay strategic on platform while the CAIO drives AI adoption across the business. The split makes most sense when AI is genuinely a board-level concern — when AI is a side project, you don't need a separate executive.
**Q: What's the difference between a Chief AI Officer and a Chief Data Officer?**
A: CDO owns data — collection, quality, governance, analytics, business intelligence. CAIO owns AI strategy — using that data plus external models to drive competitive advantage and operational improvement. In practice, the two roles overlap significantly. Some companies merge them into a Chief Data and AI Officer (CDAO). The clean split: CDO ensures you have the right data; CAIO uses it to do new things. If you only have budget for one role, CAIO is usually more strategically valuable in 2026 because AI capabilities are evolving faster than data infrastructure.
**Q: What background does a Chief AI Officer typically have?**
A: Three common paths: (1) Technical → executive — started as ML engineer or data scientist, moved into management, became VP Engineering or VP AI, then transitioned to C-suite. Strongest on AI execution. (2) Consulting → executive — partner at a big consulting firm's AI practice, hired into operating role to apply what they recommended. Strongest on strategy and stakeholder management. (3) CTO → CAIO — existing CTO who deepened on AI specifically and transitioned. Strongest on platform integration. The candidates who win the best CAIO roles in 2026 typically have at least one of: a shipped AI product portfolio, a recent fractional CAIO engagement track record, or an extensive operator network.
**Q: Should I hire a full-time or fractional Chief AI Officer?**
A: Full-time CAIO makes sense when: (1) AI is core to the business model (not a side enhancement), (2) the company has $10M+ revenue and can absorb the $400K+ all-in cost, (3) AI roadmap requires 30+ hours/week of executive attention. For most $1M-$50M businesses, none of those are true — fractional CAIO at 8-15 hours/week is the right fit. Cost is typically $60K-$180K/year equivalent (vs $400K-$700K for full-time), and you get senior AI leadership for the percentage of time you actually need it. The fractional model has exploded in 2026 because the supply of qualified full-time CAIOs is too thin to meet demand at most company sizes.
---
## AI Consulting Firm vs Solo Consultant: Which to Hire (2026)
- **URL:** https://justinmckelvey.com/blog/ai-consulting-firm-vs-solo-consultant
- **Published:** May 23, 2026
- **Updated:** May 23, 2026
- **Category:** AI for Business
- **Reading time:** 7 min
- **Description:** AI consulting firm vs solo consultant in 2026: pricing, fit, scope. When to hire a big firm vs a solo expert from a fractional CTO.
Quick Answer
For most $1M–$50M businesses, a solo AI consultant or fractional CTO outperforms a big AI consulting firm — at a fraction of the cost. Solo consultants charge $150-$400/hr and $5K-$50K per engagement; firms charge $300-$2,000/hr and $25K-$500K+. The single biggest practical difference: at a firm, junior associates do the work and the senior partner signs the invoice. With a solo consultant, the senior person IS the work. Hire a firm only for enterprise, regulated industries, or board political cover.
Reviewed May 2026 · Author: Justin McKelvey, fractional CTO, 50+ products shipped
TL;DR: AI Consulting Firm vs Solo ConsultantThis is one of the most expensive decisions you'll make as an operator hiring AI help. The same scoped deliverable from a solo AI consultant is often 5-10x cheaper than from a big firm — and frequently higher quality. But firms have brand credibility, structured methodologies, and the ability to staff multi-workstream engagements that solos can't. Picking the wrong tier costs either money (overpaying for firm overhead you don't need) or trust (underpaying for solo work that lacks the brand cover you do need).
I'm a fractional CTO who's been hired in both modes and has reviewed engagements from both sides. The honest take after years of watching $25K solo engagements outperform $250K firm engagements (and occasionally the reverse): the right choice depends on your business size, what you're optimizing for, and what kind of cover you need beyond the actual work.
The Comparison at a Glance
Dimension
AI Consulting Firm
Solo AI Consultant
Pricing
$300–$2,000/hr · $25K–$500K+ per engagement
$150–$400/hr · $5K–$50K per engagement
Who does the work
Junior associates with senior partner on contract
The senior person on the proposal
Engagement length
8–24 weeks typical
2–8 weeks typical
Methodology
Structured frameworks, established playbooks
Custom-fit to the specific engagement
Brand credibility
High — name recognition matters politically
Variable — depends on personal reputation
Execution credibility
Variable — often 12-18 months behind solos
High when the person has shipped real AI products
Implementation work
Rarely — usually hands off to a separate team
Often — many solos do strategy + implementation
Best for
Fortune 1000, regulated industries, board politics
$1M–$50M businesses, fast execution, fixed scope
Risk profile
Lower brand risk, higher cost risk
Higher brand risk, lower cost risk
The Single Biggest Difference: Who Actually Does the WorkThis is the most important thing to understand about hiring an AI consulting firm vs a solo consultant. At a firm, the senior partner you meet during the sales process is rarely the person doing the actual work.
The typical big-firm structure:
2. Partner — Senior name on the contract. Meets you 2-3 times during the engagement. Reviews deliverables.
4. Senior consultant or principal — Day-to-day project lead. Does some analysis, runs client calls.
6. Senior associate — Does most of the strategic thinking and writing.
8. Junior associates — Does most of the data gathering, slide preparation, and operational work.
You're paying senior partner rates ($500-$2,000/hour) and getting work product primarily from associates with 1-5 years of experience. They may be smart, but they're applying playbook frameworks to your business, not the senior partner's actual judgment.
With a solo AI consultant, the person you talk to during the sales call IS the person who does the work. Their personal judgment, experience, and bandwidth are what you're buying — no leverage model, no pass-through to juniors.
This is why solo consultants often produce sharper work despite lower hourly rates. Their hourly rate is their actual rate, not an average across a leverage model.
When to Hire an AI Consulting FirmFirms are the right choice when:
1. You're a Fortune 1000 enterprise with complex stakeholder maps. Firms can run parallel workstreams across multiple business units, coordinate with legal and compliance, and produce the level of documentation enterprise programs require. Solos can't realistically do this.
2. You're in a regulated industry that requires firm sign-off. Some industries (healthcare, financial services, government) functionally require the brand of a Big 4 or similar firm on engagement deliverables. The work itself might be identical, but the regulatory political cover only comes with the brand.
3. You need board political cover. Presenting "Deloitte's AI strategy" to a board carries different political weight than presenting "Justin McKelvey's AI strategy." For boards that don't know AI consultants personally, the firm logo is the credibility shortcut.
4. The engagement genuinely requires 5+ parallel workstreams. Rare for AI work in 2026 — most engagements need one or two senior practitioners, not five. But for true enterprise AI transformations covering data infrastructure, model deployment, change management, governance, and reporting, the parallel work is real.
5. You have $500K+ in consulting budget that has to be spent. Sometimes the constraint is consumption, not scope. If you have to deploy a budget, big firms can absorb it.
When to Hire a Solo AI ConsultantSolo consultants are the right choice when:
1. Your business is in the $1M-$50M range. At this size, you can't justify the firm overhead and you don't need the brand cover. A focused solo consultant produces equivalent or better work for 10-20% of the cost.
2. You want execution, not just strategy. Most solo AI consultants in 2026 also do implementation work — they ship the recommendations they wrote. Firms typically refuse to do implementation or split it into a separate engagement.
3. You can define scope tightly. Solo consultants excel at scoped engagements: "Audit our customer support workflow and recommend an AI implementation. Two weeks. $15K." They struggle with open-ended discovery work — that's where firms are structured to excel.
4. You value senior judgment over methodology. The thing you're buying from a solo is one person's actual expertise. The thing you're buying from a firm is methodology consistently applied by a team. For nuanced AI work, individual judgment usually beats methodology.
5. You need to move fast. Solos typically start in 1-2 weeks. Firms typically start in 4-8 weeks after sales, contracting, and team staffing. For competitive markets where AI work has time pressure, speed alone justifies solo.
The Hybrid Option: Fractional CTOFor ongoing AI work — not a one-time deliverable but a multi-month relationship — the right choice is often neither a firm nor a solo consultant, but a fractional CTO.
The fractional model differs from both:
• vs Firm: Embedded with your team, not handed-off-and-gone. Stays through implementation, debugging, rollout, and team development. No leverage model — the senior person is who you get.
• vs Solo consultant: Ongoing relationship instead of one-time project. Learns your business over months, builds trust, can move faster on each new initiative. Typically 8-15 hours/week instead of full-bandwidth engagement.
Pricing is in the middle: $5K-$15K/month retainer for ongoing fractional work, vs $5K-$50K per scoped solo project, vs $25K-$500K per firm engagement. For businesses that expect AI work to be a meaningful percentage of their roadmap for the next 12+ months, fractional CTO almost always beats both alternatives.
(More on the fractional vs full-time CTO decision. Also how to hire one.)
Common Mistakes Buying AI ConsultingThe four most expensive mistakes I see operators make:
1. Hiring a firm for "credibility" when the actual work is better from a solo. Brand cover is real and sometimes worth paying for, but most operators don't actually need it — they just feel safer hiring a known name. If your board won't look at the deliverable name, save the money.
2. Hiring a strategy-only consultant when you need implementation. AI strategy is the easy part. Implementation is where most AI projects die. If you're going to need someone to ship the recommendations, hire someone who can do both — not a strategy specialist who hands off to your team.
3. Accepting open-ended hourly billing. The classic firm upsell vector. Define scope, demand fixed pricing, get a deliverable in writing. Hourly-with-no-cap is how $50K engagements become $500K.
4. Not specifying who will do the work. Firms sell you the senior partner and deliver work from junior associates. The fix: name the partners and associates in the contract, lock in the time allocation, refuse to accept substitutions without your written approval.
How to Decide for Your BusinessThree diagnostic questions:
2. "Will my board / investors / regulators specifically look at the consulting firm name?" If yes — big firm is justified. If no — solo or fractional is almost always better.
4. "Do I need ongoing AI capacity for the next 12+ months, or a one-time deliverable?" One-time = solo or firm. Ongoing = fractional CTO.
6. "Do I need someone to also ship the recommendations, or just produce the strategy?" Ship = solo with implementation experience or fractional. Just strategy = either, but solos are cheaper.
For most $1M-$50M businesses reading this, the answers are: "no, yes, yes" — which means a fractional CTO is the right hire, not an AI consulting firm.
Related Posts in This Cluster
• AI Consultant: What They Do, Cost, and How to Hire — The hub post for this cluster.
• AI Strategy Consultant: What You Get (and Skip) — Deep dive on the strategy-specific subset.
• Fractional CTO vs Full-Time CTO — The closely related hiring decision.
• How to Hire a Fractional CTO — Tactical guide.
• The Free AI Readiness Checklist — Self-assessment before hiring anyone.
Working with a Fractional CTOI'm a fractional CTO who works with $1M-$50M businesses on AI implementation. If you're comparing AI consulting firms vs solo consultants and considering whether a fractional engagement might fit better, the right next step is one of two things:
2. Free 20-minute strategy call — gut-check on which engagement type fits your business. Book here.
4. AI Readiness Assessment — the productized version of strategy consulting. 2 weeks, fixed fee, written deliverable, optional follow-on implementation. Details.
If you'd rather see all the engagement options before talking, that's on the Work With Me page.
### Frequently Asked Questions
**Q: Is an AI consulting firm better than a solo AI consultant?**
A: Neither is universally better — they fit different business profiles. AI consulting firms are best for Fortune 1000 enterprises that need parallel workstreams, brand cover for board presentations, or regulated-industry sign-off. Solo AI consultants are best for $1M-$50M businesses that need focused expertise, fixed-fee scoped engagements, and the senior person actually doing the work. For most operators reading this, a solo consultant or fractional CTO is the better fit.
**Q: How much does an AI consulting firm cost vs a solo consultant?**
A: Solo AI consultants typically charge $150-$400/hour, with full engagements ranging from $5K (1-week sprint) to $50K (3-month implementation). AI consulting firms charge $300-$2,000/hour, with engagements typically $25K-$500K+. The same scoped deliverable from a solo consultant is often 5-10x cheaper than a firm — and frequently higher quality because the senior person is the one doing the work, not a junior associate.
**Q: Who actually does the work — partners or associates?**
A: At AI consulting firms, the typical pattern is: senior partner on the contract, junior consultants doing the actual analysis and writing, occasional partner check-ins. You're paying senior rates and getting work product from people 2-3 levels down. With a solo AI consultant, the person on the proposal IS the person doing the work. This is the single biggest practical difference between firms and solos — and the reason solo consultants often produce sharper work despite less brand prestige.
**Q: When should I hire an AI consulting firm instead of a solo consultant?**
A: Hire a firm when: (1) you're a Fortune 1000 enterprise with complex stakeholder maps and parallel workstreams, (2) you're in a regulated industry that requires firm sign-off, (3) you need brand cover for a board presentation (a McKinsey or Deloitte logo on the deck matters politically), (4) the engagement requires 5+ people working in parallel for months, or (5) you have $500K+ in consulting budget that has to be spent. None of these apply to most operators — small businesses, startups, and mid-market companies almost always do better with a solo consultant.
**Q: Are big AI consulting firms more credible?**
A: Credibility depends on what you're optimizing for. Big firms have brand credibility — putting 'Accenture AI strategy' on a board deck communicates rigor to non-technical stakeholders. But brand credibility isn't the same as execution credibility. The brutal truth in 2026 is that many big-firm AI practices are 18 months behind solo practitioners who've been shipping AI products since GPT-3. If you're optimizing for political cover, hire the firm. If you're optimizing for results, hire the practitioner.
**Q: Can a solo AI consultant handle a $10M+ business?**
A: Yes — most solo AI consultants regularly serve businesses in the $10M-$50M range. The work scales fine because strategy and tool selection are intellectual work, not headcount work. The places solo consultants hit limits: parallel workstreams (one person can only run one or two engagements deep at once), regulated industries requiring firm sign-off, and engagements that genuinely require multi-disciplinary teams (rare for AI work in 2026 — most engagements need one senior practitioner, not five).
**Q: What's a fractional CTO and how does it compare?**
A: A fractional CTO is a senior technical leader who engages with your business on an ongoing basis (typically 8-15 hours/week) instead of a fixed project. They do everything an AI consultant does — strategy, tool selection, implementation oversight — but stay through execution and team development. For ongoing AI work, a fractional CTO usually outperforms both solo consultants and firms because the relationship compounds. They learn your business, build trust, and can move faster on each new initiative.
**Q: How do I avoid getting upsold by an AI consulting firm?**
A: Three protections: (1) Define the scope in one sentence before any sales calls and refuse to expand it. Firms expand scope to expand fees. (2) Insist on fixed-fee pricing for fixed deliverables. Hourly-with-no-cap is the upsell vector. (3) Negotiate the team composition into the contract — name the partners and associates who will work on it, and lock those names in. The upsell trick is selling you senior names and delivering work from junior staff.
---
## AI Consultant: What They Do, Cost, and How to Hire (2026)
- **URL:** https://justinmckelvey.com/blog/ai-consultant
- **Published:** May 23, 2026
- **Updated:** May 23, 2026
- **Category:** AI for Business
- **Reading time:** 7 min
- **Description:** AI consultants in 2026: roles, hourly rates ($150-$500), how to hire, and when to choose a fractional CTO instead.
Quick Answer
An AI consultant helps businesses figure out where and how to use artificial intelligence — strategy, tool selection, implementation, team training. In 2026, solo consultants charge $150-$400/hour ($5K-$50K per engagement); boutique firms charge $300-$800/hour; big firms charge $500-$2,000/hour. The single most important filter when hiring: have they personally shipped an AI product, or only consulted on them? For $1M-$50M businesses, a focused solo consultant or fractional CTO usually beats a big firm.
Reviewed May 2026 · Author: Justin McKelvey, fractional CTO, 50+ products shipped
TL;DR: AI Consultants in 2026An AI consultant is a specialist who helps businesses use artificial intelligence productively. Some focus on strategy (where to use AI); some focus on implementation (building AI workflows); some on transformation (org-wide change management). The market in 2026 is enormous and confused — every consulting firm has rebranded into "AI" and every freelancer has added "AI" to their LinkedIn. Filtering signal from noise matters more than ever.
This guide is the honest take from a fractional CTO who's hired AI consultants, been hired as one, and reviewed dozens of engagements both successful and failed. I'll cover the actual landscape, what consultants do, what they cost, how to hire one well, when to skip them entirely, and when a fractional CTO is the better hire.
The Three Types of AI ConsultantsAlmost every AI consultant on the market in 2026 falls into one of three buckets:
Type
Focus
Deliverable
Pricing
Best for
AI Strategy Consultant
Roadmaps and use case identification
Written strategy doc or slide deck
$5K–$250K per engagement
Businesses that don't yet know where to start
AI Implementation Consultant
Building AI tools and workflows
Shipped working AI product
$15K–$250K per project
Businesses with a defined use case
AI Transformation Consultant
Org-wide change management
Multi-year program rollout
$100K–$5M+ engagements
Enterprise companies (Fortune 1000)
Most "AI consultants" you'll encounter are actually strategy consultants who'd like to also do implementation. They produce roadmaps but rarely build the products themselves. This is fine — strategy and execution are different skills — but it matters when you're hiring. If you need execution and hire a strategy-only consultant, you'll get a roadmap that's hard to implement.
What an AI Consultant Actually DoesAcross all three types, the day-to-day work falls into seven categories:
2. Workflow audits — Mapping every repeatable workflow in your business that touches data or decision-making, identifying which ones AI could improve.
4. Use case prioritization — Scoring potential AI projects on impact × feasibility, ranking them, and recommending what to build first.
6. Tool selection — Picking specific products (Claude vs ChatGPT vs Gemini, plus the SaaS stack to integrate them with) for your specific situation.
8. Prompt engineering and configuration — Designing the actual prompts, custom instructions, and Project setups that make the AI tools reliable.
10. Integration work — Connecting AI tools to your existing systems via APIs, MCP servers, or no-code automation platforms.
12. Team training — Teaching your staff to use the new AI tools effectively, including governance norms and review processes.
14. Governance setup — Data handling rules, vendor risk assessment, output review processes, and disclosure policies.
Good consultants spend most of their time on #3-#6 (tools, prompts, integrations, training). Bad consultants spend most of their time on #1-#2 (audits and prioritization). Audits are necessary but they're the easy part — the value is in the doing, not the planning.
AI Consultant Pricing in 2026Real numbers from the current market:
Tier
Hourly rate
Per-engagement
What you get
Solo independent
$150–$400
$5K–$50K
The senior person doing the actual work. Fast, focused, narrow scope.
Boutique firm (3–10 ppl)
$300–$800
$25K–$250K
Mix of senior and mid-level consultants, defined methodology, longer engagements.
Big firm AI practice
$500–$2,000
$100K–$500K+
Senior partner on contract, junior associates doing work. Brand cover and process.
Enterprise transformation
$1,000–$3,000
$500K–$5M+
Multi-year programs, change management, parallel workstreams.
Fractional CTO (alternative)
$150–$300
$5K–$15K/mo retainer
Embedded ongoing engagement, strategy + implementation in one person.
For most $1M-$50M businesses, the solo independent or fractional CTO options give the best value. Big-firm engagements rarely justify their premium unless you specifically need brand cover, regulatory sign-off, or 5+ people working in parallel.
How to Hire an AI Consultant (Without Wasting Money)The single most important step: define the scope in one sentence before any sales calls.
Examples:
• "We want to reduce customer support email volume by 30% using AI."
• "We want to draft first-pass legal contracts in 10 minutes instead of 2 hours."
• "We want to identify which leads are worth our sales team's time using AI."
One-sentence scope filters consultants fast. Generic consultants will try to expand the scope ("you might also want to consider..."). Good consultants will tighten it ("which specific workflow within customer support — incoming triage, response drafting, or escalation?").
The vetting questions that matter:
2. "Walk me through an AI product you personally built end-to-end. What were the three things that almost killed it?"
4. "What specific tools would you recommend for [my use case]? Name the products, not categories."
6. "If we like the strategy and want help shipping the first piece, can you do that?"
8. "What's the fixed price for a scoped engagement that produces [specific deliverable]?"
10. "Show me a written deliverable from a recent engagement — not a slide deck, the actual work product."
Listen for specificity in their answers. Vague answers ("we'd evaluate that as part of the discovery phase") = generic consultant. Specific answers ("for that use case I'd start with Claude with a custom Project, integrated via [specific tool]") = practitioner who'll actually help.
AI Consultant Red FlagsHard signs you should NOT hire:
• Portfolio shows presentations and case studies, not shipped products
• Refuses to do implementation work ("we only do strategy")
• Opaque or hourly-with-no-cap pricing
• No specific tool recommendations — only frameworks and matrices
• Proposes a 3-6 month "strategy phase" before any execution
• Heavy use of buzzwords: "transformation," "leverage," "synergy," "paradigm shift"
• Can't walk through a specific AI product they personally built
• Inability to commit to fixed pricing for fixed deliverables
• Wants extensive "discovery" before quoting any work
The single biggest red flag, restated: they've never personally shipped an AI product. Strategy consultants who can't build will produce strategies that can't be built.
When to Hire an AI Consultant vs a Fractional CTOThis is the most common confusion in the market, so worth being clear:
Hire an AI consultant when:
• You need a one-time strategy deliverable (roadmap, audit, opportunity map)
• You're presenting to a board and need outside brand cover
• You have a defined implementation project with clear specs and need senior expertise to ship it
• You're a Fortune 1000 with regulated industry needs
Hire a fractional CTO when:
• You want ongoing strategic guidance, not a one-time deliverable
• You need someone embedded with your team through implementation, not just before it
• You're a $1M-$50M business that can't justify a full-time CTO but needs senior technical leadership
• You want strategy + implementation in the same person (most fractional CTOs do both)
• You expect AI work to be a meaningful percentage of your roadmap for 12+ months
For most operators reading this, the fractional CTO path is the right one. AI consulting is a deliverable; fractional CTO is a relationship. Relationships compound; deliverables expire. (More on the fractional vs full-time CTO decision.)
The Specific Engagements I Offer (and Don't)For full transparency, here's how I structure AI engagements:
2. Free 20-minute strategy call — gut-check on fit. Book here.
4. AI Readiness Assessment ($) — 2 weeks, written 15-25 page roadmap, fixed fee. The productized version of AI strategy consulting. Details.
6. AI Implementation work ($$) — Project-based, fixed scope, ships one workflow end-to-end. Typically 4-8 weeks.
8. Fractional CTO engagement ($$$) — Ongoing 8-15 hours/week, embedded with your team. Strategy + implementation + team development.
What I don't do: open-ended hourly billing, 6-month strategy phases, slide-deck deliverables, enterprise transformation programs, regulated industries requiring firm sign-off. If those are what you need, a big-firm AI practice is the right fit and I'm not.
The Cluster: Going Deeper on AI ConsultingIf you want to go deeper on specific aspects of hiring AI consultants:
• AI Strategy Consultant: What You Get (and Skip) — Honest take on the strategy-focused subset.
• AI Consulting Firm vs Solo AI Consultant — Which is right for your business size.
• Fractional CTO vs Full-Time CTO — The closely related hiring decision.
• How to Hire a Fractional CTO — If you've decided fractional is the right path.
• The Free AI Readiness Checklist — Self-assessment to score where your business stands before hiring anyone.
Working with a Fractional CTOIf you've read this far and you're a $1M-$50M business considering AI consulting, the right next step is usually one of two things:
2. Start with the free AI Readiness Checklist — 5 minutes, gives you a score and tells you where to focus. /ai-readiness-checklist
4. Book a 20-minute strategy call — free, no pitch, just a gut-check on whether you need outside help. /book/strategy-call
If you'd rather see the full engagement options before talking, that's on the Work With Me page.
### Frequently Asked Questions
**Q: What is an AI consultant?**
A: An AI consultant is a specialist who helps businesses figure out where and how to use artificial intelligence. Some focus on strategy (roadmaps and use case identification), some on implementation (building AI tools and workflows), and some on transformation (organization-wide change management). In 2026, the term covers everyone from solo practitioners charging $150/hour to consulting firms charging $500/hour to enterprise practices charging $50K+ per engagement.
**Q: How much does an AI consultant cost in 2026?**
A: Pricing varies by consultant type and engagement scope. Solo AI consultants typically charge $150-$400/hour, with full projects ranging from $5K (focused 1-week sprint) to $50K (3-month implementation). Boutique firms charge $300-$800/hour and $25K-$250K per engagement. Big consulting firms (Deloitte, Accenture, McKinsey AI practices) charge $500-$2,000/hour with engagements typically $100K-$500K+. The price-to-quality ratio is often inverse — focused solo consultants who've shipped real products usually outperform big-firm associates running playbooks.
**Q: What does an AI consultant actually do?**
A: Day-to-day, AI consultants do some mix of: workflow audits (finding where AI fits in your business), tool selection (recommending specific products to use), prompt engineering (designing AI inputs that produce reliable outputs), integration work (connecting AI tools to existing systems), team training (teaching staff to use AI effectively), and governance setup (data handling rules, output review processes, vendor risk). The good ones spend most of their time on workflows and tools — the bad ones spend most of their time on slides.
**Q: Do I need an AI consultant or a fractional CTO?**
A: If you can name three workflows in your business where AI would change the outcome this quarter, you don't need an AI consultant — you need implementation. A fractional CTO (or AI-focused fractional CTO) is the better hire because they stay through the build phase. Traditional AI consultants engage for weeks at a fixed fee, deliver a deck or roadmap, and leave. Fractional CTOs engage ongoing (typically 8-15 hours/week) and stay through implementation, debugging, and rollout. For $1M-$50M businesses that need execution, fractional is almost always the better fit.
**Q: How do I hire an AI consultant?**
A: Three steps: (1) Define the scope in one sentence before any sales calls. 'We want to reduce customer support email volume by 30% using AI.' Specific goals filter out generic consultants fast. (2) Ask candidates to walk through an AI product they personally built end-to-end. If they can't, they're a strategy-only consultant — fine for some engagements, wrong for execution. (3) Insist on fixed-fee scoped engagements, not open-ended hourly. Good consultants give clear pricing for clear deliverables. 'It depends on scope' indefinitely is a yellow flag; 'I bill $X/hour without a cap' is a red flag.
**Q: What's the difference between an AI consultant and an AI consulting firm?**
A: Solo AI consultant = one person, often the senior name doing the actual work, $150-$400/hour, $5K-$50K per engagement. AI consulting firm = team of consultants, often a senior partner on the contract with junior associates doing the work, $300-$2,000/hour, $25K-$500K per engagement. Solo consultants win on focus, quality, and price for most $1M-$50M businesses. Firms win on regulated industries that require firm sign-off, brand cover for board presentations, and engagements requiring 5+ people in parallel.
**Q: What are AI consultant red flags?**
A: Watch for: portfolios that show presentations instead of products, refusal to do implementation work, opaque or hourly-with-no-cap pricing, no specific tool recommendations (just frameworks), 6-month strategy phases before execution, heavy use of buzzwords ('transformation,' 'leverage,' 'synergy'), inability to walk through a specific AI product they built end-to-end, and proposals that include extensive 'discovery phases' before any visible work. The single biggest red flag: they've never personally shipped an AI product.
**Q: How long does an AI consulting engagement take?**
A: Depends on scope. Productized AI Readiness Assessments: 2 weeks, fixed fee, written deliverable. Focused AI implementation sprints: 4-8 weeks, builds one workflow end-to-end. Ongoing fractional engagements: 3-12 months, embedded with the team. Big-firm transformation projects: 6-24 months, multiple workstreams. For most operators, shorter and more focused beats longer and broader — better to ship one workflow in 6 weeks than complete a 6-month strategy that goes stale before execution.
**Q: Can a single AI consultant handle a large company?**
A: It depends on what 'handle' means. A solo consultant can produce strategy and roadmaps for any size company — strategy work scales. Implementation gets harder as company size grows because you need team alignment, change management, and multiple parallel workstreams. For sub-$50M companies, a solo consultant or solo fractional CTO is usually sufficient. For $50M+ businesses with complex stakeholder maps, you typically need either a firm or a solo lead plus a build team. Don't over-staff small problems.
---
## AI Strategy Consultant: What You Get (and Skip) in 2026
- **URL:** https://justinmckelvey.com/blog/ai-strategy-consultant
- **Published:** May 23, 2026
- **Updated:** May 23, 2026
- **Category:** AI for Business
- **Reading time:** 7 min
- **Description:** AI strategy consultants in 2026: what they do, who needs one, who doesn't, and the honest take from a fractional CTO.
Quick Answer
An AI strategy consultant produces a roadmap recommending where AI can create leverage in your business — what to build first, what tools to use, what to skip. The honest version produces a 15-25 page document covering specific workflows, ranked opportunities, and a 90-day implementation plan. The bad version produces 60 slides on "AI transformation." Solo consultants charge $5K-$25K for a 2-4 week engagement; big firms charge $50K-$500K for the same deliverable. Quality is usually inverse to price.
Reviewed May 2026 · Author: Justin McKelvey, fractional CTO, 50+ products shipped
TL;DR: AI Strategy Consultants in 2026An AI strategy consultant evaluates your business and produces a roadmap telling you where AI can create the most leverage. Done well, the deliverable is a written 15-25 page document with specific workflows, ROI-ranked opportunities, tool picks, and an implementation timeline. Done poorly, it's a 60-slide deck that uses the word "transformation" 200 times and tells you nothing actionable.
The fundamental question to ask before hiring: can you name three workflows in your business where AI would change the outcome this quarter? If yes, you don't need strategy — you need implementation. If no, you might benefit from a structured outside perspective.
I'm a fractional CTO who's been on both sides of this — produced AI roadmaps for $5M-$50M businesses and also reviewed roadmaps from $400K consulting engagements. The good news: a solo consultant who's actually shipped AI products almost always beats a big firm staffed with junior associates running pattern-matched playbooks. The bad news: you have to know how to tell them apart.
What an AI Strategy Consultant Actually DoesThe core deliverable across every honest AI strategy engagement looks roughly the same:
2. Workflow audit — Map every repeatable workflow in your business that touches data or decision-making.
4. Opportunity ranking — Score each workflow on impact (time saved, revenue affected, error reduction) × feasibility (data available, tools mature, team capacity).
6. Tool recommendations — Specific products to use, not categories. "Claude with custom Projects" beats "consider an LLM platform."
8. Governance and data handling — What data can/can't be sent to which AI vendors, who reviews AI outputs, when to disclose to customers.
10. Implementation timeline — 30-60-90 day plan with named owners and concrete ship dates.
That's the entire job. The "transformation roadmap" framing common in big-firm work usually obscures the fact that most businesses need exactly five things done, not fifty.
Solo Consultant vs Big Firm: The Real Differences
Dimension
Solo AI strategy consultant
Big-firm AI practice
Pricing
$5K–$25K fixed fee, 2–4 weeks
$50K–$500K+, 8–24 weeks
Who does the work
The consultant (the senior name on the engagement)
Junior associates with senior name on the contract
Implementation experience
Usually built and shipped AI products themselves
Usually researched and presented; rarely shipped
Deliverable
15–25 page written roadmap
60–150 slide presentation
Specificity
Names exact tools and workflows
Frameworks, matrices, "options to consider"
Follow-through
Often offers implementation engagement
Rarely touches actual implementation
Best for
$1M–$50M businesses
Fortune 1000 with regulated industries + board reporting needs
For most operators reading this, the solo consultant is the right call. The big firm is the right call only if you specifically need brand cover for a board presentation, you're in a regulated industry that requires firm sign-off, or you have $500K of consulting budget you have to spend.
What to Look for When Hiring an AI Strategy ConsultantThree filters that matter more than the rest:
1. Have they shipped real AI products, or just consulted on them?
This is the single most important filter. AI strategy fails in the messy middle of implementation — model output quality, data quality, prompt engineering, edge cases, governance reality. Consultants who've never built AI products produce strategies that don't anticipate those failures.
Ask: "Walk me through an AI product you personally built end-to-end. What were the three things that almost killed it?" If they can't answer specifically, they're a strategy-only consultant and their roadmap will be optimistic in the wrong places.
2. Does the proposed deliverable name specific tools and workflows, or hide behind frameworks?
A good AI strategy roadmap says: "Use Claude Pro with a Project configured for [your specific use case], plug it into [your existing tool] via [specific integration]. Expected time savings: 8 hours/week for [team]. Total subscription cost: $20/month."
A bad one says: "Evaluate language model platforms across the dimensions of capability, cost, and integration complexity, with consideration for governance and ethical AI principles."
One is a plan. The other is a framework. Pay for plans.
3. Will they engage on the actual build if you want them to?
Strategy consultants who refuse to touch implementation often produce strategies that can't be implemented. The discipline of "I might have to ship this" tightens the strategy in ways that "I'll hand it off" doesn't.
Ask: "If we like the roadmap and want help shipping the first piece, can you do that?" Good answers: yes, here's how. Bad answers: no, we only do strategy; we'll refer you to an implementation partner; that's a separate scope of work.
The Red FlagsSigns you should not hire this person or firm:
• Their portfolio is presentations, not products. Click through their case studies — do they show what they built or just what they presented?
• They quote "AI transformation" or "enterprise AI maturity" without ever defining the workflow they'd touch first.
• Their pricing is opaque or "depends on scope" indefinitely. Good consultants give fixed pricing for scoped work.
• They've never written code or shipped a product end-to-end. Doesn't matter if they're brilliant — they don't know what they don't know.
• They use the words "leverage," "synergy," or "paradigm shift" in the first call. Real practitioners describe problems specifically.
• They quote you a 6-month strategy phase before any execution. Strategy that takes 6 months produces strategy that's outdated when implementation starts.
The Better Alternative for Most BusinessesFor most $1M–$50M businesses, the right move isn't an AI strategy consultant at all. It's a fixed-fee AI Readiness Assessment — the productized version of strategy consulting.
How it works: 2 weeks of focused work, a defined scope, a written deliverable. The Assessment includes:
• Workflow audit (what's repeatable, what's worth automating)
• Opportunity map ranked by ROI
• Tool recommendations with specific products
• Governance and data-handling guidance
• 30-60-90 day implementation plan
• 1-hour walkthrough call
Fixed price (no hourly billing), capped to 2-3 engagements per month so quality stays high, and the fee credits against any follow-on build work if you choose to engage further.
This is what I offer in place of traditional AI strategy consulting. More on the AI Readiness Assessment here.
When to Skip the Consultant EntirelyThree signals you don't need strategy consulting at all:
2. You can already name your top 3 AI workflows. If you have an opinion on what to build first, skip strategy and hire an implementation partner.
4. Your business is under $5M revenue. Most founders at this stage can absorb the strategy thinking themselves in a focused weekend.
6. You have an internal AI champion. Someone on the team who's already using Claude or ChatGPT daily, knows the workflows, and has opinions. Give them budget and time, not an outside consultant.
For all three cases, a better starting point is the free 30-question AI Readiness Checklist. Five minutes, no fee, gives you a score and a tier-specific recommendation. Most owners discover they don't need a consultant after running it.
How I Think About AI Strategy EngagementsMy positioning: I'm not an "AI consultant" in the traditional sense. I'm a fractional CTO who builds AI products and helps other founders do the same. The strategy work I do is the necessary front-end of an implementation engagement, not a separate product.
That biases me — I think strategy is overrated and execution is underrated. But it's worth knowing my bias before you hire me (or anyone). If you want a 6-month strategy deliverable with no implementation expected, I'm not your person. If you want a tight 2-week roadmap followed by hands-on help shipping it, I'm closer to fit.
Working with a Fractional CTOIf you've read this far and your business is in the $1M-$50M range, the right next step is usually one of two things:
2. The free AI Readiness Checklist — 5 minutes, gives you a score and tells you where to focus first. Start here: /ai-readiness-checklist
4. A 20-minute strategy call — free, no pitch, just a gut-check on whether you actually need outside help or can handle this internally. Book: /book/strategy-call
If after both of those you want a paid engagement, the AI Readiness Assessment is the right starting point — fixed fee, defined scope, 2 weeks of focused work.
### Frequently Asked Questions
**Q: What does an AI strategy consultant do?**
A: An AI strategy consultant evaluates a business's workflows, data, team capabilities, and competitive landscape, then produces a roadmap recommending where AI can create the most leverage. The deliverable is typically a written strategy document covering use cases ranked by ROI, tool recommendations, governance requirements, and an implementation timeline. The honest version focuses on what to actually build; the bad version is 60 slides on 'AI transformation' that nobody reads.
**Q: Do I need an AI strategy consultant?**
A: Only if you can't name three workflows in your business where AI would change the outcome this quarter. If you can name them, you don't need strategy — you need implementation. AI strategy consultants are most useful for businesses that have heard 'we should be using AI' from the board, have budget allocated, but don't know where to start. If you already know what to build, hire an implementation partner instead.
**Q: How much does an AI strategy consultant cost in 2026?**
A: Pricing varies wildly. Boutique solo consultants charge $150-$400/hour or $5K-$25K for a 2-4 week engagement producing a written roadmap. Big consulting firms charge $50K-$500K for the same deliverable wrapped in more slides. The actual quality is usually inverse to the price — a focused solo consultant who's shipped real AI products typically produces better roadmaps than a top-tier firm staffed with junior associates running pattern-matched playbooks.
**Q: What's the difference between an AI strategy consultant and a fractional CTO?**
A: An AI strategy consultant produces a roadmap; a fractional CTO executes one. Strategy consultants engage for weeks at a fixed fee and walk away with a deliverable. Fractional CTOs engage ongoing (typically 8-15 hours/week) and stay through implementation. If you want a plan to hand to your team, hire a strategy consultant. If you want someone embedded who'll help you ship the plan, hire a fractional CTO. Many fractional CTOs (including me) offer both — a fixed-fee AI Readiness Assessment that produces a roadmap, then optional ongoing engagement for execution.
**Q: What should I look for in an AI strategy consultant?**
A: Three things: (1) Have they shipped real AI products themselves, or just consulted on them? Implementation experience matters because most AI strategies fail in the messy middle. (2) Does their deliverable include specific tool recommendations and implementation steps, or just frameworks and matrices? Generic 'consider these dimensions' output is worthless. (3) Will they engage on the actual build if you want them to? Strategy consultants who refuse to touch implementation often produce strategies that can't be implemented.
**Q: What are the red flags when hiring an AI strategy consultant?**
A: Common red flags: (1) Their portfolio is presentations, not products. (2) They quote 'AI transformation' or 'enterprise AI maturity' without ever defining the workflow they'd touch first. (3) Their pricing is opaque or 'depends on scope' indefinitely — good consultants give fixed pricing for scoped work. (4) They've never written code or shipped a product. (5) They use the words 'leverage,' 'synergy,' or 'paradigm shift' in their first call. (6) They quote you a 6-month strategy phase before any execution.
**Q: When should I skip the AI strategy consultant entirely?**
A: Skip when: (1) Your business is small enough (under $5M revenue) that you can absorb the strategy thinking yourself in a week — most founders can. (2) You already know which workflow to start with, and you just need someone to help build it. (3) You can run a self-assessment (like the free 30-question AI Readiness Checklist) and act on the results. (4) Your team has an internal AI champion who already understands the business and the tools. In those cases, skip strategy entirely and hire an implementation partner directly.
**Q: Is an AI Readiness Assessment the same as AI strategy consulting?**
A: An AI Readiness Assessment is the productized, fixed-fee version of AI strategy consulting. Instead of an open-ended engagement, you get a defined scope: 2 weeks of work, a workflow audit, an opportunity map ranked by ROI, tool recommendations, and a 30-60-90 day implementation plan. The benefit is predictable cost and clear deliverable. The downside is it doesn't include the ongoing 'thinking partner' role traditional strategy consultants offer. For most $1M-$50M businesses, an Assessment is what they actually need.
---
## OpenAI Codex Review (2026): Honest Take From a Fractional CTO
- **URL:** https://justinmckelvey.com/blog/openai-codex-review
- **Published:** May 21, 2026
- **Updated:** May 21, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 6 min
- **Description:** OpenAI Codex review by a fractional CTO. Real pricing, model quality, autonomy, what works and what doesn't. Is the terminal agent worth it?
Quick Verdict
OpenAI Codex is the most autonomous terminal-based AI coding agent in 2026 — best for unsupervised bulk work. Six months of daily use as a fractional CTO. Pricing $5–$250/month depending on usage, or included in ChatGPT Plus/Pro subscriptions. Wins on aggressive task completion, open-source CLI, and the o-series for hard reasoning. Loses to Claude Code on focus discipline, and to Cursor/Windsurf on editor-bound workflows.
Reviewed May 2026 · 6+ months daily use · Author: Justin McKelvey, fractional CTO, 50+ products shipped
TL;DR: OpenAI Codex ReviewOpenAI Codex is a terminal-native autonomous coding agent — install the open-source CLI, point it at a project, and it plans steps, edits files, runs commands, executes tests, fixes errors, and iterates until the task is done. Default model: GPT-5. Optional: the o-series for hard reasoning tasks.
The short version: if you're a professional developer who works in terminals and you do a lot of mechanical bulk refactors, Codex is the right tool. The aggressive autonomy means tasks finish faster than Claude Code's more conservative pace. If you value smaller, reviewable diffs over raw speed, Claude Code is the safer default.
If you're a non-developer or your work lives in an editor, Codex is the wrong tool. Look at Bolt, Lovable, or Cursor/Windsurf instead.
What OpenAI Codex IsCodex is a terminal CLI — install via npm or homebrew, run it in a project directory, give it goals like "refactor the auth flow to use NextAuth" or "rename this function across the codebase." It plans steps, executes them, and iterates. The entire experience is in your shell.
What it isn't: an IDE plugin, a chat interface, a web-based tool, or anything visual. It's a power-user CLI.
Pricing Breakdown (May 2026)The Codex CLI is free (open source on GitHub). The actual cost is model usage against your OpenAI API key OR included in a ChatGPT subscription. Real numbers from six months of daily use:
Usage profile
Daily time
Monthly cost (API)
Subscription option
Light user
1 hour/day, focused tasks
$5–$15
ChatGPT Plus ($20) covers it
Moderate user
3–4 hours/day mixed work
$30–$60
ChatGPT Plus ($20) often enough
Heavy user
Full-time agentic coding
$80–$250
ChatGPT Pro ($200) is the right tier
Power user
Multiple parallel sessions
$250–$600+
Pro ($200) + API overage usually cheapest
For most professional developers, ChatGPT Plus ($20/month) is the right starting tier. Heavy users should jump to ChatGPT Pro ($200/month) — the bundled credits make heavy Codex usage essentially free at the margin once you cross 4-5 hours/day of agentic work.
The Models: GPT-5 + the o-seriesCodex's default model is GPT-5, with the o-series available for tasks that need extra reasoning (algorithmic work, math-heavy computation, complex multi-step planning). You can also drop down to GPT-4.5 or older models for cost optimization on simple tasks.
What GPT-5 is best at: Aggressive task completion, mechanical refactors, code generation across multiple files, fast iteration loops, and tasks where you want the agent to make decisions without asking.
What it's less good at: Long-context reasoning over 100K+ tokens (Claude 4.7 Sonnet has a slight edge), maintaining narrow focus on a single file when the broader codebase is relevant (it tends to expand scope), and writing-quality explanations of code (Claude is the better writer).
What Codex Does WellThree strengths stand out after six months of real production use:
1. Aggressive autonomous task completion. When I tell Codex to "rename this function across the codebase and update all callers," it just does it. No pausing on the first file to confirm scope, no asking permission before editing files it thinks need updating, no surfacing intermediate decisions for review. It finishes faster than Claude Code on mechanical tasks. For unsupervised long-running work — kick off a task in tmux, do a meeting, come back to a finished feature — Codex is the right tool.
2. Open-source CLI. The tool itself is on GitHub, well-documented, and integrates cleanly into existing dev workflows. You can inspect the prompts it sends, modify behavior via config, and extend it with custom scripts. Claude Code's closed CLI doesn't offer this level of inspection.
3. The o-series for hard reasoning. When a task requires actual algorithm work or mathematical reasoning, switching Codex to o1 or o3 changes the quality of output noticeably. Claude Code's only knob is "Sonnet vs Opus" — Codex has a deeper bench of reasoning-heavy models to reach for.
Where Codex Falls ShortThree real weaknesses to know about before committing:
1. Loose focus discipline. Codex's "act, don't ask" default behavior is its strength on bulk tasks and its weakness on scope-sensitive work. It will install packages without asking, edit adjacent files you didn't mention, and make architectural assumptions to push the task to completion. When those assumptions are correct, great. When they're wrong, you have a sprawling diff to untangle. For production code review, this is real friction. (Claude Code is more conservative on this front.)
2. OpenAI-only models. You can't use Claude, Gemini, or other providers through Codex. If you've found that Claude is better for your specific work (and many engineers have), you're stuck switching tools entirely to use it. Cursor's per-prompt model selection is meaningfully more flexible here.
3. No visual interface. Pure terminal interaction is great for some workflows and painful for others. Frontend work where you need to see UI updates as the agent edits the React component is awkward — you're flipping between terminal and browser. Cursor or Windsurf are dramatically better here.
Real-World Usage: What I Use It ForSix months in, here's how Codex has settled into my actual workflow:
• Bulk refactors — rename across 30+ files, library migrations, framework upgrades. Codex's autonomy is a real productivity win here.
• Test backfills — "write unit tests for every public method in this directory" runs faster in Codex than alternatives.
• Schema migrations — database migrations, ORM model updates, the kind of mechanical changes that benefit from autonomous completion.
• Greenfield exploration — when I want the agent to make architectural decisions I'll review later, Codex's aggressive approach matches the work.
• Background tasks via tmux — start a Codex session, walk away, review when done.
Things I do NOT use Codex for:
• Production-sensitive code (Claude Code's smaller diffs are safer)
• Frontend UI work (Cursor wins on visual feedback)
• Long-context refactors over 100K+ tokens (Claude 4.7 Sonnet wins)
• Anything where I need to use Claude or Gemini for that specific task
How It Compares to AlternativesQuick reference for common comparisons:
• Claude Code vs Codex — both terminal agents. Codex wins on autonomy + speed; Claude Code wins on focus + long-context.
• Cursor vs Codex — IDE vs terminal. Different chairs for different work; most pros use both.
• Claude Code review — the head-to-head pick on the Anthropic side.
• The 7 best AI coding agents in 2026 — full landscape with Codex's place in it.
Verdict: Should You Use OpenAI Codex?Yes, if:
• You're a professional developer who does a lot of mechanical bulk work
• You value autonomy and speed over diff size
• You're already paying for ChatGPT Pro (bundled credits make it free at the margin)
• You want open-source tooling you can inspect and extend
• You work over SSH on remote servers (IDE-based tools can't compete here)
No, if:
• You're a non-developer building apps from prompts (use Lovable or Bolt)
• Your work is mostly frontend UI iteration (use Cursor or Windsurf)
• You strongly prefer Claude's reasoning style or already pay for Claude Max
• You need maximum focus discipline on production code (Claude Code is the safer default)
Most professionals in 2026 install both Codex and Claude Code, switching based on the task. Total cost typically $40-$120/month combined depending on usage tier, less than one hour of senior developer time. The productivity gain is real, and the focus-vs-autonomy tradeoff between them is significant enough to justify having both available.
Working with a Fractional CTOI help founders pick the right AI coding tool stack for their team — and review what AI agents have produced before it ships to customers. If you're vibe-coding an MVP and worried about what happens at scale, or you've shipped something with Codex and want a professional review before launch, book a strategy call. The first call is free.
### Frequently Asked Questions
**Q: Is OpenAI Codex worth it in 2026?**
A: For professional developers, yes — Codex is one of the two best terminal-based AI coding agents available (the other being Claude Code). It costs $5-$250/month depending on usage, the underlying GPT-5 and o-series models are excellent at autonomous task completion, and the open-source CLI is well-maintained. The main reasons NOT to use it: you only do simple frontend work where Cursor's IDE wins, you strongly prefer Anthropic's Claude models, or you need maximum focus discipline on production code (Claude Code is more conservative).
**Q: How much does OpenAI Codex cost?**
A: The Codex CLI itself is free (open source). The actual cost is the model usage, billed against your OpenAI API key. Typical professional usage: light user (1 hour/day) $5-$15/month, moderate user (3-4 hours/day) $30-$60/month, heavy user (full-time agentic work) $80-$250/month. ChatGPT Plus ($20/month) and ChatGPT Pro ($200/month) subscriptions include Codex credits — Pro especially is cost-effective if you're already paying for ChatGPT and use Codex heavily.
**Q: What does OpenAI Codex do well?**
A: Three things stand out: (1) Aggressive autonomous task completion — Codex installs packages, edits adjacent files, and pushes through multi-step tasks without asking for confirmation by default. Great for bulk refactors and unsupervised work. (2) Open-source CLI — the tool itself is on GitHub, well-maintained, and integrates cleanly into existing dev workflows. (3) Access to the o-series models for hard reasoning tasks (algorithm-heavy work, mathematical computation, complex multi-step planning). For mechanical bulk refactors, Codex finishes faster than Claude Code.
**Q: Where does OpenAI Codex fall short?**
A: Three real weaknesses: (1) Loose focus discipline — Codex will edit adjacent files it thinks need updating, which creates sprawling diffs that are expensive to review when the assumptions were wrong. (2) OpenAI-only models — you can't use Claude or Gemini through Codex even when those models would be better for a specific task. (3) No visual interface — pure terminal, harder for frontend work where you need to see UI updates. For visual feedback during agent work, Cursor or Windsurf are dramatically better.
**Q: Is Codex better than Claude Code?**
A: Neither is universally better — they're more similar than different. Use Codex if: you want maximum autonomy on long-running tasks, you do a lot of mechanical bulk refactors, you're already paying for ChatGPT Pro (the bundled credits make it free at the margin), or you prefer GPT-5's writing style. Use Claude Code if: you value focus discipline and reviewable diffs, you do long-context refactors that benefit from Claude 4.7 Sonnet's stronger long-context reasoning, or you're already paying for Claude Pro/Max. Most professionals install both.
**Q: Is Codex safe for production code?**
A: Codex generates code of similar quality to other top AI agents when using equivalent models. The risk is that Codex's aggressive autonomy creates larger, harder-to-review diffs. It will install packages, edit files you didn't mention, and make architectural assumptions to push tasks across the finish line — sometimes correctly, sometimes not. For production code, the discipline is to review every change, run tests, and use version control checkpoints before letting Codex run. Treat agent output the same way you'd treat a junior developer's pull request.
**Q: Does OpenAI Codex work over SSH?**
A: Yes — Codex is a CLI that runs anywhere a shell does, including remote servers via SSH. This is one of the biggest practical advantages of terminal-based agents (both Codex and Claude Code) over IDE-based alternatives like Cursor and Windsurf. For DevOps work, server administration, or working on remote development machines, terminal agents are in a different league than IDE agents.
**Q: Is OpenAI Codex open source?**
A: The Codex CLI itself is open source on GitHub. The underlying GPT-5 and o-series models are closed (proprietary to OpenAI). This is a small but real differentiator vs Claude Code, whose CLI is also closed source. If you want to inspect or modify how the agent works, Codex is the more open option among major terminal agents in 2026.
---
## Claude Code Review (2026): Is Anthropic's CLI Worth It?
- **URL:** https://justinmckelvey.com/blog/claude-code-review
- **Published:** May 21, 2026
- **Updated:** May 21, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 7 min
- **Description:** Honest Claude Code review after 6 months. Pricing, models, focus discipline, what it does well, where it falls short. By a fractional CTO.
Quick Verdict
Claude Code is the best terminal-based AI coding agent in 2026 for focused, judgment-heavy work. Six months of daily use as a fractional CTO. Pricing $5–$300/month depending on usage, or included in Claude Pro/Max subscriptions. Wins on focus discipline, long-context reasoning, and code review. Loses to Codex on aggressive bulk-task automation, and to Cursor/Windsurf on editor-bound workflows. Most pros install Claude Code AND one other tool.
Reviewed May 2026 · 6+ months daily use · Author: Justin McKelvey, fractional CTO, 50+ products shipped
TL;DR: Claude Code ReviewClaude Code is Anthropic's terminal-native autonomous coding agent. It runs in your shell, reads files, executes commands, runs tests, fixes errors, and iterates on multi-step coding tasks — all powered by Claude 4.7 Sonnet (default) or Claude Opus 4.7 (heavier tasks). I've used it as a primary tool since launch in 2025, shipping production code in Rails, React, Python, and Go. This is the honest review.
The short version: if you're a professional developer who works in terminals and you've never tried it — try it. Claude Code is one of the two best terminal agents in 2026 (the other being OpenAI Codex), and Claude's focus discipline makes it the safer default for production work where diff size matters.
If you're a non-developer or someone whose work lives in an editor, Claude Code is probably the wrong choice. Cursor or Windsurf are better fits.
What Claude Code IsClaude Code is a terminal CLI — you install it like any developer tool, point it at a project directory, and interact with it like you would a senior developer over Slack. Give it a goal ("rewrite this auth flow to use Devise"), it plans steps, edits files, runs tests, fixes errors, and iterates until the task is done.
What it isn't: an IDE, an in-editor assistant, a chat interface in a browser, or a "no-code" tool. The entire experience lives in your terminal.
Pricing Breakdown (May 2026)Claude Code itself is free to install. The actual cost is model usage, billed against your Anthropic API key. Real numbers from six months of daily use:
Usage profile
Daily time
Monthly cost (API)
Cheaper via subscription?
Light user
1 hour/day, focused tasks
$5–$15
Pay-per-use cheaper than $20/mo Pro
Moderate user
3–4 hours/day mixed work
$30–$80
Pro ($20) breaks even ~2 hr/day
Heavy user
Full-time agentic coding
$100–$300
Max ($100/mo) saves big at this volume
Power user
Multiple parallel agents
$300–$800+
Max ($100) + API overage usually cheapest
For most professional developers, Claude Pro ($20/month) is the right starting tier — gets you ~2-3 hours/day of moderate usage covered before API charges kick in. Heavy users should jump to Max ($100/month) which includes a much higher credit allotment.
The Model: Claude 4.7 Sonnet (and Opus)Claude Code's default model is Claude 4.7 Sonnet, with Claude Opus 4.7 available for heavier tasks (long-context refactors, complex reasoning, code review). The model quality is the actual product — Claude Code is a thin agentic wrapper around the model's capabilities.
What Claude 4.7 Sonnet is best at: Long-context reasoning (200K+ tokens), code understanding across large codebases, judgment calls about architecture, writing-quality explanations of what code is doing, and refactor work where you need to maintain a consistent pattern across many files.
What it's less good at: Pure algorithm work (GPT-5 has a slight edge here), heavy mathematical computation, and ultra-fast iteration on small tasks (it sometimes over-explains when you just want the answer).
What Claude Code Does WellSix months of real production work surfaces three consistent strengths:
1. Focus discipline. When I ask Claude Code to fix a specific bug in a specific file, it fixes that bug in that file. It doesn't "helpfully" update three other files that reference the function and break them in the process. This sounds basic, but it's the single thing I've found most consistently disappointing in competing agents — Codex and Windsurf's Cascade especially love to expand scope without asking. Smaller diffs are easier to review, easier to revert, and less likely to introduce regressions.
2. Long-context refactors. Claude 4.7 Sonnet handles 200K+ token contexts more reliably than GPT-5. For tasks like "read this entire 12,000-line Rails service and rewrite it to use the new auth pattern," Claude Code is the right tool. The agent loads the relevant files, plans the migration, executes it section by section, and stops to confirm risky decisions. I've used this for actual client production refactors with great results.
3. Code review and explanation. Beyond writing code, Claude Code is excellent at reading code and explaining what it does, where it might break, and what's idiomatic vs unusual. I use it for code review on legacy projects I'm taking over — "summarize the auth model" or "tell me where this codebase deviates from Rails conventions" produces useful, accurate output.
Where Claude Code Falls ShortThree real weaknesses to know about before you commit:
1. No visual interface. Every interaction is terminal-only. For frontend work where you need to see UI updates as the agent edits the React component, this is painful. You end up flipping between terminal and browser to see results. Cursor or Windsurf are dramatically better for visual UI iteration.
2. Less aggressive autonomy than Codex. Claude Code's conservatism is a feature for production work, but it can feel slow on mechanical bulk tasks. If you want to "rename this function across 47 files and update all callers" and walk away, Codex is faster — its more aggressive default behavior pushes through without asking for confirmation at each step. Claude Code might pause on the first file to confirm scope. (More on this in the Claude Code vs Codex comparison.)
3. Anthropic-only models. You can't easily swap to GPT-5 or Gemini for tasks where those models are stronger. If you want multi-model flexibility — Claude for some tasks, GPT for others, Gemini occasionally — Cursor's per-prompt model selection is meaningfully more flexible.
Real-World Usage: What I Use It ForSix months in, here's how Claude Code has settled into my actual workflow:
• Production refactors — when scope discipline matters. Renaming, moving code between files, updating to new patterns.
• Code review of legacy projects — when I take over an existing codebase, Claude Code is the first tool I use to map what's there.
• Backend bug fixing — focused diagnostic work where I know which file has the problem.
• SQL and database migrations — terminal-native is the right environment for these tasks.
• Documentation writing — Claude is the best writer of the major AI models, and Claude Code can read the actual code while writing about it.
• Remote server work — anywhere I'm SSH'd into a server, Claude Code is available.
Things I do NOT use Claude Code for:
• Frontend UI iteration (Cursor wins)
• Mechanical bulk refactors with 30+ near-identical edits (Codex wins)
• Prompt-to-app from scratch for non-coders (Lovable or Bolt)
How It Compares to AlternativesQuick reference for the most common comparisons:
• Claude Code vs Codex — both terminal agents. Claude wins on focus + long context; Codex wins on autonomy + speed.
• Claude Code vs Cursor — terminal CLI vs IDE. Different chairs for different work; most pros use both.
• Windsurf vs Claude Code — IDE agent vs terminal agent. Windsurf for editor-bound work; Claude Code for terminal-heavy.
• The 7 best AI coding agents in 2026 — full landscape with Claude Code's place in it.
Verdict: Should You Use Claude Code?Yes, if:
• You're a professional developer who works in terminals daily
• You value focus discipline and reviewable diffs over maximum autonomy
• You do long-context refactors or code review work
• You're already paying for Claude Pro or Max (you get included credits)
• You work over SSH on remote servers (IDE-based tools can't compete here)
No, if:
• You're a non-developer building apps from prompts (use Lovable or Bolt instead)
• Your work is mostly frontend UI iteration (use Cursor or Windsurf)
• You strongly prefer the OpenAI ecosystem and already pay for ChatGPT Pro (Codex's bundled credits make it free at the margin)
• You need multi-model flexibility (Cursor's per-prompt model picker is more flexible)
Most professionals in 2026 install both Claude Code and one IDE-based tool (Cursor or Windsurf), switching based on task. Total cost: usually $40-$120/month combined, less than one hour of senior developer time. The productivity gain is real.
Working with a Fractional CTOI help founders pick the right AI coding tool stack for their team — and review what AI agents have produced before it ships to customers. If you're vibe-coding an MVP and worried about what happens at scale, or you've shipped something with Claude Code and want a professional review, book a strategy call. The first call is free.
### Frequently Asked Questions
**Q: Is Claude Code worth it in 2026?**
A: For professional developers, yes — Claude Code is one of the two best terminal coding agents available (the other being OpenAI Codex). It costs $5-$300/month depending on usage, the underlying Claude 4.7 Sonnet model is excellent at long-context refactors and code review, and the agent has strong focus discipline compared to alternatives. The main reasons NOT to use it: you're not a coder and need a visual IDE-based tool, you prefer the OpenAI model ecosystem, or you only do simple frontend work where Cursor's IDE features matter more.
**Q: How much does Claude Code cost?**
A: Claude Code itself is a free CLI install. The actual cost is the model usage, billed against your Anthropic API key. Typical professional usage: light user (1 hour/day) costs $5-$15/month, moderate user (3-4 hours/day) costs $30-$80/month, heavy user (full-time agentic work) costs $100-$300/month. Anthropic also offers Claude Pro ($20/month) and Max ($100/month) subscriptions that include Claude Code usage credits — these are cheaper for users who already have a Claude subscription for other work.
**Q: What does Claude Code do well?**
A: Three things stand out after six months of daily use: (1) Long-context refactors — Claude 4.7 Sonnet handles 200K+ token contexts more reliably than competitors, making it the best choice for understanding large codebases. (2) Focus discipline — the agent stays in the scope you specified instead of helpfully editing adjacent files, which keeps diffs reviewable. (3) Judgment-heavy work — Claude reasons about tradeoffs and architectural decisions better than most alternatives. Less impressive at pure mechanical bulk tasks where Codex's more aggressive autonomy wins.
**Q: Where does Claude Code fall short?**
A: Three real weaknesses: (1) No visual interface — every interaction is terminal-only, which is harder for frontend work where you need to see UI updates. Cursor or Windsurf are better here. (2) More conservative than Codex on autonomous task completion — Claude Code asks for permission on destructive operations and stays in scope; Codex more aggressively pushes through. For unsupervised bulk refactors, Codex finishes faster. (3) Anthropic-only models — you can't easily swap to GPT-5 or Gemini for tasks where those models are stronger.
**Q: Is Claude Code safe for production code?**
A: Claude Code generates code of similar quality to other top AI agents when using equivalent models. The risk isn't the tool — it's whether you're reviewing the diff before shipping. Claude Code is safer than alternatives because of its focus discipline (smaller, more reviewable diffs) and its default behavior of asking permission before destructive operations. But it will still happily ship code with subtle bugs in authentication, payment webhooks, multi-tenant scoping, and complex migrations. Treat agent output the same way you'd treat a junior developer's pull request — review every change.
**Q: Is Claude Code better than Cursor?**
A: They target different workflows, so 'better' depends on the work. Claude Code is a terminal CLI optimized for autonomous multi-step tasks, focused refactors, and CLI-heavy workflows. Cursor is an IDE optimized for editor-bound work, frontend development, and pair-programming feel. Most professional developers in 2026 use both — Cursor as the daily-driver editor, Claude Code for terminal-heavy tasks. The actual question isn't 'better' but 'which one matches the work you're doing right now.'
**Q: Should I use Claude Code or Codex?**
A: Both are excellent terminal agents and they're more similar than different. Use Claude Code if: you value focus discipline (smaller, more reviewable diffs), you do a lot of long-context refactors or code review, you're already paying for Claude Pro/Max, or you prefer Claude's writing style for explanations. Use Codex if: you want maximum autonomy on long-running tasks, you do a lot of mechanical bulk refactors, you're already paying for ChatGPT Pro, or you prefer the OpenAI model ecosystem. Most pros install both and switch based on the task.
**Q: Does Claude Code work over SSH?**
A: Yes — Claude Code runs anywhere a shell does, including remote servers via SSH. This is one of its biggest practical advantages over IDE-based tools like Cursor and Windsurf, which require a local install with GUI access. For DevOps work, server administration, or working on remote development machines, Claude Code is in a different league than IDE agents.
**Q: Is Claude Code open source?**
A: No — the Claude Code CLI is closed source, distributed by Anthropic as a binary install. The underlying Claude models are also closed (proprietary to Anthropic). If open source matters to you, OpenAI Codex's CLI is open source (the underlying GPT-5 models are still closed), and some community-built agents like Aider are fully open source on top of various model APIs.
---
## The 7 Best AI Coding Agents in 2026 (Ranked + Compared)
- **URL:** https://justinmckelvey.com/blog/best-ai-coding-agents-2026
- **Published:** May 21, 2026
- **Updated:** May 21, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 8 min
- **Description:** The 7 best AI coding agents in 2026: Claude Code, Codex, Cursor, Windsurf, Bolt, Lovable, Replit. Pricing, fit, and when each wins from a fractional CTO.
Quick Answer
Seven AI coding agents are worth using in 2026: Claude Code, OpenAI Codex, Cursor, Windsurf, Bolt, Lovable, and Replit Agent. No single tool wins all workflows. For most professional developers, the best combination is Cursor ($20/mo) for editor work + Claude Code (~$5–$300/mo) for terminal-heavy tasks. For non-developers, Lovable ($25/mo) ships the most polished output. For maximum autonomy in a terminal, OpenAI Codex. The "best" agent depends entirely on what you're building.
Updated May 2026 · All 7 tools used in production work · Author: Justin McKelvey, fractional CTO, 50+ products shipped
TL;DR: The Best AI Coding Agents in 2026AI coding agents in 2026 fall into three groups: terminal agents (Claude Code, OpenAI Codex), AI-first IDEs (Cursor, Windsurf), and browser-based agents (Bolt, Lovable, Replit Agent). Each group serves a different workflow. Most professional developers use 2–3 in combination.
This guide ranks all seven by use case — not by some abstract "best." There is no single best AI coding agent. There's only the best agent for the work you're doing right now.
I'm a fractional CTO who's used every tool in this guide in real production work — building MVPs, refactoring 15,000-line Rails apps, shipping React frontends, and rescuing vibe-coded projects from founders who shipped before they tested. Here's the honest comparison.
The 7 AI Coding Agents Compared
Agent
Type
Pricing (May 2026)
Best for
Cursor
AI-first IDE (VS Code fork)
$20/mo Pro · Free tier (2K completions/mo)
Editor-bound pros, frontend, scope-sensitive edits
Claude Code
Terminal CLI (Anthropic)
$5–$300/mo via API · Pro $20/mo · Max $100/mo
Focused refactors, long-context analysis, judgment calls
OpenAI Codex
Terminal CLI (OpenAI)
$5–$250/mo via API · ChatGPT Plus $20/mo · Pro $200/mo
Autonomous bulk refactors, mechanical tasks, ChatGPT-stack users
Windsurf
AI-first IDE (Codeium)
$15/mo Pro · $60/mo Ultimate
Aggressive in-editor agent, batch refactors, greenfield builds
Lovable
Browser-based prompt-to-app
$25/mo Pro · Free tier (5 msgs/day)
Non-developers, polished MVPs, products with hosting included
Bolt.new
Browser-based prompt-to-app
$20/mo Pro · Free tier (1M tokens/day)
Rapid prototyping, demos, flexible stack choice
Replit Agent
Browser-based, cloud Linux
$20/mo Core · Free tier · $40/mo Teams
Collaborative cloud coding, education, full Linux Repls
How to Pick the Right AI Coding AgentBefore naming tools, name your workflow. The choice cascades cleanly:
2. Where do you code? Local editor → Cursor or Windsurf. Terminal → Claude Code or Codex. Browser → Bolt, Lovable, or Replit Agent.
4. How much autonomy do you want? Tight scope → Claude Code or Cursor. Maximum autonomy → Codex or Windsurf.
6. Can you read code? Yes → editor or terminal tools. No → Lovable or Bolt.
8. What's your existing stack? ChatGPT Pro user → Codex. Claude Max user → Claude Code. VS Code user → Cursor.
Most professional developers in 2026 settle on 2–3 tools: one editor-based for daily work, one terminal-based for refactors and CLI tasks, occasionally a browser-based agent for spinning up demos.
1. Cursor — Best Overall for Editor-Bound DevelopersWhat it is: A VS Code fork with in-editor AI chat, predictive tab completions, and a Composer agent that handles multi-file edits when you ask for them.
Pricing: $20/month Pro (500 fast premium requests, unlimited slow requests, full tab completion). Free tier includes 2,000 completions/month. Business tier $40/seat.
Why it wins: Cursor is the most popular professional IDE for AI coding in 2026 for good reason — the tab completion is meaningfully faster than competitors, the Composer agent surfaces decisions for human review, and migrating from VS Code is zero-cost (extensions all work). For frontend developers especially, the visual diff panel during agent edits is hard to give up once you're used to it.
Where it falls short: Terminal-heavy workflows feel awkward — running a 47-file refactor inside Cursor is slower than running it via Claude Code in a terminal. Also: model selection (Claude vs GPT vs Gemini) is per-prompt, which adds friction for some workflows.
Read more: Cursor vs Windsurf · Cursor vs Codex · Claude Code vs Cursor
2. Claude Code — Best Terminal Agent for Focused WorkWhat it is: Anthropic's terminal-native autonomous coding agent. Runs in your shell, reads files, executes commands, fixes errors, iterates on multi-step tasks. Default model: Claude 4.7 Sonnet.
Pricing: Free CLI; usage billed against Anthropic API ($5–$300/month typical) or included in Claude Pro ($20/mo) / Max ($100/mo) subscriptions.
Why it wins: Claude 4.7 Sonnet's long-context reasoning is the best in 2026 for code refactors that require understanding broader patterns. The agent has strong focus discipline — it stays in the scope you asked for instead of editing adjacent files unprompted. Best terminal-based pick for production code review and judgment-heavy refactors.
Where it falls short: The terminal-only interface is harder for newer developers. Frontend work where you need to see UI updates feels clunky.
Read more: Claude Code vs Codex · Claude Code vs Cursor · Windsurf vs Claude Code
3. OpenAI Codex — Best Terminal Agent for Autonomous Bulk WorkWhat it is: OpenAI's terminal-native autonomous coding agent. Same shape as Claude Code — runs in your shell, executes commands, iterates on errors — but with GPT-5 and the o-series models, and a more aggressive autonomy default.
Pricing: Free CLI; usage billed against OpenAI API ($5–$250/month typical) or included in ChatGPT Plus ($20/mo) / Pro ($200/mo) subscriptions.
Why it wins: Codex is the most autonomous CLI agent in 2026 — it installs packages, edits adjacent files, and pushes tasks across the finish line without asking. For mechanical bulk refactors (rename across 47 files, schema migrations, test backfills), it finishes faster than Claude Code's more conservative pace.
Where it falls short: The aggressive autonomy creates sprawling diffs that are expensive to review when the assumptions were wrong. Stay-in-scope discipline is weaker than Claude Code.
Read more: Claude Code vs Codex · Cursor vs Codex
4. Windsurf — Best IDE Agent for Maximum In-Editor AutonomyWhat it is: A Codeium-built IDE with Cascade — an aggressively autonomous agent designed to plan, write code, run tests, and iterate inside the editor with minimal supervision.
Pricing: $15/month Pro (500 prompts + 1,500 flow action credits). Ultimate $60/month for unlimited model use. Free tier available.
Why it wins: Cascade is the most aggressive in-editor agent in 2026 — it'll execute long multi-step tasks without asking for confirmation at every step. For greenfield builds and batch refactors, Cascade can complete work that Cursor's Composer would step through more slowly.
Where it falls short: Smaller community than Cursor, slightly less stable on edge cases, and the aggressive autonomy creates the same sprawling-diff problem as Codex.
Read more: Cursor vs Windsurf · Windsurf vs Claude Code
5. Lovable — Best Browser Agent for Non-Developers Shipping Real ProductsWhat it is: A prompt-to-app builder. Type a description of what you want, get a deployable React + Tailwind + Supabase app with hosting included.
Pricing: $25/month Pro (500 messages/day, custom domains, GitHub sync). Free tier 5 messages/day. Teams $50/month.
Why it wins: Lovable produces the most polished default output of any browser-based agent. The opinionated stack (React + Tailwind + Supabase) means apps look professional out of the box, and hosting + custom domain SSL are handled for you — enormous wins for non-developers.
Where it falls short: The stack is opinionated and hard to change. Iteration speed is slower than Bolt. Heavy customization requires moving to a real IDE.
Read more: Bolt vs Lovable · Lovable vs Cursor
6. Bolt.new — Best Browser Agent for Rapid PrototypingWhat it is: A prompt-to-app builder from StackBlitz, optimized for speed-to-first-result and stack flexibility (Vue, Svelte, Next.js, plain HTML — whatever you ask for).
Pricing: $20/month Pro (10M tokens/month). Free tier 1M tokens/day. Pro 50 $50/month.
Why it wins: Bolt is faster than Lovable to first working app — typically 7 minutes vs 12 in side-by-side tests on the same prompt. Flexible stack choice means you can specify Vue, Svelte, or plain HTML when Lovable's React-only default doesn't fit. Best for throwaway demos and rapid iteration.
Where it falls short: No hosting included (you deploy elsewhere). Generated apps look less polished than Lovable defaults. The flexibility creates inconsistency across projects.
Read more: Bolt vs Lovable
7. Replit Agent — Best Browser Agent for Cloud Linux WorkflowsWhat it is: Replit's autonomous agent built on top of their browser-based cloud IDE. Includes full Linux containers — you can run servers, databases, and CLI tools without leaving the browser.
Pricing: Core $20/month (private projects, more compute, agent access). Free tier with limits. Teams $40/month.
Why it wins: Unique among browser-based agents — you get a real Linux environment with persistent storage. Best fit for education, collaborative coding (share a Repl link, edit together), and developers who want browser-based access to a real shell.
Where it falls short: Free tier compute is limited. Agent quality lags Claude Code / Codex on terminal-heavy work. Slower iteration than Bolt for simple prompt-to-app workflows.
Read more: Replit vs Cursor
Combinations That Work in 2026Most professional developers in 2026 don't pick one — they combine 2–3 tools based on workflow. Common patterns:
• Frontend pro: Cursor + Claude Code. Cursor for daily React/Vue work, Claude Code for backend refactors and library upgrades.
• Backend pro: Claude Code + Codex + Cursor. Claude Code for focused work, Codex for bulk refactors, Cursor occasionally for frontend tasks.
• Founder / generalist: Cursor + Claude Code + Lovable. Cursor for real work, Lovable for spinning up demo apps for stakeholders.
• Non-developer founder: Lovable + occasional Bolt. Lovable for the product you ship, Bolt for quick experiments.
• ChatGPT Pro subscriber: Codex + Cursor. Use the bundled Codex credits, pay $20 for Cursor on the side.
What AI Coding Agents Won't DoThree things every AI coding agent gets wrong in 2026:
1. Authentication and payment flows. Every agent produces code that compiles and runs but has subtle security issues — exposed API keys, missing webhook signature verification, weak auth scoping. Always review payment and auth code line-by-line.
2. Multi-tenant scoping. Agents don't reliably enforce tenant isolation in database queries. SaaS apps especially need careful human review of any agent-generated query that touches user data.
3. Production-readiness judgment. Agents will happily ship code that works on a happy path. They don't think about error handling, retries, rate limits, or graceful degradation unless you specifically ask.
If you're vibe-coding an MVP, get a professional review before launching. (More on where AI coding tools break in production.)
Working with a Fractional CTOI help founders pick the right AI coding tool stack for their team and review what AI agents have produced before it ships to customers. If you're vibe-coding an MVP and worried about what happens at scale, or you've already shipped something with one of these tools and want a professional review, book a strategy call. The first call is free.
### Frequently Asked Questions
**Q: What is the best AI coding agent in 2026?**
A: There's no single 'best' AI coding agent — the right pick depends on what you're doing. For most professional developers in 2026, the best all-purpose pick is Cursor ($20/mo) for editor-bound work paired with Claude Code (~$5-$300/mo) for terminal-heavy tasks. For non-developers building apps from prompts, Lovable ($25/mo) produces the most polished output. For aggressive autonomous task completion in a terminal, OpenAI Codex (~$5-$250/mo) is the most capable. For browser-based collaboration and instant prototypes, Replit Agent or Bolt are the right tools.
**Q: What are AI coding agents?**
A: AI coding agents are tools that use large language models to write, edit, debug, and refactor code with minimal human intervention. They go beyond auto-complete and chat — they plan multi-step tasks, execute commands, read files, run tests, fix errors, and iterate autonomously. As of 2026 the dominant agents are terminal-based (Claude Code, OpenAI Codex), IDE-based (Cursor, Windsurf), and browser-based (Bolt, Lovable, Replit Agent). All are built on top of foundation models from Anthropic, OpenAI, or Google.
**Q: What's the difference between an AI coding agent and AI auto-complete?**
A: Auto-complete (like GitHub Copilot's tab completion) predicts the next few characters or lines you'd type. An AI coding agent operates at a higher level — you describe a goal, and the agent plans steps, executes commands, edits multiple files, runs tests, and iterates on errors. Auto-complete shortens individual keystrokes; an agent replaces entire workflows.
**Q: How much do AI coding agents cost in 2026?**
A: Pricing ranges from free tiers to $200+/month for heavy users. Editor-based agents are typically flat-rate: Cursor Pro $20/mo, Windsurf Pro $15/mo. Terminal agents are usage-based: Claude Code via Anthropic API runs $5-$300/mo depending on use; OpenAI Codex via OpenAI API runs $5-$250/mo. Browser-based agents are subscription: Bolt Pro $20/mo, Lovable Pro $25/mo, Replit Core $20/mo. Most professional developers spend $40-$100/month total across multiple tools.
**Q: Can AI coding agents replace developers?**
A: Not in 2026. AI coding agents replace certain tasks within software development — boilerplate generation, first-draft implementations, mechanical refactors, test backfills, documentation. They don't replace the judgment work — system design, security review, debugging novel problems, choosing the right abstraction, maintaining production systems. The professionals winning with AI coding agents in 2026 are the ones who treat them as accelerators for execution, not substitutes for thinking.
**Q: Which AI coding agent is best for beginners?**
A: For absolute beginners who can't read code yet — Lovable or Bolt. They build apps from natural-language prompts and produce deployable output. For beginners who can read code — Cursor. The IDE looks like VS Code (familiar), the agent is conservative (doesn't run destructive commands without confirmation), and visual feedback makes it easy to learn what the agent is doing. Avoid terminal-based agents (Claude Code, Codex) as a first tool — too easy to get into trouble before you can recognize what 'good code' looks like.
**Q: What's the difference between Claude Code and Cursor?**
A: Claude Code is a terminal CLI from Anthropic — runs in your shell, autonomous, no editor UI. Cursor is an AI-first IDE from a separate company (Cursor/Anysphere) — runs as a VS Code fork with in-editor chat, tab completions, and the Composer agent. Cursor can use Claude models too (you pick the model in settings). The real difference is workflow: Claude Code for terminal-bound tasks (backend, refactors, CLI work), Cursor for editor-bound tasks (frontend, pair-programming, scope-sensitive edits). Most pros use both.
**Q: Should I use multiple AI coding agents at the same time?**
A: Yes — and most professionals do. The agents target different workflows and don't conflict. A typical 2026 setup: Cursor or Windsurf as the daily-driver IDE, Claude Code or OpenAI Codex for terminal-heavy work (refactors, migrations, CLI), and Lovable or Bolt occasionally for spinning up demos. Combined monthly cost typically runs $40-$120 depending on usage — less than one hour of senior developer time.
---
## Cursor vs Codex (2026): IDE Agent vs Terminal Agent
- **URL:** https://justinmckelvey.com/blog/cursor-vs-codex
- **Published:** May 21, 2026
- **Updated:** May 21, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 6 min
- **Description:** Cursor (AI IDE) vs OpenAI Codex (terminal agent) compared. Pricing, models, workflow fit, code quality, and when each wins — from a fractional CTO.
Quick Answer
Cursor is an AI-first IDE. OpenAI Codex is a terminal-native autonomous agent. They're complementary, not competitive. Cursor ($20/mo) is a VS Code fork best for editor-bound work — frontend, pair-programming, scope-sensitive edits. Codex (free CLI, billed via OpenAI API at $5–$250/mo) is best for terminal-bound work — bulk refactors, schema migrations, CLI tasks. Most professional developers in 2026 use both.
Tested May 2026 · Production work shipped in both · Author: Justin McKelvey, fractional CTO, 50+ products shipped
TL;DR: Cursor vs Codex in 2026Cursor is an AI-first IDE; Codex is a terminal agent. Different chairs to sit in for different kinds of work. Cursor lives inside a VS Code fork with in-editor chat, tab completions, and a Composer agent that handles multi-file edits inside the editor. OpenAI Codex lives in your terminal — runs in your shell, plans and executes multi-step coding tasks, and iterates on errors without an editor UI.
As of May 2026, Cursor Pro is $20/month flat, Codex is billed against OpenAI API usage (typically $5–$250/month depending on use, or included in ChatGPT subscriptions). Both produce comparable code quality when given the same prompt with comparable models. The real question isn't "which is better" — it's "which workflow are you in right now."
I'm a fractional CTO who ships code daily with both. This is the honest comparison after using each in real production work.
Cursor vs Codex at a glance
Feature
Cursor
OpenAI Codex
Interface
Full IDE (VS Code fork)
Terminal CLI
Pricing (May 2026)
$20/month flat (Pro) — annual saves ~$48
$5–$250/mo via OpenAI API, or included in ChatGPT Plus ($20) / Pro ($200)
Agent
Composer (turn-based, visual diffs)
Codex (autonomous, terminal-native)
Model lineup
Claude, GPT, Gemini (you pick)
GPT-5, o-series, GPT-4.5 (OpenAI only)
Visual feedback
Built-in IDE panel + diff view
Terminal output + git diff after the fact
Works over SSH
No (IDE on local machine only)
Yes — runs anywhere shell does
Best for
Frontend, scope-sensitive edits, pair-programming feel
Backend, large refactors, CLI tasks, remote work
Background-friendly
Less so — designed for interactive use
Yes — runs well in tmux / nohup
Learning curve
Gentle (looks like VS Code)
Moderate (terminal agent paradigm)
Use together?
Yes — most pros do. Cursor for editor work, Codex for terminal-heavy tasks.
What Each Tool Is (In One Sentence)Cursor is a VS Code fork with in-editor AI chat, predictive tab completions, and a Composer agent that handles multi-file edits when you ask for them.
OpenAI Codex is an autonomous coding agent that runs in your terminal — reading files, executing commands, running tests, and iterating on errors without any editor UI.
Different chairs. Different work.
Pricing Compared (May 2026)Cursor Pro: $20/month flat. Includes 500 fast premium requests (Claude 4.7 Sonnet, GPT-5, etc.), unlimited slow requests, full tab completion model. Annual is $192. Business tier is $40/seat. Predictable.
OpenAI Codex: Free to install. Usage is billed against your OpenAI API key: typical light user $5–$15/month, moderate user $30–$60/month, heavy user $80–$250/month. Alternatively, included in ChatGPT Plus ($20/mo) or ChatGPT Pro ($200/mo) subscriptions with bundled credits. Pay-per-use.
For most working developers, the monthly bills land in similar territory. The bigger differences are predictability (Cursor) vs flexibility (Codex), and which underlying provider you'd rather pay (Cursor lets you pick models; Codex is OpenAI-only).
Agent Mode: Composer vs CodexBoth have autonomous agents. They behave very differently.
Cursor Composer is a turn-based collaborator in your editor. You give it a goal ("refactor auth to use Devise"), it plans steps, proposes file changes in a visual diff panel, asks for confirmation on destructive operations, runs commands when you allow it, and pauses on errors. You stay in the loop, watching the diff appear.
Codex is more autonomous and lives in your terminal. Give it the same goal and Codex will plan, write code, run tests, install dependencies, fix failing tests, and iterate until the task is done — printing what it's doing inline. The default is "act, don't ask." You can configure confirmations, but the tool is clearly designed for hands-off agentic work.
Which is better depends on what you're doing. Cursor's careful pace makes Composer safer for security-sensitive code (auth, payments, multi-tenant scoping). Codex's autonomy makes it faster for mechanical bulk refactors that would be tedious to step through one diff at a time.
Code QualityBoth produce comparable code quality when using equivalent underlying models. Cursor lets you pick from Claude, GPT, or Gemini (each via the provider's API). Codex runs on OpenAI's lineup (GPT-5 default, o-series for harder reasoning). The differences I noticed in real production work:
Cursor is better at frontend iteration. The visual IDE matters when you're working on UI — seeing the React component update as the agent edits it shortens the feedback loop dramatically. Codex can build a frontend, but you're flipping between terminal and browser to see results.
Codex is better at backend bulk work. Database migrations, API changes, library upgrades, test backfills — the terminal matches how backend work is naturally done. Cursor can handle these but the IDE context-switching feels heavier.
Both fail in the same places. Authentication edge cases, payment webhook signatures, multi-tenant scoping, complex database migrations — these need human review regardless of which tool you use. (More on where AI coding tools break in production.)
When Cursor Wins
• Frontend work. Visual editor + agent = faster UI iteration than terminal-only.
• Security-sensitive code. Payments, auth, anything HIPAA/SOC2. You want the human-in-the-loop pace.
• Pair-programming feel. If you like watching the agent work and intervening in real time.
• Teams already on VS Code. Zero migration cost, all your extensions work.
• Onboarding to unfamiliar codebases. Cursor's "Ask" mode is the best way to explore a new codebase quickly.
• Multi-model flexibility. When you want Claude for some tasks, GPT for others, Gemini occasionally.
When Codex Wins
• Backend work. Database migrations, API changes, test backfills — terminal-native is faster.
• Large refactors. 20+ file changes where the logic is mechanical and review is batch.
• CLI-heavy workflows. npm/bundle/pip workflows feel native; in an IDE they always feel like context switches.
• Background work. Kick off Codex in tmux, do a meeting, come back to a finished feature.
• Remote machines. Codex runs over SSH on any server. Cursor needs to be local.
• ChatGPT Pro subscribers. If you're already paying $200/mo for ChatGPT Pro, the included Codex credits make it free at the margin.
What About Claude Code and Windsurf?If you liked this comparison, you're probably also weighing:
• Claude Code vs Codex — both terminal agents, different AI labs. Codex is more aggressive, Claude Code is more focused.
• Cursor vs Windsurf — both IDEs, different agent philosophies. Cursor is conservative, Windsurf is autonomous.
• Claude Code vs Cursor — terminal agent vs IDE, similar to this comparison but with Anthropic instead of OpenAI.
• Windsurf vs Claude Code — IDE agent vs terminal agent, autonomous philosophies on both sides.
What I Actually RecommendIf you're a working developer doing varied work and can afford both: install both. They serve different surfaces. Use Cursor for editor-bound work. Reach for Codex when you have terminal-heavy tasks (refactors, migrations, remote work).
If you can only pick one and your work is mostly editor-bound: Cursor. Better tab completion, better diff review, better onboarding to unfamiliar codebases.
If you can only pick one and your work is mostly terminal-heavy: Codex. Better autonomy, better for unsupervised long-running tasks, works over SSH.
If you're already paying for ChatGPT Pro: start with Codex, the subscription credits make it free at the margin.
Working with a Fractional CTOI help founders pick the right AI coding tools for their stack and team. If you're vibe-coding an MVP and worried about what happens at scale, or you've shipped something and want a professional review before launch, book a strategy call. The first call is free.
### Frequently Asked Questions
**Q: Is Cursor or Codex better in 2026?**
A: Neither is universally better — they target different workflows. Cursor is an AI-first IDE (VS Code fork) with in-editor chat, tab completions, and a Composer agent — best for editor-bound work, frontend development, and pair-programming workflows. OpenAI Codex is a terminal-native autonomous agent — best for backend work, large refactors, CLI-heavy tasks, and remote server work. Most professional developers in 2026 use both: Cursor in their editor, Codex from their terminal.
**Q: What is the difference between Cursor and Codex?**
A: Cursor is a full AI-first IDE — VS Code fork built around chat, tab completions, and a Composer agent that handles multi-file edits inside the editor. OpenAI Codex is a terminal CLI — runs in your shell, executes commands, reads files, fixes errors, and iterates autonomously without any editor UI. Cursor amplifies you as a developer working in an editor. Codex replaces parts of you when you're running terminal-heavy tasks.
**Q: How much does Cursor cost vs Codex?**
A: Cursor Pro is $20/month flat (with annual discount to ~$192/year). It includes 500 fast premium requests, unlimited slower requests, and full tab completion. Codex is billed against your OpenAI API key (typical professional usage $30-$80/month for moderate users, $80-$250/month for heavy users) OR included in ChatGPT Plus ($20/mo) and ChatGPT Pro ($200/mo) subscriptions. For most developers, monthly cost is in the same range — Cursor's predictability vs Codex's pay-per-use is the bigger difference.
**Q: Is Codex better at autonomous coding than Cursor?**
A: Codex is more aggressive about autonomous task completion — it will execute commands, install packages, edit adjacent files, and iterate without confirmation by default. Cursor's Composer agent is more turn-based — it proposes changes, waits for confirmation, and surfaces decisions for review. If you want a fire-and-forget agent for unsupervised work, Codex wins. If you want pair-programming with the agent under your supervision, Cursor wins.
**Q: Can I use Cursor and Codex together?**
A: Yes — they don't conflict. Many developers in 2026 use Cursor as their primary editor and reach for Codex for terminal-bound work: bulk refactors, library upgrades, schema migrations, test backfills, anything CLI-heavy. The cost overlap is small relative to the productivity gain.
**Q: Which is better for non-developers, Cursor or Codex?**
A: Neither is ideal for non-developers — both assume you can read code, write prompts that describe technical work, and review diffs. Cursor is more accessible because the IDE provides visual feedback as the agent works. Codex's pure-terminal interface is harder for non-coders. If you're a non-developer wanting to build apps from prompts, look at Lovable or Bolt instead — those tools assume you can't read code.
**Q: Does Codex replace Cursor?**
A: No — they target different surfaces. Codex doesn't have an IDE; it runs in your terminal. Cursor doesn't have great terminal automation; it runs in your editor. For frontend work where you need to see UI updates as the agent edits the code, Cursor wins. For backend refactors and CLI workflows where you want autonomous task completion in the background, Codex wins. They're complementary, not competitive.
**Q: Which is faster, Cursor or Codex?**
A: Cursor feels faster for inline edits and quick completions — its tab model is highly tuned for the editor experience. Codex feels faster for multi-step autonomous tasks — it does more per prompt without asking for confirmation. For 'how fast can I ship this end-to-end,' the answer depends on what 'this' is: small editor-bound changes favor Cursor; large mechanical refactors favor Codex.
---
## Claude Code vs Codex (2026): Which Terminal AI Agent Wins?
- **URL:** https://justinmckelvey.com/blog/claude-code-vs-codex
- **Published:** May 21, 2026
- **Updated:** May 21, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 7 min
- **Description:** Claude Code vs OpenAI Codex compared by a fractional CTO. Pricing, autonomy, model quality, focus, and when each terminal agent wins. May 2026.
Quick Answer
Claude Code and OpenAI Codex are both terminal-native autonomous coding agents — same shape, different defaults. Claude Code (Anthropic) wins on focus discipline and long-context reasoning. Codex (OpenAI) wins on aggressive task completion and ChatGPT-ecosystem integration. Pricing is comparable: $5–$300/month depending on usage on either. Most professional developers in 2026 keep both installed and switch based on the task.
Tested May 2026 · Production work shipped in both · Author: Justin McKelvey, fractional CTO, 50+ products shipped
TL;DR: Claude Code vs Codex in 2026Claude Code is Anthropic's terminal CLI built around Claude 4.7 Sonnet. Codex is OpenAI's terminal CLI built around GPT-5 and the o-series models. Both launched in 2025 within months of each other. Both run in your shell, read files, execute commands, run tests, fix errors, and iterate on multi-step tasks without an IDE. As of May 2026, the prices are roughly comparable ($5–$300/month depending on usage), the model quality is roughly comparable on most tasks, and the actual difference comes down to how each one behaves when you're not watching.
I'm a fractional CTO who ships code daily with both. I've used Claude Code to refactor 15,000-line Rails apps and used Codex to build React frontends and run multi-file migrations. This is the honest comparison — what each does well, where each falls short, and how most professionals end up using both.
Claude Code vs Codex at a glance
Feature
Claude Code
OpenAI Codex
Maker
Anthropic
OpenAI
Interface
Terminal CLI
Terminal CLI
Default model
Claude 4.7 Sonnet (Opus 4.7 available)
GPT-5 (o-series, GPT-4.5 available)
Pricing
Via Anthropic API ($5–$300/mo typical) or Claude Pro ($20/mo) / Max ($100/mo) subscriptions
Via OpenAI API ($5–$250/mo typical) or ChatGPT Plus ($20/mo) / Pro ($200/mo) subscriptions
Open source
Closed source CLI
Open source CLI on GitHub
Focus discipline
Strong — stays in scope
Looser — edits adjacent files unprompted
Autonomy default
Asks permission before destructive ops
More aggressive about completing the task
Best for
Focused tasks, long-context refactors, judgment calls
Mechanical multi-step tasks, ChatGPT ecosystem integration
Works over SSH
Yes
Yes
Sandboxed execution
Yes
Yes
Use them together?
Yes — most pros keep both installed. Claude Code for scope-sensitive work, Codex for autonomous task completion.
What Each Tool Is (In One Sentence)Claude Code is Anthropic's terminal-native autonomous coding agent. It runs in your shell, plans multi-step tasks, executes commands, and iterates on errors — with Claude 4.7 Sonnet as the default model.
OpenAI Codex is OpenAI's terminal-native autonomous coding agent. Same shape: runs in your shell, plans multi-step tasks, executes commands, iterates on errors — with GPT-5 as the default model and access to the o-series for harder reasoning tasks.
Same chair to sit in. Different driver.
Pricing Compared (May 2026)Claude Code is free to install. Usage is billed against your Anthropic API key OR included in Anthropic's Claude Pro ($20/mo) and Max ($100/mo) subscriptions. Typical professional usage: light user (1 hour/day) $5–$15/month, moderate user (3-4 hours/day) $30–$80/month, heavy user (full-time agentic work) $100–$300/month.
OpenAI Codex is free to install. Usage is billed against your OpenAI API key OR included in ChatGPT Plus ($20/mo) and ChatGPT Pro ($200/mo) subscriptions. Typical professional usage: light user $5–$15/month, moderate user $30–$60/month, heavy user $80–$250/month.
For most professional developers, the costs are within ~20% of each other. The deciding factor on price isn't price — it's whether you're already paying for ChatGPT Pro or Claude Max, in which case the included credits make one cheaper for you specifically.
Agent Behavior: Claude Code vs CodexBoth are autonomous agents. Both will plan, code, execute commands, fix errors, and iterate. The behavioral differences are subtle but consistent.
Claude Code is conservative by default. It stays focused on the file or directory you asked it to work on. It asks for permission before destructive operations (configurable). It surfaces intermediate decisions inline so you can intercept. The agent assumes you'll review the diff carefully when it's done.
Codex is more aggressive. It will install npm/pip packages without asking, edit adjacent files it thinks need updating, and make architectural assumptions to push the task to completion. That's great when the assumptions are correct — the task finishes faster. It's frustrating when the assumptions are wrong — you get a sprawling diff to untangle.
In practice: Claude Code feels like an intern who finishes exactly what you asked. Codex feels like an intern who finishes what you asked plus three things they thought you'd want.
Code QualityBoth produce comparable code quality on most tasks. Claude 4.7 Sonnet has a slight edge on long-context refactors (50K+ tokens of file context); GPT-5 has a slight edge on certain algorithmic and math-heavy tasks. The differences I noticed in real production work:
Claude Code is better at staying in scope. If you say "fix the auth bug in user_session.rb," Claude Code fixes that bug in that file. Codex might also "helpfully" update three other files that reference user sessions — sometimes correctly, sometimes not.
Codex is better at mechanical task completion. If you say "rename this function across the entire codebase and update all callers," Codex finishes in one pass. Claude Code might pause to confirm scope on the first file or ask whether you want to update tests too.
Both fail in the same places. Authentication edge cases, payment webhook signatures, multi-tenant scoping, complex database migrations — these need human review regardless of which agent you use. (More on where AI coding agents break in production.)
When Claude Code Wins
• Scope-sensitive work. When you only want changes to specific files and a sprawling diff would be a problem.
• Long-context refactors. Claude 4.7 Sonnet handles 200K+ token contexts more reliably than GPT-5.
• Judgment-call refactors. When the task requires understanding the broader codebase pattern and applying it consistently.
• Pair-programming feel. If you want to watch the agent work and intervene in real time.
• Teams already on Claude Pro/Max. Subscription credits make it cheaper for you specifically.
When Codex Wins
• Mechanical multi-step tasks. Rename across N files, schema migrations, test backfills — Codex finishes faster.
• Unsupervised long-running work. Kick off a task, walk away, review the diff later.
• OpenAI ecosystem integration. If your team is already deep in ChatGPT Pro / OpenAI API.
• Math and algorithm-heavy tasks. GPT-5 and the o-series have a slight edge on hard reasoning.
• Greenfield exploration. When you want the agent to make architectural decisions you'll review later.
How Professionals Actually Use BothMost senior developers I work with in 2026 install both and switch based on task type. Common pattern:
2. Scope-sensitive fixes → Claude Code. Bug fixes in known files, security-sensitive code, code review responses.
4. Bulk refactors → Codex. Rename across 47 files, library upgrades, test backfills, mechanical migrations.
6. Long-context analysis → Claude Code. Reading a large codebase, summarizing it, identifying patterns.
8. Greenfield builds → Either, slight preference for Codex if you want it to make autonomous decisions.
10. Production-sensitive code → Claude Code. Anywhere a sprawling diff would be expensive to review.
What About Cursor and Windsurf?Both Claude Code and Codex are terminal agents. Cursor and Windsurf are IDE agents — different chair entirely. The IDE agents win on visual feedback (seeing the React component update as the agent edits it). The terminal agents win on background work, large-scope tasks, and remote-server workflows.
Most professionals in 2026 use a combination: Cursor or Windsurf for editor-bound work, and Claude Code or Codex for terminal-heavy tasks (refactors, backend work, CLI workflows).
Switching CostMigrating between Claude Code and Codex is mostly painless. Both are CLIs you install via npm or homebrew. Both accept similar prompts. The only meaningful switching cost is muscle memory — each tool has slightly different commands for things like sandboxing, model selection, and conversation history. Plan a day to adjust if you're a heavy user.
The lock-in isn't the tool — it's the model preference. If you're used to Claude 4.7 Sonnet's writing style and reasoning patterns, Codex's GPT-5 output will feel different (more verbose, more eager to volunteer alternatives). Vice versa for Codex-natives trying Claude Code.
What I Actually RecommendIf you're a working developer and you can afford both subscriptions: install both. They're $40–$120/month combined depending on your tier, which is less than one hour of a senior developer's time. Use Claude Code for scope-sensitive work. Use Codex when you want maximum autonomy.
If you can only afford one and you do varied work: Claude Code. The focus discipline pays off across more task types and the long-context reasoning is meaningfully better for code review and refactor work.
If you can only afford one and you do mostly mechanical bulk refactors: Codex. The aggressive autonomy is a real productivity win on tasks where you're going to batch-review the diff at the end anyway.
If you're already paying for ChatGPT Pro: start with Codex, the subscription credits make it free at the margin.
If you're already paying for Claude Max: start with Claude Code, same logic.
Working with a Fractional CTOI help founders pick the right AI coding tools for their stack and team. If you're vibe-coding an MVP and worried about what happens at scale, or you've already shipped something with one of these agents and want a professional review before you launch, book a strategy call. The first call is free.
### Frequently Asked Questions
**Q: Is Claude Code or Codex better in 2026?**
A: For most professional developers, Claude Code is the safer pick in 2026 — it has stronger focus discipline (less tendency to edit adjacent files you didn't mention) and Claude 4.7 Sonnet remains the highest-quality coding model for long-context refactors. OpenAI Codex is a stronger choice if you're already deep in the OpenAI ecosystem, want broader model access (GPT-5, GPT-4.5, o-series), or need first-class integration with ChatGPT subscriptions. Pricing is roughly comparable: Claude Code via Anthropic API runs $5-$300/month depending on usage; Codex via OpenAI API runs $5-$250/month.
**Q: What is the difference between Claude Code and Codex?**
A: Both are terminal-native autonomous coding agents released in 2025 by competing AI labs. Claude Code is Anthropic's CLI using Claude 4.7 Sonnet (and Opus 4.7); Codex is OpenAI's CLI using GPT-5 and the o-series models. Functionally similar: both read files, run commands, execute tests, fix errors, and iterate without an IDE. The behavioral differences are subtle — Claude Code is generally more conservative about scope (stays focused on what you asked); Codex tends to be more aggressive about completing the broader task, sometimes editing files you didn't mention. Pricing and pricing models differ — Claude Code uses Anthropic's pricing (or subscription credits); Codex uses OpenAI's pricing.
**Q: How much does Claude Code cost compared to Codex?**
A: Both are billed via API usage from their respective providers, with subscription options available. Claude Code: typical light use $5-$15/month via Anthropic API; moderate use $30-$80/month; heavy use $100-$300/month. Anthropic also offers Pro ($20/mo) and Max ($100/mo) subscriptions that include Claude Code credits. Codex: typical light use $5-$15/month via OpenAI API; moderate use $30-$60/month; heavy use $80-$250/month. OpenAI's ChatGPT Plus ($20/mo) and Pro ($200/mo) subscriptions also include Codex usage. For most professional developers, the actual cost difference is small — pick based on model preference, not price.
**Q: Is Codex better than Claude Code at autonomous coding?**
A: Codex is more aggressive about autonomous task completion — it will install packages, edit adjacent files, and make architectural assumptions to push a task across the finish line. That's great when the assumptions are correct, frustrating when they aren't. Claude Code is more conservative: it stays focused on exactly what you asked, surfaces decisions for review, and is less likely to introduce dependencies you didn't request. For unsupervised long-running tasks, Codex often finishes faster. For tasks where you want to review the diff before it explodes, Claude Code is the safer choice.
**Q: Can I use Claude Code and Codex at the same time?**
A: Yes — they're separate CLIs with separate API keys, no conflicts. Many professional developers in 2026 keep both installed: Claude Code for focused, scoped tasks where the diff needs to stay small, and Codex when they want maximum autonomy on a multi-step task they'll review at the end. Cost overlap is small relative to productivity gain.
**Q: Which is better for large refactors, Claude Code or Codex?**
A: It depends on what 'large' means. For a refactor with hundreds of nearly-identical changes (renaming a function across 47 files, schema migrations, test backfills) — both work well, Codex slightly faster due to its more autonomous behavior. For a refactor that requires understanding context and making judgment calls across files — Claude Code wins because of Claude 4.7's stronger long-context reasoning and the agent's tighter focus discipline. Use Codex when you can describe the task mechanically; use Claude Code when judgment matters.
**Q: Is Claude Code or Codex better for beginners?**
A: Neither is ideal for absolute beginners — both assume you can read code and review diffs. If you have to choose, Claude Code is gentler: it asks for permission before destructive operations by default, surfaces fewer surprises, and stays in scope. Codex's more aggressive autonomy is harder to recover from when you don't know what 'good code' looks like. Beginners are usually better served by an IDE-based tool like Cursor or Windsurf, which provide visual feedback as the agent works.
**Q: Does Codex work without an OpenAI API key?**
A: No — Codex requires an OpenAI API key or an active ChatGPT Plus/Pro subscription. The CLI itself is free to install, but every prompt counts against your API usage or subscription credits. Same model for Claude Code — free CLI, but the actual model inference is billed against your Anthropic API key or Claude Pro/Max subscription credits.
---
## Windsurf vs Claude Code: Which AI Coding Agent Wins in 2026?
- **URL:** https://justinmckelvey.com/blog/windsurf-vs-claude-code
- **Published:** May 11, 2026
- **Updated:** May 11, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 6 min
- **Description:** Windsurf vs Claude Code compared by a fractional CTO who ships with both. IDE vs terminal, pricing, agent behavior, code quality, and when each wins.
Quick Answer
Windsurf is an agent IDE; Claude Code is a terminal agent. They're complementary, not competitive. As of May 2026, Windsurf Pro is $15/month and Claude Code runs $5–$50/month on Anthropic API pricing. Windsurf is best for IDE-bound work — frontend, multi-file editing, visual navigation. Claude Code is best for terminal-bound work — backend, refactors, test backfills, anything CLI-heavy. Most professional developers use both.
Tested May 2026 · Both used for real production work · Author: Justin McKelvey, fractional CTO, 50+ products shipped
TL;DR: Windsurf vs Claude Code in 2026Windsurf is an agent-first IDE. Claude Code is an agent without an IDE. Both are autonomous coding agents — they plan, write code, run commands, iterate on errors, and ship features with minimal hand-holding. The difference is the interface. Windsurf gives you the agent inside a full Codeium-built editor. Claude Code gives you the agent inside your terminal, full stop. As of May 2026, Windsurf Pro is $15/month and Claude Code typically costs $5–$50/month via the Anthropic API.
I'm a fractional CTO who ships code daily with both. I've used Claude Code to refactor 15,000-line Rails apps and used Windsurf to build React frontends end-to-end. This is the honest comparison — what each does well, where each falls short, and how most professionals end up using both.
Windsurf vs Claude Code at a glance
Feature
Windsurf
Claude Code
Interface
Full IDE (Codeium-built)
Terminal CLI
Pro pricing (May 2026)
$15/month (Pro) / $60/month (Ultimate)
$5-$300/month via Anthropic API; $20-$100/month Claude subscriptions
Agent
Cascade (visual, in-editor)
Claude (terminal-native, autonomous)
Model
Claude, GPT, Gemini (you pick)
Claude (4.7 Sonnet default in 2026)
Best for
Frontend, visual UI work, junior developers
Backend, large refactors, CLI tasks, remote servers
Works over SSH
No
Yes
Visual diff review
Built-in IDE panel
Terminal output + post-hoc git diff
Focus discipline
Sometimes edits adjacent files you didn't mention
Stays focused on what you asked
Background-friendly
Less so — designed for interactive use
Yes — works well in tmux sessions
Use them together?
Yes — most pros use both. Windsurf for editor-bound work, Claude Code for terminal-bound work.
What Each Tool Is (In One Sentence)Windsurf is a Codeium-built IDE (fork of VS Code patterns) with Cascade — an autonomous agent that plans and executes multi-step coding tasks inside the editor.
Claude Code is an Anthropic terminal application that runs an autonomous agent in your shell — reading files, running commands, executing tests, and iterating on errors without any editor UI.
Same kind of agent. Completely different chair to sit in.
Pricing Compared (May 2026)Windsurf: Free tier (50 prompt credits, 200 flow action credits). Pro is $15/month (500 prompts + 1,500 flow actions). Ultimate is $60/month (unlimited model use within fair use). Annual saves ~17%.
Claude Code: Free to install (it's a CLI). Usage is billed against your Anthropic API key. Typical costs: light user (1 hour/day) $5–$15/month, moderate user (3–4 hours/day) $30–$80/month, heavy user (full-time agentic work) $100–$300/month. Anthropic also offers Claude subscriptions ($20/month Pro, $100/month Max) that include Claude Code usage credits.
For most professional developers, the costs are surprisingly close. Heavy Windsurf users hit Ultimate ($60/month). Heavy Claude Code users hit Anthropic Max ($100/month). Both are well under what an hour of a senior developer's time is worth.
Agent Behavior: Cascade vs Claude CodeBoth are autonomous agents. Both will plan, code, run commands, fix errors, and iterate. The behavioral differences are subtle but real.
Cascade (Windsurf) is optimized for visual context. It surfaces decisions in the editor — diffs appear in a side panel, file changes are highlighted in the file tree, terminal output streams in a dedicated pane. The agent assumes you're watching and reviewing as it works.
Claude Code is optimized for autonomy. It runs in your terminal, prints what it's doing inline, asks for permission before risky commands (configurable), and otherwise just works. The agent assumes you're going to review the diff later, not watch it happen.
In practice: Cascade feels like a collaborator. Claude Code feels like an intern who emails you the work when it's done.
Code QualityBoth produce comparable code quality when using equivalent models. Claude Code runs on Claude (4.7 Sonnet by default in 2026). Windsurf can run on Claude, GPT, or Gemini — you pick.
The differences I noticed in real production work:
Claude Code is better at staying focused. It doesn't get distracted by adjacent files unless you ask. Cascade sometimes "helpfully" edits files you didn't mention, which is great when it's right and frustrating when it's wrong.
Cascade is better at frontend work. The visual IDE matters — seeing the React component update as the agent edits it shortens the feedback loop on UI work. Claude Code can build a frontend, but you're flipping between the terminal and a browser to see the result.
Both fail in the same places. Authentication edge cases, payment webhook signatures, multi-tenant scoping, complex database migrations — these need human review regardless of which agent you use. (More on where AI coding agents break in production.)
When Windsurf Wins
• Frontend work. Visual editor + live agent = faster UI iteration.
• Onboarding to a new codebase. The IDE's navigation features help you orient before you ask the agent to act.
• Teams already on VS Code. Migration cost is low; most extensions work.
• Pair programming feel. If you want to watch the agent work and intervene in real time.
• Junior developers. The visual feedback teaches you what the agent is doing.
When Claude Code Wins
• Backend work. Database migrations, API changes, test backfills — the terminal matches how backend work is done.
• Large refactors. No editor distraction; the agent just runs.
• CLI-heavy tasks. npm/bundle/pip workflows feel native; in an IDE they always feel like context switches.
• Background work. Kick off a Claude Code run in a tmux session, do a meeting, come back to a finished feature.
• Servers and remote machines. Claude Code runs over SSH; an IDE doesn't.
How Professionals Actually Use BothMost senior developers I work with in 2026 use Windsurf (or Cursor) as their primary editor and reach for Claude Code for specific tasks:
2. Bulk refactors — "rename this function across 47 files and update all callers"
4. Test backfills — "write unit tests for every public method in this directory"
6. Library upgrades — "upgrade Rails 7.2 to 8.0 and fix any breaking changes"
8. Code reviews of vibe-coded apps — "audit this codebase for security issues"
10. Remote work — anything happening on a production server
Day-to-day editing stays in the IDE. The agent that lives outside the IDE handles the heavy mechanical work.
What About Cursor?Cursor is the third option in this category — a VS Code fork with its own Composer agent. Compared to Windsurf, Cursor is more conservative and feels more like "AI-assisted coding" than "agentic coding." Compared to Claude Code, Cursor is an IDE, not a terminal tool, so the same trade-offs apply. Most developers I know use Cursor OR Windsurf, plus Claude Code. Few use all three.
Switching CostGoing between Windsurf and Claude Code costs zero — they're different surfaces, you can install both today and use both tomorrow.
The real learning curve is on agent prompting. Both Cascade and Claude Code reward specific, scoped instructions. Vague prompts produce mediocre results in either tool. Spend a week with each before forming strong opinions.
What I Actually RecommendIf you can afford both: install both. Combined cost is $20–$100/month depending on usage. That's less than one hour of senior developer time.
If you can only afford one and you're an IDE-first developer: Windsurf. The agentic features inside the editor are worth more than terminal autonomy.
If you can only afford one and you live in the terminal: Claude Code. You'll move faster with a tool designed for your existing workflow.
If you're a backend or platform engineer: Claude Code, almost certainly. The CLI fit matters.
If you're a frontend engineer: Windsurf, probably. Visual feedback on UI work is genuinely useful.
Working with a Fractional CTOI help founders pick the right AI coding stack for their team and codebase. If you're scaling a team and trying to standardize on tools — or you've inherited a vibe-coded mess and need to plan a migration — book a strategy call. The first call is free.
### Frequently Asked Questions
**Q: Is Windsurf or Claude Code better in 2026?**
A: It depends on where you work. Windsurf is better if you want an agent inside a full IDE that handles UI, navigation, and visual file management. Claude Code is better if you live in the terminal and want maximum autonomy on command-line tasks, multi-file refactors, and test backfills. Most professional developers in 2026 use both — Windsurf for IDE-bound work, Claude Code for terminal-heavy tasks.
**Q: What's the difference between Windsurf and Claude Code?**
A: Windsurf is an IDE (visual editor) with an autonomous agent called Cascade built in. Claude Code is a terminal application — no editor, no UI — that operates entirely through your shell, reading files, running commands, and iterating on errors. Same kind of agentic behavior, completely different interface.
**Q: How much does Claude Code cost vs Windsurf?**
A: As of May 2026, Windsurf Pro is $15/month with 500 prompt credits + 1,500 flow action credits. Claude Code is pay-as-you-go on the Anthropic API — typical cost is $5–$50/month depending on usage, with heavy users (multi-hour daily sessions) running $100–$300/month. Windsurf is cheaper for light use; Claude Code is cheaper for predictable medium use; Windsurf is cheaper again for very heavy users on the Ultimate tier ($60/month).
**Q: Can Claude Code replace my IDE?**
A: Not entirely. Claude Code is a terminal agent — it has no editor of its own. You typically run it alongside an IDE (VS Code, Cursor, Windsurf, or even just vim). The agent works in your project directory, makes changes to files, runs tests, and commits — but you still need an editor to read diffs, navigate code visually, and handle non-AI work.
**Q: Is Claude Code safer than Windsurf's Cascade?**
A: Neither is inherently safer — both can make sweeping changes. Claude Code defaults to asking permission before running commands, which makes it feel safer for cautious users. Cascade defaults to running, which makes it faster but riskier. Both can be configured to be more or less autonomous. The bigger safety factor in both is using version control checkpoints and running tests before committing.
**Q: Which is better for large refactors — Windsurf or Claude Code?**
A: Claude Code is typically better for large refactors. The terminal interface keeps you focused on the task, the agent doesn't get distracted by editor UI, and it's easier to scope the work to specific directories or file patterns via CLI arguments. Windsurf's Cascade is comparable in quality but the IDE can feel like more friction on pure refactor work.
**Q: Can I use Claude Code with VS Code or Cursor?**
A: Yes — Claude Code runs in the terminal and is editor-agnostic. Most users keep an editor (VS Code, Cursor, Windsurf) open in one window and Claude Code running in another. The agent edits files; you watch the changes appear in your editor. This is a very common professional setup in 2026.
**Q: Is Windsurf or Claude Code better for backend work?**
A: Claude Code has a slight edge for backend-heavy work — database migrations, API design, test backfills, dependency management — because the terminal interface matches how backend work is normally done. Windsurf is comparable on backend code quality but the IDE feels like overhead when you're mostly running commands and editing config files.
---
## Bolt vs Lovable: Which Should Non-Developers Use in 2026?
- **URL:** https://justinmckelvey.com/blog/bolt-vs-lovable
- **Published:** May 11, 2026
- **Updated:** May 11, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 7 min
- **Description:** Bolt vs Lovable compared by a fractional CTO. Built the same MVP in each, reviewed the code, tested deployment. Pricing, output quality, and which is safer to ship.
Quick Answer
Lovable is the better choice for non-developers shipping real products; Bolt is better for fast prototyping and throwaway demos. As of May 2026, Lovable Pro is $25/month and Bolt Pro is $20/month — both target non-coders. Lovable produces more polished output and handles deployment to custom domains automatically. Bolt is faster to first result and cheaper per iteration but the apps usually need cleanup before launch.
Tested May 2026 · Same MVP built in both · Author: Justin McKelvey, fractional CTO, 50+ products shipped
TL;DR: Bolt vs Lovable in 2026Both tools build apps from prompts. They feel different the moment you start using them. Lovable wants to ship you a finished product. Bolt wants to ship you a working sketch. As of May 2026, Lovable Pro is $25/month and Bolt Pro is $20/month. The sticker prices are similar; the products are not. Most non-developers I've advised end up using both for different things — Bolt for ideation, Lovable for the version they actually ship.
I'm a fractional CTO who reviews vibe-coded applications for founders before they launch. I built the same MVP — a paid newsletter signup app with Stripe checkout — in both Bolt and Lovable to write this. No affiliate links. Just what each one is actually like to use and what the code looks like after.
Bolt vs Lovable at a glance
Feature
Bolt.new
Lovable
Maker
StackBlitz
Stockholm-based team (formerly GPT Engineer)
Pro pricing (May 2026)
$20/month
$25/month
Free tier
Yes (1M tokens/day, public projects)
Yes (5 messages/day, public projects)
Usage metering
Tokens (expensive on large codebases)
Messages (more predictable)
Default stack
Flexible — Vue, Svelte, Next.js, HTML, anything
Opinionated — React + Tailwind + Supabase
Hosting
None — you deploy to Netlify/Vercel yourself
Built-in, custom domains on Pro, SSL included
Design quality
Generic out of the box
Magazine-quality, polished default aesthetic
Time to first working app
~7 minutes (in same-prompt test)
~12 minutes (in same-prompt test)
Best for
Throwaway prototypes, demos, developer scaffolds
Real SaaS products non-developers ship
Code handoff to developer
Inconsistent across projects
Clean, predictable, GitHub sync
What Each Tool IsLovable is a prompt-to-app builder from a Stockholm-based team (formerly GPT Engineer). It's optimized for non-developers who want a finished, deployable product. The output is opinionated — React + Tailwind + Supabase by default — and the UX is polished.
Bolt (Bolt.new) is a prompt-to-app builder from StackBlitz. It's optimized for speed-to-first-result. The output is more flexible — it'll use whatever stack you ask for — but the UX feels rougher and the generated apps need more work to feel finished.
Lovable wants you to ship. Bolt wants you to iterate.
Pricing Compared (May 2026)Lovable: Free tier (5 messages/day, public projects only). Pro is $25/month (500 messages/day, private projects, custom domains, GitHub sync). Teams is $50/month. Annual saves ~17%.
Bolt: Free tier (1M tokens/day, public projects). Pro is $20/month (10M tokens/month, private projects, file upload). Pro 50 is $50/month (26M tokens). Annual saves ~25%.
The economics differ. Lovable counts messages; Bolt counts tokens. For most non-developers building a single app over a month, both work out to roughly the same monthly cost. Heavy iteration on Bolt gets expensive faster — large codebases consume tokens quickly on every change. Lovable's flat message limits are more predictable.
Building the Same App in BothI prompted both with: "Build a paid newsletter signup app. Users enter their email, pay $10 via Stripe to subscribe, and get added to a Mailchimp list. Admin dashboard to see subscribers."
Lovable took 12 minutes and produced a working app with a polished landing page, working Stripe checkout, Supabase auth for the admin, and a clean admin dashboard. The Mailchimp integration was scaffolded but needed API keys to work. The visual design was magazine-quality. I could have shipped this with one more pass of cleanup.
Bolt took 7 minutes and produced a working app with the same functional flow, but the design was generic, the admin dashboard was barebones, and the Stripe integration had a webhook signature verification bug that would've broken in production. Faster to first result; more cleanup needed before launch.
Code QualityBoth tools generate similar code quality when given the same prompt. The output is React + Tailwind by default, with reasonable component structure. The differences I noticed:
Lovable is more conservative. It stays inside its preferred stack (React + Tailwind + Supabase), produces consistent file structure, and rarely introduces unusual dependencies. The code is easier to hand off to a developer because it looks like every other Lovable app.
Bolt is more flexible but messier. It will use whatever you ask for — Vue, Svelte, Next.js, plain HTML. That flexibility comes with inconsistency. Different Bolt projects look like they were written by different developers, because effectively they were.
Both have similar security gaps. In my testing, both produced code with at least one critical issue out of the box: exposed API keys in client code, missing input validation, or improper auth scoping. (More on where AI-generated code breaks in production.)
Deployment + HostingLovable hosts your app on a Lovable subdomain by default and supports custom domains on Pro plans. You point your domain at Lovable's hosting, they handle SSL. For a non-developer, this is enormous — no separate hosting account, no DNS gymnastics.
Bolt doesn't host. You export the project to StackBlitz, GitHub, or download as a zip, and deploy it yourself — usually to Netlify or Vercel. For a developer, this is fine. For a non-developer, this is a wall.
If you're a non-developer and your goal is "ship something at my-domain.com," Lovable saves you 4–6 hours of YouTube tutorials.
When Lovable Wins
• You want a finished-looking product. Lovable's default aesthetic is meaningfully better.
• You need custom-domain hosting handled for you. One-click. Done.
• You're building a real SaaS or paid product. The polish matters for conversion.
• You'll hand the code to a developer eventually. Cleaner output, GitHub sync, easier handoff.
• You don't know what stack to use. Lovable's opinions save you from yourself.
When Bolt Wins
• You're prototyping fast and throwing it away. Bolt is faster per iteration.
• You want a specific stack (Vue, Svelte, plain HTML). Lovable will fight you; Bolt won't.
• You're sharing a working demo via link. Bolt's instant URL sharing is unbeatable.
• You're testing an idea internally with friends. Cheap, fast, disposable.
• You're a developer who just wants a starting scaffold. Bolt's flexibility wins here.
What About Cursor?Neither Bolt nor Lovable is meant for developers. If you write code, you'll get more value from Cursor or Windsurf than either of these. Bolt and Lovable abstract away the code on purpose — that's the whole point. If you want to read and edit code as the primary interface, use an AI IDE.
What Happens When You Outgrow Either ToolEventually, every successful vibe-coded app outgrows its tool. The product gets traction, the user base grows, and the limits show up: scaling issues, security gaps, customizations the tool can't support. At that point, you have three options:
1. Export and hand to a developer. Lovable's GitHub sync makes this easier; Bolt's manual export works but is messier. Plan for a "vibe code rescue" project that takes 2–8 weeks and costs $5K–$50K depending on scope.
2. Rebuild from scratch. Sometimes cheaper than fixing what's there. Especially if the codebase is small (under 5,000 lines) and the requirements are now clear.
3. Stay on the tool and accept the limits. Reasonable if the app is internal or low-traffic. Less reasonable if you're scaling a real business.
(I've written about all three paths in this real case study.)
What I Actually RecommendIf you're a non-developer building one app and you want to ship it: Lovable. Pay $25/month, ship the app, point your domain at it, and budget $5K–$15K for a developer review before you take payment from customers.
If you're a non-developer who likes tinkering and wants to try ideas fast: Bolt. The free tier is generous, iteration is fast, and you'll learn a ton about how software is structured.
If you can afford both ($45/month combined): Use Bolt to prototype the idea, then rebuild it cleanly in Lovable once you know what you want. Most of my advised founders end up doing this.
If you're a developer: Skip both. Use Cursor or Windsurf — you'll move faster with tools designed for people who can read code.
Getting Professional HelpIf you've already built something in Bolt or Lovable and want a professional review before you launch — or you've outgrown the tool and need to migrate to a real codebase — book a strategy call. The first call is free, and I'll tell you honestly whether your app is shippable as-is or needs a rescue.
### Frequently Asked Questions
**Q: Is Bolt or Lovable better for non-developers in 2026?**
A: Lovable is the safer choice for non-developers who want to ship a real product. It produces more polished output, handles deployment automatically, and supports custom domains out of the box. Bolt is better for rapid prototyping and shareable demos — it's faster to first result but the apps need more cleanup before they're production-ready.
**Q: How much do Bolt and Lovable cost?**
A: As of May 2026, Lovable Pro is $25/month (500 messages, custom domain support, GitHub sync) and Bolt Pro is $20/month (10M tokens, private projects, file upload). Both have free tiers with strict usage limits. Heavier users typically end up on $50/month tiers — Lovable Teams ($50) or Bolt Pro 50 ($50).
**Q: Can you deploy a Lovable app to your own domain?**
A: Yes. Lovable supports custom domains on Pro plans and above. You point your domain at Lovable's hosting and the app serves from your domain with SSL. Bolt apps need to be exported and deployed elsewhere — usually Netlify or Vercel — for custom domains.
**Q: Are Bolt and Lovable safe for production?**
A: Both produce code of similar quality, but neither is production-safe out of the box for serious applications. The most common issues across both: weak input validation, missing rate limiting, exposed API keys, no proper error handling, and naive authentication. For low-stakes apps (portfolios, landing pages, internal tools), both are fine. For real products handling money or sensitive data, get a developer review first.
**Q: What's the difference between Bolt and Bolt.new?**
A: Bolt.new is the URL; Bolt is the product. There's also Bolt.diy (an open-source clone) and StackBlitz Bolt (the parent company's enterprise version). When people say 'Bolt' in 2026, they usually mean Bolt.new — the hosted prompt-to-app tool from StackBlitz.
**Q: Can Bolt or Lovable handle authentication and payments?**
A: Both can scaffold auth (Supabase, Clerk) and Stripe integration when you ask for them. The scaffolded code works for happy-path demos but typically has issues: improper session handling, missing webhook signature verification, no idempotency keys on payment flows. If you're building a real SaaS, plan for a developer to harden these flows before launch.
**Q: Which is better for MVPs — Bolt or Lovable?**
A: For a polished MVP you'd show investors or paying customers, Lovable wins — the output looks better and feels more cohesive. For a rough prototype to validate an idea internally or with friends, Bolt wins — it's faster, cheaper per iteration, and easier to throw away. Use Bolt to find product-market fit; switch to Lovable (or a real codebase) once you know what you're building.
**Q: What happens when I outgrow Bolt or Lovable?**
A: Both let you export your code. Lovable syncs to GitHub on every change, so you can pick up the codebase and hand it to a developer. Bolt lets you download project files. The catch: the exported code is often messy and tightly coupled to the tool's conventions. Most professional cleanups (what I call 'vibe code rescue') cost $5K–$50K depending on how much of the app needs to be rebuilt.
---
## Cursor vs Windsurf: Which AI IDE Should You Use in 2026?
- **URL:** https://justinmckelvey.com/blog/cursor-vs-windsurf
- **Published:** May 11, 2026
- **Updated:** May 11, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 6 min
- **Description:** Cursor vs Windsurf compared by a fractional CTO who shipped real features in each. Pricing, Cascade vs Composer, agent mode, code quality, and which to pick.
Quick Answer
Cursor is the safer default in 2026; Windsurf wins if you want an aggressive autonomous agent. Cursor ($20/month) is a VS Code fork with the largest community and the most stable Composer agent. Windsurf ($15/month) is a ground-up IDE built around Cascade — a more autonomous agent that runs longer multi-step tasks. Most professional developers in 2026 keep both installed and switch based on the task.
Tested May 2026 · 2 production features shipped · Author: Justin McKelvey, fractional CTO, 50+ products shipped
TL;DR: Cursor vs Windsurf in 2026Cursor is an AI-assisted IDE. Windsurf is a directed-agent IDE. They look similar in screenshots, but the philosophy behind each is different. Cursor wants to make you a faster developer. Windsurf wants to do the developing for you, with you supervising. As of May 2026, Cursor Pro is $20/month and Windsurf Pro is $15/month. Both ship with frontier-model access (Claude, GPT, Gemini) and both produce comparable code quality when given the same prompt. The real difference is how much autonomy you're handing the agent.
I'm a fractional CTO who ships code daily. This month I built two production features — one in Cursor, one in Windsurf — to write this comparison honestly. No affiliate links, no demos, no toy projects. Just what each tool is actually like to live with for a week.
Cursor vs Windsurf at a glance
Feature
Cursor
Windsurf
Base editor
VS Code fork
Codeium-built IDE
Pro pricing (May 2026)
$20/month
$15/month
Agent
Composer (turn-based, asks for confirmation)
Cascade (autonomous, acts first)
Agent philosophy
Human-in-the-loop
Hands-off autonomy
Frontier models
Claude, GPT, Gemini
Claude, GPT, Gemini
Free tier
Yes (2,000 completions/mo)
Yes (limited Cascade credits)
Best for
Frontend, security-sensitive code, incremental edits
Large refactors, batch migrations, greenfield builds
Risk profile
Conservative — surfaces decisions
Aggressive — installs packages, edits files unprompted
VS Code extensions
All work natively
Most import on setup
Learning curve
Gentle (looks like VS Code)
Moderate (new agent paradigm)
What Each Tool Is (In One Sentence)Cursor is a VS Code fork with in-editor AI chat, predictive tab completions, and a Composer agent that handles multi-file edits when you ask for them.
Windsurf is a Codeium-built IDE with Cascade — an autonomous agent that plans, executes, and iterates on multi-step coding tasks with minimal supervision.
The shorter version: Cursor amplifies you. Windsurf replaces parts of you.
Pricing Compared (May 2026)Cursor Pro: $20/month. Includes 500 fast premium requests (Claude 4.7 Sonnet, GPT-5, etc.), unlimited slow requests, full tab completion model. Annual is $192. Business tier is $40/seat.
Windsurf Pro: $15/month. Includes 500 prompt credits and 1,500 flow action credits (for Cascade agent steps). Annual saves ~17%. Ultimate tier at $60/month for heavy agentic users.
The sticker prices are close. The real cost depends on how aggressively you use the agent. Heavy Cascade users blow through the 1,500 flow credits in a week; heavy Cursor Composer users blow through the 500 fast requests faster. If you're agent-first, Windsurf Ultimate ($60) is usually cheaper than Cursor's overage charges. If you're chat-first, Cursor wins on price.
Agent Mode: Composer vs CascadeThis is where the two tools actually diverge. Both have agent modes. They feel completely different.
Cursor Composer is a turn-based collaborator. You give it a goal ("refactor the auth flow to use Devise"), and it plans steps, proposes file changes, asks for confirmation, runs commands when you allow it, and pauses on errors. It's conservative on purpose. You stay in the loop.
Windsurf Cascade is more autonomous. Give it the same goal and Cascade will plan, write code, run tests, fix failing tests, run more tests, install dependencies, and keep iterating until the goal is met or it's stuck. The default behavior is "act, don't ask." You can enable confirmations, but the tool is clearly designed for hands-off agentic flows.
Which is better depends on what you're doing. I shipped a Stripe checkout flow in Cursor and a 23-file ActiveRecord migration in Windsurf. Cursor's careful pace made the Stripe work safer — payment code needs review at every step. Cascade made the migration faster — there were 23 nearly-identical changes that didn't need individual review.
Code QualityBoth tools produce code of roughly the same quality when using the same underlying model. The differences I noticed in my testing:
Windsurf is more aggressive about completing the task. It will install packages, edit files you didn't mention, and make architectural assumptions. This is great when the assumptions are correct, painful when they aren't. Cascade once installed a deprecated package in my Rails app because it was the first option Google returned.
Cursor is more aggressive about surfacing decisions. Composer will pause and ask "I'm about to install X — is that what you want?" That's annoying when you know what you want; it's a safety net when you don't.
Both tools fail in similar ways on edge cases. Authentication, payment processing, multi-tenant scoping, and database migrations all need human review regardless of which tool you're using. (More on where AI coding tools break in production.)
When Cursor Wins
• Frontend work — Cursor's tab completion is faster and more accurate on React, Vue, and Tailwind.
• Security-sensitive code — payments, auth, anything HIPAA/SOC2. You want the human-in-the-loop pace.
• Onboarding to a new codebase — Cursor's "Ask" mode is the best way to explore an unfamiliar codebase quickly.
• Pair programming feel — if you like the "AI is helping me code" mental model.
• Teams already on VS Code — zero migration cost, all your extensions work.
When Windsurf Wins
• Large refactors — 10+ file changes where the logic is repetitive and review is mechanical.
• Batch migrations — schema changes, library upgrades, test backfills.
• Greenfield projects — when you want to type "build me a blog with auth and Stripe" and walk away.
• Background work — kick off a Cascade run, do a meeting, come back to a finished feature.
• Developers who want max throughput and are comfortable reviewing diffs after the fact.
What About Claude Code?Neither of these is the same as Claude Code, which runs in your terminal instead of an IDE. Claude Code is closer to Windsurf's Cascade in spirit — autonomous, agentic, command-execution-friendly — but lives outside the editor. Most professional developers in 2026 use a mix: Cursor or Windsurf for editor-bound work, Claude Code for terminal-heavy tasks.
Switching CostMigrating from VS Code or Cursor to Windsurf is mostly painless. Windsurf imports your settings and most VS Code extensions work. Migrating off Windsurf to Cursor is similarly easy. The lock-in isn't the editor — it's the muscle memory of how each tool's agent works. You'll spend a few days re-learning workflows when you switch.
What I Actually RecommendIf you're a working developer and you can afford both: install both. They're $35/month combined, which is less than one hour of a senior developer's time. Use Cursor for day-to-day editing and code review work. Reach for Windsurf when you have a large, mechanical task that benefits from autonomous iteration.
If you can only afford one and you're a generalist: Cursor. It's the better all-purpose tool and the safer default.
If you can only afford one and you do a lot of large refactors or greenfield builds: Windsurf. Cascade pays for itself in a single multi-file refactor.
If you're a non-developer trying to ship an app: neither. Use Lovable or Bolt instead — these IDEs assume you can read code.
Working with a Fractional CTOI help founders pick the right AI coding tools for their stack and team. If you're vibe-coding an MVP and worried about what happens at scale, or you've already shipped something and want a professional review before you launch, book a strategy call. The first call is free.
### Frequently Asked Questions
**Q: Is Cursor or Windsurf better in 2026?**
A: For most developers, Cursor is still the safer pick in 2026 — it has the larger community, more stable model integrations, and the most mature agent mode. Windsurf is the stronger choice if you want a more aggressive autonomous agent (Cascade) that runs longer multi-step tasks with less hand-holding. Both cost $15–$20/month at the pro tier.
**Q: What is the difference between Cursor and Windsurf?**
A: Cursor is a VS Code fork built around in-editor chat, tab completions, and a manual-feeling Composer agent. Windsurf is a full IDE rewrite by Codeium, built around Cascade — an autonomous agent designed to plan and execute multi-file changes with minimal prompts. Cursor feels like 'AI-assisted coding'; Windsurf feels like 'directed AI coding.'
**Q: How much does Windsurf cost vs Cursor?**
A: As of May 2026, Windsurf Pro is $15/month and Cursor Pro is $20/month. Both include access to frontier models (Claude, GPT, Gemini) with monthly usage credits. Heavier agentic users typically exceed the included credits on either tool and pay $20–$60/month total.
**Q: Is Windsurf better at agent mode than Cursor?**
A: Cascade (Windsurf's agent) is more aggressive about running multi-step tasks autonomously — it will plan, execute, run terminal commands, and iterate without asking. Cursor's Composer is more conservative and surfaces intermediate steps for review. If you want speed, Cascade wins. If you want control, Composer wins.
**Q: Can I use Cursor and Windsurf at the same time?**
A: Yes — they're separate apps and don't conflict. Many developers in 2026 keep both installed: Cursor for day-to-day editing and quick completions, Windsurf when they want to hand a multi-file refactor to an agent and walk away. The cost overlap is small relative to the productivity gain.
**Q: Is Windsurf safe for production code?**
A: Windsurf-generated code is similar in quality to Cursor and Claude Code when using the same underlying model. The risk isn't the tool — it's the agent autonomy. Cascade can make sweeping changes faster than you can review them. Always run tests, use version control checkpoints, and review diffs before committing.
**Q: Which is faster, Cursor or Windsurf?**
A: Cursor feels faster for inline edits and quick completions — its tab model is highly tuned. Windsurf feels faster for large refactors because Cascade does more per prompt with less back-and-forth. For 'how fast can I ship this feature end-to-end,' Windsurf often wins on agentic work; Cursor wins on incremental editing.
**Q: Should a beginner pick Cursor or Windsurf?**
A: Beginners should start with Cursor. The learning curve is gentler — it looks and feels like VS Code, has more tutorials, and won't run dangerous commands without confirmation. Windsurf's Cascade agent is more powerful but easier to get into trouble with if you don't yet know what 'good code' looks like.
---
## How to Hire a Fractional CTO: Complete Guide for Founders (2026)
- **URL:** https://justinmckelvey.com/blog/how-to-hire-a-fractional-cto
- **Published:** May 04, 2026
- **Updated:** May 04, 2026
- **Category:** Fractional CTO
- **Reading time:** 10 min
- **Description:** Step-by-step guide to hiring a fractional CTO in 2026. Where to find them, 12 interview questions, contract structure, red flags, and the onboarding checklist.
TL;DR: How to Hire a Fractional CTO in 2026Hiring a fractional CTO takes 2-4 weeks if you know what you're looking for. The five-step process: define what you actually need, source candidates from operator networks (not job boards), interview 3-5 candidates with strategic questions (not technical trivia), do a paid trial engagement before committing, and sign a month-to-month retainer with 30-day notice. Expect to pay $5,000-15,000/month for an embedded engagement. The single biggest predictor of success is whether the CTO pushes back on your ideas vs. just executing what you tell them. This guide walks through every step.
Already know what to pay? See fractional CTO rates and cost in 2026. Trying to decide if you need fractional or full-time? Read fractional vs full-time CTO.
What Is a Fractional CTO (And Who Actually Needs One)A fractional CTO is a senior technology executive who works with multiple companies simultaneously, typically 2-4 clients at any given time, providing strategic technical leadership at 10-20 hours per week per company. They are NOT a senior developer. They are NOT a project manager. They are NOT an outsourced dev agency. They are the equivalent of a full-time CTO operating at part-time capacity, with the strategic thinking, hiring judgment, and architectural authority that role implies.
You need one if any of these apply:
• You're a non-technical founder building a software product and feeling out of your depth on technical decisions
• You have 1-3 engineers but no one with senior architecture experience
• You're about to spend $50,000+ on a major build and want a second opinion on the approach
• You're hiring engineers and don't know how to evaluate their skills
• You inherited a codebase (acquisition, AI-generated, offshore build) and need someone to assess it
• You're raising a Series A and need credible technical leadership for investor diligence
• Your full-time CTO just left and you need 60-90 days of coverage while you hire
You do NOT need one if: you have fewer than 100 paying customers and no engineering team yet (you need an engineer who can ship, not a leader to direct), or your business is fundamentally non-technical (you need a tech-literate operator, not a CTO).
The 5-Step Process to Hire a Fractional CTOStep 1: Define the Problem (Not the Person)Before sourcing anyone, write down the specific outcomes you want in 90 days. Not job descriptions — outcomes. Examples:
• "Reduce deployment time from 2 hours to under 15 minutes"
• "Decide whether to rebuild or refactor our existing Python codebase"
• "Hire two senior engineers and onboard them productively"
• "Pass technical due diligence for our Series A round"
• "Migrate from Heroku to AWS without downtime"
This list does two things: it filters out fractional CTOs whose strengths don't match your needs, and it gives you a measurable scorecard for evaluating success at the 90-day mark. If you can't articulate three concrete outcomes, you're not ready to hire — you need to do more discovery first.
Step 2: Source Candidates from Operator NetworksSkip job boards. Skip Upwork. The fractional CTOs you want aren't actively looking — they're working with referred clients. Source from:
• Personal referrals from other founders who've used a fractional CTO. This is the highest-quality source. Ask in founder communities like Indie Hackers, On Deck, or your local startup Slack.
• Operator networks like On Deck CTO Track, Pallet, Reforge alumni network, and Lenny's Talent Collective. These curate experienced operators rather than open marketplaces.
• Fractional executive marketplaces like Bonsai, GrowthMentor, Continuum, or Pallet Labs. Vetted but pay-to-play, so check references rigorously.
• LinkedIn search for "Fractional CTO" + your industry vertical. Filter by people who post regularly about their work — that's a signal of operator mindset, not just a title change.
• AI-focused communities if you're building AI products: Latent Space Discord, AI Tinkerers Slack, AI Engineer World's Fair alumni. The AI fractional CTO market is small and tight — most operate via referral.
Aim to source 8-12 candidates initially. You'll narrow to 3-5 for interviews, and 1-2 for paid trials.
Step 3: Interview with Strategic Questions (Not Technical Trivia)The interview's purpose is to assess judgment, communication, and pattern recognition — not to test if they can solve LeetCode problems. Ask:
2. "Walk me through a time you told a founder NOT to build something they wanted to build. What was your reasoning?" Tests strategic backbone — fractional CTOs who just execute what founders ask are dangerous.
4. "How would you architect [our specific use case] in the next 90 days?" Tests speed of judgment under uncertainty.
6. "How many other clients are you working with right now? What's the time split?" You want 2-4 max, with clear boundaries.
8. "When have you been wrong about a major technical decision? How did you recover?" Tests humility and learning. If they can't name one, they're either inexperienced or dishonest.
10. "What's your hourly rate vs. retainer rate, and why are they different?" Tests pricing transparency. Anyone who hedges here will hedge on other things.
12. "Describe your communication cadence with current clients — meetings, async, response time SLA." Tests operational maturity.
14. "What does your first 30 days look like with a new client?" Tests whether they have a repeatable process.
16. "How do you measure success in your engagements?" Vague answers = vague results.
18. "What's a technology you used to love that you've changed your mind about?" Tests intellectual flexibility.
20. "Who would you NOT hire? What's your candidate red flag list?" Reveals their hiring philosophy if you'll be involving them in hires.
22. "If you had to leave this engagement in 90 days, what would good handoff look like?" Forces them to think about exit and continuity.
24. "What's a question I should be asking that I haven't?" Reveals strategic thinking and interview skill.
Take notes during each interview. After all interviews are done, score candidates on a 1-5 scale across: judgment, communication, domain fit, pattern recognition, and culture fit. Highest aggregate score wins, with culture fit as a tiebreaker.
Step 4: Do a Paid Trial EngagementNever hire fractional CTO talent without a paid trial. The trial filters out people who interview well but execute poorly. Two trial formats work well:
Option A: Architecture Audit — Pay them $3,000-5,000 for a 2-day deep dive into your existing system, codebase, infrastructure, and team. Output: a written report with prioritized recommendations, ranked by ROI. This works whether you have an existing codebase or are starting from scratch (in which case it's an architecture proposal, not audit).
Option B: 1-Week Sprint Engagement — Pay them at their hourly rate for a 1-week project with a specific deliverable: hire a senior engineer, ship a feature, pass a security review, design a migration plan. This tests whether they can actually execute, not just advise.
After the trial, you have three signals: quality of their thinking (the deliverable), quality of their communication (how they worked with you and your team), and quality of their judgment (did they push back appropriately or just rubber-stamp your assumptions). If any of those is weak, do not convert to retainer — go back to your candidate list.
Step 5: Sign a Month-to-Month RetainerOnce you've found a fit, sign a month-to-month retainer with 30-day notice on either side. Required terms:
• Scope — clear definition of what's included (hours/week, types of work, communication channels)
• Rate — flat monthly fee, ideally with a small discount vs. their hourly rate as the trade for predictability
• Communication SLA — response time expectations (typical: 4 business hours for non-urgent, 1 hour for critical)
• Notice period — 30 days from either party, no questions asked
• IP ownership — all work product is yours; they retain knowledge but no IP
• NDA — standard, reciprocal, mutual
• Equity (optional) — if you want skin-in-the-game alignment for a 6+ month engagement, 0.25-1.0% with monthly vesting after a 12-month cliff is standard
Avoid contracts longer than 6 months. Avoid yearly minimums. Avoid retainers under 30-day notice — those are gig contracts, not fractional executive engagements.
Comparison: 4 Ways to Get CTO-Level HelpBefore committing to fractional, make sure it's actually the right model. Here's the honest comparison:
Model
Cost (Year 1)
Time to Start
Best For
Worst For
Full-Time CTO
$380K-600K (salary + benefits + equity dilution)
4-6 months to hire
Series B+, dedicated technical roadmap, 10+ engineers
Pre-Seed, Seed, anyone unsure they need 40 hours/week of CTO
Fractional CTO
$60K-180K (10-20 hours/week)
2-4 weeks
Pre-seed to Series A, technical strategy + part-time leadership
Companies needing 30+ hours/week of strategic technical work
CTO-as-a-Service
$36K-96K (productized package)
1-2 weeks
Defined scope: due diligence, architecture review, hiring sprint
Open-ended ongoing leadership needs
Dev Agency
$120K-500K (project-based)
1-3 weeks
Building a specific product or feature with clear specs
Strategic decisions, hiring judgment, telling you NOT to build something
Most pre-seed and seed-stage founders should hire fractional first. Once you have $1M+ ARR and a clear technical roadmap, transition to full-time. CTO-as-a-Service is best as a project-based supplement, not a primary leader. Dev agencies are great for execution but should never be the strategic decision-maker.
Where to Find Qualified Fractional CTOs (Source List)Best sources by tier:
Tier 1: Personal referral networks — Indie Hackers (founder community), On Deck (operator network), Pallet (alumni), local founder Slacks. The highest-quality source because there's a peer reputation cost for bad referrals.
Tier 2: Operator-curated networks — Reforge alumni, Lenny's Talent Collective, Continuum, GrowthMentor, Pallet Labs. Vetted but you still need to do reference checks.
Tier 3: LinkedIn directly — Search "Fractional CTO" + your industry. Filter for people with consistent posting (signals operator mindset), 5+ years CTO/VP-level experience, and AT LEAST one previous startup exit or successful scale.
Tier 4: Industry-specific communities — Latent Space (AI), AI Tinkerers Slack (AI), Demand Curve (B2B SaaS GTM), MicroConf (bootstrapped SaaS). Find fractional CTOs who specialize in your space.
Avoid: Upwork (not where senior operators live), generic dev agencies (different business model), anyone who calls themselves a "CTO advisor" without operator history (advice without operating experience is theory).
Red Flags During the Hiring ProcessWalk away if you see any of these:
• Won't share specific past clients or outcomes. Confidentiality has limits — they should be able to describe the type of company, problem, and outcome even if names are masked.
• Pricing without explanation. "I charge $300/hour because that's my rate" is fine. "I charge $300/hour" with no justification is a yellow flag. "I'd rather not say my rate until we talk more" is a red flag.
• Won't do a paid trial. Anyone confident in their value will happily prove it on a 1-2 week paid trial.
• Pushes for equity-only or equity-heavy compensation. They're either uncertain about delivering immediate value, or trying to acquire equity arbitrage.
• 5+ concurrent clients. Math doesn't work. They're an absentee fractional CTO at best.
• Can't say no to anything. If they agree with everything you say in the interview, they'll agree with everything during the engagement. You're paying for judgment, not agreement.
• Vague about their first 30 days. Experienced fractional CTOs have a repeatable onboarding process. If they don't, they're inexperienced or unstructured.
• Bad at communication during the sales process. Slow responses, unclear emails, missed meetings — this only gets worse after they're hired.
Onboarding Your Fractional CTO (First 30/60/90 Days)How you onboard determines the engagement's success more than who you hired. The first 90 days follow a predictable pattern:
First 30 Days: Discovery and Quick Wins
• Week 1-2: Codebase audit, team 1:1s, business context immersion
• Week 3: Identify 3-5 quick wins (under 1 week each) and execute
• Week 4: Deliver a written 90-day plan with prioritized initiatives
Days 31-60: Strategic Initiatives
• Execute on the highest-priority initiative from the 90-day plan
• Begin team-building work (hires, role definitions, culture cleanup)
• Establish ongoing rhythms: weekly 1:1 with founder, monthly all-hands tech update, quarterly architecture review
Days 61-90: Compounding Value
• Deliver second priority initiative
• Quarterly review meeting: what's working, what's not, scope adjustments
• Decision point: continue at current scope, scale up to embedded, or wind down
If at day 60 you can't articulate three specific things the fractional CTO has accomplished, the engagement is failing. Have the direct conversation, course-correct, or part ways.
Getting StartedThe fastest way to figure out if a fractional CTO is right for your stage is a 30-minute strategy call. Not a sales pitch — a diagnostic conversation about your specific situation, what you're trying to build, what's actually blocking you, and whether fractional CTO support is the right move.
I work with 3-4 founders at a time on $5K-15K/month embedded engagements, focused on AI-first product builds, vibe code rescue (rebuilding AI-generated codebases that broke in production), and Series A technical due diligence prep. Austin-based, US clients only.
Book a strategy call and I'll give you an honest assessment of whether fractional CTO support fits your stage and budget. If it doesn't, I'll tell you what does.
Related reading: Fractional CTO Rates and Cost, Fractional vs Full-Time CTO, Fractional Product Manager Guide, Fractional CTO Services Overview.
### Frequently Asked Questions
**Q: How long does it take to hire a fractional CTO?**
A: Most fractional CTO hires happen in 2-4 weeks from first conversation to start date. Week 1: source candidates and screen for fit. Week 2: interview 3-5 candidates and check references. Week 3: paid trial engagement (architecture audit or 1-week sprint). Week 4: contract signing and onboarding. Compare this to a full-time CTO hire which typically takes 4-6 months.
**Q: Where do I find a qualified fractional CTO?**
A: The best sources are: (1) personal referrals from founders who've worked with one, (2) operator networks like On Deck CTO Track, Pallet, or Reforge, (3) fractional executive marketplaces like Bonsai or GrowthMentor, (4) LinkedIn outreach to people who explicitly list 'Fractional CTO' in their profile, (5) AI-focused communities like Latent Space or AI Tinkerers Slack. Avoid traditional dev agencies — they're not selling strategic leadership, they're selling hours.
**Q: What should I look for in a fractional CTO?**
A: Five things: (1) Pattern recognition — they've seen 5+ companies at your stage, (2) Operator past — they've actually shipped products, not just advised, (3) Domain fit — relevant industry experience for your space (B2B SaaS, marketplaces, AI, fintech), (4) Founder communication — they can explain technical tradeoffs to non-technical founders without condescension, (5) Honest red flags — they tell you what they're NOT good at. The single biggest predictor of success is whether they push back on your ideas vs. just executing what you tell them.
**Q: What questions should I ask in a fractional CTO interview?**
A: Top five: (1) 'Walk me through a time you told a founder NOT to build something they wanted to build.' (Tests strategic backbone), (2) 'How would you architect [my specific use case] in the next 90 days?' (Tests speed of judgment), (3) 'How many other clients are you working with right now and what's the time split?' (Tests bandwidth), (4) 'When have you been wrong about a major technical decision and how did you recover?' (Tests humility and learning), (5) 'What's your hourly rate vs. retainer rate, and why?' (Tests pricing transparency).
**Q: Should I hire a fractional CTO hourly or on retainer?**
A: Retainer is better for ongoing work (3+ months). The CTO becomes embedded, learns your codebase, and proactively spots problems. Typical embedded retainer: $5,000-15,000/month for 10-20 hours/week. Hourly is better for time-bounded projects: architecture review ($3,000-5,000 for a 2-day audit), security assessment, technical due diligence. Most engagements start with a paid hourly trial (1-2 weeks) then convert to retainer if it's working.
**Q: How much equity should a fractional CTO get?**
A: Most fractional engagements have zero equity — it's a paid service relationship. For long-term embedded engagements (6+ months), equity in the 0.25-1.0% range is reasonable on top of cash compensation. Never replace cash with equity-only — that's a red flag indicating the CTO can't justify their value with cash. The equity should vest over the engagement length (typically 12-month cliff with monthly vesting after) so you're not stuck giving equity to someone who left after 60 days.
**Q: What's a typical fractional CTO contract length?**
A: Industry standard is month-to-month with 30-day notice on either side. This protects both parties — you can exit if it's not working, they can exit if priorities shift. Avoid contracts longer than 6 months. Avoid contracts shorter than 30-day notice (suggests they're just gigging). The healthiest structure: 1-month paid trial, then month-to-month with quarterly check-ins on engagement scope and rate.
**Q: Can I switch fractional CTOs if it's not working?**
A: Yes, and you should plan for it. If month 2 isn't dramatically better than month 1, have the direct conversation: 'This isn't working — here's specifically what I need that's not happening.' Either they course-correct or you part ways. Most fractional CTOs prefer this honesty to a dragged-out exit. Average fractional engagement length is 8-14 months, so plan for switching every 12-18 months as your needs evolve from MVP → scale → optimization.
---
## Build vs Buy AI: A Decision Framework for Profitable Businesses (2026)
- **URL:** https://justinmckelvey.com/blog/build-vs-buy-ai-decision-framework
- **Published:** April 27, 2026
- **Updated:** April 27, 2026
- **Category:** AI for Business
- **Reading time:** 8 min
- **Description:** Should you build AI in-house or buy a vendor solution? A 4-question framework with real cost ranges, plus the rare case where a productized AI ops service is the right answer.
TL;DR: Default to Buy, Build Only When You MustBuild vs buy is the AI question every operator is wrestling with right now, and the wrong answer is expensive in both directions. Build wrong and you spend $1M+ over three years on a system that's worse than off-the-shelf. Buy wrong and you end up with vendor lock-in, generic output, and a workflow that doesn't fit your business. The honest decision: default to buy unless you have a specific reason to build. The case for building is narrower than most operators think — and there's a third option, productized AI ops, that solves the common middle case where neither buying generic SaaS nor building from scratch is right.
Why "Build vs Buy AI" Is Different from "Build vs Buy Software"The classic build vs buy software framework still applies, but AI changes some of the math.
What's the same: the core question of competitive differentiation. If the capability is core to your moat, lean toward build. If it's table stakes, lean toward buy. That principle has held for decades and still holds for AI.
What's different: the cost structure of building has shifted dramatically in both directions. The model itself is cheaper than ever — foundation models from Anthropic, OpenAI, Google, and Meta are commodity-priced and capable. But the surrounding work — evaluation, monitoring, integration, drift management, security review — has gotten more expensive because the standards for production AI are rising fast.
What's also different: vendors have caught up faster than they did in classic SaaS. The 18-month gap between "vendor solution exists" and "vendor solution is enterprise-ready" has compressed to 3-6 months for AI. That changes the build-vs-buy math because waiting for the vendor to catch up is now a real strategy.
The 4-Question Build vs Buy AI FrameworkThe classic build vs buy framework — the one that's been around since the SaaS era — still works as the spine. The build vs buy decision is one of the oldest in software procurement; AI just changes the cost inputs, not the structure of the decision. The four questions below are calibrated for AI specifically.
Run every AI capability you're considering through these four questions. The answers will sort 80% of decisions cleanly. The other 20% are genuine judgment calls — that's where outside advice earns its fee.
Question 1: Is this capability core to your competitive advantage? Not "important to operations" — core to your moat. If you sold the AI capability separately, would customers pay for it? If yes, lean build. If no, lean buy.
Most internal AI applications fail this question. AI for customer support triage is important but not core. AI for drafting first-pass content is useful but not differentiated. AI inside a product you sell — where the AI quality is part of what customers pay for — passes this question and pushes toward build.
Question 2: Do you have unique data that gives your AI a quality advantage? If you're building on the same foundation models with the same generic data as everyone else, you don't have a build advantage. If you have proprietary data — years of customer records, domain-specific datasets, internal expertise others can't access — that data can give a custom-trained or custom-prompted system a real edge.
Most businesses overestimate how unique their data is. "We have customer transactions" isn't unique data. "We have 10 years of structured outcomes data on a specific niche workflow that vendors don't have access to" is unique data. Be honest here. Vendors have a lot of data.
Question 3: Will you commit to running the system long-term? Building means owning. Owning means hiring or retaining the people who can maintain, debug, and improve it for years. AI systems decay if unmaintained — model drift, prompt drift, integration drift. The hidden cost of building is the team you need on payroll forever.
If you can't honestly commit to a 2-3 person AI team for the foreseeable future, you can't build. You can buy or productize, but building without long-term ownership is the most expensive failure mode.
Question 4: Does buying lock you in dangerously? Some lock-in is fine. Lock-in becomes dangerous when (a) the vendor is small enough to disappear, (b) your data isn't easily exportable, or (c) the vendor's pricing power could increase 5-10x without you having alternatives. Evaluate lock-in as a real cost; vendors with clean APIs and portable data are worth a premium over vendors who hold your workflow hostage.
The decision rule: 3+ "yes" answers across Questions 1-3 (core to moat, unique data, long-term commitment) plus a "yes" on Question 4 (buying creates real lock-in danger) → build is justified. Anything less → buy or productize.
When to BuildThe case for building is real but narrow. Examples where it tends to be right:
AI inside a software product you sell. If your customers pay for your product specifically because of its AI capabilities, the AI is core to your moat. Build it. Don't outsource your differentiation.
AI doing core risk-bearing work in a regulated industry. Underwriting at a financial firm, fraud detection at a payment processor, clinical decision support in healthcare. These workflows have unique data, regulatory requirements that constrain vendor choice, and outcomes where buying generic accuracy isn't acceptable.
AI on top of proprietary datasets you've built over years. If you have 10+ years of structured outcomes data in a niche domain, the data is your moat. Building on top of it preserves the moat. Buying a generic vendor solution erodes it because the vendor will eventually offer the same capability to your competitors.
If you don't fit one of these patterns, building is probably wrong.
When to BuyBuy when AI is operationally useful but not differentiating. Most internal AI use cases fit here.
Customer support triage and drafting. Vendors are good at this and getting better fast. Use them.
Content drafting and editing. Foundation models from generic vendors handle this well. The differentiator is prompts and review, not the underlying model.
Data analysis and reporting. AI-augmented analytics tools are mature. Build only if you have unusual data shapes or volumes that vendors can't handle.
Sales prospecting and CRM workflows. Vendor solutions exist for almost every sub-task here. Build wastes time on solved problems.
The pattern: if multiple credible vendors offer the capability, your build won't beat them on quality, and the long-term ownership cost won't justify the build effort. Buy.
When to Use a Productized Service Like SuperDuprThe third option exists because there's a common middle case the build/buy frame misses.
Profitable small and mid-sized businesses often face this situation: they have a specific workflow they want AI-enabled, generic vendor SaaS doesn't fit it cleanly (the SaaS is built for a different workflow shape), and building from scratch is too expensive and slow. Neither build nor buy is right.
Productized AI ops services like SuperDupr — that's the productized AI ops business I run — exist for exactly this case. The shape: a fixed-scope, fixed-price engagement (typically 4-12 weeks, $15,000-$60,000) where someone scopes, builds, and integrates a custom-fit AI workflow for you, then hands it back operating. You get the custom fit of a build without the team commitment. You get the speed and predictability of a buy without the vendor mismatch.
It's not the right answer for every situation. Productized AI ops doesn't fit when:
The workflow is novel. If your project is genuinely unique — something nobody has built before — productized services won't have the patterns. You need an agency or in-house build.
The capability is core to your moat. Same logic as the build case — outsourcing your differentiation, even via productized service, is risky long-term.
You don't yet know what you want. Productized services ship defined scope. If your scope is "we need AI somewhere," you need a consultant first to figure out where, then a productized service for the implementation.
For most everything else — internal workflow automation, lead qualification, customer support augmentation, content generation, ops data flows — productized is the lowest-risk, fastest-to-value option.
The Cost MathReal numbers across all three options for a typical "AI-enable one workflow" project:
Build (in-house): $100K-$500K initial development, 6-12 months to production, $200K-$600K/year ongoing for the team to operate it. 3-year all-in: $700K-$2.5M for a focused system.
Buy (vendor SaaS): $5K-$50K/year licensing, 2-8 weeks to deploy, $0-$50K integration work. 3-year all-in: $30K-$200K — much cheaper, but only fits if a vendor solution matches your workflow.
Productized AI ops: $15K-$60K fixed-price engagement, 4-12 weeks to production, optional $2K-$5K/month ongoing for operations and improvements. 3-year all-in: $90K-$240K — bridges the cost gap between vendor SaaS and custom build, with custom-fit implementation.
The build number stops most build conversations cold once it's on the table. Most operators have been considering build because they imagined the cost as $200K-$300K total. The actual 3-year cost being $1M+ reframes the decision quickly.
Vendor Lock-In: Real Risk vs. Imagined RiskLock-in is the most-cited reason to build instead of buy, and it's usually overstated.
Real lock-in: the vendor holds your data in a proprietary format you can't export. The vendor's pricing has step-functioned 5-10x because they have you cornered. The vendor has gone out of business and there's no migration path. The vendor has been acquired and the new owner is changing terms.
Imagined lock-in: "We'll be dependent on their API." Most AI APIs are interchangeable now. Foundation model providers (Anthropic, OpenAI, Google) have functionally similar capabilities and you can switch in days, not months. Your data is yours. Your prompts are portable.
The right way to handle lock-in risk: pick vendors with clean APIs, exportable data, and minimal proprietary moat. Pay a small premium for vendors who don't lock you in. Avoid vendors with closed data formats or aggressive contractual lock-in. The premium is much cheaper than the alternative cost of building to avoid lock-in.
The Final Decision RuleCompress everything above into one sentence: buy unless you must build, and consider productized AI ops as a third option whenever the buy/build dichotomy doesn't quite fit.
If you want help running the framework against a specific decision you're facing, book a strategy call. I'll walk through the four questions with your specific situation and tell you honestly which path fits — including telling you when SuperDupr isn't the right answer. The point of the conversation is the right decision, not the sale.
For the related question of how to choose between an AI consultant, an AI agency, and a productized AI ops service — once you've decided which way the build/buy decision lands — see AI Consultant vs Agency vs Productized AI Ops. For the methodology behind how productized AI ops engagements get scoped, see AI Operations: How I Scope AI Projects That Actually Ship. For mid-market operators ($1M–$50M) who want a 2-week written roadmap instead of an open-ended engagement, see AI for Business Owners.
### Frequently Asked Questions
**Q: Should I build my own AI tool or buy a vendor's?**
A: Default to buy unless you have a specific reason to build. Building costs more upfront, takes longer, and creates ongoing maintenance burden. Buying gets you to value faster and lets you switch later if the vendor disappoints. The case for building is narrower than most operators think: building makes sense when the AI capability is core to your competitive advantage, when you have unique data that vendors can't access, or when buying costs more long-term than building (rare in AI today).
**Q: What's the difference between build, buy, and use a productized service?**
A: Build means your team writes the code and operates the system. Buy means you license a vendor's product (a SaaS tool, API, or platform) and your team uses it. A productized service is a third option: someone else builds and operates a custom-fit AI workflow for you on a fixed-price engagement. The productized option exists because most businesses don't actually want to build (too expensive) or buy generic SaaS (doesn't fit their workflow) — they want a custom-fit implementation without committing to a build team.
**Q: When is building AI in-house the right call?**
A: Three conditions, all of which usually need to be true. First: the AI capability is genuinely core to your competitive advantage — not table stakes. Second: you have unique data that gives your AI a meaningful quality advantage over generic vendor solutions. Third: you have or are willing to build the team to operate it long-term (engineers, ML practitioners, product managers). Without all three, building is almost always more expensive than the alternatives.
**Q: What's the real cost of building AI in-house?**
A: More than the spreadsheet says. Initial build: $100K-$500K depending on scope. Ongoing engineering and ops: $200K-$600K/year for a small team. Foundation model costs: $5K-$50K/month at any meaningful scale. Hidden costs: managing model drift, evaluation infrastructure, monitoring, security review, integration maintenance. The realistic 3-year all-in cost for a custom AI build is $1M-$3M for a focused single-use-case system. That number stops most build conversations cold once it's on the table.
**Q: When does the build case become real with AI?**
A: When the workflow you're AI-enabling produces direct revenue at scale, when off-the-shelf vendors can't match the accuracy your business needs, and when the cost of a vendor going down or changing terms is existential. Examples: AI inside a software product you sell to customers, AI doing core risk underwriting at a financial firm, AI generating output you sell directly. Examples that usually don't pass: AI doing internal customer support triage, AI helping with drafting emails, AI analyzing internal data for ops dashboards. The latter examples are buy-or-productize cases.
**Q: How does this decision change if a vendor goes out of business?**
A: Build the dependency risk into the buy decision. Vendor lock-in is real but usually overstated. Most AI capabilities can be replaced within 30-90 days if the vendor disappears, and the data is yours. The exceptions are vendors with proprietary models you can't replicate (rare today as foundation models converge in capability) and vendors deeply integrated into your workflow with no clean export path. Choose vendors with clean APIs, exportable data, and minimal proprietary moat. The lock-in cost is usually lower than the build cost.
---
## AI Consultant vs AI Agency vs Productized AI Ops: How to Choose (2026)
- **URL:** https://justinmckelvey.com/blog/ai-consultant-vs-agency-vs-productized-ai-ops
- **Published:** April 27, 2026
- **Updated:** April 27, 2026
- **Category:** AI for Business
- **Reading time:** 8 min
- **Description:** AI consultant, agency, or productized AI ops — which one fits your business? Honest comparison of costs, timelines, and risks from an AI ops founder who runs the third option.
TL;DR: Three Different Products, Same BuyerAI consultant, AI agency, and productized AI ops are three different products being sold to the same overwhelmed buyer. They cost different amounts, ship in different timeframes, and fail in different ways. The buyer — usually a profitable business owner or operator who's been told they need to "do AI" — typically picks based on whoever showed up first, which is the wrong filter. As of 2026, the honest decision tree is: AI consultant for strategy work when you don't yet know what to build; AI agency for execution-heavy projects with defined scope; productized AI ops for the common middle case where you need both scoping and implementation in one fixed-price engagement. I run one of those three (the third) and advise on the other two as a fractional CTO. Below is the actual comparison.
What an AI Consultant Does (And What They Don't)If you're searching for AI consulting services or trying to figure out how to hire an AI consultant, the first question to settle is what you're actually buying. AI consulting services and AI implementation services are different things sold by different kinds of firms — and operators often confuse the two when shopping.
An AI consultant's job is to help you decide what to build, why, and in what order. The deliverable is judgment, not code. A good consulting engagement produces a roadmap, a prioritized list of use cases, an honest cost-benefit assessment, and often a written recommendation about whether to proceed at all.
What a good AI consultant looks like in practice: 20-40 hours over 4-8 weeks, mostly in interviews with your team and analysis of your current workflows. Output is a written deliverable — usually 15-30 pages — with specific recommendations. Implementation is not included; that's a separate engagement, often with a different vendor.
What an AI consultant typically does NOT do: build production systems, integrate with your tools, train your team on operating the AI, or stick around when something breaks 60 days after the strategy doc is delivered. Consulting and implementation are different jobs and the people who do them well are usually different people.
Cost range: Independent AI consultants run $200-$500/hour or $5,000-$25,000 per engagement. Boutique firms run $300-$1,000/hour or $25,000-$100,000+. Big-firm AI consulting starts at $250,000 and climbs from there.
The right time to hire an AI consultant: when you genuinely don't know what to build. You've heard you need AI, you have budget, and you want a defensible decision about which workflow to attack first. A consultant earns their fee by saving you from picking the wrong project — which is the most expensive form of mistake at this stage.
The wrong time: when you already know what you want to build and you just need someone to build it. Hiring a consultant for execution is paying strategy rates for tactical work.
What an AI Agency Does (And the Trap of Paying for Retainers)An AI agency builds and ships AI projects on a per-project or retainer basis. They have engineers, designers, and project managers; they take a defined scope and ship working software. They sit between consultants (strategy only) and productized services (fixed scope) — flexible enough to take on novel projects, expensive enough that you feel the bill.
What a good AI agency engagement looks like: a discovery phase (1-2 weeks), a build phase (4-12 weeks depending on scope), and a handoff. The team is typically 3-6 people; you'll deal with a project manager day-to-day and engineers/designers as needed. You get production-grade software when you're done.
What an AI agency does NOT do well: open-ended exploration. If your scope is "we need AI in our business," an agency will either refuse the engagement (good agencies) or accept it and turn it into an indefinite retainer (bad agencies). Agencies need defined targets to ship; without them, you're paying for time without buying outcomes.
Cost range: AI agencies run $5,000-$50,000+/month on retainer or $25,000-$150,000 per defined project.
The retainer trap: the most common AI agency failure mode. The retainer starts because there's "lots to do." Six months later, the team is still busy but nothing has shipped that meaningfully changed your business. The retainer continues because canceling feels like quitting — but the retainer is the problem. If your agency engagement isn't shipping discrete outcomes every 4-6 weeks, the structure is wrong.
The right time to hire an AI agency: when you have a defined project, the budget to ship it, and the project is novel or specialized enough that productized services don't fit. Custom internal tools, complex multi-system integrations, and AI products you're building for your own customers are all good fits.
The wrong time: when your project could be productized, when your scope is unclear, or when you don't have an internal owner for the work the agency produces.
What Productized AI Ops Looks Like — And Why I Built SuperDupr Around This ModelProductized AI ops is fixed-scope, fixed-price AI implementation. You buy a defined deliverable. You know what you're getting, when, and for how much, before you sign anything. The trade-off is scope rigidity — you can't add "one more thing" mid-project without re-scoping. The upside is predictability and lower cost than consulting plus agency separately.
I built SuperDupr around this model because the consulting + agency stack didn't fit the businesses I was working with. Profitable small and mid-sized businesses ($2M-$50M revenue) usually have a clear pain point, can describe the workflow they want changed, and want a real implementation — not a strategy deck and not an open-ended retainer. They want someone to scope it, build it, integrate it, and hand it back operating.
What a productized AI ops engagement looks like: a 1-2 week discovery phase included in the fixed price, a 3-8 week build and integration phase, and a 30-day stabilization phase where the system is operating in production. Total engagement: 4-12 weeks, $15,000-$60,000, depending on scope. You walk in with a problem; you walk out with a working AI workflow your team is using.
Cost benchmarks across the three options for a typical "add AI to one workflow" project:
Consultant + Agency stack: $20,000-$50,000 (consultant strategy) + $50,000-$120,000 (agency build) = $70,000-$170,000 total, 3-6 months.
Agency alone (skipping the strategy phase): $50,000-$120,000, 2-4 months — risky because the strategy phase isn't formally done.
Productized AI ops: $15,000-$60,000, 4-12 weeks, both scoping and implementation included.
The math favors productized for most cases. The exceptions are when you genuinely need open-ended strategy work (consultant) or when your project is novel enough that no productized service exists for it (agency). Both are real cases — just less common than the "we want AI in our customer service workflow" or "we want AI handling lead qualification" engagements that productized services were built for.
How Much Each Option Actually Costs in 2026 (AI Consultant Cost Ranges)If you came here searching for AI consultant cost or trying to find an AI consultant near me, here are honest numbers across the market. AI consultant cost varies by 10-50x depending on the firm size — and the cheapest option is rarely the worst, while the most expensive option is rarely the best.
Honest cost ranges, current as of mid-2026:
AI consultant — independent: $200-$500/hour. $5,000-$25,000 per scoped engagement. Best fit: solo projects, focused strategy questions, founder-led businesses.
AI consultant — boutique firm: $300-$1,000/hour. $25,000-$100,000+ per engagement. Best fit: companies needing senior judgment plus a small team. Often pairs well with internal execution.
AI consultant — big firm (McKinsey, BCG, Deloitte, Accenture): $250,000 minimum, often $1M+ for substantive engagements. Best fit: enterprises with board-level AI strategy questions and the budget to absorb consulting overhead. Almost always wrong for businesses under $100M revenue.
AI agency — small (5-15 people): $5,000-$15,000/month retainer or $25,000-$75,000 per project. Best fit: businesses with one defined AI project that's too custom for productized services.
AI agency — mid-size (15-50 people): $15,000-$50,000/month retainer or $75,000-$200,000 per project. Best fit: companies with multiple AI projects in flight or specialized requirements (regulated industries, specific compliance needs).
Productized AI ops — focused project: $15,000-$30,000 fixed price for 4-6 weeks. Best fit: a single defined workflow to AI-enable.
Productized AI ops — multi-stakeholder rollout: $30,000-$75,000 fixed price for 8-12 weeks. Best fit: a workflow that touches multiple teams or systems.
The "$5K AI project" trap: if a vendor quotes you under $5,000 for an AI implementation, they're not scoping it — they're selling you an API call wrapper. Real implementation work doesn't fit in that price range. The model API calls themselves are nearly free; the work that produces business value is the scoping, integration, and operation around the model.
How to Hire an AI Consultant — Filters That Actually MatterIf you're going to hire an AI consultant (or an agency, or a productized AI ops shop), three filters distinguish operators from sellers, regardless of which option you go with.
1. Ask them to describe the last AI project they shipped — in operational detail. Not the case study. The actual project: who the customer was, what was scoped, what went wrong, how they fixed it, what the team is doing with it now. People who ship can describe operational reality. People who sell can only describe outcomes.
2. Ask what they would NOT recommend AI for in your business. Anyone who says "AI can help everywhere" is selling, not advising. Good operators have a list of cases where AI is the wrong tool — usually because the data isn't ready, the workflow isn't tight enough, or the cost of error is too high. Listen for the no.
3. Ask for references from the past 12 months. Good vendors have happy recent clients. Bad vendors have testimonials from 2022 and a glossy deck. Call the references; ask what surprised them about the engagement (good or bad). Surprises are the most informative signal.
The Honest Decision TreeIf I had to compress this into a single decision, here's how I'd frame it.
You don't yet know what to build: hire a consultant. The expensive thing isn't the consultant — it's spending $100K building the wrong project.
You know what to build, the project is well-defined, and your scope is novel/specialized: hire an agency. Pay for the engineering capacity and project management.
You know what to build, the project is well-defined, and the scope is similar to projects others have run successfully: use a productized AI ops service. Lower cost, faster timeline, less risk than the alternatives.
You're not sure which category you're in: book a free 30-minute call with someone who runs one of these and ask them to honestly slot you. Book a strategy call — I'll tell you which option fits your situation, including telling you if SuperDupr isn't the right fit. The point of the call is to make the right choice, not to sell you my service.
For the related question of whether to build AI yourself or buy it from a vendor, see Build vs Buy AI: A Decision Framework. For the methodology behind how productized AI ops engagements actually get scoped, see AI Operations: How I Scope AI Projects That Actually Ship. If you're a $1M–$50M operator and want the AI consultant version (advisory + roadmap, not execution), see AI for Business Owners and the productized AI Readiness Assessment.
### Frequently Asked Questions
**Q: Should I hire an AI consultant or an AI agency?**
A: Neither, in many cases. AI consultants are best for strategy work — what to build, why, and in what order. AI agencies are best for execution-heavy projects with defined scope. But for most profitable small and mid-sized businesses, neither is the right shape — productized AI ops services give you both scoping and implementation in one engagement at a fraction of the cost. The honest answer depends on whether you have a defined project (agency), don't yet know what to build (consultant), or want both packaged (productized AI ops).
**Q: How much does an AI consultant cost?**
A: Independent AI consultants typically charge $200-$500 per hour or $5,000-$25,000 per scoped engagement. Boutique AI consulting firms charge $300-$1,000 per hour or $25,000-$100,000+ per project. Big-firm AI consulting (McKinsey, BCG, Deloitte) starts around $250,000 and rapidly climbs from there. As of 2026, AI consultant rates have gone up roughly 40% over the past year as demand has outstripped supply. Cost per hour is a misleading metric — what matters is cost per shipped outcome.
**Q: How much does an AI agency cost?**
A: AI agencies typically work on monthly retainers ranging from $5,000 to $50,000+ depending on scope and team size. Per-project pricing runs $25,000 to $150,000 for defined builds. The hidden cost with agencies is the retainer trap — work continues at retainer pace whether or not it's shipping value. The right time to hire an agency is when you have a defined project with clear deliverables, not when you have an undefined 'we need AI in our business' problem.
**Q: What is productized AI ops?**
A: Productized AI ops is a fixed-scope, fixed-price AI implementation service — you buy a defined deliverable rather than time and materials. A typical engagement runs $15,000-$60,000 for a 4-12 week implementation including discovery, build, integration, and 30-day stabilization. The advantage is predictability — you know what you're getting, when, and for how much, before you sign anything. The trade-off is scope rigidity — you can't add 'one more thing' mid-project without re-scoping. SuperDupr is the productized AI ops business I run; this is the model we operate.
**Q: How do I hire an AI consultant?**
A: Three filters that matter. First: ask them to describe the last AI project they shipped, in operational detail (not the case-study version). If they can't, they don't ship. Second: ask what they would NOT recommend AI for in your business — anyone who says 'AI can help everywhere' is selling, not consulting. Third: ask for references you can call from the past 12 months. Good AI consultants have happy recent clients; bad ones have testimonials from 2022 and a glossy deck.
**Q: When should I use a productized service instead of a consultant?**
A: When you've already done the strategic thinking and know what you want to build. Productized services skip the strategy phase and go straight to implementation. If you can describe the workflow you want changed, the integration points, and the success metric, you're ready for a productized engagement. If you can't articulate any of those, you need a consultant first to figure them out — then come back to productized for the implementation.
**Q: Can I get an AI consultant near me?**
A: Yes, but local doesn't matter the way it used to. AI work is largely remote-friendly — most of the conversations are video calls, most of the work is in your data and tools, not in your office. The 'near me' search is usually a proxy for 'someone I can trust' rather than 'someone in my zip code.' Good AI consultants and productized AI ops shops work across geographies. Filter for credibility and operating experience, not location.
---
## AI Operations: How I Scope AI Projects That Actually Ship
- **URL:** https://justinmckelvey.com/blog/ai-operations-scoping-ai-projects
- **Published:** April 27, 2026
- **Updated:** April 27, 2026
- **Category:** AI for Business
- **Reading time:** 10 min
- **Description:** AI operations scoping framework that ships. 3 phases, real cost ranges, and how to spot the AI pilot trap. Written by an AI ops founder who scopes these for a living.
TL;DR: AI Operations in Three PhasesMost AI projects fail before they start because they were scoped to impress, not to ship. AI operations — the discipline of selecting, scoping, integrating, and running AI inside a business — is where almost all the value (or wasted budget) lives. The model isn't the hard part anymore. The hard part is getting the AI output to actually be used by your team in a workflow that produces a measurable result. The 3-phase AI operations scoping framework: Discovery → Build & Integrate → Operate. Most successful AI projects ship in 4-12 weeks across these three phases. Most failed AI projects skipped Phase 1 and tried to start at Phase 2.
What "AI Operations" Actually Means (And Why It's Different from AIOps)The term gets confused regularly, so let's separate the two cleanly.
AIOps (one word) is the established enterprise IT category. Splunk, Dynatrace, Datadog, and similar players use "AIOps" to mean using AI to monitor and manage IT infrastructure — alerting, anomaly detection, log analysis. If you search for AIOps and you're not in IT operations, you're probably in the wrong category.
AI operations (two words) is the broader business discipline. It's how any company — accounting firms, agencies, ecommerce shops, professional services — selects AI use cases, scopes projects, integrates the output into real workflows, and runs the resulting systems over time. It's the management layer around AI, not the AI itself.
This post is about the second one. If you're an owner or operator at a profitable business trying to figure out how to actually use AI in your operations — not how to monitor your servers — keep reading.
Why Most AI Projects Don't ShipI run an AI ops shop called SuperDupr where this is the work. Across the engagements I've scoped, the same failure patterns repeat. They're not technical failures. They're scoping failures.
The demo trap. An owner sees an AI demo. The demo summarizes a contract in 30 seconds with one clean input and one perfect output. They back into a project that recreates the demo. Six months later, the project hasn't shipped because the real workflow involves 12 contract types, 4 systems of record, 3 stakeholders who need to review, and edge cases the demo never showed. The scope didn't account for any of that. It accounted for the demo.
The pilot-as-stall. Stakeholders aren't convinced. Someone proposes "let's run a pilot." The pilot is scoped to prove AI works. It does prove that. Then everyone discusses the pilot for four months while no real implementation lands. The pilot becomes the project. AI operations gets confused with AI testing.
The scope-from-tool, not from-problem. Someone reads about a new AI tool and decides "we should use this." A project is scoped around the tool. Halfway through, the team realizes the tool doesn't fit the problem. The project either fails or pivots to a different tool. Scope-from-tool is the most common mistake at companies under 100 people.
The fix to all three is the same: a real scoping phase before anything gets built. That's what Phase 1 is for.
The 3-Phase AI Project Plan I Use (Scoping AI Project Engagements That Ship)Scoping an AI project well is the single highest-leverage activity in this whole arc. Get the scoping right and the rest of the engagement runs cleanly. Get it wrong and no amount of model quality or engineering rescues it. The 3-phase AI project plan below is the structure I use on every engagement.
Phase 1: Discovery (Weeks 1-2). Phase 2: Build & Integrate (Weeks 3-8). Phase 3: Operate (ongoing, with a 30-day stabilization). The phases aren't optional and they aren't in a different order. Discovery before building. Building before operating. The shape doesn't change; what changes is how big each phase is for the specific project.
Phase 1: Discovery (Weeks 1-2)The job in Phase 1 is to define the workflow we're trying to change, the people whose work changes, the data that already exists, and what "success" looks like in measurable terms. No AI gets built. No tools get selected. The output is a one-page scoping doc that anyone in the business can read.
Specifically, Phase 1 produces:
The workflow being changed. Drawn out as a sequence — what happens today, who does it, where the data comes from, what the output looks like. If we can't draw the current state in under a page, the project isn't scoped tight enough yet. Keep narrowing.
The success metric. One number. Time per task, error rate, throughput, customer turnaround, revenue per hour — whichever applies. Without one specific number, the project has no way to declare success and no way to declare failure. Vague projects don't ship.
The data we have. Where it lives, what shape it's in, how clean it is, who owns access to it. Most AI projects break in Phase 2 because Phase 1 didn't honestly assess data. "We have customer data" isn't an assessment — "we have 12,000 customer records spread across HubSpot and a Google Sheet, with 30% missing email and inconsistent name formatting" is.
The integration surface. Which tools the AI output has to flow into and out of. CRM, accounting system, ticketing tool, email — whatever's relevant. Each integration is a project unto itself; counting them honestly in Phase 1 prevents Phase 2 surprises.
The risk of being wrong. If the AI output is wrong 10% of the time, what happens? For some workflows (drafting first-pass email replies for human review), 10% wrong is fine. For others (auto-categorizing transactions for tax purposes), 10% wrong is catastrophic. The tolerance for error is the most-skipped scoping question.
Phase 2: Build & Integrate (Weeks 3-8)This phase varies most across projects. A focused use case with clean data and one integration is 3-4 weeks. A multi-stakeholder rollout with messy data and 3-5 integrations is 6-8 weeks. The shape, though, is consistent.
Pick the model and the orchestration layer. Most projects don't need a custom model — a foundation model (Claude, GPT, Gemini) plus a thin orchestration layer (a small Rails or Node app, or a workflow tool) is sufficient. Reserve fine-tuning and custom models for cases where you've already proven the workflow works without them.
Build the integration plumbing. Webhooks in, webhooks out. Auth. Error handling. Logging. The unglamorous engineering work that makes the AI output flow from where it's generated to where it gets used. This is where 60-70% of the actual build time goes — not on the AI.
Run real data through it. Not synthetic test data. Real records from the business, with all their messiness. The first end-to-end run is almost always ugly. Phase 2 is partially about making the ugly parts less ugly through iteration on prompts, integration code, and edge-case handling.
Build the human review/feedback loop. AI outputs need human eyes for the first 30-90 days at minimum, and often forever. Build the review interface as part of the project, not as an afterthought. The team using the AI needs to be able to flag bad outputs and have those flags actually train the next iteration.
Phase 3: Operate (Ongoing, with 30-Day Stabilization)This is the phase most projects skip — and it's where AI operations actually lives. The first 30 days after launch are stabilization. Things will break. Edge cases will surface. The team will use the system in ways nobody anticipated. The job in Phase 3 is to absorb that signal and improve the system.
What Phase 3 looks like in practice:
Daily monitoring for the first 30 days. Output quality, error rates, usage rates. If the team isn't using the AI output, it doesn't matter how good the model is. Adoption is a metric.
Weekly review of flagged outputs. Whatever the team flagged as wrong — review them, find the patterns, update prompts or rules accordingly. This is the loop that turns a 70%-accurate AI workflow into a 90%-accurate one over 8-12 weeks of operation.
Quarterly scope reviews. What the AI does well, what it doesn't, what new use cases the team has surfaced. Most successful AI deployments find adjacent use cases within 3-6 months that weren't visible at scoping. Phase 3 is when those get evaluated honestly.
What AI Implementation Actually Costs (AI Implementation Cost Ranges 2026)If you searched for "ai implementation cost," here are honest numbers — not vendor brochures. AI implementation cost ranges widely depending on scope, integration depth, and how clean your data is going in. The model itself is nearly free; the implementation around it is where every dollar goes.
Here are real cost ranges I see across engagement sizes. Honest numbers, not vendor brochures.
Focused single-use-case AI project (one workflow, one integration, one team): $15,000-$30,000 fixed price for a 4-6 week engagement. This is the right shape for most first-time AI projects at small/mid businesses. Get one use case shipping; learn from it; expand from there.
Multi-stakeholder AI rollout (one workflow, 3-5 integrations, multiple teams): $30,000-$75,000 for an 8-12 week engagement. The cost driver isn't the AI; it's the integration count and the change management. Each stakeholder group adds friction.
Enterprise AI platform build (custom orchestration, multiple use cases, ongoing operation): $75,000-$250,000+ over 6-12 months. Reserve this for businesses that have already shipped 2-3 focused use cases successfully and have a real platform need.
If a vendor quotes you under $5K for an AI project: they're selling you an API call wrapper, not an implementation. The "AI part" of an AI project is genuinely cheap now. The work that produces business value is the scoping, integration, and operation around it — and that's never $5K.
The Pilot-Project Trap (And How to Avoid It)Pilots get a bad rap because most of them are stalling tactics. But there's a real version of the AI pilot project that's worth running.
A real pilot has a kill criterion. "If after 30 days the model's accuracy is below 80% on real data, we kill the project" is a real criterion. "Let's see how it goes" is not. Without a written kill criterion, the pilot will outlive its usefulness because nobody wants to be the person who "killed" it.
A real pilot has a time box. 30 days, 60 days, 90 days. The clock starts when the pilot launches and stops on a specific calendar date. After that date, you decide: ship it, kill it, or extend with explicit rationale. No silent extensions.
A real pilot has a single use case. Not "AI for our business." A specific workflow with a specific success metric. If the pilot scope drifts during the pilot, you're not running a pilot anymore — you're running a project with extra steps.
If your proposed pilot doesn't have all three (kill criterion, time box, single use case), it's a stalling tactic. Either tighten it into a real pilot or skip the pilot phase and ship a small focused implementation directly.
Who Should Own AI Operations Inside a CompanyThree rough thresholds based on the engagements I see.
Under 50 people: the owner or COO owns AI operations as part of their broader role. They scope projects, prioritize use cases, and oversee outside support. A dedicated AI hire at this scale is premature; the work isn't 40 hours a week and the right shape is fractional or productized AI ops support.
50 to 250 people: a designated lead — usually a head of operations or chief of staff — owns AI operations part-time, often 5-10 hours a week. Outside or fractional support handles the implementation work; the internal lead handles prioritization, stakeholder coordination, and adoption.
250+ people: a dedicated role makes sense. Titles vary — Chief AI Officer, Head of AI Operations, AI Program Manager. The role manages a portfolio of AI projects across the company and increasingly owns the relationships with outside vendors and consultants.
The wrong move at any scale is hiring a "Head of AI" too early, before the work is defined. Generic Head of AI hires often turn into expensive position-builders who write strategy decks for 6 months before any AI ships.
How This Connects to My WorkThe 3-phase scoping framework is what we run at SuperDupr — that's the productized AI ops business I founded. SuperDupr handles the implementation: Phase 2 build + integration, Phase 3 operate + stabilize. The Phase 1 discovery work is included in every engagement because it's the part that makes the rest work.
If you're trying to figure out whether AI operations are the right fit for your business right now, that's a different conversation — strategic advisory rather than productized implementation. Book a strategy call and I'll help you decide whether you're ready to scope a project, whether a pilot is appropriate, or whether you should hold off until your data and workflows are in better shape. If you've already scoped a project and want a productized implementation, that's what SuperDupr is built for.
For the broader question of how to choose between an AI consultant, an AI agency, and a productized AI ops service, see AI Consultant vs Agency vs Productized AI Ops. For the related decision of whether to build AI yourself or buy it, see Build vs Buy AI: A Decision Framework. If your business is in the $1M–$50M range and you want a strategic engagement before scoping execution, see AI for Business Owners.
### Frequently Asked Questions
**Q: What is AI operations?**
A: AI operations means the discipline of running, scoping, and maintaining AI systems inside a business — selecting the right use cases, integrating them with existing tools, monitoring their output, and improving them over time. It's the management layer around AI implementation, not the AI itself. It's distinct from AIOps (one word), which is an enterprise IT category for using AI to monitor infrastructure. AI operations applies to any business deploying AI; AIOps is a narrower IT discipline.
**Q: How is AI operations different from AIOps?**
A: AIOps (one word) is the established enterprise IT category — companies like Splunk, Dynatrace, and Datadog use 'AIOps' to mean using AI to monitor and manage IT infrastructure. AI operations (two words) is the broader business discipline: how a non-IT business runs AI projects from scoping through deployment to ongoing improvement. If you're hiring someone for AIOps, you want infrastructure monitoring expertise. If you're hiring for AI operations, you want someone who scopes business problems and ships AI solutions to them.
**Q: How long does an AI project take to ship?**
A: A well-scoped AI project ships in 4 to 12 weeks. The variable is scope and integration depth. A focused use case with clean data and one integration ships in 4-6 weeks. A multi-stakeholder rollout with messy data and 3-5 integrations takes 8-12 weeks. Anything scoped to take more than 12 weeks at this stage of AI tooling maturity is almost always over-scoped — break it into phases that ship value every 4-6 weeks.
**Q: How much does an AI implementation cost?**
A: AI implementation costs range from $5,000 for a focused single-use-case engagement to $150,000+ for enterprise rollouts. The honest mid-range for a profitable small or mid-sized business is $15,000 to $60,000 per scoped project. The number is less about model costs (which keep dropping) and more about discovery, integration, change management, and the human work of getting AI output to actually be used by the team. If a vendor is quoting you under $5K for an AI project, they're not scoping it; they're selling you a model API call.
**Q: Should I run an AI pilot first?**
A: Sometimes. The right reason to run a pilot: you have a specific question about the use case (will the model output be accurate enough? will users actually adopt it?) that can only be answered by testing. The wrong reason: stakeholders aren't convinced and need to be convinced before approving a real project. Pilots aimed at convincing skeptics usually become permanent — they ship 'proof' that gets discussed for months while no real implementation ever lands. If you can't define what would make the pilot succeed (or fail) in advance, you're not running a pilot, you're stalling.
**Q: Who should own AI operations inside a company?**
A: It depends on company size. Under 50 people: the owner or COO owns AI operations as part of their role; outside support handles scoping and implementation. 50 to 250 people: a designated lead — often a head of operations or a chief of staff — owns it part-time, with outside or fractional AI operations support. 250+ people: a dedicated role makes sense, often titled Chief AI Officer or Head of AI Operations. Below that scale, hiring a full-time AI ops lead is premature; the work isn't yet 40 hours a week.
**Q: What's the most common AI scoping mistake?**
A: Scoping for the demo, not for the workflow. Founders and ops leaders see an AI demo and back into a project that recreates the demo. The demo runs in 30 seconds with clean inputs and one happy-path test case. Real workflows have messy data, multiple stakeholders, edge cases, and integration with tools that don't have clean APIs. A project scoped from a demo systematically underestimates integration and change-management work — usually by 3-5x. Scope from the workflow you want to change, not from the demo that excited you.
---
## The Delegation Decision: When to Hire Your First Employee as a Founder
- **URL:** https://justinmckelvey.com/blog/when-to-hire-your-first-employee-founder-delegation
- **Published:** April 27, 2026
- **Updated:** April 27, 2026
- **Category:** Business Growth
- **Reading time:** 8 min
- **Description:** When to hire your first employee as a founder. A 3-question delegation test plus the fractional vs. full-time decision framework. Most founders hire too late and too generic.
TL;DR: For Founders Drowning in Their Own BusinessIf you're working 60+ hour weeks and the business is still bottlenecked on you doing everything, you don't need more grit — you need to delegate something specific. Most founders hire too late, then hire too generic. The Delegation Decision is a 3-question test for figuring out the one thing to take off your plate first, plus the framework for whether the right shape of help is fractional, contractor, or a full-time first hire. As of 2026 the answer is almost always fractional senior support or a contractor before it's a full-time hire — not because of cost, but because the cost of a wrong full-time hire at this stage is brutal in time and money.
The Four Costs of Doing Everything YourselfFounders default to doing everything because it's cheap and they're accountable. Both true. Both also incomplete. Doing everything yourself has four costs that are easy to ignore until they compound.
Cost 1: The work that doesn't get done. Founders measure their day by what they did, not what they didn't. The deals you didn't follow up on, the content you didn't ship, the customer onboarding you skipped — those don't show up on a to-do list because they never made it on. They show up later as flat revenue.
Cost 2: The work that gets done badly. When you're doing everything, you're doing most of it at 60%. The sales emails are okay. The blog posts are fine. The product decisions are decent. Aggregate 60% across 10 functions and you have a mediocre business. One specialist running one function at 90% beats a founder running it at 60% almost every time.
Cost 3: The leverage you can't see. The biggest cost is the upside you're missing. If your sales motion would close 2x more deals with consistent follow-up and you're doing it inconsistently, you're paying for that gap in lost revenue every month. The cost of fixing it (a part-time SDR, an automation, a fractional sales lead) is usually a fraction of the lost revenue. You're paying not to delegate.
Cost 4: The trap of being needed. The longer you do everything, the more your business depends on you doing everything. Customers learn that you handle their issues. Vendors learn that you sign their invoices. Investors learn that you write the updates. Eventually you can't take a week off without things breaking. That's not a successful founder; that's an expensive job.
When to Hire Your First Employee (And the Delegation Framework I Use Before You Get There)Most founders search "when to hire your first employee" months after they should have started delegating. The hire is the wrong frame. The right frame is delegation: which lever do you take off your plate first, and what's the cheapest shape of help that takes it. The delegation framework below — three questions — runs ahead of any hiring decision. Founder delegation is the discipline; the first hire is the eventual artifact.
The 3-Question Delegation TestRun every recurring task in your week through this test. The ones that fail all three are immediate delegations.
Question 1: Could a reasonably skilled person do this with 80% of my quality? Not perfect. 80%. If yes, the task is delegable. If no — strategic vision, customer empathy, founder-led sales conversations early on — keep it. Most tasks pass this question. Most founders pretend they don't.
Question 2: Does my doing this directly create founder leverage? Founder leverage means the work compounds because you specifically did it. A founder writing the first sales script: yes, leverage. A founder formatting the email template the script gets pasted into: no leverage. Half the work in your week probably fails this question once you ask honestly.
Question 3: Is the cost of getting this wrong catastrophic? Some tasks have asymmetric downside — wire transfers, contracts, public communication during a crisis. Those should stay on you, or at least pass through you, even if Questions 1 and 2 say delegate. The point of this question is to identify the rare case where the cost of error outweighs the cost of doing it yourself.
A task that fails all three (someone else can do it well, doesn't generate founder leverage, isn't catastrophic if wrong) is a delegation. A task that fails Q1 and Q2 but passes Q3 is a delegation with founder oversight. A task that passes Q1 should never be delegated until you absolutely have to.
What to Offload First (Ranked)Once you've identified the delegable work, the order matters. Most founders get the order wrong and hire for the wrong thing.
1. Sales follow-up and pipeline hygiene. If you have any kind of pipeline — leads, demos, proposals — and you're not following up consistently, this is the highest-leverage thing to fix first. The math is brutal: most founder pipelines lose 40-60% of their potential conversions to inconsistent follow-up. A part-time SDR or a fractional sales lead pays for itself in 2-3 months.
2. Customer onboarding and success. Second-highest leverage. If your churn is high or your activation is low, the fix isn't more product — it's someone making sure new customers actually use it. This is also the most common bad delegation: founders hire a generic "operations" person and assume they'll do this. They won't. Hire someone whose specific job is "customers are activated and not churning."
3. Technical execution if it's blocking the business. If you've got a vibe-coded MVP that's struggling under real load, or your engineering velocity has dropped because the codebase needs senior judgment, a fractional CTO is the right move. The cost ($5,000-15,000/month) is much lower than a full-time senior engineer ($200,000+ all-in) and you get strategic technical judgment, not just code. For the deeper breakdown on whether this is the right call, see fractional CTO vs full-time CTO.
4. Content and marketing production. If content is part of your acquisition motion and you're not shipping consistently, delegate the production. Contractors or fractional content leads. Don't delegate the strategy — keep that — but the actual writing, editing, and posting is delegable as long as you've defined the voice clearly.
5. Bookkeeping, invoicing, and compliance ops. Lowest in the rank because it's the easiest to justify keeping (small money, predictable cadence). But it eats founder hours that have higher-leverage uses elsewhere. Outsource bookkeeping early — $200-500/month buys back 5-10 hours/month.
Fractional vs. Contractor vs. Full-TimeThe decision isn't just what to delegate. It's what shape the help should take.
Fractional senior support. A senior person working 10-20 hours/week who manages 2-4 clients. You get experienced judgment without a full-time price tag. Best for: functions where the value is judgment, not raw output. Fractional CTO, fractional CFO, fractional CMO, fractional product. Cost: $5,000-15,000/month. Wrong choice if you mostly need execution capacity.
Contractors. Specific deliverables on a per-project or per-hour basis. Best for: discrete output you can scope cleanly. A specific feature build, a specific campaign, a specific number of blog posts a month. Cost: $50-150/hr typical, sometimes higher for niche skills. Wrong choice if you need ongoing judgment about ambiguous work.
Full-time first hire. A salaried employee with benefits, dedicated to your business 40 hours a week. Best for: when you have 40 hours of the same work week after week and the rate of change is high enough that you can't afford context-switching delays. Cost: $80,000-150,000 all-in for a generalist; more for a senior specialist. Wrong choice if the work isn't yet 40 hours/week or if you can't articulate clearly what their first 90 days look like.
The default at the post-MVP stage is fractional or contractor before full-time. The reason isn't cost — it's reversibility. A wrong fractional engagement ends in 30 days. A wrong full-time hire takes 3-6 months and emotional damage on both sides. Reversibility is leverage when you're still figuring out what shape the business takes.
Three Worked ExamplesExample 1: A founder closing $5K/mo deals with a 3-month sales cycle. They've got 30 conversations in flight at any given time and they're losing track. The right first move is fractional sales support or a part-time SDR — someone whose specific job is keeping the pipeline warm and surfacing the deals that need founder attention. Fractional sales lead: $4,000-8,000/mo. SDR contractor: $1,500-4,000/mo. Net effect: founder spends time only on closing calls, not chasing emails.
Example 2: A founder running a vibe-coded SaaS that just hit $30K MRR but the product breaks every time they ship. The right first move isn't another developer — it's a fractional CTO who can decide what the architecture should look like and whether to rebuild, refactor, or hire engineering capacity. The cost of guessing wrong here is months of wasted developer time. A fractional CTO at this stage prevents the most expensive form of mistake: rebuilding the wrong thing.
Example 3: A solo founder generating 60-80 leads a month from content but converting 2%. The right first move is customer success — someone (fractional or contractor) whose job is making sure new signups activate and don't churn. Hiring a marketer to generate more leads at this stage is a mistake; you have a leaky bucket, not a top-of-funnel problem. Fix the conversion before you hire to scale the volume.
Pricing Power Comes From DelegationOne non-obvious benefit of delegating earlier than feels comfortable: it forces you to charge what your time is actually worth. When you do everything yourself, you can quietly charge low because your only cost is your time. The moment you have $5,000/mo of fractional support running, you have to charge enough to cover it. That pressure pushes pricing up — which is healthy. Pricing up is one of the leveraged moves at this stage. The companion read is how to raise prices without losing customers.
When to Hire Your First Employee (the Actual Answer)If you ran the 3-question delegation test, ranked what to offload, and tried fractional or contractor help first, the question of when to hire your first employee mostly answers itself. You hire your first employee when you have 40 hours a week of the same work, a clear definition of what their first 90 days look like, and the cash flow to absorb both the salary and the ramp time without strangling the business. Most founders ask "when to hire your first employee" before any of those three are true. The honest answer in those cases isn't "now" — it's "not yet, here's what to delegate first."
If You Want Help Making the CallThe hardest part of the Delegation Decision isn't the framework — it's overcoming the founder's instinct to keep grinding. If you want help running the 3-question test against your specific week, book a strategy call. The Delegation Decision is what Month 3 of Founder's Cut works through, and it usually surfaces 2-3 immediate delegations the founder had been avoiding.
If the leverage point you're considering is technical, read what a fractional CTO actually does first to figure out whether that's the right fit. And if your roadmap is the actual bottleneck — too many features, no clear priority — start with The Clarity Filter. Sometimes the highest-leverage delegation is to delegate the cuts to someone who isn't emotionally attached to the features.
### Frequently Asked Questions
**Q: How do I know it's time to delegate?**
A: Three signals. First: you're declining work that would otherwise close because you can't physically do it (sales calls you skip, customer requests you ignore). Second: you're spending 40%+ of your week on things any reasonably skilled person could do, while the things only you can do are slipping. Third: a specific lever — sales follow-up, customer onboarding, content production — has measurable upside that you're consistently failing to capture. Any one of these means it's time. All three at once means it's overdue.
**Q: What's the first role I should hire?**
A: Whatever's blocking the most revenue. For most post-MVP founders, that's either sales (you can build but can't sell at the volume the product needs) or customer success (deals close but customers churn before they're really won). The wrong first hire is a generalist 'operations' person — those are useful at year 2 but premature at year 1. The right first hire owns one specific lever and has the skills to actually move it.
**Q: Should I hire fractional, contractor, or full-time first?**
A: Default to fractional or contractor for the first piece of leverage. The cost of a wrong full-time hire at the post-MVP stage is brutal — you're paying for capacity you're not using, dealing with the wrong-fit conversation, and burning founder time on people management when you should be on product. Fractional support (a fractional CTO, fractional CMO, fractional ops) gives you senior judgment at 10-20 hours a week. Contractors give you specific output without the management overhead. Move to full-time only when you can fill 40 hours a week of the same work.
**Q: How much should I spend on the first piece of leverage?**
A: Fractional senior support runs $5,000-15,000/month depending on the function. A solid contractor runs $50-150/hr. Full-time generalist hires at the post-MVP stage are $80,000-150,000 all-in. The question isn't 'what does it cost' but 'what's the highest-leverage cost'. A $10,000/mo fractional CTO who saves you 40 hours of your time is profitable. A $90,000/yr ops person who saves you 10 hours of your time is not.
**Q: What should I never delegate?**
A: Three things, at the post-MVP stage. First: customer conversations. You learn most of what matters about your business by talking to customers; if you offload this, you lose your decision-making fuel. Second: the strategic vision. Coaches help you sharpen it, advisors pressure-test it, but you don't delegate it. Third: the first version of any new product surface. Once a feature is shipped and validated, hand off the iteration. Before then, the founder's hands have to be on it.
**Q: What's the most common delegation mistake?**
A: Hiring too generic. Founders feel overwhelmed and think 'I need help.' They hire someone to 'help out' without defining what they're handing off. The result is an expensive person doing scattered work, the founder still doing all the important things, and a slow drift toward the founder feeling busier instead of less busy. Fix: never hire 'help.' Hire one specific lever — sales follow-up, content production, customer onboarding — with a clear definition of done.
**Q: How do I interview for a first hire?**
A: Skip the resume questions. Give the candidate a real task from your business — a sales call to role-play, a customer message to draft, a feature request to triage. Watch how they think in real time. The single best signal at this stage is whether they ask the right questions before doing the work. Generalists jump to action. Senior people stop and frame the problem first. You want frame-first people, even if their domain experience is light.
---
## The Prioritization Formula: A Product Prioritization Framework for Solo Founders
- **URL:** https://justinmckelvey.com/blog/product-prioritization-framework
- **Published:** April 27, 2026
- **Updated:** April 27, 2026
- **Category:** Product Leadership
- **Reading time:** 9 min
- **Description:** A product prioritization framework for solo founders. Combines ICE's speed, RICE's reach, Kano's user-satisfaction lens, and Lean's validation focus into one scoring formula.
TL;DR: For Founders Re-Litigating Priorities Every MondayIf your team relitigates priorities every Monday, your prioritization framework is broken — or you don't have one at all. Here's the formula: (Customer Fit + Metric Impact + Validation Confidence + Speed) − Cost = Priority Score. Five inputs, scored 0-20 each, totaled to a 1-100 priority. You can run it on your whole roadmap in 10 minutes. It exists because every prioritization framework you've been handed — RICE, ICE, Kano, MoSCoW, Lean Prioritization — was built for a product team with hundreds of thousands of users, not a solo founder with 50. The Prioritization Formula takes the best of all five and recalibrates them for the stage you're actually at.
The 5 Prioritization Frameworks Every Founder Gets Recommended (And Why Each One Fails Solo Founders)If you've asked any product person, podcast, or AI assistant "how do I prioritize features?", you've heard the same five answers. Here they are, briefly, with the specific way each one breaks down at the solo-founder stage.
ICE Score (Impact, Confidence, Ease). Score each feature 1-10 on three dimensions — how much value it delivers (Impact), how sure you are (Confidence), how fast it ships (Ease). Multiply for a single number. Fast and easy to apply. The breakdown: ICE doesn't account for whether the feature serves your specific customer or works against your strategic focus. A feature with high impact for the wrong customer scores high but should be a cut.
RICE Score (Reach, Impact, Confidence, Effort). The RICE framework adds Reach — how many users will be affected — to ICE. Reach × Impact × Confidence ÷ Effort. The breakdown: solo founders don't have reliable reach data. With 50 users, every reach estimate is noise. RICE works at scale but makes solo-founder scoring feel scientific when it's actually guesswork.
Kano Model. Categorizes features as Basic Needs (must-haves), Performance Features (more is better), Delighters (unexpected wins), or Indifferent (don't bother). Categorical, not numerical. The breakdown: Kano is great for thinking about features but bad for ordering them. You end up with three Performance Features and four Delighters and no rule for which to ship first.
MoSCoW (Must-have, Should-have, Could-have, Won't-have). Sort features into four buckets. Simplest of the bunch. The breakdown: MoSCoW makes you feel decisive without forcing real evidence. Half a roadmap ends up labeled "Must-have" because nobody wants their feature in "Could-have." It works for project scoping; it fails as a prioritization framework.
Lean Prioritization (Validation First). Build whatever tests your riskiest assumption first. Don't score features — score hypotheses. The breakdown: Lean is the right mindset but doesn't help when you have 12 features that all test reasonable hypotheses. You still need a way to choose between them.
All five are correct in some context. None of them are calibrated for the solo founder with 50 users, a product they shipped 3 months ago, and a roadmap of 14 reasonable-sounding features.
Feature prioritization at the solo-founder stage isn't the same activity as feature prioritization at a product team with PMs and analytics. The inputs are different, the data is thinner, and the cost of a wrong call is more concentrated. A scoring framework built for the second stage produces noise at the first.
What We Actually Need: A 6th Option Built for Solo FoundersThe Prioritization Formula is the synthesis. It takes ICE's speed, replaces RICE's broken Reach with Customer Fit (which is the same idea calibrated for small numbers), borrows Kano's user-satisfaction logic to define Metric Impact, takes Lean's validation discipline as a standalone input, and uses MoSCoW's rigor by enforcing a minimum threshold under which features are simply cuts.
The formula is intentionally simple. Five inputs, scored 0-20, summed to a 1-100 priority. Anything below 50 is a cut. Anything 50-69 is "park, revisit." 70-84 is "ship in this quarter." 85+ is "ship next."
The Five Inputs, DefinedInput 1 — Customer Fit (0-20). Does this feature serve your one core customer? Score 20 if it directly serves them. 10 if it serves them indirectly (helps their workflow but isn't core to the value). 0 if it serves a different customer segment you're not currently pursuing. This input replaces RICE's Reach. With 50 users, you don't measure how many — you measure whether the right ones get value.
Input 2 — Metric Impact (0-20). Pick your one metric — activation, retention, revenue, whatever the stage demands. Score how much this feature plausibly moves it. 20 if you have direct evidence (similar feature in adjacent product, prototype data, customer interviews). 15 if the logic is strong but unproven. 10 if it's a defensible bet. 5 if it's a guess. 0 if it doesn't move the metric at all (these usually mean a different feature, not a different score).
Input 3 — Validation Confidence (0-20). How sure are you the approach works the way you think it works? 20 if you've tested it, even crudely. 15 if you've talked to users who validated the design. 10 if it's standard practice in your space. 5 if it's a strong hypothesis but unvalidated. 0 if you're guessing. Most founders' first instinct is to score everything 15+ here. Don't. Score honestly. This is the input that catches the "I'm sure this will work" features that won't.
Input 4 — Speed (0-20). How fast can you ship it? 20 if it's a 1-3 day build. 15 if it's a week. 10 if it's 2-3 weeks. 5 if it's a month or more. 0 if you can't estimate (which means you don't understand the work yet — go define it before scoring). Speed matters more than founders admit because slow features delay the next experiment.
Input 5 — Cost (0-20, subtracted). Real cost beyond build time. Score 20 if the feature locks you in (data shape changes, deep integrations, training users to expect behavior). 15 if it adds substantial complexity. 10 if it's moderate. 5 if it's mostly contained. 0 if it's trivially cheap to remove or ignore. Subtract this from the sum of the other four. Cost is the input that kills the most "looks like an easy win" features once you score it honestly.
Total: (Customer Fit + Metric Impact + Validation Confidence + Speed) − Cost. Range: -20 to 80. Add a constant of 20 if you want all positive scores; the relative order is what matters.
The Copy-Paste Scoring TableHere's the table to use. Copy this into a spreadsheet, paste your features in column 1, score honestly. Sort by total descending. Ship the top 1-3 in the next sprint.
| Feature | Customer Fit (0-20) | Metric Impact (0-20) | Validation Confidence (0-20) | Speed (0-20) | Cost (0-20, subtracted) | Total |
| Feature 1 | 20 | 15 | 10 | 15 | 5 | 55 |
| Feature 2 | 10 | 20 | 5 | 20 | 10 | 45 |
| Feature 3 | 20 | 20 | 15 | 10 | 5 | 60 |
The pattern you'll see almost immediately: features the team is excited about score worse than features that are obvious-but-boring. That's not a bug. The formula is doing the job. Boring features that serve the right customer with high validation confidence beat exciting features that don't, every time.
Three Worked ExamplesExample 1: A founder building a directory tool for tennis players. Roadmap item: "Add player skill-level filtering."
Customer Fit: 20 (tennis players directly want this). Metric Impact: 15 (filtering plausibly improves successful match rate; users in interviews keep mentioning skill mismatch as the reason past matches didn't repeat). Validation Confidence: 15 (users have explicitly asked for it in 6 interviews). Speed: 15 (1-week build). Cost: 5 (low — straightforward filter, easy to remove). Total: 60. Ship it.
Example 2: A founder building a lead-qualification tool. Roadmap item: "Build admin dashboard for managing API keys."
Customer Fit: 10 (real estate teams need it eventually but it's not what they're paying for). Metric Impact: 5 (doesn't move qualification accuracy or volume). Validation Confidence: 10 (standard pattern but no direct evidence it matters at this stage). Speed: 10 (2-3 weeks). Cost: 10 (adds an entire admin surface to maintain). Total: 25. Cut. The founder will think about this feature 8 more times before realizing the formula was right; the cost of building it would have been 3 weeks of foregone customer-facing work.
Example 3: A founder building a SaaS product. Roadmap item: "Add a referral program."
Customer Fit: 15 (existing customers benefit). Metric Impact: 20 if you have evidence referrals work in your space, 10 if you're guessing. Let's say 10 (no evidence yet — first SaaS product). Validation Confidence: 5 (pure hypothesis). Speed: 10 (2-3 weeks). Cost: 10 (referral logic adds complexity to user model). Total: 30. Cut, or rerun as a Lean experiment first — fake-door test a referral program with a landing page before building.
How This Compares to Other Feature Prioritization FrameworksSide-by-side against the named frameworks: ICE is faster but doesn't account for customer fit. RICE is more rigorous but requires reach data solo founders don't have. Kano is great for thinking about features categorically but doesn't order them. MoSCoW is decisive but doesn't force evidence. Lean Prioritization is the right mindset but skips the scoring step. The Prioritization Formula is the synthesis: it preserves Lean's validation discipline as a standalone input, replaces RICE's broken Reach with Customer Fit at small-team scale, and produces a single number you can sort by.
Common MistakesScoring optimistically when you're attached to a feature. Everyone does this. The fix is the blind-scoring trick from the FAQ — score every input before computing the total. Better fix: have someone else score the same features independently and look at the disagreement.
Confusing Customer Fit with "could be useful." Customer Fit is binary at the high end. Either it serves the core customer directly or it doesn't. "Some users might want this" scores 5-10, not 15-20. The line catches a lot of well-meaning feature requests that don't belong on this product.
Underweighting Cost. Cost is the input founders most often wave their hand at. The actual cost of a feature isn't build time — it's the years of carrying it, the surface area in onboarding, the bugs in adjacent features when this one changes, the cognitive load on the team. Score Cost honestly. It's almost always higher than your gut estimate.
Treating the score as the answer. The score is an input to the decision, not the decision itself. If a 60-point feature feels wrong to ship, the formula is telling you one of the inputs is miscalibrated. Find the input you don't believe and re-score it. The formula is a forcing function for honesty, not a replacement for judgment.
If You Want Help Calibrating the FormulaMost teams need 1-2 cycles of using the formula before they trust the inputs. The first round usually surfaces 2-3 inputs that need recalibration for your specific business. If you want a faster path, book a strategy call and I'll run your current roadmap through the formula with you, score by score. It's the same exercise that drives Month 2 of Founder's Cut.
If your roadmap has more features than feels reasonable, the cuts come first. Read The Clarity Filter and run that before scoring. The formula is for choosing among the survivors. And if you're earlier than this — still working out which MVP to ship — start with the 6-week MVP framework. Prioritization is a luxury you earn after you've shipped.
### Frequently Asked Questions
**Q: How is the Prioritization Formula different from RICE?**
A: RICE is built for product teams with reach data — they have hundreds of thousands of users and can estimate how many would touch a given feature. Solo founders almost always have wrong reach numbers because their user base is too small. The Prioritization Formula replaces Reach with Customer Fit (does this serve our one core customer?) and adds Validation Confidence as a separate input. The output is a 1-100 score per feature, same as RICE, but the inputs are calibrated for stages where you have 50 users instead of 50,000.
**Q: How do I score features honestly when I'm biased toward what I've already built?**
A: The honest-scoring trick is to score every feature blind first — write the inputs without seeing the running total. Then add up. Most founders' first instinct is to back into the score they want; doing the inputs blind makes that harder. A second trick: have a peer score the same feature independently. If their score is more than 15 points off yours, the gap is usually you flattering a feature you're attached to.
**Q: Who should be doing the scoring — founder or team?**
A: Solo founders should do it themselves but pressure-test the inputs with one or two trusted advisors. Small teams (2-5 people) should have whoever owns the feature score it, then have a second person score it independently. The score isn't the point — the disagreement is. Where two scorers diverge, you've found the part of the feature that's underspecified or unevidenced. Resolve that gap and the score gets cheap to settle.
**Q: How often should I rescore features?**
A: Rescore the top 5-10 features on the roadmap once a month. The whole list once a quarter. Inputs change — what you learned from users, what your metrics did, what the market did. A score from three months ago is stale. The cadence isn't optional; an old prioritization list quietly becomes wrong before you notice.
**Q: When does the formula break?**
A: Three cases. First: when a single feature is genuinely existential (security, compliance, something your top customer demanded). Score it anyway, but trust your judgment over the score. Second: when you're at a strategic inflection point — entering a new market, repositioning the product. The score assumes your strategy is fixed; if it's not, the score is meaningless. Third: when the team has lost trust in the inputs. If people are gaming the scores, fix the trust issue, not the formula.
**Q: How do I weight the inputs?**
A: Default to equal weights for the first quarter — 5 inputs, 20 points each. After 90 days of using the formula, look at which features you scored high and shipped: did the high-scorers actually move the metric? If yes, your weights are calibrated. If not, the input that's least predictive is the one to weight down. Most solo founders end up over-weighting Customer Fit and under-weighting Cost, because cost is harder to estimate accurately.
**Q: Can I use this with a team that's already on RICE or another framework?**
A: Yes — but don't switch frameworks just to switch. The Prioritization Formula is most useful when the existing framework is producing bad decisions. Symptoms: the team always agrees on the score but disagrees on what to ship next; high-scoring features keep failing; the scoring takes 3+ hours and people stop doing it. If your current framework isn't broken, leave it alone. If it is, swap to the formula and watch how the new inputs change which features rise to the top.
---
## The Clarity Filter: How to Beat Feature Creep and Know What to Stop Building
- **URL:** https://justinmckelvey.com/blog/feature-creep-what-to-stop-building
- **Published:** April 27, 2026
- **Updated:** April 27, 2026
- **Category:** Product Leadership
- **Reading time:** 7 min
- **Description:** Feature creep kills more products than competitors do. The Clarity Filter is a 4-question test for deciding what to stop building before scope creep buries you.
TL;DR: For Founders Whose Roadmap Has Gotten Out of HandIf your roadmap has 14 items, your team is busy, but the metric isn't moving — your problem isn't execution speed. It's that half of what you're building shouldn't be built. Feature creep isn't a discipline problem. It's a clarity problem, and you can't fix it by working harder. The Clarity Filter is a 4-question test you run on every feature on your roadmap. Anything that fails any one of the four is a cut. When founders run this honestly the first time, 30-50% of their roadmap disappears — and the work that's left actually moves the business.
Why Feature Creep Is a Clarity Problem, Not a Discipline ProblemMost advice on feature creep treats it like a willpower issue. "Just say no." "Be more disciplined about scope." That's wrong, and if you've ever tried to apply that advice while staring at your own roadmap, you already know it doesn't help. Founders aren't shipping bloated products because they lack willpower. They're shipping bloated products because every individual feature looks reasonable when you evaluate it alone.
The reasonable-feature trap: Someone suggests a feature. You think about it. It would help some users. It's not too hard to build. You add it. Repeat 50 times. Now you have a bloated product. Each individual decision was defensible. The aggregate is a mess.
The fix isn't more willpower. The fix is a filter that's harder to pass than "this seems reasonable." A good filter forces a feature to clear several specific bars at once, and the cumulative bar is high enough that most reasonable-looking features fail it.
How to Know What to Stop Building: The 4-Question Clarity FilterIf you Googled "what to stop building" or "how to cut features," you're already past the hardest part — you know the roadmap is too big. The next move is having a rule for the cuts. Here's the rule.
Every feature on your roadmap has to pass all four questions. Failing any one is a cut. Not "we'll discuss it." A cut.
Question 1: Does this serve your one core customer? Not "could it be useful to some users." Does it directly serve the specific customer your product is built around. If you serve solo founders and the feature is "team collaboration," it fails. If you serve solo founders and the feature is "team collaboration we'll need eventually when we expand," it fails harder. Eventually isn't a customer.
Question 2: Does it move the one metric that matters right now? Pick your one metric — activation, retention, revenue, whatever the stage demands. Each feature has to plausibly move that metric. "It improves the experience" doesn't count. "It moves activation from 32% to ~38% based on the assumption that X" counts. If you can't write a sentence about which metric moves and roughly how much, you don't know why you're building the feature.
Question 3: Are you sure it works the way you think it works? Most features founders are sure about turn out to be wrong when users touch them. The filter version: do you have direct evidence (interviews, prototypes, similar features in adjacent products) that this approach is right? If the only evidence is "it makes sense," you're guessing. Guessing isn't disqualifying — but a roadmap full of guesses is. Limit yourself to one or two unproven bets at a time.
Question 4: Is it cheap to remove later? Some features lock you in. They change data shapes, integrate deeply with other features, or train users to expect behavior that's hard to walk back. Those features need an extra-high bar because the cost of being wrong is permanent. Cheap-to-remove features (a setting, a copy change, a new view) can be more speculative because you can rip them out in an afternoon.
A feature that passes all four — serves the core customer, moves the metric, has real evidence, cheap to remove — is worth building. A feature that fails any one is a cut. Most founders find that 30-50% of their roadmap fails on Question 1 or 2 alone.
Applying the Filter to a Real RoadmapHere's how it actually plays out. Imagine a founder building a booking tool for fitness trainers. The current roadmap has 14 items.
Item: "Trainer profile pages with photos and bios." Does it serve the core customer (trainers)? Yes. Does it move the metric (bookings)? Probably — clients book trainers they trust. Is there evidence? Yes, every booking platform has profiles. Cheap to remove? Yes. Pass.
Item: "Group class scheduling for studios." Does it serve the core customer (solo trainers)? No — studios are a different customer. Cut. The temptation to keep this kind of feature is enormous because "it's not that hard to build" and "studios are an obvious expansion." Both true. Both irrelevant. The product isn't ready to serve two customers; it's barely serving one.
Item: "AI workout plan suggestions." Does it move the metric (bookings)? Unclear — workout plans are a separate product surface. Is there evidence it improves bookings? No. Cut, or move to a separate research bucket.
Item: "Settings page with notification preferences." Cheap to remove? Yes. Moves the metric? Probably not directly. Cut for now — defer until users actually complain about notifications.
This isn't a hypothetical pattern. Every roadmap I've audited has the same shape: 4-6 items that clearly serve the customer and move the metric, 8-10 that fail one of the four questions, and the founder has been treating all 14 as roughly equal.
How Feature Creep Becomes Scope CreepFeature creep and scope creep aren't the same thing, but one becomes the other if you don't catch it early.
Feature creep is adding features beyond what the product needs to serve its core customer. You ship the booking tool, then you add the workout plans, then you add the studio scheduling. Each addition is a feature. The product still has a clear shape, just a more cluttered version of it.
Scope creep is when feature creep changes what the product is. You started building a booking tool for solo trainers. Now you're building a fitness platform. The customer is different. The pricing is different. The competitor set is different. You didn't decide to build a fitness platform — you arrived there one feature at a time.
The Clarity Filter blocks both, but it does so at different points. Question 1 (does it serve your one core customer?) catches scope creep — anything that serves a different customer fails. Questions 2-4 catch feature creep — anything that doesn't move the metric, lacks evidence, or is hard to remove fails.
If you're already in scope creep — the product has drifted away from its original customer — the filter still works, but the harder question is which customer you're keeping. That's a different exercise: Founder's Cut's Month 1 (Clarity) is built for exactly this case.
What to Do With the Killed IdeasCutting a feature doesn't mean the idea was bad. It means the idea isn't right for now. You have two options for what to do with the killed list.
Park it visibly. Maintain a "parked" list — features you're not building right now and the reason. Review the list quarterly. Some parked features come back as obvious priorities once the product evolves. Most don't. Either way, parking is better than killing because the team and stakeholders can see you didn't forget about it; you just deferred.
Bury it permanently. Some features should never come back. They served a customer you're not pursuing, or solved a problem you've decided isn't worth solving. Bury those. Don't park them; killing them definitively is healthier than letting them haunt every prioritization conversation.
The point of cutting isn't to be permanent about the cut — it's to free up the team's attention right now. A clear "not now, here's why" is almost as valuable as a yes.
The Filter on Existing (Already-Shipped) FeaturesMost teams apply this kind of filter to new features but not to features they've already shipped. That's backwards. Shipped features have higher costs than unshipped ones — they take up UI real estate, they have to be maintained, they show up in onboarding, they shape user expectations. A bad shipped feature is more expensive than a bad idea.
Run the Clarity Filter against shipped features once a quarter. Anything that fails — especially Questions 1 and 2 — should be a candidate for removal, not just "deprecation" or "hiding." Removing features feels scary; almost no one notices. The team feels relief; the product gets sharper.
Feature bloat happens when this never gets done. After 18 months of shipping, the product has 60 features and uses are concentrated in 8. The other 52 are tax — they slow down the codebase, confuse new users, and consume your team's attention every time something breaks. The fix is to cut.
If You Want Help Running the FilterThe hard part of cutting isn't knowing which features to cut. It's making yourself do it. Founders are often too close to their own roadmap to apply the filter honestly. If that's where you are, book a strategy call and I'll run the Clarity Filter against your current roadmap with you. It's the same exercise that's the first month of Founder's Cut, condensed into a single conversation.
For the next decision after the cut — which of the surviving features to actually build first — read The Prioritization Formula. And if you're cutting features because the underlying product is broken (a vibe-coded app that's struggling under real load), the deeper fix may be a vibe code rescue rather than a roadmap pruning exercise.
### Frequently Asked Questions
**Q: How do I know when a feature should be cut?**
A: Run it through the Clarity Filter: does it serve your one core customer, does it move the metric that matters, are you sure it works, and is it cheap to remove later? If a feature fails any of the four, cut it. Most founders won't cut features because they've already built them — that's sunk cost, not signal. The cost of keeping a wrong feature is higher than the cost of cutting one you might have wanted later.
**Q: How do I get past the sunk cost trap?**
A: Reframe the question. Instead of 'should I keep this feature I already built?' ask 'if I were starting today, would I build this?' If the answer is no, the feature is costing you complexity, attention, and trust with users — even if you don't see the cost on the P&L. Sunk cost is a real psychological tax, but you pay it once when you cut. You pay it forever if you keep.
**Q: How do I tell my team I'm cutting features they built?**
A: Be direct and don't apologize for the cut. Frame it as 'we shipped this, we learned X, and based on what we learned the right move is to remove it.' Engineers and designers respect clear thinking more than they respect protecting their work. The only thing that damages morale is when cuts are random or when leadership pretends the feature was never important. If a person on your team only built one thing and you're cutting it, you owe them a conversation about what comes next — but the cut itself isn't the problem.
**Q: Aren't some features just obviously necessary even if no one uses them?**
A: Sometimes. Auth, billing, and basic security are real examples — they don't drive engagement but you can't ship without them. The Clarity Filter handles this with question 1: does it serve your one core customer? If your customer can't use the product without it, it stays. The trap is calling something 'foundational' when it's actually optional. Most 'we have to have a settings page' features fail this test.
**Q: Should I cut features after launch if no one's using them?**
A: Yes — and faster than you think. Post-launch is the highest-signal time to cut because you have actual usage data, not assumptions. If a feature has been live for 60 days and fewer than 5% of active users have touched it, it's failing. Either fix the discoverability and re-test, or cut. Carrying dead features forever is how products end up bloated, slow, and hard to explain.
**Q: What are the signs my product has feature bloat?**
A: Three signs. First: you can't describe what the product does in one sentence anymore. Second: new users get confused on first use because there are too many surfaces. Third: your team makes incorrect assumptions about how features interact because the system has gotten too complex to hold in one head. If any of those is true, you have feature bloat — which is feature creep that compounded. The fix is the same: run the Clarity Filter and start cutting.
---
## Founder's Cut: A Founder Coaching Framework for Builders Past the Build Stage
- **URL:** https://justinmckelvey.com/blog/founders-cut-founder-coaching-framework
- **Published:** April 27, 2026
- **Updated:** April 28, 2026
- **Category:** Product Leadership
- **Reading time:** 8 min
- **Description:** Founder's Cut: a founder coaching framework for the post-MVP stage. 90 days, 3 phases — Clarity, Systems, Velocity. Built for founders and builders past the build stage who need to figure out what to do next.
TL;DR: For Founders and Builders Stuck Between MVP and Real BusinessYou shipped something. People are using it. Maybe paying for it. And now you're staring at a list of 50 next moves with no idea which one actually matters — and every founder you ask gives you a different answer. That's the stage this is built for. Founder's Cut is a 90-day founder coaching framework — Clarity, Systems, Velocity — for founders and builders past the build stage who need to stop guessing and start operating. The name is the punchline: most founders fail not because they didn't build enough, but because they didn't cut enough. Most founder coaching, CEO coach, and executive coaching engagements miss this stage entirely — they focus on mindset when the actual problem is product decisions and where you spend your time. As of 2026: $1,500/month with a 3-month minimum, or $3,999 paid in full.
Why Builders Past the Build Stage Need Coaching, Not AdviceOnce you ship something that works, your problem changes. The first problem was technical: can I build this? The second problem is strategic: should I? Most founder advice — books, podcasts, accelerator programs — is built for the first problem. There's almost nothing built for the second.
You're probably feeling this right now. Your roadmap is a guess. Your week is reactive. Customers want different things. You don't know whose feedback to trust or which deal to chase. None of this means you're doing it wrong; it means you've graduated from one problem to a harder one.
The advice trap: You ask five founders what to do next. You get five answers. They're all reasonable, but they contradict each other. You pick the one that feels right and build for two months. Then you ask five more founders. Different five answers. You pivot. You're now four months in and your roadmap is a graveyard of half-finished features.
The fix isn't more advice. It's a framework that forces you to make your own decisions in a structured way and lets a coach push on the assumptions inside those decisions. That's what coaching actually is — not handing you answers but making sure you can defend the ones you give yourself.
Founder Coaching vs. CEO Coach vs. Executive Coaching for FoundersThese terms get used interchangeably and they shouldn't. The differences matter when you're deciding what kind of help you need.
Executive coaching for founders is the broadest category. It usually focuses on self-management — how you communicate, how you handle stress, how you lead a team. It's the right fit for founders running real organizations with direct reports and board pressure. Less useful at the post-MVP solo stage.
CEO coach work overlaps heavily with executive coaching but adds business-strategy framing. CEO coach engagements typically assume you have a leadership team and your job is to run them. The conversations are about delegation, hiring, and operating cadence.
Founder coaching — at the post-MVP stage specifically — is about the decisions only the founder can make: which customer to focus on, which features to cut, how to spend the next 90 days. Founder's Cut sits in this third category. It's narrower than CEO coach work because the company isn't big enough yet to need executive-level coaching. It's more product-focused than generic founder coaching because that's where the bottleneck usually is.
Month 1: Clarity — Cut Before You BuildMost founders have too many ideas, not too few. Month 1 is about cutting until what's left is obviously right. No new features get built this month. The work is decisions, not code. This is where the framework gets its name — the founder's cut is the version of your product, your customer list, and your roadmap that survives an honest cutting pass.
Three questions drive Month 1:
Who is your customer, actually? Not the persona deck. The specific person. By name, if you can. If you can't name three real people who would buy this product right now, you don't have a customer — you have a hypothesis about a customer. The work this month is to either find three real people or change the product so that three real people show up.
What problem are you really solving? Founders are notorious for confusing the feature with the problem. A consumer marketplace I worked on didn't solve "tennis players need an app." It solved "tennis players can't find local hitting partners because the existing platforms are dead." The shape of the problem is what tells you which features matter and which don't. Most roadmaps look bloated because the problem statement is vague.
What should you stop building? This is the hard one. Every founder has features they've half-built or fully built that nobody uses. The instinct is to keep them — sunk cost. The right move is usually to cut. Month 1 of Founder's Cut walks through the specific filter for deciding what to cut. The framework for that lives in The Clarity Filter.
The output of Month 1 is a one-page document: who the customer is, what problem you solve, and what's no longer on the roadmap. Most founders find this exercise emotionally expensive. That's normal. It's also the most leveraged work you'll do all quarter.
Month 2: Systems — Install the Repeatable StuffMonth 2 turns one-off wins into systems that don't depend on you grinding harder. You install three: a repeatable sales motion, a prioritization system, and an execution rhythm.
Repeatable sales motion. Most post-MVP founders have closed a few deals through hustle — DMs, intro calls, friends-of-friends. That's not a sales motion. A motion is a sequence of steps that consistently produces qualified conversations. Month 2 maps your hustle into a process you can repeat without thinking. The shape varies by business — outbound, content, partnerships, ads — but the principle is the same: turn the thing that worked once into the thing you do on Tuesdays.
Prioritization system. Founders default to whatever feels urgent. That's how roadmaps explode. Founder's Cut uses a specific prioritization formula that scores features against five inputs and outputs a ranked list. The full breakdown is in The Prioritization Formula. The point isn't the formula itself — there are dozens of decent ones (RICE, ICE, Kano, MoSCoW). The point is using one consistently so you stop relitigating priorities every Monday.
Weekly execution rhythm. What does your week look like? Most founders can't answer this without lying. The rhythm is: when you do customer calls, when you build, when you sell, when you think. Without a rhythm, every week is reactive — whoever pings you loudest wins. With one, you protect the deep work that actually moves the business.
By the end of Month 2, your business should run a noticeable amount on autopilot. Not literally — you're still doing most of the work — but the work is structured instead of frantic.
Month 3: Velocity — Measure, Decide, DelegateMonth 3 is about reading the data from the systems you just installed and making the next round of decisions, including the first one about not doing everything yourself.
The pipeline review is the first thing. After 30+ days of running a real sales motion, you have data. Where are leads stalling? Which sources convert? Which customer profiles close fast vs. slow? The answers reshape the next quarter's priorities. They also usually surface that you're spending time on the wrong customer segment — that's the most common Month 3 finding.
The first delegation decision is the second thing. Founders default to doing everything because they're cheap and accountable. That breaks somewhere around month 3-6 of operating with real customers. The question isn't "do I need help?" — it's "what's the highest-leverage thing to take off my plate first?" The full breakdown lives in The Delegation Decision. For most post-MVP founders, the answer is fractional support before full-time hires — a fractional CTO, a fractional ops person, or a part-time SDR — because the cost of a wrong full-time hire at this stage is brutal.
The 90-day plan is the third thing. By Month 3, you have enough signal to plan the next quarter with conviction. Not vibes. Specific bets, specific bets-against, and specific check-in points. The output is a one-page plan you actually believe.
What Founder Coaching Isn'tA few things I've watched the term "founder coaching" get used for that Founder's Cut explicitly isn't.
It isn't therapy. Founders carry real psychological weight, and that work matters. It's not what coaching is for. If you're navigating burnout, anxiety, or relationship stress, see a therapist. The coaching work assumes you have the bandwidth to make decisions; if you don't, the order of operations is wrong.
It isn't an accelerator. Accelerators are 1-to-many programs that teach a generic curriculum to a cohort. Coaching is 1-to-1 work on your specific situation. Both have value; they're different products.
It isn't a course. Courses are content. Coaching is decisions. You can read 50 books on prioritization and still freeze when you have to choose between two real features. The framework is the structure; the conversation is what gets you to commit.
It isn't a CTO. If your code is on fire, you don't need coaching — you need engineering help. A fractional CTO is a different role and you can run them in parallel with a coaching engagement if both are needed.
How to Run Founder's Cut (With or Without a Coach)If you want to run the framework yourself, here's the abbreviated version.
Block out 90 days. Commit. The temptation to abandon the structure 3 weeks in is very real — resist it.
Month 1: write the one-page document — customer, problem, kill list. Show it to three people who'll push back honestly. Edit. The kill list should hurt a little. That's the founder's cut.
Month 2: install the three systems. Sales motion, prioritization, weekly rhythm. Each one is a 1-2 day project, not a 30-day project. The 30 days are for running them, not designing them.
Month 3: review what the systems told you. Decide on the first piece of leverage you're adding (delegation, hire, fractional, automation). Write the next 90-day plan.
If you want help running it — or want to make sure the cuts in Month 1 are honest — that's what the coaching engagement is for. Book a strategy call and I'll walk you through whether the framework is the right fit for where you are. If you're earlier than post-MVP, start with the 6-week MVP framework first — coaching at the wrong stage is wasted money.
### Frequently Asked Questions
**Q: What is founder coaching?**
A: Founder coaching is structured 1:1 work with someone who has shipped products and run companies, focused on the specific decisions a founder is making right now. It's narrower than executive coaching (which is mostly about leadership and self-management) and broader than a fractional CTO engagement (which is mostly about technical leadership). A good founder coach pushes on product decisions, sales motions, and how the founder spends their time. They don't run plays for you — they make sure you're running the right ones.
**Q: How is Founder's Cut different from generic founder coaching?**
A: Most founder coaching and executive coaching for founders is mindset work — confidence, communication, fear, focus. Useful, but not what's broken at the post-MVP stage. Founder's Cut assumes you have the founder fundamentals already and need help deciding what to do, not how to feel about doing it. The 90-day arc — Clarity, Systems, Velocity — moves through specific decisions you need to make about your product, customers, and time. Less psychology, more decision-making. The name comes from Month 1: the kill list. Most founders fail not because they didn't build enough — because they didn't cut enough.
**Q: Who is this for?**
A: Founders and builders who shipped something real — usually with vibe coding tools, no-code, or a small team — and are now staring at a list of 50 next moves with no obvious priority. You probably have early traction, some customers or strong intent signals, and a roadmap that's mostly guesses. You don't need help building. You need help deciding. If you're pre-product or pre-traction, this isn't the right framework yet — go ship something first.
**Q: Why 90 days?**
A: Three months is long enough to surface real patterns in your business and short enough to force decisions. Most founder coaching engagements are open-ended retainers, which lets the work drift. A 90-day arc forces every conversation to lead somewhere. Month 1 is for cutting and clarifying. Month 2 is for installing repeatable systems. Month 3 is for measuring and deciding what comes next. After 90 days, most founders either don't need ongoing coaching or have a clear reason to extend.
**Q: How much does founder coaching cost?**
A: Founder's Cut runs $1,500/month with a 3-month minimum, or $3,999 paid in full. That includes 2 monthly 1:1 calls, async Slack access, and a small group cohort. CEO coach engagements typically run $500-3,000/hr or $5,000-15,000/month. Executive coaching for founders at the partner level can hit $30,000+ per quarter. The pricing here is intentionally lower than CEO coach rates because the focus is narrower — product and operating decisions, not full executive leadership.
**Q: Can I run this framework on my own?**
A: Yes, and a lot of the value is in the framework itself, not the coaching. The three phases — Clarity, Systems, Velocity — work whether you're running them solo or with a coach. The hard part of doing it solo is honesty. Most founders skip the cutting phase because it's emotionally expensive to kill features and ideas. A coach makes that part faster. If you have a co-founder or peer who'll push back honestly, you can run the framework yourself. If you don't, that's the role a coach plays.
**Q: What happens after the 90 days?**
A: Three outcomes are common. About a third of founders are clearly ready to operate independently — they have a system, a pipeline, and a plan. About a third want a lighter touch — quarterly check-ins or async-only support to keep the systems running. The last third have grown into a different problem — they need a fractional CTO, a fractional product manager, or a real first hire — and the engagement transitions into help finding that. The framework's job is to make the next move obvious, not to make you dependent.
---
## Vibe Code Rescue Case Study: From Broken AI MVP to Production in 6 Weeks
- **URL:** https://justinmckelvey.com/blog/vibe-code-rescue-case-study
- **Published:** April 26, 2026
- **Updated:** April 26, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 10 min
- **Description:** How we rebuilt a YC-backed B2B SaaS MVP shipped on AI-generated code with no auth, no payments, no onboarding. Fixed-price rescue, 6 weeks, $40K vs $150K agency quote.
TL;DRA YC-backed B2B SaaS shipped its MVP using AI tools that generated a Vercel + Firebase codebase. The UI looked done. The product wasn't. There was no real authentication. No payment processing. No onboarding flow. No data integrity. The "Stripe integration" was a button. The "user accounts" were unguarded Firebase reads. Customers couldn't actually pay them, and the early ones who tried hit dead ends fast.
We rebuilt the product in 6 weeks for a fixed price of $40K, replacing an agency quote of $150K and 4 to 6 months. Real Stripe billing shipped in week 2. Auth, onboarding, and data integrity shipped by week 4. The founder closed their first paying customers in week 5. Production has been stable since launch — zero customer-facing incidents in the 90+ days since.
This is a case study about one engagement, but the pattern is true of every vibe-coded MVP I've seen at month four. If you've shipped on AI-generated code and you're hitting walls, the diagnosis below probably matches what's on your screen right now.
The State We InheritedThe founder is technical-adjacent — knows enough to ship, not enough to know what's missing. They built the MVP using AI tools that generated a Vercel + Firebase project. The dashboard rendered. The marketing site looked sharp. The login button worked, in the sense that clicking it changed what was on the screen.
Underneath, here's what we found on day one:
• No real authentication. The "logged in" state was a client-side flag. Firebase rules were either wide-open or had been copy-pasted from a tutorial without anyone understanding what they did. Any user could read or write any other user's data by changing a parameter in a request. This wasn't an obscure security flaw — it was the architecture.
• No payment processing. The pricing page had three tiers, three "Start Free Trial" buttons, and a fully designed checkout UI. None of it talked to Stripe. The buttons logged events and routed users to a "Thanks!" page. Customers who tried to pay couldn't.
• No onboarding flow. Successful new accounts had no path from signup to first value. The empty-state screens existed in Figma. They didn't exist in the app.
• No data integrity. The Firebase schema had no validation, no required fields, and no relationships. Records existed in inconsistent states because the AI-generated frontend was the only thing enforcing structure — and it didn't, consistently.
• No deploy story. "Production" was a Vercel preview that the team had agreed to call production. There was no rollback path, no staging environment, and no way to test changes before they hit users.
• No observability. When something broke, the team found out from customer emails. There were no error logs, no monitoring, and no analytics that actually fired correctly.
On the kickoff call I asked the founder to walk me through their Stripe integration. They opened the pricing page and clicked "Start Free Trial." A modal popped up — three input fields, a submit button, the words "Stripe Powered" in light gray at the bottom of the modal. They submitted the form. A green check animated. The page redirected to a "Welcome!" screen. I asked, "And then Stripe charges them?" The founder paused. "Well, no. We follow up by email. Eventually." That was the integration.
None of this is unusual. This is what every vibe-coded MVP looks like at month four. AI tools ship the parts of a product that are easy to demo. They skip or stub the parts that actually keep a business running. Founders don't realize what's missing until the first real customer tries to give them money.
Why the AI Codebase Couldn't Be PatchedThe first thing every founder asks is some version of: "Can we just fix what's there?" Sometimes the answer is yes. In this case it was no. Three reasons:
1. The architecture wasn't designed — it was assembled. The AI generated each feature in isolation, optimizing for whatever was on the screen at that moment. The result was a codebase with no shared abstractions, no consistent error handling, and no cohesive data model. Every feature was its own little island, often with its own little Firebase calls duplicating logic that should have lived in one place. Patching means rewriting the patches, then rewriting them again next month when the next thing breaks.
2. The missing pieces aren't optional. Authentication isn't a feature you add later — it's the substrate everything else sits on. Same for payments, onboarding, and data integrity. Bolting these onto a codebase that wasn't designed with them in mind is more work than rebuilding around them from the start. The seams show forever.
3. The stack was wrong for the team. Vercel + Firebase scales in some ways and absolutely doesn't in others. For a small team without dedicated DevOps, it generates ongoing operational complexity that the team can't actually pay down. Every change risks breaking something. The team starts shipping less, not more.
I lay out this same pattern in my breakdown of vibe coding — AI-generated code optimizes for the appearance of progress. Real progress requires the boring parts.
How We Rebuilt ItThe rebuild's job was to right-size the architecture to the team that actually exists. The team is one technical-adjacent founder and one part-time contractor. They don't have a DevOps person. They never will. Whatever we shipped had to be something they could operate, debug, and extend on their own after we handed it back. That single constraint shapes every other decision in a rescue.
What that meant in practice: a real relational data model with enforced consistency, real authentication with sane defaults, a payments integration that actually charges money, and a deployment story with rollback that doesn't require a runbook. Stack-wise we chose Ruby on Rails 8 with Hotwire on the frontend and SQLite in production — total monthly hosting cost under $25 — because that combination is genuinely production-ready for a B2B SaaS at this scale and it's a stack a small team can run for years without inheriting infrastructure work. The stack matters less than the principle: pick something a team your size can actually maintain, not something that requires an ops team you don't have.
If you're earlier in the process and trying to figure out what to ship in the first place, my MVP development guide covers how to scope this from scratch — same principles, applied earlier.
The Rebuild Sequence (What Got Done in What Order)Six weeks. Fixed scope. Here's how it sequenced:
• Week 1: Migrated the data model from Firebase to a real relational schema. Stood up Rails 8 with authentication using the built-in generator. Got the team's existing customer list (around 80 records) into the new database with all the cleanup that required.
• Week 2: Real Stripe integration with subscription management, webhooks, and a working customer billing portal. Onboarding flow from signup to first value (the screens that previously only existed in Figma). At this point a customer could sign up and pay, which is something the previous codebase had never been able to do.
• Week 3: Core product features rebuilt as Hotwire-driven flows. UX kept faithful to the original design system, just running on a stack that actually works. Email transactional flows on Amazon SES.
• Week 4: Admin tooling for the founder to manage customers, see usage, handle support. Observability: error tracking, logs, dashboards. Production deploy on Railway with Litestream backups to S3. The team got a real staging environment for the first time.
• Week 5: Polish, edge cases, and migration of the remaining customers from the old system. Founder ran their first paid customer onboarding through the new system end-to-end.
• Week 6: Knowledge transfer, documentation, and handoff. The founder and their part-time contractor took over day-to-day. We stayed available for a 30-day support window for anything that came up.
OutcomesWhat it cost the founder, in time and dollars and stability and revenue:
Time saved: 6 weeks of focused rebuild vs. the 4 to 6 months the agency had quoted. The faster timeline mattered because the runway clock doesn't pause for a rebuild.
Cost avoided: Fixed price of $40K vs. the $150K agency quote — and vs. the indeterminate cost of trying to keep patching the existing codebase, which would have eaten contractor hours indefinitely. Net savings around $110K for the founder, plus the ongoing operational cost reduction from a simpler stack.
Production stability: Zero customer-facing incidents in the 90+ days since launch. The previous codebase had been generating customer support requests at a rate the founder was personally trying to keep up with. That stopped.
Revenue unblocked: First paying customers closed in week 5. The previous setup had quite literally been incapable of taking money — every "purchase" had been a manual workaround. Once Stripe shipped in week 2, the team's pipeline started converting, and the founder went from operating a demo to running an actual business.
The Pattern (Why This Story Repeats)Here's the thing every founder I talk to needs to hear: the codebase you're describing is not unusual. The exact same pattern shows up across Lovable, Cursor, Replit Agent, v0, Bolt, and the AI assistants that target Vercel + Firebase or Next.js + Supabase. The tools differ; the failure modes are the same.
What every vibe-coded MVP shares at month four:
2. UI that's 80% complete and 20% theater. The screens look ready. Some of them aren't actually wired to anything.
4. Auth that isn't. Either fully missing, fully open, or implemented in a way that doesn't survive a real attacker (which is to say, any attacker).
6. Payments that don't. Stripe integration that's UI-only, or wired up in a way that breaks on the second customer who tries to use it.
8. Data with no contract. No schema validation, inconsistent records, no foreign key relationships, garbage piling up faster than the team can clean it.
10. Operational debt. No observability, no rollback path, no staging, no backup story. Every deploy is a coin flip.
This isn't a critique of AI tools — they're remarkable for what they do. The mistake is assuming the tool produced a product when what it produced was a working-looking shell. The shell is real progress. It's just not the whole job.
If you want a deeper read on which AI tools handle which parts of this gracefully, my Claude Code vs Cursor breakdown walks through the differences in how they generate (or don't generate) the foundational systems.
How to Tell If Your Codebase Needs a RescueSome quick diagnostic questions. If three or more of these are "no" or "I don't know," your codebase probably needs a rescue:
• Can a real attacker steal another user's data? (You should know the answer cold.)
• Can a customer pay you without a manual intervention from someone on your team?
• Does your data model enforce its own consistency, or are you relying on the frontend to keep it clean?
• If today's deploy breaks something, can you roll back in under five minutes?
• When customers report bugs, do you find out from your own monitoring before they tell you?
• Can a new engineer onboard to your codebase in a week, or does the system rely on tribal knowledge?
• Is your stack one that one or two people can operate indefinitely, or does it implicitly assume a DevOps team you don't have?
"Don't know" counts as no. The questions you can't answer cold are the ones already costing you something.
The Vibe Code Rescue OfferIf this case study sounds like your codebase: I do this work as a fixed-price engagement. $25K to $50K, 4 to 8 weeks, audit-first, fully scoped before you sign anything.
The audit is free. I take 20 minutes with your repo, send back a Loom walkthrough flagging the three biggest production risks, and we decide together whether the rescue is the right move. If it isn't, you keep the audit and the next steps anyway.
Compared to the alternatives: a typical agency rebuild quote runs $120K to $200K and 4 to 6 months. Hiring a full-time senior engineer runs $200K+ all-in for the first year and assumes you can find one in this market. A rescue is faster and cheaper than either, and it's scoped against a fixed deliverable instead of an open-ended retainer.
Book a free 30-minute call and we'll decide on the call whether your codebase is a fit for the rescue, the rebuild, or something else entirely. No deck, no pitch, just an honest read on what you've shipped and what it'll take to get to a real product.
Further Reading
• What Is Vibe Coding? — the broader category and why this pattern keeps repeating
• Is Vibe Coding Bad? — the honest case for and against
• Claude Code vs Cursor — how the tools generating these MVPs actually differ
• MVP Development Guide — how to scope this correctly from scratch
• Fractional CTO Cost — pricing comparison for outside engineering help
### Frequently Asked Questions
**Q: What is a vibe code rescue?**
A: A vibe code rescue is the rebuild of an AI-generated codebase that broke when it hit real users. Tools like Lovable, Cursor, Replit Agent, v0, and AI assistants targeting Vercel + Firebase ship working-looking UIs fast, but they typically skip or stub the load-bearing systems: real authentication, payment processing, onboarding flows, data integrity, and production deploys. A rescue takes the working parts of the original codebase, identifies what's missing or fundamentally broken, and rebuilds the foundation — usually on a more durable stack — so the product can actually serve customers.
**Q: Can you rescue a Lovable, Cursor, Replit, or v0 codebase, or only Rails projects?**
A: Yes. The original tool doesn't matter — what matters is whether the codebase has the structural pieces a real product needs. Most rescues start in JavaScript-heavy stacks (Next.js on Vercel, Firebase, Supabase, or generic React + Node setups). If the existing codebase is salvageable, we'll keep it there. When it isn't, we typically rebuild on a stack designed for one-person and small-team products to ship and maintain end-to-end — for us that's Rails 8 with Hotwire, but the stack choice follows the team, not the other way around. Most of the time, the rebuild is faster than the patch.
**Q: How long does a typical vibe code rescue take?**
A: Most rescues run 4 to 8 weeks from kickoff to production launch. The variable is scope: how much of the original UI we're keeping, how many integrations need real implementations (payments, auth, email, file storage), and how much migration work we're doing if customer data already exists. We scope every engagement against a fixed timeline and fixed price, so founders know what they're getting before signing.
**Q: What's included in the free repo audit?**
A: A 20-minute Loom walkthrough of the codebase identifying the three biggest production risks, plus a written summary with concrete next steps. We look at: authentication implementation, payment and subscription flows, data model integrity, deploy and rollback safety, error handling and observability, and the scale ceiling of the current architecture. Founders get the audit whether or not they hire us afterward — it's a no-strings diagnostic so you understand what you actually shipped.
**Q: Will I lose all my AI-generated work in a rescue?**
A: No. The UI work, design system, and any custom business logic that's actually working get carried forward — usually as Hotwire components if we're rebuilding on Rails. What gets thrown out is the duct tape: half-implemented auth, fake-but-clickable Stripe buttons, Firebase hacks that won't scale past a hundred users, and any code that tried to be too clever for its own good. Most rescues keep 30 to 50 percent of the original codebase. The other 50 to 70 percent gets replaced with code that's actually production-grade.
**Q: How much does a vibe code rescue cost compared to hiring an agency?**
A: Agency quotes for full MVP rebuilds typically run $120K to $200K and 4 to 6 months. A vibe code rescue with us is fixed-price between $25K and $50K and runs 4 to 8 weeks. The price difference comes from two things: we're not rewriting the entire UI from scratch (we keep what works), and we right-size the stack to a team of one or two — meaning the founder can actually maintain it after we hand it back, instead of inheriting a microservice setup that requires an ongoing team to operate.
**Q: What if my codebase is too far gone to rescue?**
A: It happens, but rarely. Even codebases that feel hopeless usually have salvageable design work, customer feedback baked into the UX, and at least some validated business logic. If the rescue assessment shows the rebuild would cost more than starting fresh, we'll tell you that on the audit call and price a clean MVP rebuild instead — same fixed-price model, same timeline range. You get an honest answer either way.
---
## Lovable vs Cursor (2026): Vibe Coding App Builder vs IDE
- **URL:** https://justinmckelvey.com/blog/lovable-vs-cursor
- **Published:** April 20, 2026
- **Updated:** April 20, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 8 min
- **Description:** Lovable vs Cursor: Lovable wins for non-devs shipping in days. Cursor wins for developers building real software. Pricing, limits, when each breaks.
Quick Answer (Verdict After Building With Both)
After building real apps with both, the verdict: Lovable wins for non-developers shipping in days; Cursor wins for developers building real software. Lovable Pro ($25/mo) generates deployable apps from prompts — perfect for founders without code skills. Cursor Pro ($20/mo) makes existing developers 3–5x faster in their IDE — useless if you can't already code. Where each one breaks: Lovable hits an iteration ceiling around v3–v4 of complex apps; Cursor depends entirely on you knowing what to ask.
Updated May 2026 · Built real apps with both · Author: Justin McKelvey, fractional CTO, 50+ products shipped
The Verdict After Building With BothAfter building real apps with both, here's the verdict: pick Lovable if you don't code, Cursor if you do. Price isn't the decider. Lovable Pro ($25/mo) takes a prompt and generates a deployable web app — frontend, backend, database, auth, the works. Non-developers can ship something usable in a single afternoon. Cursor Pro ($20/mo) is a VS Code fork that makes existing developers 3–5x faster in their day-to-day editing — it doesn't help if you can't already read code. They're not really competitors.
The trap is picking based on the $25-vs-$20 price tag (they're functionally the same cost) instead of based on whether you can code. Below: when each one breaks, what happens when you outgrow Lovable, and the path from "I can't code" to "I'm shipping real software" if that's the direction you want to go.
What Each Tool IsLovable is a browser-based AI app builder. You go to lovable.dev, describe what you want ("a meal planning app where users save favorite recipes"), and Lovable generates a complete web app — frontend, backend, database, auth, the works. You iterate through chat: "add a login screen," "store user preferences," "make the sidebar collapsible." You can deploy with one click. The output is a real React + Tailwind + Supabase app you can host anywhere.
Cursor is a desktop code editor — specifically a fork of VS Code with AI features integrated. You install it, open a project from your local file system, and use tab completion, inline chat, and multi-file edits (Composer) to write code faster. You still have to know how to code. Cursor makes you faster at coding; it doesn't do the coding for you.
The split is categorical: Lovable generates apps. Cursor helps you write apps. Everything else flows from that.
Feature Comparison
Feature
Lovable
Cursor
Target user
Non-developers, founders, product people
Developers (beginner to senior)
How you interact
Chat — describe what you want
Editor — write and edit code with AI help
Code visibility
You can view, but the AI drives
You write it; AI suggests
Generates full apps
Yes — that's the main feature
No — you build incrementally
Tech stack
React + Tailwind + Supabase (fixed)
Any language, any framework
Built-in hosting
Yes, click to deploy
No — use Vercel, Railway, etc.
Git / version control
Export to GitHub supported
Native Git workflows
Pricing
$25/mo Pro (usage-based)
$20/mo Pro (flat)
Ceiling / limitations
Low — breaks down on complex custom features
None — you control everything
Learning curve
Very low — describe what you want
Medium — requires coding knowledge
Runs on your machine
No — cloud only
Yes — desktop app
The Core Difference: Who's DrivingIn Lovable, the AI drives. You describe outcomes, the AI writes the code and makes architectural decisions. You can view and edit code, but that's not the intended workflow — if you find yourself wanting to edit code manually in Lovable, you've probably outgrown the tool.
In Cursor, you drive. The AI suggests, completes, and assists, but you're the one clicking into files, writing functions, and deciding how code is structured. Cursor is a force multiplier on your existing coding ability.
This distinction determines who each tool is for. If you can't code (or don't want to), Lovable's AI-driven workflow is liberating. If you can code, Cursor's editor-driven workflow is much more powerful — you can do anything Lovable does, plus everything Lovable can't.
When Lovable Wins
• You're a non-technical founder with an idea and no dev team. Lovable can get you to a working app faster than any other tool in 2026, full stop. This is the use case it was built for.
• You're validating a concept before committing to build. Spend a weekend with Lovable, get a working prototype, show it to 10 potential customers. If the concept works, you'll have real conversations to shape the actual build. If it doesn't, you've spent $25 to find out.
• You want to ship an internal tool or one-off app. Event registration page, internal dashboard, simple CRUD app — Lovable handles these cases in hours.
• You're replacing a no-code tool (Airtable + forms + Zapier). A Lovable app with a real database often beats a no-code stack for specific use cases, and you own the code.
• You're teaching someone about software. Watching an AI build your app in real time is one of the best introductions to how modern web apps work — components, state, database, auth — the scaffolding is visible without requiring you to write it.
When Cursor Wins
• You already know how to code. If you can write the code yourself, Cursor makes you 3-5x faster at it. Lovable would slow you down because you'd have to argue with the AI about implementation decisions you already know the answer to.
• You're working in an existing codebase. Lovable generates new apps. If you have a 50K-line production codebase you need to maintain and extend, Cursor is the only relevant option here.
• You need a specific tech stack. Lovable is locked to React + Tailwind + Supabase. If you're building in Rails, Django, Go, Swift, or anything else, you need Cursor.
• You need full control over architecture. Lovable makes decisions about how to structure your app. Those decisions are fine for MVPs but may not match what you'd do if you were architecting carefully.
• You're scaling past MVP. Performance optimization, security hardening, complex business logic, integrations with specific APIs — these work better in a proper editor where you have full control.
Pricing Breakdown (2026)Lovable pricing:
• Free tier: 5 messages per day, limited to basic app generation. Useful for evaluation.
• Pro ($25/month): 100 daily messages, export to GitHub, custom domains, and private apps. What most serious users pay.
• Teams ($50/month/user): Team management, SSO, private projects, centralized billing.
• Usage overages: Heavy iterators can hit limits and pay per-message. Some founders report $50-150/month when pushing a complex app.
Cursor pricing:
• Free tier: 2,000 completions/month, 50 premium requests. Evaluation only.
• Pro ($20/month): 500 fast premium requests, unlimited slow premium. Standard for individual developers.
• Business ($40/month): Team features, privacy mode.
• No compute cost: Your editor runs on your machine; your hosting is a separate decision.
Real-world cost: A solo founder building a Lovable app and hosting it there pays $25-50/month. A developer using Cursor + Vercel for hosting pays $20 + $0-20 = $20-40/month. Similar. Where Lovable gets expensive is heavy iteration — pushing a complex app through 500+ AI messages can double your bill.
The Lovable-to-Cursor GraduationA common path I see with founders: start with Lovable, graduate to Cursor. Here's how it typically plays out:
2. Weeks 1-4: Founder builds MVP in Lovable. Gets 5-10 users. Validates concept.
4. Weeks 4-8: Feature requests come in. Lovable handles most of them. Founder is still not writing code.
6. Weeks 8-12: A specific feature breaks Lovable's assumptions. Maybe a custom integration, a weird UI requirement, or a performance issue. The founder fights the AI and loses.
8. Month 3+: Founder either (a) hires a developer who opens the Lovable codebase in Cursor and refactors, (b) learns to code and moves to Cursor themselves, or (c) rebuilds the app from scratch in a more maintainable way.
This is not a failure of Lovable — it's Lovable working as designed. Lovable is optimized for speed-to-first-version. After you have product-market fit signal, you need the flexibility of a real development environment, which is where Cursor (or Cursor + Claude Code) takes over.
Which Should You Pick?If you cannot code and have no intention of learning: Lovable, with your eyes open to the ceiling. It'll get you to a working product faster than anything else. Just know that if your product succeeds, you'll eventually hire a developer who uses Cursor to rebuild what Lovable generated.
If you're willing to learn to code: spend 1-2 weeks in Lovable to build a first version and understand how apps are structured. Then start learning real development with Cursor — you'll have a much better mental model for what the code does.
If you already code: Cursor. Lovable will feel like you're arguing with a less-capable version of yourself.
If you're building enterprise software, anything with sensitive data, or any codebase you expect to maintain for more than a year: Cursor from the start. The "speed" advantage of Lovable disappears by week 3 when you're fighting the AI over architecture decisions.
Alternatives to BothThe Lovable / Cursor decision is only the tip of the vibe coding tool iceberg. Other options:
• Bolt.new — Lovable's closest competitor. Very similar product, different UX.
• v0 by Vercel — AI-powered UI builder, less full-stack than Lovable but strong at frontend.
• Replit Agent — closer to Lovable but with more control and developer options. See Replit vs Cursor for that comparison.
• Claude Code — Cursor's most common companion. Agent-based, works in the terminal.
• Windsurf — VS Code fork with AI. Direct Cursor alternative.
For the full landscape, see Best Vibe Coding Tools in 2026.
Further Reading
• Claude Code vs Cursor — if you're a developer choosing between these
• Replit vs Cursor — the other cloud-vs-local comparison
• What Is Vibe Coding? — the broader context
• Is Vibe Coding Bad? — honest failure modes
• Vibe Coding with Cursor — deeper Cursor workflow guide
If you're a founder picking between these tools for a specific product, book a strategy call. I'll give you a specific recommendation based on what you're building, your technical comfort level, and where you're trying to get to.
### Frequently Asked Questions
**Q: Is Lovable better than Cursor?**
A: They serve different audiences. Lovable is better for non-developers and founders who want to generate working apps from a prompt without learning to code. Cursor is better for developers who already know how to code and want AI to make them faster. Asking which is 'better' is like asking whether a microwave is better than a chef's knife — depends on what you're trying to make and who's making it.
**Q: Can Lovable replace Cursor for a developer?**
A: For simple apps, maybe for the first draft. For any serious work, no. Lovable generates code in the cloud and gives you limited control over the architecture, dependencies, and tooling. A professional developer will outgrow Lovable quickly — usually within the first custom feature that doesn't match Lovable's default templates. Cursor doesn't have this ceiling because you control everything.
**Q: What is Lovable exactly?**
A: Lovable is a browser-based AI app builder. You describe an app in plain English, and Lovable generates a React + Tailwind + Supabase codebase, hosts it, and gives you a working URL. You can iterate by chatting with the AI ('add a login page', 'make the button blue', 'store data about users'). It's designed for non-developers and solo founders who want to ship an MVP without writing code.
**Q: What is Cursor exactly?**
A: Cursor is a desktop code editor — a fork of Microsoft VS Code with AI features integrated. You install it on your laptop, open a project, and use tab completion, in-editor chat, and multi-file edit (Composer). It's designed for developers who already know how to code and want AI assistance inside their normal workflow.
**Q: How much does Lovable cost vs Cursor?**
A: Lovable Pro is $25/month with usage credits, Lovable Teams starts around $50/month. Cursor Pro is $20/month flat. At the entry tier, they're nearly identical in price, but Lovable's usage-based model can get expensive quickly if you're iterating heavily — some founders report $100-200/month as they push an app to production. Cursor's cost is predictable because it runs on your hardware.
**Q: Can Lovable and Cursor work together?**
A: Yes. Lovable apps can be exported to Git, which means you can clone them and open the code in Cursor for further development. This is actually a common pattern: scaffold in Lovable, iterate on the high-level structure with the AI, then export and continue in Cursor once you need more control. The catch is that Lovable-generated code can be hard to maintain — it's AI-structured, not human-structured.
**Q: Is Lovable for developers or non-developers?**
A: Primarily non-developers and solo founders. Lovable's UX and chat-first workflow assume you're describing outcomes, not writing code. Developers will find it limiting within a few hours. The sweet spot for Lovable is someone who has a product idea, no dev team, and wants to test the concept before hiring anyone — or before committing to learning to code themselves.
**Q: What's the best Lovable alternative?**
A: The closest alternatives are Bolt.new (similar prompt-based app builder), v0 by Vercel (UI-focused), and Replit Agent (broader but similar vibe). For non-developers who outgrow Lovable, the path is usually to hire a developer who uses Cursor, or to learn to code and graduate to Cursor yourself. There isn't a middle-ground tool that bridges the two categories.
---
## Replit vs Cursor (2026): Which AI Coding Tool Wins?
- **URL:** https://justinmckelvey.com/blog/replit-vs-cursor
- **Published:** April 20, 2026
- **Updated:** May 12, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 10 min
- **Description:** Replit vs Cursor compared. Replit wins for collaborative prototyping. Cursor wins for shipping production code. Pricing, free tiers, when to switch.
Quick Answer (Verdict by Project Type)
Replit vs Cursor verdict by project type: Replit wins for collaborative prototyping, education, and hackathon demos. Cursor wins for shipping production code, solo development, and codebases over ~50 files. Paid pricing is similar ($20-25/mo). Replit is browser-based and built around multiplayer/sharing; Cursor is a desktop IDE built around speed and depth. If you're a non-developer or just learning, start with Replit. If you're shipping software for a job, use Cursor.
Tested May 2026 · Both tools used on real client projects · Author: Justin McKelvey, fractional CTO, 50+ products shipped
The Verdict by Project TypeReplit vs Cursor — which one wins depends entirely on what you're shipping. Replit wins for collaborative prototyping (multiplayer/share-a-link), classroom and education use, hackathon demos, and getting started in seconds without installing anything. Cursor wins for production codebases over ~50 files, solo developer speed, and any work where the AI needs deep context across the project. Both run around $20-25/month for paid tiers — the browser-vs-desktop split, not pricing, is what should drive the decision.
If you're a non-developer learning, or you need to share live code with someone who'll edit alongside you, pick Replit. If you're shipping software that pays your bills, pick Cursor. Below is the full alternatives comparison for the long tail of situations in between.
Best Replit Alternatives in 2026 (Quick Reference)
Alternative
Best for
Free tier?
Paid pricing
Cursor
Professional developers, large codebases
Yes (2,000 completions/mo)
$20/mo Pro
Bolt.new
Vibe-coding full-stack apps from a prompt
Yes
$20/mo Pro
Lovable
Non-coders building apps in the browser
Yes
$20/mo Pro
GitHub Codespaces
Cloud VS Code with full Linux containers
60 hrs/mo free
$0.18/hr after
CodeSandbox
Free browser IDE for web dev
Yes
$9/mo Pro
StackBlitz
Instant browser environments (Node, React, Angular)
Yes
$8/mo Pro
Windsurf
AI desktop IDE with cascading agents
Yes
$15/mo Pro
Cursor is the alternative most founders and developers end up choosing, so the rest of this guide compares Replit vs Cursor in depth. If you want the broader landscape, see Best Vibe Coding Tools 2026.
TL;DR: Replit vs Cursor in 2026Replit is a browser-based coding environment with real-time collaboration and an autonomous AI agent. Cursor is a desktop IDE (forked from VS Code) with Copilot-style AI integration. They solve different problems. Replit wins at collaboration, quick prototypes, solo founders building their first app, and teaching. Cursor wins at professional development, large codebases, and deep editing work. As of 2026, Replit costs $20/month for the agent tier, Cursor is $20/month for Pro — similar sticker price, very different workflows.
I'm a fractional CTO who's used both tools across client projects. This is the honest comparison — not marketing fluff, not Twitter takes — just what each tool is actually good at and when to reach for which. If you're also considering Cursor vs Claude Code, read the Claude Code vs Cursor comparison first; that's a more common decision point for professional developers.
What Each Tool IsReplit is a browser-based coding platform. You go to replit.com, open a "Repl" (their term for a project), and you're coding immediately — no install, no configuration. Every Repl runs in a cloud container, so your code executes in the cloud, not on your machine. Replit includes a full Linux environment, package management, real-time collaboration (multiple people editing the same file at once), and an autonomous AI agent (Replit Agent) that can build entire apps from a prompt, including deployment.
Cursor is a desktop code editor — a fork of Microsoft VS Code with AI features layered in. You install it on your laptop, open projects from your local file system, and use tab completion, in-editor chat, and a "Composer" mode that can edit multiple files at once. Cursor is for professional developers who already know how to code and want the AI to make them faster.
The fundamental split: Replit is cloud-native, collaborative, and beginner-friendly. Cursor is local, solo-developer-focused, and IDE-powerful. Everything else flows from that.
Feature-by-Feature Comparison
Feature
Replit
Cursor
Environment
Browser (cloud)
Desktop (local)
Setup time
0 minutes — just go to replit.com
5-10 minutes (install, sign in, configure)
AI agent
Replit Agent (autonomous, cloud)
Composer mode (multi-file edit in editor)
Tab completion
Via Replit AI
Native, best-in-class
Real-time collaboration
Native — Google Docs-style multiplayer
No — use Git / Live Share extensions
Runs your code
In Replit's cloud (always on option)
On your machine
Deployment
Built in — deploy from the editor
None — connect to Vercel/Railway/etc. yourself
File system access
Cloud only (container)
Full local filesystem
Git
Built-in, push to GitHub
Native via standard Git workflows
Pricing
Free, Core $20/mo, Teams $40/mo
Pro $20/mo, Business $40/mo
Compute cost
Variable (deployments cost extra)
None (runs on your hardware)
Learning curve
Very low — beginners can start immediately
Medium — assumes VS Code familiarity
Works offline
No — requires internet
Yes (AI features need internet)
The Core Difference: Where Your Code LivesEverything else flows from where your code actually lives. In Replit, your code lives in Replit's cloud containers. Every time you run it, execute a command, or deploy, it happens in the cloud. Your machine is just a window into that cloud environment.
In Cursor, your code lives on your laptop's hard drive. When you run it, it runs locally. When you commit, you push to whatever Git remote you've configured. Cursor is just an editor — your project exists independently of whether Cursor is open or even installed.
This distinction determines everything. If you want your code to persist without setup, be accessible from any device, and be shareable via URL, you want Replit. If you want performance, privacy, offline access, and full control of your environment, you want Cursor.
Replit Agent vs Cursor ComposerBoth tools have "agent mode" features, but they work very differently.
Replit Agent is an autonomous builder. You describe an app ("build me a to-do list with user accounts and Stripe billing"), and the agent generates a complete project, writes all the files, runs the code, fixes errors, and offers to deploy it — all from a chat interface. You can guide it along the way, but the default behavior is "AI does the entire thing." This is genuine vibe coding: the user doesn't write code, they describe outcomes.
Cursor Composer is a multi-file edit mode inside the editor. You describe a change across your codebase ("add authentication to all protected routes"), and Composer shows you a diff across multiple files. You review the diff, accept or reject individual changes, and commit. The editor is still in charge — Composer augments your editing rather than replacing it.
Practical implication: Replit Agent is better when you don't know exactly what you want and are discovering it through iteration. Cursor Composer is better when you know what you want and need the AI to execute a defined change across many files.
Pricing Comparison (2026)Replit pricing:
• Free tier: Limited compute, no always-on repls, basic AI access. Fine for learning.
• Replit Core ($20/mo): Includes Replit Agent, unlimited public repls, more compute, 10 always-on repls.
• Teams ($40/mo/user): Private repls, team features, centralized billing.
• Deployments: Extra — free tier for hobby projects, $1-25/month per deployment for production.
Cursor pricing:
• Free tier: 2,000 completions/month, 50 slow premium requests. Evaluation only.
• Pro ($20/mo): 500 fast premium requests, unlimited slow. What most individual developers use.
• Business ($40/mo/user): More fast requests, team privacy controls, centralized billing.
• No compute cost: Runs on your hardware. Your AWS/Railway bill is separate from your Cursor bill.
Real-world cost comparison: For a solo developer building a small app, Replit + 1 deployment ≈ $30-45/month. Cursor Pro + whatever you deploy to ≈ $20 + $5-20 hosting = $25-40/month. Close to parity. Where they diverge is scale: Replit's compute costs grow with usage; Cursor's don't.
When to Use Replit
• You're learning to code. Zero setup is a superpower when you're just starting. Replit's tutorials, community repls, and in-browser execution eliminate the first-week friction of "why won't my environment work."
• You're building an MVP from scratch with no dev team. Replit Agent can scaffold an entire app in minutes. For founders who want to see their idea working before hiring a developer, this is the fastest path from concept to clickable demo.
• You're collaborating in real-time. Pair programming, teaching, hackathons, remote whiteboarding on code — Replit's multiplayer editing is in a class of its own.
• You want built-in deployment. Replit Deployments handles hosting, scaling, and HTTPS for you. If "I don't want to set up Railway/Vercel" is the blocker, Replit removes it.
• You need to code from multiple devices. Chromebooks, iPads, work laptops with locked permissions — Replit runs anywhere with a browser.
• You're experimenting with a new language or framework. Spinning up a Go project in Replit takes 30 seconds. Same project locally requires installing Go, setting up VS Code, configuring the debugger, etc.
When to Use Cursor
• You're working in a large existing codebase. Cursor's tab completion, codebase-aware autocomplete, and symbol navigation are indispensable when you're navigating 100K+ lines. Replit struggles with large codebases.
• You need local file system access. Tools that need to read/write local files (desktop apps, CLI tools, system scripts) are painful in a cloud environment.
• You care about editor performance. Cursor edits are instant — no network latency. For serious coding where milliseconds matter, local beats cloud.
• You want to use your existing toolchain. Your VS Code extensions, your Vim keybindings, your custom themes — all transfer directly. Replit has extensions but the ecosystem is narrower.
• You're working offline or on flaky internet. Airplane, coffee shop, traveling — Cursor keeps working; Replit doesn't.
• You're doing professional software engineering with Git-based workflows. PR reviews, branch strategies, rebase workflows, CI/CD integration — this is the standard path for shipping production software.
Can You Use Both?Yes, and it's a legitimate workflow. A pattern I see with founders working with developers:
2. Prototype in Replit with Replit Agent. Founder uses natural language to describe the app. Agent generates a working prototype in hours.
4. Export to Git. Once the prototype proves the concept, push the Repl to a GitHub repo.
6. Continue in Cursor. The dev team clones the repo, opens it in Cursor, and begins serious engineering: refactoring the generated code, adding tests, scaling the architecture, preparing for production deployment.
8. Ship from Cursor. Final deploys go to production infrastructure (Railway, Vercel, AWS), not Replit's cloud.
This "Replit for speed of first draft, Cursor for quality of final build" pattern is how many solo-founder-plus-fractional-dev teams actually work in 2026.
Which Is Better for Vibe Coding?If vibe coding means "describe what you want, let AI write the code," then Replit is more purely vibe-coded. The agent can build a full app without you touching the editor. Cursor still expects you to be in the editor — it makes you faster, but you're still driving.
That said, Cursor's Composer mode gets close to full vibe coding for specific tasks. And tools like Claude Code (which you can run alongside Cursor) are even more agent-like. For the full comparison of vibe coding tools, see the best vibe coding tools in 2026.
The VerdictFor beginners and solo founders: start with Replit. The zero-setup experience is unmatched, and Replit Agent will build you something real faster than you can learn Cursor's keybindings.
For professional developers: use Cursor. The editor-centric workflow, local execution, and deep AI integration give you the fastest path from idea to shipped production code.
For teams: Cursor is the industry standard in 2026. Replit is excellent for specific team workflows (hackathons, teaching, pair programming), but PR-based engineering with Git is the default.
For mixed teams: use both. Replit for scaffolding and prototyping, Cursor for the serious build. The handoff via Git is clean.
Common Alternatives to ConsiderReplit and Cursor aren't the only options. Other tools you should evaluate:
• Claude Code — Anthropic's terminal-based AI agent. Complements Cursor for agent workflows.
• Lovable — browser-based app generator, similar to Replit Agent but more consumer-focused. Competes with Replit for the "founder with an idea" use case.
• Bolt.new — browser-based full-stack app builder. Another Replit competitor.
• GitHub Codespaces — cloud dev environments that run VS Code. Competes with Replit for the "cloud IDE" use case but assumes developer-level knowledge.
• Windsurf — VS Code fork with AI, similar to Cursor. Direct Cursor alternative with a different UX.
For the comprehensive list, see Best Vibe Coding Tools in 2026.
Further Reading
• Claude Code vs Cursor — the other big Cursor comparison
• Vibe Coding with Cursor — deeper guide to Cursor workflows
• What Is Vibe Coding? — the broader context
• Is Vibe Coding Bad? — the honest take on hype and failure modes
• Best Vibe Coding Tools 2026 — the complete landscape
If you're a founder deciding which tool to invest in for your startup's development workflow, book a strategy call and I'll give you a specific recommendation based on your team, stage, and tech stack.
### Frequently Asked Questions
**Q: What are the best alternatives to Replit in 2026?**
A: The top Replit alternatives are Cursor (desktop IDE for pros, $20/mo), Bolt.new (free browser-based full-stack builder), Lovable (browser app generator for non-coders), GitHub Codespaces (cloud VS Code, $0.18/hr), CodeSandbox (free browser-based dev, paid tiers from $9/mo), StackBlitz (free instant browser dev environments), and Windsurf (desktop IDE with cascading agents, $15/mo). Cursor is the most-cited alternative for professional developers; Bolt and Lovable are the closest matches to Replit's vibe-coding workflow.
**Q: Are there free alternatives to Replit?**
A: Yes. The best free Replit alternatives in 2026 are CodeSandbox (free tier with full browser IDE), StackBlitz (free browser-based dev for Node, Angular, React), Bolt.new (free tier — generates full-stack apps from prompts), and Cursor (free tier with 2,000 completions/month). GitHub Codespaces gives 60 hours/month free for individual accounts. If you need always-on hosted code execution like Replit's free tier, StackBlitz and CodeSandbox are the closest direct replacements.
**Q: What is the best free Replit alternative?**
A: For browser-based coding with zero setup, CodeSandbox is the closest direct Replit replacement on the free tier. For AI-generated apps from a prompt (Replit Agent's strength), Bolt.new has the most generous free tier. For cloud VS Code with full Linux containers, GitHub Codespaces gives 60 free hours/month — the closest match to Replit's full-Linux Repls. Pick CodeSandbox for learning/prototyping, Bolt for vibe-coding full apps, Codespaces for serious dev.
**Q: Is Replit or Cursor better?**
A: Neither is universally better — they target different use cases. Cursor is better for professional developers working in large codebases locally, where IDE features and tab completion matter. Replit is better for collaborative coding, quick prototypes, teaching, and projects where running code in the cloud is the point. As of 2026, about 60% of professional developers prefer Cursor, while Replit dominates education and solo-founder vibe-coding use cases.
**Q: Can Replit Agent replace Cursor?**
A: For simple apps, yes — Replit Agent can generate, run, and deploy a full app from a prompt, all in the browser. For complex professional work (large codebases, serious refactoring, local system access), no. Cursor's editor-based approach gives you finer control and faster iteration once you're past the initial scaffold. The clean answer: use Replit Agent to bootstrap quickly, then export or continue in Cursor for serious development.
**Q: Which is cheaper, Replit or Cursor?**
A: Cursor Pro is $20/month flat. Replit starts with a free tier, then $20/month for Replit Core (which includes the agent, more compute, and always-on repls). At base tier they're nearly identical. Replit's pricing gets more expensive as you use more compute or need production deployments — a real Replit deployment might cost $25-100/month in resources. Cursor has no compute cost because it runs locally.
**Q: Does Cursor work in the browser?**
A: No. Cursor is a desktop application — a fork of VS Code that you install on macOS, Windows, or Linux. This is intentional: running locally means faster editing, file system access, and no cloud latency. If you need browser-based coding specifically, Replit is the right choice. Trying to use Cursor via a remote desktop or cloud IDE defeats its core advantage.
**Q: Which is better for teams, Replit or Cursor?**
A: Replit is better for real-time collaboration — multiple people can edit the same file simultaneously, like Google Docs for code. Cursor requires Git and normal software engineering collaboration (PRs, code reviews). For teams shipping production software, Cursor + GitHub is the standard. For teaching, pair programming sessions, or hackathons, Replit's live collaboration is hard to beat.
**Q: Can Replit and Cursor connect to each other?**
A: Not directly, but you can git-push from Replit and pull in Cursor, and vice versa. Replit exports to Git repos, and Cursor reads any local Git repo. This is how some developers use both: bootstrap in Replit with the agent, push to GitHub, clone locally, continue in Cursor. The friction is minor once you set up the repo once.
**Q: Is Replit Agent the same as Claude Code or Cursor Composer?**
A: Similar concept, different implementations. Replit Agent runs in Replit's cloud and autonomously builds full apps including deployment. Claude Code runs in your terminal locally and executes any command you can run. Cursor Composer edits multiple files inside the Cursor editor. All three are AI agents, but the environment differs: cloud (Replit), terminal (Claude Code), editor (Cursor). Different agents for different workflows.
**Q: Which tool is best for learning to code?**
A: Replit. Zero setup, runs in the browser, has built-in tutorials, and the agent can explain what code does in plain language. Cursor assumes you already know how to code — it's a productivity tool for developers, not a learning platform. If you're new to programming in 2026, start with Replit or Lovable, then graduate to Cursor when you want more control.
---
## Claude Code vs Cursor: Which AI Coding Tool Wins in 2026?
- **URL:** https://justinmckelvey.com/blog/claude-code-vs-cursor
- **Published:** April 20, 2026
- **Updated:** April 20, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 10 min
- **Description:** Claude Code vs Cursor — 90 days, 12 production apps using both. The verdict, pricing breakdown, and the one project type where each tool wins in 2026.
Quick Answer (90-Day Verdict)
After 90 days using both on 12 production apps, the verdict: run both, but lead with the right one for the job. Claude Code ($5–$50/mo Anthropic API usage) wins for backend work, multi-file refactors, autonomous test runs, and anything requiring command execution. Cursor ($20/mo flat) wins for frontend work, visual UI iteration, and day-to-day editing. The project type where each one loses: Cursor on autonomous refactors that need to touch 20+ files; Claude Code on tight visual UI loops where you want to see the change immediately.
90-day test ending May 2026 · 12 production apps · Author: Justin McKelvey, fractional CTO, 50+ products shipped
TL;DR: The 90-Day VerdictAfter 90 days and 12 production apps using both: Claude Code is the better backend tool, Cursor is the better frontend tool, and most professional devs in 2026 run both. Claude Code ($5–$50/mo usage-based) is a terminal-based AI agent that excels at multi-file refactors, autonomous test runs, and any task requiring command execution. Cursor ($20/mo flat) is a VS Code fork that excels at in-editor completions, visual UI iteration, and tight feedback loops. Neither replaces the other — picking the right one as the lead for each project saves you days per feature.
I'm a fractional CTO who shipped 12 production apps using both tools across the last 90 days. This post is the honest verdict — what each one is actually good at, when to reach for which, and the project type where each one loses. No affiliate links, no hype.
What Each Tool Is (In One Sentence)Cursor is a desktop code editor — specifically a fork of Microsoft VS Code — with AI features layered in: tab completion, in-editor chat, codebase-aware autocomplete, and a "Composer" mode for multi-file edits. You install it, open a project, and it feels exactly like VS Code except the AI is smarter and more integrated.
Claude Code is a command-line tool from Anthropic that runs an AI agent inside your terminal. You type natural language instructions, and Claude Code reads files, writes code, executes shell commands, runs your tests, and iterates until the task is done. There's no IDE — you can use any editor you want alongside it, or no editor at all.
This fundamental difference in form factor drives every other difference between them. Cursor is about making your existing editor smarter. Claude Code is about offloading entire workflows to an autonomous agent.
Feature-by-Feature ComparisonHere's how the two tools stack up across the dimensions that matter most:
Feature
Cursor
Claude Code
Form factor
Desktop IDE (VS Code fork)
Terminal CLI agent
Tab completion
Yes — best-in-class
No — not an editor
Autonomous file edits
Via Composer mode
Native — it's the default
Runs shell commands
Limited (via agent mode)
Yes — executes directly
Runs tests & fixes errors
With prompting
Native loop — sees errors, fixes, re-runs
Multi-file refactors
Good (Composer)
Excellent (no copy-paste)
Frontend / visual work
Excellent — see changes live
Blind — no UI preview
Backend / infrastructure
Good
Excellent
Model choice
Claude, GPT, o1, open-source
Claude Sonnet & Opus only
Pricing
$20/mo Pro, $40/mo Business
Usage-based via API ($5-50/mo typical)
Learning curve
Low (VS Code users feel at home)
Medium (requires terminal comfort)
Works with existing editor
No — it replaces your editor
Yes — runs alongside anything
The Core Architectural DifferenceIf you understand nothing else about Claude Code vs Cursor, understand this: Cursor is a smarter editor, Claude Code is an AI teammate.
When you use Cursor, you're still the one driving. You click into files, you type, you evaluate suggestions, you accept or reject. The AI is a faster version of Stack Overflow built into your cursor (hence the name). It's a force multiplier for your keystrokes.
When you use Claude Code, you hand off a task. "Refactor the auth system to use JWT instead of session cookies." "Add pagination to the posts index with infinite scroll." "Fix the failing CI tests." The agent reads the relevant files, makes the changes across multiple files, runs tests, sees what broke, and fixes it. You show up at the end to review.
This distinction matters because it changes what you're optimizing for. Cursor optimizes for velocity in your current workflow. Claude Code optimizes for eliminating the workflow entirely.
Speed and Iteration LoopPeople ask "which is faster" but the question is wrong. They're fast at different things.
Cursor is faster for: typing in general, small edits, exploratory coding, reading code, navigating a codebase, quick bug fixes you already understand. Tab completion is instant. Chat responses are 1-3 seconds. You stay in flow.
Claude Code is faster for: multi-file refactors, test-driven development, debugging cycles that require running code, infrastructure tasks, anything where "try → fail → learn → retry" is the pattern. The agent loop eliminates the human-in-the-middle during iteration.
Concrete example: last week I needed to migrate a Rails app from session cookies to JWT across ~20 files, update all controller tests, and make sure the frontend still authenticated correctly. In Cursor, this would have been a 45-minute manual refactor — file by file, reading each change, running tests manually, debugging failures one at a time. In Claude Code, I described the change in two paragraphs, let it work for 8 minutes, reviewed the diff, and shipped. The speed differential gets bigger as tasks get more systematic.
Conversely: when I'm iterating on a landing page layout, watching the browser refresh with each change, Cursor is dramatically faster. Claude Code's agent loop is pointless when the feedback signal is "does it look right to my eyes."
Pricing Comparison (2026)Cursor pricing: Simple. $20/month for Pro (includes 500 fast requests on premium models, unlimited slow requests), $40/month for Business (team management, privacy mode, more premium requests). There's a free tier with 2,000 completions and 50 slow premium requests monthly — enough to evaluate but not to work with daily.
Claude Code pricing: Usage-based through the Anthropic API. You pay for tokens consumed. For typical developer usage:
• Light use (a few hours per week): $5-15/month
• Moderate use (daily coding assistance): $20-50/month
• Heavy professional use (all day, every day): $50-150/month
Anthropic also offers a Claude Max subscription at $100/month (5x included usage) or $200/month (20x included usage) that wraps Claude Code costs into a predictable bill. For heavy users this is cheaper than pay-as-you-go API usage.
Combined cost for the typical professional setup: $40-100/month total ($20 Cursor + $20-80 Claude Code API or Max). For a developer billing $100+/hour, the tools pay for themselves in under an hour of saved time per month.
When to Use CursorReach for Cursor when you're doing the traditional "developer sitting in an editor" work:
• Frontend development — HTML, CSS, React, Vue, Svelte — anything where you need to see the browser update
• Exploratory coding — figuring out how a new API works, poking at unfamiliar code, learning a library
• Small, targeted edits — renaming a variable, fixing a typo, adding a field to a form
• Reading code — jumping to definitions, inline AI explanations of what a function does
• Pair-programming with AI — you write a line, it suggests the next line, you accept or reject
• UI design iteration — tweaking Tailwind classes while watching the preview
• Working in unfamiliar codebases — Cursor's codebase-aware autocomplete shines here
Cursor wins whenever the task involves reading, navigating, or doing small edits inside an editor you're actively driving.
When to Use Claude CodeReach for Claude Code when you want the AI to take over a whole task:
• Backend refactors — database migrations, auth system changes, API redesigns
• Multi-file changes — anything that touches 5+ files with a consistent pattern
• Test generation — "add tests for this controller" and the agent writes them, runs them, fixes failures
• CI / deployment tasks — writing GitHub Actions configs, Dockerfiles, Railway setup
• Bug hunting — describe the bug, let the agent reproduce it, locate the cause, and propose a fix
• Documentation — "update README to cover these new endpoints" and it does the full pass
• Greenfield feature work — "add a booking system with these requirements" and let it scaffold
Claude Code wins whenever the task can be described once and completed autonomously. The more self-contained the task, the bigger the speed gain over Cursor.
Using Both Together: The Real-World WorkflowThe biggest misconception about Claude Code vs Cursor is that you have to choose. You don't. The two tools don't conflict — they share no state, no configuration, no IDE chrome. You run Cursor as your editor and Claude Code in a terminal tab, and switch between them based on the task.
My typical workflow on a client project:
2. Morning planning — open Cursor, read through any PRs or tickets, understand what I'm shipping today
4. Backend work — switch to terminal, describe the feature or fix to Claude Code, let it work, review the diff in Cursor's diff viewer
6. Frontend work — back to Cursor with the dev server running in a split terminal, iterate on UI with tab completion and Composer
8. Testing and polish — Claude Code for the broad "add tests for everything I built today" pass, Cursor for the specific edge cases I need to hand-craft
10. Deployment — Claude Code handles the "push, watch the deploy, check logs" loop while I do something else
This split mirrors the traditional frontend/backend split in most development work, which is why it works naturally. The tools align with the kinds of tasks in each domain.
Model Access: The Subtle DifferenceBoth tools can use Claude models, but the access is different.
In Cursor, you choose the model per-request from a dropdown. Claude Sonnet, Claude Opus, GPT-4o, o1, o1-mini, and several open-source models are all available. Most Cursor users default to Claude Sonnet for the best balance of speed and intelligence. The benefit of Cursor's approach is flexibility — if Anthropic has an outage, you can switch to GPT without changing tools.
In Claude Code, you get Claude models only — Sonnet and Opus. There's no OpenAI option, no open-source option. The benefit is tighter integration: Anthropic builds Claude Code specifically for Claude's strengths (long context, tool use, code understanding), so you get capabilities that a model-agnostic wrapper can't match.
Practical implication: if you want multi-model access or prefer non-Claude models for specific tasks, Cursor is the only choice. If you're willing to commit to Claude's model family, Claude Code gives you the best possible integration.
Verdict: Which Should You Use?Here's my honest take after 18 months of daily use of both tools:
If you're a beginner: start with Cursor. The VS Code-like interface is familiar, the learning curve is gentle, and tab completion gives immediate value. Add Claude Code to your workflow after 2-3 months once you're comfortable with AI-assisted coding patterns.
If you're a professional developer: use both. Cursor for your editor, Claude Code for autonomous work. The combined cost is $40-100/month, which pays for itself in a single hour of time saved per month.
If you're a fractional CTO, consultant, or founder shipping your own product: same answer — use both. You can't afford to not. The speed advantage of agent-based workflows on infrastructure, testing, and refactoring is too big to ignore.
If you have to pick one: Cursor for frontend-heavy work, Claude Code for backend-heavy work. But try hard not to pick — the combined value is greater than either alone.
Common Mistakes When Choosing Between ThemMistake 1: Picking based on hype. Twitter will tell you Claude Code has replaced everything. Twitter is wrong. Both tools have specific use cases where they excel.
Mistake 2: Assuming Cursor is "enough." Many developers never try Claude Code because they assume an AI in an editor is enough. They miss out on the agent workflow that compresses 45-minute tasks into 8-minute ones.
Mistake 3: Assuming Claude Code is "enough." The reverse: terminal enthusiasts who think they don't need an IDE. They miss out on the tight feedback loop that tab completion provides for detailed editing work.
Mistake 4: Comparing pricing at face value. Cursor looks cheaper at $20/month vs Claude Code's variable pricing. But for heavy users, Cursor's "fast request" quota caps force you to upgrade or slow down. Claude Code's usage-based pricing scales linearly — heavier users pay more but aren't artificially limited.
What About Other AI Coding Tools?The Claude Code vs Cursor decision doesn't happen in a vacuum. Other AI coding tools worth knowing in 2026:
• GitHub Copilot — the original AI coding tool. Tab completion in any editor. Cheaper ($10/month) but less capable than Cursor's built-in AI.
• Windsurf — VS Code fork similar to Cursor. Different UX, competitive features. A Cursor alternative worth evaluating.
• Lovable, Bolt, v0 — browser-based tools for generating apps from prompts. Different product category entirely — they replace the IDE with a chat UI. See vibe coding for more.
• Replit Agent — closest competitor to Claude Code. Agent-based, runs in Replit's cloud environment. Worth trying if you want agent workflows but don't want to run things locally.
Further ReadingIf you're evaluating AI coding tools more broadly, I've written detailed guides on:
• What Is Vibe Coding? — the broader category these tools fit into
• Vibe Coding with Claude — deeper dive on Claude Code workflows
• Vibe Coding with Cursor — deeper dive on Cursor workflows
• Best Vibe Coding Tools 2026 — the full landscape
• Is Vibe Coding Bad? — the honest case against hype
Or if you're a founder trying to figure out how AI coding tools fit into your broader engineering strategy — book a strategy call and we'll map it to your specific stage and team.
### Frequently Asked Questions
**Q: Is Claude Code better than Cursor?**
A: Neither is universally better — they solve different problems. Claude Code is a terminal-based AI agent that excels at backend work, multi-file refactors, test generation, and executing commands autonomously. Cursor is a VS Code fork that excels at frontend work, live in-editor suggestions, and visual UI iteration. As of 2026, most professional developers use both: Claude Code for architecture and backend, Cursor for frontend and day-to-day editing.
**Q: What's the difference between Claude Code and Cursor?**
A: Cursor is an integrated development environment (IDE) — a fork of VS Code with AI features built in. You open files, use Copilot-style tab completion, and chat with an AI that sees your editor context. Claude Code is an AI agent you run in your terminal. It reads your files, edits them directly, executes commands, runs tests, and iterates on errors autonomously. The core difference: Cursor augments your editing, Claude Code replaces chunks of your workflow entirely.
**Q: Which is faster, Claude Code or Cursor?**
A: Cursor is faster for small edits and autocomplete — tab completion responds in under 200ms. Claude Code is faster for complex multi-file changes because it eliminates the copy-paste cycle. A refactor that takes 20 minutes of manual edits in Cursor might take 3 minutes in Claude Code because the agent runs tests, sees errors, and fixes them without human intervention. Speed depends entirely on task type.
**Q: Can you use Claude Code and Cursor together?**
A: Yes, and most professional developers do. A common workflow: use Claude Code in the terminal for backend changes, database migrations, running tests, and complex refactors. Then switch to Cursor for frontend work where you want to see UI changes live. They share no state — you can run both on the same project simultaneously without conflicts as long as they're not editing the same file.
**Q: How much does Claude Code vs Cursor cost?**
A: Cursor is $20/month for the Pro plan or $40/month for Business (more requests, team features). Claude Code uses Anthropic API usage-based pricing: $5-15/month for light use, $20-50/month for heavy daily use, or $100-200/month on the Max subscription tier with included credits. Combined, most professional developers spend $40-100/month across both tools — still far less than the time savings.
**Q: Is Cursor better than Claude Code for beginners?**
A: Yes. Cursor is easier to start with because it's a familiar IDE — if you've used VS Code, you already know the interface. Claude Code requires comfort with the terminal and a mental model of what the agent is doing. Beginners should start with Cursor, then add Claude Code to their workflow after 2-3 months of AI-assisted coding experience.
**Q: Does Cursor use Claude or GPT?**
A: Cursor lets you choose — it supports Claude Sonnet, Claude Opus, GPT-4, GPT-4o, o1, and several open-source models. Most Cursor users default to Claude Sonnet for the best balance of speed and quality. Claude Code is Anthropic's own product and uses Claude models exclusively (Sonnet and Opus). If you want GPT-based coding, Cursor is your only option between the two.
**Q: Can Claude Code replace Cursor entirely?**
A: Not for most developers. Claude Code is powerful for agent-style workflows, but you still need an editor for reading code, navigating files, and making quick edits. Developers who use Claude Code exclusively typically pair it with plain VS Code or a Vim setup — they still need an editor, just not one with AI features. The sweet spot is Claude Code plus Cursor, with each handling what it's best at.
---
## Fractional CTO vs. Full-Time CTO: Which One Does Your Startup Actually Need?
- **URL:** https://justinmckelvey.com/blog/fractional-cto-vs-full-time-cto
- **Published:** April 19, 2026
- **Updated:** April 19, 2026
- **Category:** Fractional CTO
- **Reading time:** 7 min
- **Description:** Fractional CTO vs full-time CTO — which does your startup need? Decision framework by stage, team size, budget, and when to transition.
TL;DR: The Decision FrameworkMost startups before Series A should hire a fractional CTO. Most startups after Series B should hire full-time. The messy middle — Series A to Series B — depends on your team size, technical complexity, and burn rate. This isn't just about cost (though a fractional CTO saves 60-75%). It's about what type of leadership your company actually needs at each stage. After 15 years of operating in both roles across 50+ products, I've seen founders waste $300K hiring a full-time CTO too early and I've seen founders lose $500K in preventable mistakes by hiring one too late.
What a Fractional CTO Gives YouStrategic leadership without the full-time commitment. A fractional CTO works 10-20 hours/week with your company, typically managing 2-4 clients simultaneously. They bring the same caliber of experience as a full-time CTO — architecture decisions, team management, vendor evaluation, technical strategy — at $5,000-15,000/month instead of $300K+/year.
What you get:
Speed to start. A fractional CTO can begin in 1-2 weeks. A full-time CTO search takes 3-6 months. If you need technical leadership now, fractional is the only option that doesn't involve months of waiting.
Breadth of experience. A fractional CTO who works with 3-4 companies sees patterns across industries, tech stacks, and growth stages. They've solved your problem before — probably multiple times. A full-time CTO brings depth in one company's context but less cross-company pattern recognition.
Lower risk. If the relationship doesn't work, you part ways with 30 days' notice. No severance, no awkward board conversations, no 6-month search to replace them. The cost of a bad fractional CTO hire is 1-2 months of retainer ($10K-30K). The cost of a bad full-time CTO hire is $200K-400K when you factor in salary, equity, severance, and replacement recruiting.
Flexibility. Start at 10 hours/week. Scale to 20 during a critical period. Drop back to 5 during a quiet stretch. You can't do this with a full-time hire.
What a Full-Time CTO Gives YouDedicated, always-on technical leadership. A full-time CTO eats, sleeps, and breathes your company. They're in every standup, every architecture discussion, every production incident. They build culture, mentor the entire engineering team, and represent the technical vision to investors and customers.
What you get that a fractional CTO can't fully provide:
24/7 availability. When production goes down at 2am on a Saturday, the full-time CTO is in the war room. A fractional CTO has an on-call agreement, but their response is measured in hours, not minutes. For companies with critical uptime requirements, this matters.
Cultural leadership. Engineering culture is built through daily interactions — code review tone, decision-making speed, risk tolerance, quality standards. A full-time CTO shapes culture by being present every day. A fractional CTO can influence culture, but they can't define it from 15 hours/week.
Team management at scale. Managing 3-5 engineers part-time is doable. Managing 15-20 engineers part-time is not. Above 10-12 engineers, the management overhead (1:1s, performance reviews, career conversations, cross-team coordination) requires full-time attention.
Investor and board representation. At Series A and beyond, investors expect a named CTO who's fully committed. A fractional CTO can present technical strategy in board meetings, but the optics of a part-time technical leader can raise questions during due diligence.
The Decision by StagePre-Seed / Bootstrapping → Fractional CTOBudget: $0-500K total. Engineering team: 0-3. Biggest need: make the right architecture decisions before writing code.
At this stage, you can't afford a $300K CTO and you don't need 40 hours/week of technical leadership. You need someone to help you choose the right tech stack, scope the MVP correctly, evaluate early engineering hires, and avoid the architectural mistakes that cost $50K-200K to fix later.
A fractional CTO at $5,000-8,000/month is the highest-ROI investment a pre-seed startup can make. It's less than one month of a full-time CTO's salary and prevents the mistakes that kill companies at this stage.
Seed Stage → Fractional CTO (Usually)Budget: $500K-3M raised. Engineering team: 3-8. Biggest need: technical leadership for a growing team while the founder focuses on sales and fundraising.
This is the sweet spot for fractional CTO engagement. The team is big enough to need leadership but small enough that 10-15 hours/week of senior attention covers it. The day-to-day work — code reviews, sprint planning, 1:1s, architecture decisions — fits comfortably in a part-time cadence.
The exception: if your product IS the technology (AI model, deep infrastructure, novel algorithms), you may need a full-time CTO even at seed stage because the technical decisions are too frequent and complex for part-time involvement.
Series A → Evaluate the TransitionBudget: $3M-15M raised. Engineering team: 8-15. Biggest need: scale the team, build engineering processes, represent tech to the board.
This is the transition zone. Some Series A companies thrive with a fractional CTO at $12,000-15,000/month (full engagement level). Others need to hire full-time because the team management overhead exceeds what part-time allows.
Stay fractional if: Your team is under 10 engineers. You have a strong engineering manager who handles daily team operations. Your product is stable and iterating rather than building net-new architecture. The fractional CTO has been with you since pre-seed and has deep context.
Go full-time if: Your team exceeds 12 engineers. You're building complex new systems (not just iterating). Investors are asking for a named CTO. You need someone in the office 5 days/week for culture and management.
Series B+ → Full-Time CTOBudget: $15M+. Engineering team: 15+. Biggest need: organizational leadership, technical vision at scale, board-level representation.
At this stage, the engineering organization is too large and complex for part-time leadership. You need a full-time CTO who owns hiring plans, manages engineering managers, sets multi-year technical strategy, and represents the engineering function to the board and to customers.
The best outcome here: your fractional CTO either transitions to full-time (they already know everything) or helps you hire their replacement and mentors the new CTO through the transition over 2-3 months.
The Transition Playbook: Fractional to Full-TimeEvery fractional CTO engagement should plan for one of three exits:
Path 1: The Fractional CTO Goes Full-TimeThis is the ideal outcome when the fit is right. After 6-12 months of fractional work, both sides know if it works. The CTO already understands the codebase, team dynamics, business goals, and company culture. There's no ramp-up period, no recruiting risk, and no onboarding cost.
Negotiate the transition: the fractional rate converts to a salary, benefits start, and equity is granted (typically 1-3% for a Series A-stage full-time CTO). The key is having this conversation early — both sides should know by month 3-4 whether full-time is a possibility.
Path 2: The Fractional CTO Hires Their ReplacementIf the fractional CTO doesn't want to go full-time (many prefer the fractional model), the next best outcome is having them run the search for a full-time CTO. They know what the role requires, what technical skills matter, and what personality fits the team. This produces significantly better hires than a recruiter-led search because the evaluator has deep context about the actual job.
Timeline: 2-3 months to find and hire, plus 1-2 months of overlap where the fractional CTO onboards the new full-time CTO and transfers context.
Path 3: Promote from WithinThe fractional CTO identifies and mentors a senior engineer into the CTO role over 6-12 months. This works when you have a strong internal candidate who has the technical skills but needs coaching on leadership, strategy, and stakeholder management.
This is the most cost-effective path and produces the highest retention — the new CTO already has deep context and team trust. The fractional CTO transitions to an advisory role (5 hours/month) to support the new CTO through their first 6 months of leadership.
The Side-by-Side ComparisonFactorFractional CTOFull-Time CTOCost$60K-180K/year$300K-500K+/yearTime to start1-2 weeks3-6 monthsHours/week10-2040-60AvailabilityBusiness hours + on-callAlways onEquity0-0.5%1-5%Exit riskLow (30-day notice)High ($100K+ to replace)Team size limitUp to 10-12UnlimitedBest forPre-seed to Series ASeries B+Cross-company insightHigh (sees 3-4 companies)Low (one company focus)Culture buildingModerateStrongMaking Your DecisionAsk yourself these five questions:
1. Is your engineering team over 12 people? If yes → full-time. If no → fractional works.
2. Do you have $300K+/year budgeted for a CTO? If yes → full-time is an option. If no → fractional is the only option.
3. Do you need someone on-call 24/7? If yes → full-time. If no → fractional works.
4. Are investors requiring a named full-time CTO? If yes → full-time. If no → fractional works.
5. Is your technology the product? (AI models, infrastructure, novel algorithms) If yes → lean toward full-time even at earlier stages. If no → fractional works longer.
If you answered "fractional works" to 4 or more, start fractional. You can always transition to full-time later — and you'll make a better full-time hire because you'll know exactly what the role requires from having a fractional CTO do it first.
For the full breakdown on what a fractional CTO costs and what they actually do day-to-day, read those guides. If you're ready to explore whether fractional CTO support makes sense for your stage, book a strategy call.
### Frequently Asked Questions
**Q: What is the difference between a fractional CTO and a full-time CTO?**
A: A fractional CTO works part-time (10-20 hours/week) across 2-4 companies, providing senior technical leadership at $5,000-15,000/month. A full-time CTO is dedicated to one company 40-60 hours/week at $250,000-400,000/year plus equity. Both make the same types of decisions — architecture, hiring, strategy — but a fractional CTO's time is divided.
**Q: When should a startup hire a full-time CTO instead of fractional?**
A: Hire full-time when: your engineering team exceeds 10-15 people, you need someone on-call 24/7 for production systems, you're post-Series A with the budget to support a $300K+ salary, and the technical complexity requires daily hands-on involvement. Before these thresholds, fractional is almost always the better choice.
**Q: Can a fractional CTO transition to full-time?**
A: Yes, and this is one of the best outcomes of a fractional engagement. After 6-12 months working fractionally, both sides know if it's a fit. The CTO already understands the codebase, team, and business. Converting a proven fractional CTO to full-time eliminates the risk and cost of an external search.
**Q: What are the disadvantages of a fractional CTO?**
A: The main limitations are: not available 24/7 for emergencies, divided attention across multiple clients, may not be deeply embedded in company culture, and harder to manage a large engineering team (10+) part-time. These limitations matter more as the company scales past Series A.
**Q: Is a fractional CTO worth it for a pre-seed startup?**
A: Yes — this is actually the best stage for a fractional CTO. Pre-seed startups make the most consequential technical decisions (architecture, stack, MVP scope) with the least experience. A fractional CTO at $5,000-8,000/month prevents $50K-200K in mistakes during the period when every dollar of runway matters most.
**Q: How do you transition from a fractional CTO to a full-time CTO?**
A: Three paths: (1) The fractional CTO goes full-time with your company (best outcome — they already know everything). (2) The fractional CTO helps you hire their full-time replacement, then transitions out over 2-3 months. (3) You promote an internal engineering leader with the fractional CTO mentoring them into the role over 6-12 months.
---
## Fractional CTO Rates, Cost, and Hourly Pricing in 2026
- **URL:** https://justinmckelvey.com/blog/fractional-cto-cost-hourly-rate
- **Published:** April 19, 2026
- **Updated:** May 04, 2026
- **Category:** Fractional CTO
- **Reading time:** 6 min
- **Description:** Fractional CTO rates are $150-350/hour. Monthly: $5K-15K. 2026 pricing by engagement type, market tier, ROI formula, and red flags to watch.
TL;DR: Fractional CTO Rates and Cost in 2026Fractional CTO rates run $150-350/hour. Monthly cost is $5,000-15,000 depending on engagement depth, market, and experience level. Compare that to a full-time CTO at $250,000-400,000/year plus benefits and equity — a fractional engagement saves 60-75% while delivering the same strategic leadership. As of April 2026, hourly rates have been stable for the past 12 months, with slight upward pressure in AI and fintech specializations where demand exceeds supply.
Looking to hire a fractional CTO rather than compare rates? See the complete guide to hiring a fractional CTO for the step-by-step process, or fractional CTO services for engagement models and how to get started.
This guide breaks down the real numbers: what you pay, what you get, how to structure the engagement, and how to calculate whether the ROI makes sense for your stage. No vague "it depends" answers — actual pricing data from the market.
Fractional CTO Hourly Rates by MarketHourly rates vary by geography, specialization, and experience. Here's what the market looks like in 2026:
Tier 1 markets ($275-350/hour): San Francisco, New York, Seattle, Boston. These are the most expensive markets, driven by competition from full-time CTO salaries that exceed $400K in the Bay Area. Fractional CTOs in these markets typically have 15-20+ years of experience and multiple successful exits.
Tier 2 markets ($200-275/hour): Austin, Denver, Chicago, LA, Portland, Atlanta, Miami. This is where most of the fractional CTO market operates. Strong talent pool, lower cost of living than Tier 1, and rates that are accessible to seed and Series A startups.
Tier 3 markets ($150-200/hour): Smaller metros, remote-first CTOs, and international talent working with US companies. Quality varies more at this tier — due diligence on experience and references is especially important.
Specialization premiums: CTOs with deep expertise in AI/ML, healthcare/HIPAA, fintech/PCI, or security command a 20-30% premium regardless of market. An AI-specialized fractional CTO in Austin might charge $300/hour — more than a generalist in San Francisco.
Monthly Retainer PricingMost fractional CTO engagements use monthly retainers rather than hourly billing. Here's the breakdown by engagement depth:
Advisory Level: $3,000-5,000/month (5-8 hours/week)What you get: weekly 1:1 with the founder, monthly architecture reviews, on-call for critical decisions, technology roadmap input. What you don't get: hands-on code review, sprint participation, team mentoring.
Best for: Pre-seed startups with a small dev team that needs occasional strategic guidance. Founders who are technical but want a sounding board for major decisions.
Embedded Level: $5,000-10,000/month (10-15 hours/week)What you get: everything in advisory plus weekly code reviews, sprint planning participation, 1:1 mentoring with senior engineers, build-vs-buy evaluations, vendor selection, interview support for engineering hires.
Best for: Seed to Series A startups with 3-10 engineers who need day-to-day technical leadership. This is the most common engagement type — about 60% of fractional CTO relationships operate at this level.
Full Engagement: $10,000-15,000/month (20+ hours/week)What you get: everything in embedded plus daily team interaction, hands-on architecture work, direct management of engineering team, full ownership of technical roadmap, representing the company in investor and customer technical conversations.
Best for: Startups that need a full-time CTO's involvement but can't afford (or don't want) the full-time salary and equity commitment. Often used during critical periods: fundraising, major pivots, or rapid scaling.
The Real Cost Comparison: Fractional vs. Full-TimeHere's the math that makes the fractional model compelling:
Full-time CTO total cost:
Base salary: $250,000-400,000/year
Benefits (health, 401K, etc.): $30,000-60,000/year
Equity: 1-5% (at a $10M valuation, that's $100K-500K in potential dilution)
Recruiting cost: $50,000-100,000 (recruiter fees, interview time)
Onboarding time: 2-3 months before full productivity
Total year-one cost: $380,000-600,000+
Fractional CTO total cost (embedded level):
Monthly retainer: $8,000/month x 12 = $96,000/year
No benefits obligation
No equity (or minimal: 0.25-0.5%)
No recruiting cost (you can start in 1-2 weeks)
Productive from week 1 (experienced fractional CTOs have done this before)
Total year-one cost: $96,000-120,000
Savings: $260,000-480,000 in year one. That's 2-4 additional engineers, 12-18 months of runway, or a meaningful marketing budget.
How to Calculate If the ROI Makes SenseThe cost is only half the equation. Here's how to calculate whether a fractional CTO pays for itself:
Value Driver 1: Mistakes PreventedThe most valuable thing a fractional CTO does is say "don't build that" or "don't build it that way." Architecture decisions that cost $50,000-200,000 to reverse. Framework choices that add 6 months of migration work. Feature builds that nobody will use. If a fractional CTO prevents even one $50K mistake in a year, the engagement has paid for itself.
In my practice, the average client avoids $80,000-150,000 in preventable costs during the first 6 months. These aren't hypothetical savings — they're specific decisions where I redirected engineering effort away from dead ends and toward high-impact work.
Value Driver 2: Engineering VelocityA fractional CTO typically increases engineering output by 30-60% through better processes, CI/CD improvements, and removing bottlenecks. If your 4-person engineering team costs $600K/year in salaries and a fractional CTO at $120K/year makes them 40% more productive, you're getting $240K/year in additional output — a 2x return on the CTO investment.
Value Driver 3: Hiring QualityEngineering hiring is expensive. The average cost of a bad engineering hire (recruiting, onboarding, severance, replacement) is $100,000-150,000. A fractional CTO who improves your hiring hit rate from 60% to 85% saves you $100K+ per avoided mis-hire.
The ROI Formula(Mistakes prevented + velocity gains + hiring savings) / Fractional CTO cost = ROI
For a typical seed-stage startup: ($80K mistakes + $100K velocity + $50K hiring) / $120K cost = 1.9x ROI in year one. And that's conservative — most engagements exceed 3x.
Hourly vs. Retainer: Which Structure to ChooseChoose hourly when: You need a specific, time-bound deliverable. Architecture review ($3,000-5,000 for a 2-day audit). Security assessment. Technical due diligence for an acquisition. Code review before a major launch. These are project-based needs with a clear endpoint.
Choose retainer when: You need ongoing technical leadership. The CTO needs to know your codebase, your team, and your business context to be effective. Retainers work better because the CTO can be proactive — spotting issues before they become fires — rather than reactive.
The hybrid model: Start with a 2-day paid architecture audit ($3,000-5,000 at hourly rates). If the audit reveals significant value and you work well together, transition to a monthly retainer. This is the lowest-risk way to evaluate a fractional CTO relationship.
Red Flags in Fractional CTO PricingBelow $100/hour: Either very junior, very desperate, or planning to overcommit and underdeliver. At this rate, you're not getting senior CTO-level thinking — you're getting a senior developer with a leadership title.
Equity-only compensation: A fractional CTO who won't charge cash isn't confident in their ability to deliver immediate value. Equity should be a bonus on top of cash compensation, not a replacement for it.
No clear engagement model: If they can't tell you exactly what you get for your money — hours per week, deliverables, communication cadence — they're making it up as they go.
Locked into long contracts: Good fractional CTOs don't need 12-month commitments. Month-to-month or quarterly with 30-day notice is standard. If they need a long lock-in, ask why.
More than 5 clients simultaneously: A fractional CTO with 8 clients isn't fractional — they're absent. The sweet spot is 2-4 concurrent clients. Ask how many other companies they're working with.
Getting StartedThe best way to evaluate whether a fractional CTO is worth the cost for your startup is a 30-minute conversation about your specific situation. Not a sales pitch — a diagnostic.
For the step-by-step hiring process, read How to Hire a Fractional CTO: Complete Guide for Founders. For the decision framework on fractional vs full-time CTO, read that guide. To understand whether you need a CTO or a product manager, read the Fractional Product Manager guide.
I work with 3-4 companies at a time at $8,000-12,000/month for embedded engagements. Book a strategy call and I'll give you an honest assessment of whether a fractional CTO makes sense for your stage and budget.
### Frequently Asked Questions
**Q: How much does a fractional CTO cost per hour?**
A: Fractional CTOs charge $150-350/hour as of 2026. Rates vary by market: Austin and mid-tier cities average $200-275/hour. San Francisco and New York command $275-350/hour. CTOs with deep specializations (AI, healthcare, fintech) charge at the higher end regardless of location.
**Q: How much does a fractional CTO cost per month?**
A: Monthly retainers range from $5,000-15,000/month. A 10 hours/week engagement costs $5,000-10,000/month. A 20 hours/week embedded engagement costs $10,000-15,000/month. Most startups start at 10 hours/week and scale up based on needs.
**Q: Is a fractional CTO cheaper than a full-time CTO?**
A: Yes — 60-75% cheaper. A full-time CTO costs $250,000-400,000/year in salary plus $50,000-100,000 in benefits and equity. A fractional CTO at $10,000/month costs $120,000/year with no benefits or equity obligation. You get 80% of the strategic value at 20-40% of the cost.
**Q: What is the ROI of a fractional CTO?**
A: Typical ROI is 3-10x the engagement cost. A fractional CTO saves money by: preventing $50K-200K architecture mistakes, reducing engineering hiring costs by 30-50% through better vetting, cutting deployment time by 60-80% through CI/CD improvements, and identifying features to kill that would waste 2-4 months of development time.
**Q: Should I pay a fractional CTO hourly or on retainer?**
A: Retainer is better for ongoing engagements (3+ months). You get predictable costs and the CTO prioritizes your work. Hourly is better for short-term projects (architecture review, security audit, technical due diligence). Most fractional CTOs prefer retainer because it allows them to be embedded in your team rather than context-switching.
**Q: Do fractional CTOs take equity?**
A: Some do, some don't. Equity-only arrangements are rare and usually a red flag — it means the CTO isn't confident enough in their value to charge cash. The healthiest structure is cash retainer plus a small equity component (0.25-1%) for long-term engagements over 6 months. Never give more than 1% equity to a fractional hire.
**Q: How do fractional CTO rates compare to dev agency rates?**
A: Fractional CTOs cost $150-350/hour. Dev agencies charge $150-300/hour for senior developers. The costs are similar, but the value is different. A dev agency builds what you tell them. A fractional CTO tells you what to build — and what NOT to build. The strategic value of avoiding a $100K mistake far exceeds the hourly rate difference.
---
## The Sales Follow-Up Sequence That Actually Converts (Templates Included)
- **URL:** https://justinmckelvey.com/blog/sales-follow-up-sequence-that-converts
- **Published:** April 16, 2026
- **Updated:** April 16, 2026
- **Category:** Founder Sales
- **Reading time:** 6 min
- **Description:** The 5-touch follow-up sequence that converts prospects into customers. Includes email templates for every stage. 80% of sales close between follow-up 3 and 5.
TL;DR: The Follow-Up Math80% of sales require at least 5 follow-up contacts. 92% of people give up before reaching 5. This gap is the single biggest revenue leak in founder-led sales. The founders who follow up consistently — with value in every touchpoint — close 3-5x more deals than those who send one "just checking in" email and move on. As of 2026, the follow-up problem hasn't changed despite better tools. What has changed is that the founders who get this right have an even bigger advantage because everyone else's attention span is getting shorter.
This guide is the exact follow-up sequence I teach to every founder I work with, including templates you can adapt for your business. It works for $50/month SaaS and $50,000 consulting engagements — the principle is the same: be useful, be consistent, be human.
Why "Just Checking In" Doesn't WorkEvery founder sends this email. Nobody responds to it.
"Hi [Name], just checking in on our conversation from last week. Have you had a chance to think about [product]? Let me know if you have any questions!"
This email fails because it adds zero value. It asks the prospect to do work (think about your product) without giving them a reason to. It signals that you don't have anything new to offer. And the cheerful tone feels generic — like you're sending the same email to 50 people, which you probably are.
The principle behind every follow-up should be: if I removed my product from this email, would the recipient still find it useful? If yes, send it. If no, you're asking for their attention without earning it.
The 5-Touch Follow-Up SequenceTouch 1: The Same-Day Recap (Day 0)Purpose: Prove you listened. Create a written record. Confirm next steps.
When: Within 2 hours of your conversation.
Template:
"Hi [Name],
Thanks for the conversation today. A few things that stood out:
• You mentioned [specific pain point in their words, not yours]
• The current approach is costing you approximately [time/money they mentioned]
• You're looking for [specific outcome they described]
Based on that, I think [one specific recommendation]. Here's what I'd suggest as a next step: [specific action with timeline].
Does [day/time] work for a follow-up?"
Why this works: You're reflecting their words back to them, which builds trust. You're demonstrating that you paid attention. And you're proposing a specific next step — not an open-ended "let me know."
Touch 2: The Value Add (Day 3)Purpose: Build the relationship without asking for anything.
When: 3 days after the call.
Template:
"Hi [Name],
After our conversation, I came across [article/data point/resource] that's directly relevant to [their specific problem].
[1-2 sentence summary of why it's relevant to their situation specifically]
Thought you'd find it useful. No need to respond — just wanted to pass it along."
Why this works: You're being genuinely helpful with no ask. This builds trust and keeps you top of mind. The "no need to respond" line reduces friction — ironically, people respond more when they don't feel pressured to.
What to share: An industry report. A relevant blog post (yours or someone else's). A data point from your experience. An introduction to someone in your network who could help them. The more specific to their situation, the better.
Touch 3: The Specific Question (Day 7)Purpose: Re-engage with something that requires a response.
When: 7 days after the call.
Template:
"Hi [Name],
You mentioned that [specific problem] costs your team about [metric they shared]. I was curious — have you had a chance to calculate the total impact across [broader scope]?
I ask because I ran some numbers based on similar companies and found [specific insight]. Happy to share the analysis if it would be useful."
Why this works: You're referencing their specific situation (showing you remember), asking a question they can answer quickly, and offering something valuable in return. This converts silent prospects back into active conversations 30-40% of the time.
Touch 4: The New Angle (Day 14)Purpose: Give them a new reason to re-engage.
When: 14 days after the call.
Template (case study version):
"Hi [Name],
Quick update — we just worked with a company similar to yours ([industry/size]) who had the same challenge with [specific problem].
The result: [specific metric improvement] in [timeframe].
Their situation was similar to what you described. Would it be helpful to walk through what they did differently? I can share the specifics in a 15-minute call."
Template (offer version):
"Hi [Name],
I know [problem] is still a priority for you. We're running a limited pilot program for [number] companies this month — [specific benefit: extended trial, reduced onboarding cost, hands-on setup help].
If the timing works, I'd love to include you. Want me to send the details?"
Why this works: New information creates a new reason to engage. A case study provides social proof. A limited offer creates urgency. Both give the prospect something to react to — not just a reminder that you exist.
Touch 5: The Honest Check-In (Day 30)Purpose: Get a clear yes or no so you can manage your pipeline.
When: 30 days after the call.
Template:
"Hi [Name],
I want to be respectful of your time, so I'll be direct: is [solving specific problem] still a priority for you this quarter?
If yes, I'd love to pick up where we left off. If the timing isn't right, no worries at all — I'll check back in a few months.
Either way, I appreciate you taking the time to talk with me."
Why this works: It gives them permission to say no, which is valuable for both of you. Prospects who aren't buying are clogging your pipeline and taking mental energy. A "no" frees you to focus on better-fit opportunities. And the graceful exit makes it easy for them to come back later — which they often do.
After the 5 Touches: The Nurture CadenceIf you've completed all 5 touches with no conversion, move the prospect to a monthly nurture list. Once per month, send one useful piece of content — your latest blog post, an industry insight, or a relevant case study. No pitch, no ask. Just value.
The goal is simple: stay in their peripheral vision so that when the timing is right, you're the first person they think of. I've had prospects convert 6-12 months after the initial conversation because a monthly nurture email hit them at exactly the right moment.
The Numbers That Tell You It's WorkingResponse rate after Touch 2: 25-35% of prospects who went silent after Touch 1 should re-engage after a value-add email. If it's below 15%, your content isn't relevant enough to their situation.
Conversation recovery after Touch 3: 30-40% of stalled conversations should restart after a specific question. If it's below 20%, your questions aren't specific enough.
Close rate from sequence: 20-30% of prospects who complete the full 5-touch sequence should convert. This is significantly higher than the 2-5% you'd get from a single follow-up — because each touch builds trust and demonstrates value.
Nurture-to-close rate: 5-10% of monthly nurture contacts should convert over 6-12 months. This is "free" revenue from prospects you already invested time in.
Common Follow-Up MistakesFollowing up too fast. Emailing daily or every other day feels desperate and burns the relationship. Respect the cadence: same day, day 3, day 7, day 14, day 30. Patience signals confidence.
Generic templates without personalization. The templates above are frameworks, not copy-paste scripts. Replace every bracket with specific details from your actual conversation. A personalized follow-up converts 3-5x better than a generic one.
Only following up by email. Mix channels. If you've sent 3 emails with no response, try a LinkedIn message or a brief voicemail. Sometimes the prospect isn't ignoring you — they just don't check that inbox.
No CRM tracking. If you're tracking follow-ups in your head, you're dropping 30-50% of them. Use a CRM, a spreadsheet, or even calendar reminders — but track every touch.
This follow-up sequence is one part of the complete founder-led sales framework. For the discovery call questions that set up a strong follow-up, start there. For the full GTM playbook that frames your sales process, start there. Your follow-up effectiveness also depends on how well you've positioned your product — clear positioning makes every touchpoint more compelling.
If you want help building your sales process end-to-end, book a strategy call.
### Frequently Asked Questions
**Q: How many follow-ups does it take to close a sale?**
A: Research consistently shows that 80% of sales require at least 5 follow-up contacts after the initial meeting. Most deals close between follow-up 3 and 5. Yet 44% of salespeople give up after just one follow-up, and 92% give up after four. Persistence — with value in each touchpoint — is the single biggest differentiator.
**Q: How do you follow up without being annoying?**
A: Every follow-up must add value. Share a relevant article, a case study, a data point, or an introduction. If your follow-up is just 'checking in' with no new information, it's noise. The rule: if you removed your product from the email, would the message still be useful? If yes, it's a good follow-up. If no, don't send it.
**Q: How long should you wait between follow-ups?**
A: After the initial call: same day recap. Then: day 3 (value-add), day 7 (specific question), day 14 (new angle or offer), day 30 (honest check-in). After day 30, move to a monthly nurture cadence. The exact timing matters less than consistency — never go more than 2 weeks without a touchpoint during active pursuit.
**Q: What should you say in a sales follow-up email?**
A: Each follow-up has a different purpose. Day 1: recap what they told you + next steps. Day 3: share something useful (article, data, introduction). Day 7: ask a specific question related to their problem. Day 14: offer a new angle (case study, limited offer, revised proposal). Day 30: honestly ask if it's still a priority. Never say 'just checking in.'
**Q: When should you stop following up?**
A: Stop active follow-up after 5 touches if you've received no response. Move them to a monthly nurture list (share one useful piece of content per month). If they explicitly say 'not interested,' stop immediately and thank them for their time. If they say 'not right now,' ask when to check back and set a reminder.
**Q: Should you follow up by email or phone?**
A: Match the channel your prospect used. If they emailed you, follow up by email. If they called, call back. For LinkedIn connections, use LinkedIn messages. The best approach uses 2-3 channels: email as the primary, with one LinkedIn or phone touchpoint mixed in to break through inbox fatigue.
---
## How to Raise Prices Without Losing Customers: The SaaS Founder's Playbook
- **URL:** https://justinmckelvey.com/blog/how-to-raise-prices-without-losing-customers
- **Published:** April 16, 2026
- **Updated:** April 16, 2026
- **Category:** Pricing Strategy
- **Reading time:** 6 min
- **Description:** Raise SaaS prices 20-30% with less than 5% churn. The playbook: when to raise, how to communicate, grandfathering strategy, and real results.
TL;DR: The Numbers Behind Price IncreasesA 20-30% price increase typically causes less than 5% churn, and the revenue gain from the 95% who stay far exceeds the loss. Most SaaS founders know they're underpriced. They just don't know how to raise prices without triggering a mass exodus. After managing price increases across dozens of products, here's the reality: the fear is always worse than the outcome. The founders who raise prices strategically grow faster than those who don't — because they have more revenue to reinvest in the product their customers love.
How to Know It's Time to Raise PricesIf any of these are true, you're leaving money on the table:
Nobody has said "that's too expensive" in the last 3 months. Zero price pushback means you're priced below what the market will bear. The ideal pushback rate is 10-20% of prospects. If it's 0%, you're significantly underpriced.
Your product is meaningfully better than when you set the price. You've added features, improved performance, expanded integrations, grown your customer base. The value increased. The price didn't. Every month this gap widens, you're subsidizing customers who would happily pay more.
Your competitors charge more for less. If a competitor with fewer features charges 2x your price and still has customers, your pricing is the problem — not theirs. Customers associate low prices with low quality. Sometimes raising prices actually increases conversion because the product "feels" more premium.
Your unit economics don't work. If your CAC payback period is longer than 18 months or your LTV:CAC ratio is below 3:1, you either need to reduce acquisition costs or increase revenue per customer. Raising prices is often the faster fix.
Step 1: Set Your New PriceDon't guess. Use data.
Value-anchor method: Calculate the total value your product delivers (time saved + money saved + revenue generated). Price at 20-30% of that value. If your tool saves customers $500/month and you're charging $49, you have room to move to $99-149.
Competitor-benchmark method: Map every competitor's pricing. If you're in the bottom 25%, you can move to the median without resistance. If you're already at the median, you need a differentiation story to justify going higher.
The "10% test": Raise prices 10% for new signups immediately. Track conversion rate for 30 days. If conversion doesn't drop, raise another 10%. Repeat until you see a measurable decline. This is the lowest-risk method because it only affects new customers.
Step 2: Grandfather Existing CustomersThis is the move that eliminates 90% of the churn risk:
Announce the new pricing publicly and make it clear that existing customers keep their current rate for 6-12 months. This creates urgency for new prospects ("sign up at the current rate before it increases") while protecting your relationship with existing customers.
The grandfather timeline:
Month 0: New pricing applies to all new customers immediately.
Months 1-6: Existing customers stay at old pricing. Use this time to ship improvements that justify the increase.
Month 6: Send a 60-day notice to existing customers about the transition to new pricing. Reference the specific improvements made since the change.
Month 8: Existing customers transition to new pricing at their next billing cycle.
In my experience, less than 3% of grandfathered customers churn at the transition point — significantly lower than the 5% you'd see with an immediate increase — because you've had 6 months to demonstrate additional value.
Step 3: Communicate the IncreaseThe announcement email is the most important piece of copy you'll write all year. Here's the structure that works:
Subject line: "What's changing (and what you're getting)" — not "Price increase notice." Frame it as an update, not a penalty.
Opening paragraph: Lead with value, not price. "In the last 6 months, we've shipped [specific features], improved [specific metric], and added [specific integration]. Our customers are now [specific outcome] — and we're investing more to make the product even better."
The change: Be direct. "Starting [date], our pricing will update to [new price]. Your current rate of [old price] is locked in until [grandfather date]."
The reasoning: One sentence. "This reflects the increased value the product delivers and funds the improvements our customers have been asking for." Don't apologize. Don't over-explain. Confidence signals that the price is justified.
Send from the founder. Not from "The Team." Not from "Support." A price increase email from the founder feels personal and respectful. From a generic team address, it feels corporate and impersonal.
Step 4: Handle PushbackSome customers will respond to the announcement. This is good — it means they care enough to engage. Here's how to handle the three types of pushback:
"I can't afford the increase." Ask: "What would the product need to do for the new price to feel like a no-brainer?" Their answer is a product roadmap item. If the gap is small, offer a 3-month extension at the current rate. If the gap is fundamental, they may not be your target customer anymore — and that's okay.
"Your competitor is cheaper." Don't match their price. Instead: "What specifically does [competitor] offer that we don't? And what do we offer that they don't?" This reframes the conversation from price to value. If they genuinely prefer the competitor, let them go gracefully — they'll remember how you handled it and may come back.
"I've been a loyal customer." Acknowledge it sincerely. "You have, and we appreciate it. That's why you're grandfathered at your current rate until [date]. We want to make sure the additional value we're building justifies the change." Loyalty deserves recognition, not a permanent discount.
Step 5: Measure the ImpactTrack these metrics for 90 days after the increase:
Churn rate: Compare to your baseline. An increase of 2-5 percentage points is normal and acceptable. Above 10 percentage points means you raised too much or communicated poorly.
New customer conversion rate: If new signups decline more than 10%, your pricing page or positioning needs adjustment — not necessarily the price itself. Often, better value communication fixes conversion without lowering the price.
Revenue per customer: This should increase immediately for new customers and within 6-8 months for existing customers (after the grandfather period). If total revenue is higher within 90 days despite some churn, the increase was correct.
Expansion revenue: Do customers on new pricing upgrade to higher tiers at the same rate? If tier-upgrade rates decline, the gap between tiers may need adjustment.
Real Results from Price Increases I've ManagedB2B SaaS tool ($49 → $79/month): 61% price increase. 3.2% churn in the first 90 days. Net revenue increase of 47% after accounting for lost customers. The founder's only regret: not doing it 6 months earlier.
Consulting productized service ($2,500 → $3,500/month): 40% increase. Zero client churn. Every existing client accepted the new rate because the value delivered had increased dramatically over the previous year. Two clients said "honestly, we expected this sooner."
Developer tool ($19 → $29/month): 53% increase. 7% churn — slightly higher than ideal. But the churned users were overwhelmingly on the lowest-usage tier and generating the most support tickets. Revenue per support interaction improved by 80%. Net win.
The Compounding Cost of Not Raising PricesEvery month you don't raise prices, the gap between your value and your price grows. Here's what that costs over time:
A product charging $49/month with 500 customers generates $294K/year. If the product is worth $79/month (based on value delivered), that's $180K/year in uncaptured revenue. Over 3 years, that's $540K — enough to fund an additional engineer, a marketing hire, or a year of runway.
The compounding effect is even more powerful: the revenue you capture today funds product improvements that justify future price increases. The revenue you don't capture starves the product of investment, making it harder to justify any price at all.
Price increases aren't greed. They're the mechanism that funds the product your customers rely on.
Your pricing increase connects to your GTM strategy and how you structure your technical leadership. For the full pricing framework including how to choose your model and structure tiers, read the SaaS pricing strategy guide. Your pricing connects directly to your positioning and your sales conversations.
If you need help restructuring your pricing or planning a price increase, book a strategy call.
### Frequently Asked Questions
**Q: How much can you raise SaaS prices without losing customers?**
A: Most SaaS companies can raise prices 20-30% with less than 5% customer churn. The revenue gain from the 95% who stay far exceeds the revenue loss from the 5% who leave. Companies that haven't raised prices in 12+ months can often raise by 30-50% because their product value has increased significantly while pricing stayed flat.
**Q: How do you announce a price increase to customers?**
A: Announce 30-60 days before the increase takes effect. Lead with value — what they're getting, not what they're paying. Pair the announcement with a new feature or improvement. Offer existing customers a grandfathered rate for 6-12 months. Send the announcement from the founder, not a generic 'team' email.
**Q: Should you grandfather existing customers when raising prices?**
A: Yes, for 6-12 months. Grandfathering eliminates the immediate churn risk and gives you time to demonstrate additional value. After the grandfather period, transition existing customers to the new pricing with another 30-day notice. Most will stay because you've had 6-12 months to prove the higher price is justified.
**Q: How often should SaaS companies raise prices?**
A: Review pricing annually. If your product has improved significantly (new features, better performance, more integrations), raise prices for new customers immediately and for existing customers at renewal. Most SaaS products are underpriced because founders set the price at launch and never revisit it.
**Q: What if customers complain about a price increase?**
A: Some pushback is healthy — it means you're priced correctly. If zero customers complain, you didn't raise enough. If more than 15-20% complain, you raised too much or communicated poorly. For the customers who push back, ask what additional value would justify the increase. Their answers will guide your product roadmap.
---
## The 5 GTM Mistakes That Kill Startups Before They Launch
- **URL:** https://justinmckelvey.com/blog/gtm-mistakes-that-kill-startups
- **Published:** April 16, 2026
- **Updated:** April 16, 2026
- **Category:** Go-to-Market Strategy
- **Reading time:** 5 min
- **Description:** The 5 go-to-market mistakes that kill startups before launch. Real examples from a fractional CTO who's launched 50+ products and seen every failure mode.
TL;DR: These 5 Mistakes Kill More Startups Than Bad ProductsThe product is rarely the problem. The go-to-market is. After launching 50+ products and advising dozens of founders on their GTM strategies, I've watched the same five mistakes kill startups over and over. Most of these mistakes happen before launch day — by the time the product is live, the damage is already done. As of 2026, the startup failure rate hasn't changed, but the reasons have become more predictable. Fix these five and you've eliminated 80% of the risk.
Mistake 1: Launching to Everyone Instead of SomeoneThe symptom: "Our product is for any business that needs better [productivity/communication/data/etc.]."
Why it kills you: When you target everyone, your messaging resonates with nobody. "Better productivity for businesses" doesn't make anyone think "that's exactly my problem." It makes them scroll past. Generic positioning creates generic results.
A founder I worked with pivoted from "AI-powered analytics for businesses" (zero traction in 3 months) to "automated lead scoring for real estate teams spending 4+ hours/day qualifying leads" (first 10 customers in 3 weeks). Same product. Different positioning. Completely different outcome.
The fix: Pick one specific customer type. Serve them so well that they tell other people like them. You can expand later — but only after you've won a beachhead. Read the positioning guide for the exact formula.
Mistake 2: Building Before SellingThe symptom: "We're going to build the product first, then figure out how to sell it."
Why it kills you: You spend 6-12 months building something based on assumptions about what customers want. Then you launch and discover that nobody wants it — or they want a different version of it — or the problem isn't painful enough to pay for a solution. You've burned runway on code instead of learning.
The most successful GTM strategies I've seen all share the same pattern: the founder sold the product before it was fully built. Not vaporware — but a minimum version, a pilot program, a waitlist with a deposit. If people won't pay before the product is perfect, they probably won't pay after either.
The fix: Sell first, build second. Get 5 paying customers (or signed LOIs, or deposits) before you write a line of code beyond your MVP. If you can't sell it, building it won't help.
Mistake 3: Starting with Channels Instead of ConversationsThe symptom: "Should we do LinkedIn ads, Google Ads, or content marketing?"
Why it kills you: You're optimizing distribution before you've validated the message. Even the best channel in the world can't sell a message that doesn't resonate. Founders who start with channels spend $5,000-50,000 testing ads before they realize the problem isn't the ad — it's what the ad says.
I worked with a SaaS founder who spent $12,000 on Google Ads in their first 2 months. Result: 3,000 clicks, 40 trials, 2 paying customers. CAC: $6,000 per customer on a $99/month product. The math was impossible. When we paused ads and had the founder do direct outreach to 30 people, they closed 8 customers in 3 weeks at a CAC of effectively $0 (just the founder's time).
The fix: Start with 20 direct conversations. No ads, no content, no automation. Just talk to potential customers. The insights from those conversations will tell you what message to use and which channel to invest in. The 90-day GTM playbook walks through this step by step.
Mistake 4: Confusing Launch Day with GTM StrategyThe symptom: "We're going to launch on Product Hunt, send a press release, and post on Hacker News. That's our GTM."
Why it kills you: A launch event generates a spike of attention that lasts 24-72 hours. Then it's over. If your entire GTM strategy is "launch day," you have 3 days of momentum and nothing after. Product Hunt, Hacker News, and press coverage are accelerants — they amplify an existing engine. They don't replace one.
The founders who succeed treat launch day as one tactic within a 90-day GTM plan, not the plan itself. Before launch, they have 20-30 active conversations in their pipeline. After launch, they have a content engine, a sales process, and a follow-up system. The launch spike adds fuel to an engine that's already running.
The fix: Build your sales pipeline before launch day. Have 30+ warm prospects who are expecting the product. When you launch, email them personally. The launch becomes the trigger for conversion, not the trigger for awareness.
Mistake 5: Scaling Before Finding Product-Market FitThe symptom: "We just raised $1M. Time to hire a marketing team and start running paid campaigns."
Why it kills you: Scaling amplifies whatever you already have. If you have a product that 3 out of 10 prospects love, scaling finds you more of those 3-out-of-10 people. If you have a product that 1 out of 50 prospects somewhat likes, scaling just burns cash faster.
Product-market fit has a specific feeling: customers come to you instead of you going to them. Support tickets ask for more features, not for basic functionality. Usage grows even when you stop marketing. If you don't have this feeling, you don't have PMF, and scaling will hurt more than help.
I've seen founders burn through $200K-500K in marketing spend trying to scale a product that hadn't found PMF. In every case, the money would have been better spent on 50 more customer conversations and 3 more product iterations.
The fix: Before you scale, answer these questions with data: Is your monthly churn below 5%? Do customers refer others without being asked? Can you predict your close rate? If any answer is no, keep iterating. Scale when the engine works, not before.
The GTM Audit: Are You Making These Mistakes?Quick self-assessment. Score yourself 0-2 on each:
Customer specificity (0-2): 0 = "any business that needs X." 1 = "SMBs in Y industry." 2 = "VP of Sales at 20-50 person B2B SaaS companies who spend 5+ hours/week on manual lead qualification."
Validation method (0-2): 0 = "we built it and they'll come." 1 = "we did a survey." 2 = "we have 5+ paying customers or signed LOIs."
Channel selection (0-2): 0 = "we're trying everything." 1 = "we chose based on what competitors do." 2 = "we chose based on where our first 10 customers came from."
Launch plan (0-2): 0 = "big launch day, then figure it out." 1 = "launch plus ongoing content." 2 = "30+ warm prospects before launch, with a 90-day pipeline plan."
Scale timing (0-2): 0 = "we're scaling now with <10 customers." 1 = "we have 20+ customers but inconsistent process." 2 = "we have 30+ customers, predictable conversion, and sub-5% churn."
Score 8-10: Your GTM foundation is solid. Focus on optimization.
Score 5-7: You have gaps. Prioritize the lowest-scoring area.
Score 0-4: Stop scaling. Go back to customer conversations.
If you scored below 7 and want help fixing the gaps, book a strategy call. I'll audit your GTM approach and tell you exactly where to focus. For the full GTM framework, read the 90-day GTM playbook.
### Frequently Asked Questions
**Q: What is the most common go-to-market mistake?**
A: Starting with channels instead of customers. Founders decide they need 'LinkedIn ads' or 'a content strategy' before they've had 10 real conversations with potential buyers. The channel decision should follow from knowing exactly who your customer is and where they look for solutions — not precede it.
**Q: Why do most startup launches fail?**
A: Most startup launches fail because the founder launches to everyone instead of someone. A launch that targets 'all small businesses' reaches nobody. A launch that targets 'solo marketing consultants earning $100-300K who spend 10+ hours/week on admin tools' reaches exactly the right people with a message that resonates.
**Q: How do you know if your GTM strategy is working?**
A: Track three numbers weekly: qualified conversations booked (leading indicator), conversion rate from conversation to customer (process quality), and customer acquisition cost (efficiency). If qualified conversations are increasing and conversion rate is steady or improving, your GTM is working. If either number is declining, something needs to change.
**Q: Should you launch on Product Hunt?**
A: Only if your target customer uses Product Hunt — which means tech-savvy early adopters and builders. For B2B SaaS targeting non-technical buyers, Product Hunt traffic rarely converts to paying customers. It's great for developer tools, productivity apps, and AI products. Less useful for industry-specific or enterprise software.
**Q: How much should a startup spend on go-to-market?**
A: In Phase 1 (first 30 days), spend almost nothing — your time is the investment. LinkedIn Sales Navigator ($80/month) and a domain/hosting ($25/month) is all you need. In Phase 2 (days 31-60), budget $500-2,000/month for content and basic tooling. Only in Phase 3 (days 61-90+) should you consider paid channels, and only after you've proven what messaging works through organic effort.
---
## Go-to-Market Strategy: The Founder's Playbook for Getting to Revenue in 90 Days
- **URL:** https://justinmckelvey.com/blog/go-to-market-strategy-founders-playbook
- **Published:** April 16, 2026
- **Updated:** April 16, 2026
- **Category:** Go-to-Market Strategy
- **Reading time:** 10 min
- **Description:** The go-to-market playbook for founders. Launch to revenue in 90 days using the GTM framework from 50+ product launches. Real examples, timelines.
TL;DR: The GTM Framework That Gets to RevenueMost go-to-market strategies fail because they start with channels ("should we do LinkedIn ads or Google ads?") instead of customers ("who exactly has this problem and where do they look for solutions?"). After launching 50+ products over 15 years — including products that grew to 500K+ users and generated $53M+ in revenue — I've refined a 90-day GTM playbook that consistently gets founders from launch to repeatable revenue. As of 2026, "go to market strategy" gets 5,400 searches per month and "GTM strategy" gets 2,900 more. Most of the results are theoretical frameworks from consultants who've never launched a product. This guide is the practical version from someone who does it for a living.
The playbook has three phases: Validate (days 1-30), Systematize (days 31-60), and Scale (days 61-90). Each phase has specific goals, actions, and metrics. Skip a phase and the whole thing falls apart.
Why Most GTM Strategies FailI've reviewed hundreds of GTM plans from startup founders. The same three mistakes show up in 90% of them.
Mistake 1: Starting with channels instead of customers. "We're going to do LinkedIn outreach, run Google Ads, publish blog content, attend conferences, and build a referral program." That's not a strategy. That's a list of things you saw other companies do. A strategy starts with: who is the customer, what's their problem, and where do they currently look for solutions? The channel follows from that answer — not the other way around.
Mistake 2: Trying to scale before validating. Founders raise money, hire a marketing team, and start running paid campaigns before they've had 10 real conversations with potential customers. They're scaling a message they haven't tested to an audience they haven't validated through channels they haven't proven. This is how startups burn $50K-200K in 6 months with nothing to show for it.
Mistake 3: Treating GTM as a one-time plan instead of a learning loop. The best GTM strategies evolve weekly based on real data from real conversations. Your first positioning will be wrong. Your first channel bet will underperform. Your first pricing will be too low. The founders who win aren't the ones who guess right on day one — they're the ones who learn and adjust fastest.
Phase 1: Validate (Days 1-30)Goal: Find 10 customers through direct, manual effort. Not 100. Not 1,000. Ten people who pay you money for your product. If you can't find 10 with direct effort, no amount of marketing spend will save you.
Week 1: Nail Your PositioningBefore you contact a single prospect, you need a one-sentence positioning statement that makes the right person say "I need that." Use the formula from my product positioning guide:
"For [specific customer] who [specific problem], [product name] is the [category] that [key differentiator]."
Write 5 versions. Test each by saying it out loud to someone who doesn't know your product. The version that gets "tell me more" instead of "what do you mean?" is your winner.
This week's deliverable: one sentence that you'll use as your opening line in every outreach message, your homepage headline, and your elevator pitch. Everything else in the GTM plan flows from this sentence.
Week 2: Build Your Target ListCreate a list of 50 specific people who match your ideal customer. Not companies — people. With names, titles, emails, and the specific reason you think they have the problem you solve.
Where to find them:
Your network (start here). You already know 5-10 people who have this problem or know someone who does. Warm introductions convert at 30-50%. Cold outreach converts at 2-5%. Start with the highest-probability path.
LinkedIn Sales Navigator ($80/month). Filter by title, company size, industry, and geography. Save profiles. Write personalized connection requests that reference something specific about them — not a templated pitch.
Communities. Reddit, Slack groups, Discord servers, industry forums. Don't spam. Join conversations, be helpful, and identify people who express the problem you solve. Then reach out directly.
Competitors' customers. Who reviews your competitors on G2, Capterra, or Trustpilot? Who complains about them on Twitter? These people already have the problem and are actively looking for better solutions.
Weeks 3-4: Have 15-20 ConversationsReach out to your list of 50. Goal: book 15-20 discovery calls. Use the founder-led sales framework — ask about their problem, listen more than you talk, and only pitch if there's genuine fit.
From 15-20 conversations, you should close 3-5 paying customers or trials. If you close zero, one of three things is wrong: your targeting (wrong people), your positioning (wrong message), or your product (wrong solution). The conversations will tell you which one.
What to track in Phase 1:
Response rate on outreach (target: 15-25% for warm, 3-8% for cold). Conversation-to-qualified rate (target: 40-60%). Qualified-to-closed rate (target: 20-40% for a new product). The exact numbers matter less than the trend — each week should be better than the last as you refine your messaging.
Phase 2: Systematize (Days 31-60)Goal: Turn your Phase 1 wins into a repeatable process. You have 5-10 customers. You know what messaging works. You know which type of person buys. Now build the system so it doesn't depend on the founder doing everything manually.
Week 5: Document What WorkedWrite down everything you learned in Phase 1:
Who actually bought? Not who you thought would buy — who did. What's their title? Company size? Industry? What trigger made them start looking for a solution? This is your real ideal customer profile, validated by actual purchases.
What messaging resonated? Which version of your pitch made people lean in? What objections came up repeatedly? What phrase made prospects say "yes, exactly"? This is your sales playbook — the actual words that work.
What channel produced the best results? Was it LinkedIn outreach? Warm introductions? Community engagement? Conference conversations? Double down on the channel that produced the most qualified conversations, not the one that felt easiest.
Week 6: Build Your Content EngineThe conversations you had in Phase 1 are content gold. Every question a prospect asked is a blog post. Every objection is a FAQ. Every "I didn't know that" moment is a social media post.
Publish 2-4 pieces of content per week that address the exact questions and concerns your prospects raised. This does three things: it attracts inbound leads who have the same questions, it gives you material to share in sales follow-ups, and it builds SEO authority for the keywords your customers search.
Use the AI tools to accelerate content production. One hour of writing with Claude or ChatGPT produces what used to take a full day. The insights come from your real conversations — the AI handles the writing labor.
Weeks 7-8: Build Your Sales PipelineMove from ad hoc outreach to a structured pipeline. Set up a simple CRM (HubSpot free tier, Pipedrive, or even a spreadsheet with columns for: name, company, stage, next action, notes).
Create email templates for each stage of your sales process:
Initial outreach: Personalized, references something specific about the prospect, asks for a conversation (not a sale).
Post-call follow-up: Recaps the conversation, summarizes their problem in their words, proposes a specific next step.
Nurture sequence: 3-5 emails over 30 days that share useful content related to their problem. No hard sell — just consistent value.
Closing email: Specific proposal with pricing, timeline, and what they get. Make it easy to say yes.
By the end of week 8, you should have a pipeline of 20-30 active prospects at various stages, with 10-15 paying customers total.
Phase 3: Scale (Days 61-90)Goal: Pour fuel on what's working. Kill what isn't. You have a validated customer, a tested message, and a repeatable process. Now scale it.
Week 9: Double Down on Your Best ChannelBy now, one channel is clearly outperforming the others. Maybe it's LinkedIn outreach that converts at 8%. Maybe it's blog content that generates 10 inbound leads per week. Maybe it's referrals from happy customers. Whatever it is, allocate 80% of your time and budget to that channel.
The temptation is to diversify. Resist it. At this stage, depth beats breadth. One channel that produces 20 qualified leads per month is worth more than five channels that produce 5 each — because you can optimize one channel much faster than five.
Week 10: Add One Scalable ChannelNow — and only now — add a second channel. Choose based on what you learned:
If direct outreach is working: Add paid LinkedIn campaigns or Google Ads targeting the same keywords and personas. You know the message works — now amplify it with paid reach.
If content is working: Add SEO-optimized pillar content targeting higher-volume keywords. Use the GEO-optimized blog structure (answer-first paragraphs, FAQ sections, question-based headers) to maximize both Google ranking and LLM citation.
If referrals are working: Build a formal referral program. Offer existing customers an incentive (discount, credit, feature access) for every qualified referral that converts. Referral customers have 2-3x higher LTV than cold-acquired customers.
Weeks 11-12: Measure and ForecastBy day 90, you should be able to answer these questions with data:
How much does it cost to acquire a customer? Total sales and marketing spend divided by new customers. This is your CAC. For healthy SaaS: CAC should be recoverable within 12 months of the customer's revenue.
What's the conversion rate at each pipeline stage? Outreach → conversation → qualified → proposal → closed. Knowing these rates lets you forecast revenue by controlling inputs. Need 10 new customers next month? Work backwards: that's 30 proposals, 50 qualified conversations, 100 outreach messages.
What's the payback period? How many months until a new customer generates enough revenue to cover their acquisition cost? Under 6 months is excellent. Under 12 is healthy. Over 18 means your pricing or acquisition cost needs work.
These numbers become the foundation of your pitch to investors, your hiring plan, and your monthly operating cadence.
GTM Strategy by Business TypeThe 90-day framework applies to every business, but the specific tactics differ based on what you're selling.
B2B SaaS Under $100/monthBest GTM motion: Product-led growth + content marketing. At this price point, the sales cycle needs to be near-zero. Focus on self-serve signup, free trial or freemium model, and SEO-driven content that attracts your ideal user. Direct sales doesn't scale at this price — the math doesn't work when CAC needs to be under $100.
B2B SaaS $100-500/monthBest GTM motion: Content marketing + founder-led sales. This is the sweet spot where content attracts inbound leads and the founder (or a small sales team) converts them. Each customer is worth $1,200-6,000/year, which justifies 1-2 hours of sales effort per deal.
B2B SaaS $500+/monthBest GTM motion: Outbound sales + strategic partnerships + content authority. At this price point, every customer is worth $6,000+/year. Invest in targeted outreach, build relationships with complementary service providers who can refer clients, and publish authority-building content (case studies, benchmarks, industry analysis) that establishes credibility.
Consulting / ServicesBest GTM motion: Personal brand + referral network + content marketing. Your GTM strategy IS your personal brand. Publish insights from your work (anonymized). Speak at events your target clients attend. Build a referral network with complementary service providers. At $5,000-20,000/month engagements, you need 3-5 active clients, not 3,000.
The GTM Metrics That Actually MatterIgnore vanity metrics (website traffic, social media followers, email list size). Track these five numbers weekly:
1. Qualified conversations per week. This is the leading indicator of revenue. If this number is growing, revenue will follow. If it's flat or declining, nothing else matters.
2. Conversion rate: conversation to customer. This tells you whether your sales process works. Below 10%: your targeting or pitch needs work. 20-30%: you're in a healthy range. Above 40%: you might be qualifying too aggressively and missing opportunities.
3. Time to close. How many days from first conversation to payment? Shorter is better, but consistency matters more. If your average is 21 days, you can forecast revenue 3 weeks out.
4. Customer acquisition cost (CAC). All sales and marketing spend divided by new customers acquired. Track monthly. If CAC is rising, your channels are saturating or your messaging is degrading.
5. Revenue per customer per month. This tells you whether you're attracting the right customers at the right price point. If it's rising, your positioning is improving. If it's falling, you're attracting less-qualified buyers.
The GTM Plan TemplateHere's the one-page GTM plan I fill out with every client before we start execution. Fill in each line:
Target customer: [One sentence: title, company type, specific problem]
Positioning: [One sentence using the formula]
Primary channel: [Where your target customer already looks for solutions]
Pricing: [Price point and model]
Sales motion: [Self-serve / founder-led / outbound]
Day 30 goal: [Number of paying customers]
Day 60 goal: [Pipeline size and conversion rate]
Day 90 goal: [Monthly revenue target and CAC]
If you can't fill in every line, you're not ready to execute. Go back to customer conversations and positioning work until you can.
Getting Started This WeekIf you're pre-launch or early-stage, do three things this week:
1. Write your one-sentence positioning using the positioning formula.
2. List 10 people you know who match your target customer.
3. Send each of them a message asking for a 15-minute conversation about their experience with the problem you solve.
Those 10 conversations will teach you more about your go-to-market than any strategy document. They'll refine your positioning, reveal your best channel, surface objections you haven't considered, and potentially produce your first 2-3 customers.
Your GTM strategy connects to your positioning (what you say), your pricing (what you charge), and your sales process (how you close). Get all four right and you have a repeatable revenue engine.
If you need help building your GTM strategy, book a strategy call. I'll review your current approach and tell you where the biggest opportunity is — and what to stop doing.
### Frequently Asked Questions
**Q: What is a go-to-market strategy?**
A: A go-to-market (GTM) strategy is the plan for how you'll reach customers and generate revenue with a new product or in a new market. It covers who you're selling to, what problem you're solving for them, how you'll reach them, what you'll charge, and how you'll close deals. A good GTM strategy gets you to paying customers in 60-90 days.
**Q: What are the 5 elements of a go-to-market strategy?**
A: The five elements are: (1) Target customer — exactly who you're selling to, (2) Value proposition — why they should care, (3) Channel strategy — how you'll reach them, (4) Pricing and revenue model — what you'll charge and how, (5) Sales motion — how you'll convert interest to revenue. Most founders skip #1 and jump to #3, which is why most GTM strategies fail.
**Q: How long does it take to execute a go-to-market strategy?**
A: A focused GTM strategy takes 90 days from launch to repeatable revenue. Days 1-30: validate your positioning and find your first 10 customers through direct outreach. Days 31-60: refine your pitch and build a repeatable sales process. Days 61-90: scale what's working and kill what isn't. Most founders try to do all three simultaneously, which is why it takes them 12 months instead of 3.
**Q: What is the difference between a go-to-market strategy and a marketing strategy?**
A: A marketing strategy is ongoing — it covers brand awareness, content, advertising, and lead generation over the life of the product. A GTM strategy is time-bound — it covers the specific plan for entering a market and getting to initial revenue. Your GTM strategy is the first 90 days. Your marketing strategy is everything after. The GTM strategy determines what your marketing strategy should focus on.
**Q: What is the best go-to-market strategy for B2B SaaS?**
A: For B2B SaaS under $500/month: founder-led sales combined with content marketing. The founder handles the first 30-50 sales personally while building an SEO-driven content engine that generates inbound leads. For B2B SaaS above $500/month: founder-led sales combined with strategic partnerships and targeted outbound. At higher price points, each customer relationship justifies more sales investment.
**Q: How do you choose the right channel for your GTM strategy?**
A: Don't choose a channel — find your customers and go where they already are. If they're on LinkedIn, do LinkedIn outreach. If they attend specific conferences, attend those conferences. If they search Google for solutions, invest in SEO. The best channel is wherever your ideal customer already spends time looking for answers to the problem you solve.
**Q: What is the most common GTM mistake?**
A: Starting with channels instead of customers. Founders ask 'should we do LinkedIn ads or Google ads?' before they ask 'who exactly is our customer and where do they look for solutions?' The channel question is irrelevant until you know who you're trying to reach. Start with 10 manual conversations with your target customer, then scale the channel that generated those conversations.
**Q: Do I need a GTM strategy for a side project or indie product?**
A: Yes, but a simpler version. Even a side project needs a clear answer to: who is this for, where will they find it, and why will they pay? The indie GTM playbook: launch on Product Hunt or Hacker News, share in 3-5 relevant communities, do direct outreach to 20 people who match your target customer. If none of those produce paying customers, the product or positioning needs work.
---
## SaaS Pricing Strategy: How to Stop Leaving Revenue on the Table
- **URL:** https://justinmckelvey.com/blog/saas-pricing-strategy
- **Published:** April 16, 2026
- **Updated:** April 16, 2026
- **Category:** Pricing Strategy
- **Reading time:** 8 min
- **Description:** SaaS pricing framework for founders. Pricing models, 3-tier strategy, unit economics, and how to raise prices without losing customers.
TL;DR: Most Founders Are UnderpricedIf nobody has ever told you your product is too expensive, you're leaving 30-50% of revenue on the table. After working with 50+ products across every pricing model, I've seen the same pattern: founders default to "affordable" pricing because they're afraid of rejection, then slowly realize their $29/month product is solving a $500/month problem. As of 2026, SaaS pricing is the second-highest-leverage decision a founder makes (after product-market fit), yet most founders spend less time on pricing than they spend choosing a domain name.
This guide covers the pricing framework I use with every client: which model to choose, how to structure tiers, what to charge, and how to raise prices without losing customers. Real numbers, real examples, no theory.
The 4 SaaS Pricing Models (And When to Use Each)1. Flat-Rate PricingHow it works: One product, one price, one plan. Everyone pays the same amount regardless of usage or team size. Example: a $49/month tool with no plan tiers.
When it works: Early-stage products with a single clear use case. When your customer base is homogeneous — everyone uses the product the same way and gets the same value. Basecamp used this model famously.
When it doesn't: When different customers get dramatically different value. A solo freelancer and a 50-person team shouldn't pay the same price — you're either overcharging the freelancer or undercharging the team. Most products outgrow flat-rate pricing within 12 months.
2. Tiered Pricing (Recommended for Most SaaS)How it works: 2-4 plans at different price points with different feature sets or usage limits. The classic Starter / Growth / Enterprise structure.
When it works: Almost always. Tiered pricing captures different willingness-to-pay segments, creates clear upgrade paths, and lets you serve multiple customer types without building multiple products. 80%+ of successful SaaS companies use tiered pricing.
The optimal structure: Three tiers. The lowest tier is your entry point (attract customers). The middle tier is your money-maker (60-70% of customers should land here). The highest tier is your enterprise play (capture maximum value from large customers).
3. Usage-Based PricingHow it works: Customers pay for what they consume — API calls, messages sent, storage used, transactions processed. Examples: Twilio, AWS, Stripe.
When it works: When usage directly correlates with value received. If sending more emails generates more revenue for the customer, charging per email aligns your interests. Also works when usage varies dramatically between customers.
When it doesn't: When customers can't predict their bill. Usage-based pricing creates anxiety — "will this month cost $50 or $500?" — that slows adoption. Hybrid models (base fee + usage) solve this by providing a predictable floor.
4. Per-Seat PricingHow it works: Price per user per month. $10/user/month means a 5-person team pays $50 and a 50-person team pays $500.
When it works: When each additional user genuinely receives independent value (collaboration tools, sales CRMs, project management). Per-seat pricing scales naturally with the customer's organization.
When it doesn't: When one person configures the tool and everyone else just views output (analytics dashboards, reporting tools). Charging per seat for viewers feels like a tax. Use per-seat for editors and give free viewer access — it accelerates adoption.
The Value-Based Pricing FrameworkRegardless of which model you choose, the price itself should be based on value, not cost. Here's how to calculate it.
Step 1: Quantify the status quo cost. What does your customer currently pay to solve this problem? Include money (existing tools, services), time (hours spent on manual processes), and opportunity cost (revenue lost by not having a better solution). Be specific: "$200/month on tools + 10 hours/month at $75/hour = $950/month total cost."
Step 2: Price at 20-30% of the value. If you save the customer $950/month, pricing at $200-300/month is a clear win for them and a healthy margin for you. They get a 3-5x return on their investment, and you capture enough value to build a sustainable business.
Step 3: Validate with real conversations. Ask 10 prospects: "If this tool saved you [specific outcome], what would you expect to pay?" Then add 20% to whatever they say. People consistently understate their willingness to pay in hypothetical conversations.
The cardinal rule: If zero out of 10 prospects say "that's too expensive," you're too cheap. The ideal is 10-20% pushback on price. That means you're priced at the maximum the market will bear while still converting the majority.
How to Structure Your 3-Tier PricingThis is the pricing architecture that works for most SaaS products in 2026. Adapt the numbers to your market, but keep the structure.
Tier 1: Starter ($29-79/month)Purpose: get customers in the door. This tier should include the core product with enough value to be genuinely useful, but with limits that naturally push growing users to upgrade.
Common limits: 1 user, limited storage/usage, basic features, community support only. Don't cripple this tier — unhappy Starter customers don't upgrade, they churn.
Tier 2: Growth ($99-249/month) — Label as "Most Popular"Purpose: capture the majority of your revenue. This tier removes the Starter limitations and adds the features that teams need: multiple users, integrations, automation, priority support.
Price the Growth tier at 2-4x the Starter tier. This ratio makes Starter feel like a great deal (entry point) while making Growth the obvious choice for anyone who's serious. The "Most Popular" badge acts as social proof and anchoring.
Tier 3: Scale/Enterprise ($299-999/month or custom)Purpose: capture maximum value from high-use customers. This tier includes everything in Growth plus: unlimited usage, SSO, custom integrations, SLA, dedicated support, and advanced analytics.
Don't list a price above $499/month for the Enterprise tier on your website — use "Contact us" instead. This signals that you're worth more than what a pricing page can convey and lets you custom-price based on the customer's size and needs.
The 5 Pricing Mistakes That Kill SaaS Revenue1. Pricing Based on Cost, Not Value"It costs us $5/month to serve each customer, so we'll charge $15/month for a 3x margin." This ignores that your product might save the customer $500/month. A $15/month price for a $500/month value is a gift, not a business. Price on value delivered, not cost incurred.
2. Too Many TiersFive pricing tiers create decision paralysis. The customer can't figure out which one to buy, so they buy nothing. Three tiers is optimal: one for individuals, one for teams, one for enterprises. If you need more than three, you're serving too many customer segments with one product.
3. No Upgrade PathIf the gap between your tiers is too large ($29 to $299 with nothing in between), customers stuck between them churn. If the gap is too small ($29 to $39 to $49), there's no incentive to upgrade. The 2-4x ratio between tiers creates natural step-ups.
4. Waiting Too Long to Raise PricesYour product improves every month. Your value increases every quarter. But your price stays the same for years. This is the most common revenue leak in SaaS. If you haven't raised prices in 12 months and your product is significantly better than it was, you're undercharging every new customer.
5. Offering Discounts Instead of Solving Objections"It's too expensive" almost never means the price is too high. It means the prospect doesn't understand the value. Instead of discounting, ask: "Compared to what?" Then reframe the value. If they're comparing you to doing it manually, show them the time cost. If they're comparing you to a cheaper competitor, show them what the competitor doesn't do.
How to Raise Prices Without Losing CustomersThe playbook I use with every client who needs a price increase:
Grandfather existing customers at their current rate for 6-12 months. This eliminates the immediate churn risk and gives you time to demonstrate additional value before their rate changes.
Announce 30+ days in advance. Nobody likes surprises on their credit card statement. A 30-day notice email that explains the change, why it's happening, and what new value they're getting is the minimum.
Pair the increase with genuine value. Launch a new feature, improve performance, or add a new integration on the same day the price increase takes effect. This reframes the conversation from "you're charging me more" to "I'm getting more."
Apply new pricing to new customers first. Run the new pricing for 2-3 months with new signups before rolling it out to existing customers. This proves the market will bear the new price before you risk existing relationships.
The data from companies I've worked with: a 20-30% price increase typically causes less than 5% churn. The 95% of customers who stay generate enough additional revenue to more than compensate for the 5% who leave. Most of the 5% were your least profitable customers anyway.
Unit Economics Every Founder Must KnowPricing doesn't exist in isolation. It connects to three numbers that determine whether your business is viable:
CAC (Customer Acquisition Cost): How much does it cost to acquire one paying customer? Include ad spend, sales team costs, and marketing time. For healthy SaaS: CAC should be less than 12 months of revenue from that customer.
LTV (Lifetime Value): How much total revenue does an average customer generate before they churn? For healthy SaaS: LTV should be at least 3x CAC. 5x or higher is excellent.
Gross Margin: Revenue minus cost to serve, divided by revenue. SaaS benchmark is 70-85%. If your margin is below 60%, your infrastructure costs are too high or your price is too low.
These three numbers tell you whether your pricing works. High CAC + low LTV = you're underpriced or targeting the wrong customer. Low CAC + high LTV = you probably have room to raise prices.
Getting StartedThe fastest way to improve your pricing: talk to 5 customers this week. Ask them: "What would you do if we doubled our price?" Most will say they'd stay. Some will tell you exactly what additional value would justify the increase. That conversation is worth more than any pricing analysis.
For the complete guide on raising prices without losing customers, read the price increase playbook. Your GTM strategy also depends heavily on pricing decisions. Your pricing strategy connects directly to your product positioning (how customers perceive value) and your sales process (how you communicate value in conversations). Get all three right and revenue takes care of itself.
If you need help restructuring your pricing, book a strategy call. I'll review your current pricing, your unit economics, and tell you where the revenue opportunity is.
### Frequently Asked Questions
**Q: What are the main SaaS pricing models?**
A: The four main SaaS pricing models are: flat-rate (one price, one plan), tiered (2-4 plans at different price points), usage-based (pay for what you use), and per-seat (price per user). Most successful SaaS products use tiered pricing with 3 plans because it offers clear upgrade paths and captures different willingness-to-pay segments.
**Q: How do you price a SaaS product?**
A: Start with your customer's cost of the status quo (what they pay now in time, money, or missed opportunities), then price at 20-30% of that value. If your tool saves a company $1,000/month, price it at $200-300/month. Validate by talking to 10 prospects: if nobody pushes back on price, you're too cheap.
**Q: What is the best pricing structure for SaaS?**
A: Three-tier pricing works best for most SaaS products. A Starter plan ($29-49/month) for individual users, a Growth plan ($99-199/month) for teams, and an Enterprise/Scale plan ($299+/month or custom) for larger organizations. The middle tier should be highlighted as 'most popular' — it's where 60-70% of customers land.
**Q: How do you know if your SaaS is priced correctly?**
A: Three signals your pricing is right: 10-20% of prospects say 'that's too expensive' (lower means you're too cheap), customers upgrade at a steady rate (shows the tier structure works), and your gross margin is above 70% (SaaS benchmark). If nobody ever complains about price, you're definitely underpriced.
**Q: Should SaaS founders offer a free plan?**
A: Offer a free plan only if your product has a viral loop (users invite other users) or if the free tier serves as a lead magnet with clear upgrade triggers. Otherwise, use a free trial (14 days is standard) instead. Free plans attract users who never convert and cost you support resources. Free trials attract buyers who are evaluating.
**Q: How do you raise SaaS prices without losing customers?**
A: Grandfather existing customers at their current rate for 6-12 months. Announce the increase 30+ days in advance. Pair the price increase with a genuine feature or value addition. Apply new pricing to new customers first. Most SaaS companies that raise prices by 20-30% lose less than 5% of customers — the revenue gain far outweighs the churn.
---
## Product Positioning for Founders: How to Explain What You Do So People Actually Buy
- **URL:** https://justinmckelvey.com/blog/product-positioning-for-founders
- **Published:** April 16, 2026
- **Updated:** April 16, 2026
- **Category:** Go-to-Market Strategy
- **Reading time:** 7 min
- **Description:** Product positioning framework for founders. Turn confused prospects into customers by explaining what you do in one sentence. 50+ examples.
TL;DR: The Positioning FormulaIf you can't explain what your product does in one sentence that makes someone say "I need that," your positioning is broken. Product positioning is the most underleveraged growth tool in startups. It's not a marketing exercise — it's the foundation of everything: your pitch, your homepage, your sales conversations, your pricing, even which features you build next. Over 1,300 people search for "product positioning" every month, and 720 more search for "positioning strategy." Most of the advice they find is abstract theory. This guide is the practical framework I use with clients to turn confused prospects into paying customers, based on 50+ products shipped over 15 years.
Why Positioning Is the First Thing Founders Get WrongEvery struggling startup I've worked with has the same root problem: they can't clearly articulate what they do and why it matters. The symptoms look like other problems — slow sales, high churn, low conversion rates — but the cause is almost always positioning.
The test is simple. Ask three people on your team to describe your product to a stranger. If you get three different answers, your positioning is broken. If the stranger responds with "oh, so it's like [wrong competitor]," your positioning is broken. If the best description takes more than 15 seconds, your positioning is broken.
Good positioning does three things simultaneously: it tells the customer what category you're in (so they know how to evaluate you), what makes you different (so they know why to choose you), and who you're for (so the right people self-select in and the wrong people self-select out).
The One-Sentence Positioning FormulaEvery product should be describable in one sentence. Here's the formula:
"For [specific customer] who [specific problem], [product name] is the [category] that [key differentiator]."
Each word matters. Let me break it down.
"For [specific customer]" — Not "businesses." Not "teams." A specific person with a specific role at a specific type of company. "For engineering managers at 20-50 person startups" is good. "For businesses" is useless.
"who [specific problem]" — The problem must be felt, not theoretical. "Who struggle with technical hiring timelines" is felt. "Who need better HR solutions" is theoretical. The best problems include a measurable cost: "who spend 3 months and $40K per engineering hire."
"[product name] is the [category]" — The category tells people how to evaluate you. If you're a CRM, they compare you to CRMs. If you're a "revenue operating system," they have no idea what you are. Pick a category that exists. You can redefine it later — but only after people understand what you do in the first place.
"that [key differentiator]" — What makes you different from the obvious alternative? Not a feature list. One thing. "That reduces hiring time from 3 months to 3 weeks." "That costs 90% less than traditional agencies." "That works without changing your existing workflow."
Real Positioning Examples (Good and Bad)Let me show you what this looks like in practice with products I've worked on. No client names — just the before and after.
Example 1: AI Lead Qualification ToolBefore (bad): "An AI-powered platform that leverages machine learning to optimize lead engagement and conversion across multiple touchpoints."
After (good): "For real estate teams spending 4-6 hours/day qualifying leads manually, this tool automatically scores and qualifies inbound leads for $50/month instead of $500/month alternatives."
What changed: Specific customer (real estate teams), specific problem (4-6 hours/day), specific category (lead qualification), specific differentiator ($50 vs $500). The "before" could describe 500 different products. The "after" makes one specific person think "that's exactly my problem."
Example 2: Sports MarketplaceBefore (bad): "A social platform connecting athletes, coaches, and sports enthusiasts in a dynamic community ecosystem."
After (good): "For recreational tennis players who can't find hitting partners nearby, this app matches you with players at your level within 10 miles."
What changed: Specific customer (recreational tennis players, not "sports enthusiasts"), specific problem (can't find partners), specific differentiator (nearby, skill-matched). This product grew to 500K users. The good positioning is what made the first 1,000 possible.
Example 3: Consulting BusinessBefore (bad): "Technology consulting and digital transformation services for forward-thinking organizations."
After (good): "For startup founders who need senior technical leadership but can't afford a $300K CTO, I'm a fractional CTO who embeds with your team for $8-12K/month."
That's my own business. The repositioning from "technology consulting" to "fractional CTO for founders" tripled my inbound leads because the right people immediately understood what I do and self-selected in.
The 5 Signals Your Positioning Is BrokenYou don't need a consultant to diagnose bad positioning. These five signals are unmistakable:
1. Prospects Consistently Misunderstand What You DoIf you spend the first 10 minutes of every sales call explaining your product instead of discussing the prospect's problem, your positioning isn't doing its job. Good positioning pre-qualifies conversations so that by the time someone talks to you, they already understand what you do.
2. Your Sales Cycle Is Longer Than CompetitorsConfused prospects take longer to buy. If your competitors close in 2 weeks and you close in 2 months, the product might be fine — the positioning is the bottleneck. People buy fast when they understand the value clearly.
3. Customers Churn Saying "This Isn't What I Expected"This is the most expensive positioning failure. You attracted customers with messaging that promised X, but the product delivers Y. The product might be great — but if the positioning sets the wrong expectations, even happy customers feel deceived.
4. You Compete on Price Instead of ValueWhen prospects don't understand your differentiation, price becomes the only comparison. "Why should I pay you $100/month when [competitor] is $30/month?" is a positioning problem, not a pricing problem. If they understood the value difference, price wouldn't be the question.
5. Your Team Describes the Product DifferentlyIf marketing says "we're a project management tool," sales says "we're a collaboration platform," and the founder says "we're reimagining how teams work" — nobody knows what you are. Internal misalignment always shows up as external confusion.
How to Fix Your Positioning in One WeekThis is the process I use with clients. It takes 5 days, not 5 months.
Day 1: Interview 5 of your best customers. Not your biggest or oldest — your best. Ask: "How would you describe what we do to a friend? What problem were you trying to solve when you found us? What other options did you consider? Why did you choose us?"
Day 2: Find the patterns. Your customers' words are better than your words. The language they use to describe your value is the language your positioning should use. If 4 out of 5 customers say "it saves me time on X," that's your differentiator — even if you thought your differentiator was something else.
Day 3: Write 5 versions of your one-sentence positioning. Use the formula. Make each version different — different customer emphasis, different problem framing, different differentiator. Don't judge yet. Just generate options.
Day 4: Test each version. Show all 5 to 3 people who match your ideal customer but haven't used your product. Ask: "Which one makes you want to learn more? Which one do you not understand? Which one describes something you'd actually pay for?" Their answers will converge on 1-2 clear winners.
Day 5: Deploy the winner everywhere. Homepage headline. LinkedIn bio. Sales email opening line. Investor pitch opening slide. The one sentence should appear, with minor variations, in every piece of external communication. Consistency compounds. Confusion compounds too — in the wrong direction.
Positioning vs. Messaging vs. BrandingThese three concepts get conflated constantly. Here's how they relate:
Positioning is strategic. It answers: what are we, who are we for, and why do we win? Positioning is the decision layer. It rarely changes quarter to quarter.
Messaging is tactical. It's how you express your positioning in specific contexts — your homepage copy, your sales scripts, your email sequences, your ad copy. Messaging changes frequently as you test and optimize.
Branding is experiential. It's the visual identity, voice, and feeling that makes your product recognizable. Branding without positioning is decoration. Positioning without branding is invisible.
Do them in order. Position first, then message, then brand. Most startups do it backwards — they design a logo, write copy, and then wonder why nobody buys.
Getting StartedIf you take one action from this post: write your one-sentence positioning statement using the formula above. Then say it out loud to someone who doesn't know your product. If they respond with a question about your product, the positioning is working. If they respond with "what do you mean?" it needs work.
Positioning feeds everything downstream. Your GTM strategy depends on it. Your pricing reflects it. Your MVP is easier to scope when you know exactly who it's for. Your sales conversations are easier when prospects already understand the value. Your marketing is cheaper when the message resonates immediately.
If you're struggling to articulate what your product does and why it matters, book a strategy call. I'll help you find the positioning that turns confused prospects into paying customers.
### Frequently Asked Questions
**Q: What is product positioning?**
A: Product positioning is how you define what your product is, who it's for, and why it's better than alternatives — in a way that makes your ideal customer immediately understand the value. Good positioning makes selling easy. Bad positioning makes every conversation an uphill battle.
**Q: How do you write a positioning statement?**
A: Use this formula: 'For [specific customer] who [specific problem], [product name] is the [category] that [key differentiator]. Unlike [alternative], we [unique advantage].' Fill in each blank with the most specific language possible. Vague positioning like 'we help businesses grow' doesn't work.
**Q: What is the difference between positioning and branding?**
A: Positioning is strategic — it defines what you are, who you serve, and why you win. Branding is the expression of positioning through visuals, voice, and experience. Positioning decides what to say. Branding decides how to say it. You need positioning first; branding without positioning is lipstick on a confused product.
**Q: How do you know if your positioning is wrong?**
A: Five signals: prospects consistently misunderstand what you do, your sales cycle is longer than competitors, customers churn within 90 days saying 'this isn't what I expected,' you compete on price rather than value, and your team describes the product differently depending on who you ask.
**Q: How often should you update your positioning?**
A: Revisit positioning every time you: change your target customer, add a major feature, notice a shift in how customers describe your value, enter a new market segment, or get consistent feedback that prospects don't understand what you do. For most startups, that means reviewing positioning quarterly.
**Q: What are examples of good product positioning?**
A: Good positioning examples: Slack — 'where work happens' (repositioned from enterprise messaging to workplace hub). Basecamp — 'the all-in-one toolkit for working remotely' (positioned against complex project management). Superhuman — 'the fastest email experience' (positioned on speed, not features).
---
## Founder-Led Sales: How to Close Your First 50 Customers Without a Sales Team
- **URL:** https://justinmckelvey.com/blog/founder-led-sales
- **Published:** April 16, 2026
- **Updated:** April 16, 2026
- **Category:** Founder Sales
- **Reading time:** 8 min
- **Description:** The founder-led sales framework for closing your first 50 customers without a sales team. Real scripts, follow-ups, and what generated $53M+.
TL;DR: The Founder-Led Sales SystemFounder-led sales isn't about being a salesperson. It's about being useful. After 15 years of building products and generating $53M+ in revenue, I've learned that the founders who close the most deals aren't the ones with the best pitch — they're the ones who understand their customer's problem deeply enough to be genuinely helpful. As of 2026, "founder-led sales" gets 210 searches per month with a growing trend, because more founders are realizing that hiring a sales team before they understand their own sales process is one of the most expensive mistakes a startup can make.
This guide is the system I use with every founder I advise. It works whether you're selling $50/month SaaS or $50,000 consulting engagements. The principles are the same: find people with the problem you solve, be useful to them, and make it easy to buy.
Why the Founder Must Sell FirstEvery founder I've met who hired a salesperson before closing their first 30 customers regrets it. Here's why:
You don't know your sales process yet. A salesperson executes a process. If you don't have a process, you're paying someone $80K-120K/year to figure one out — and they're less qualified to do it than you are because they don't know the product, the market, or the problem as deeply as you do.
You need the feedback loop. Every sales conversation is a product conversation. When a prospect says "I love this but I need X feature," that's product intel you can't get from analytics or surveys. When they say "I already use Y for this," that's competitive intelligence. When they say "my real problem is actually Z," that's a potential pivot. Founders who delegate sales too early cut off the most valuable feedback channel in the business.
Customers buy from founders differently. When the person selling is also the person who built the product, prospects trust the conversation more. They ask harder questions and get better answers. They're more willing to take a chance on an early-stage product because they know they're talking to someone who can actually fix problems. This advantage disappears the moment you hand sales to someone who has to say "let me check with the team."
The 4-Step Founder Sales FrameworkThis framework works for B2B SaaS, consulting, services, and even consumer products with a direct sales component. Each step builds on the last.
Step 1: Define Your Ideal Customer in One SentenceNot a persona document. Not a segment analysis. One sentence: "[Title] at [company type] who struggles with [specific problem] and currently solves it with [current workaround]."
Examples from products I've built:
"Engineering managers at 20-50 person startups who struggle with technical hiring and currently rely on recruiters charging 20-25% of salary."
"Solo consultants earning $150K-500K who waste 10+ hours/week on admin tools and currently juggle 4-5 SaaS subscriptions that don't talk to each other."
If you can't write this sentence, you're not ready to sell. Go talk to 10 potential customers first and come back.
Step 2: Build a Pipeline of 50 ProspectsYou need 50 names. Not 500. Not 5,000. Fifty real people who match your one-sentence definition, with their name, company, email, and the reason you think they have the problem you solve.
Where to find them:
LinkedIn Sales Navigator ($80/month): Filter by title, company size, industry, and geography. Save 50 profiles. This is the fastest path for B2B. Cancel after 2 months — you'll have your initial pipeline by then.
Communities and forums: Where do your ideal customers hang out? Slack groups, Discord servers, Reddit communities, industry forums. Don't spam — join the conversation, be helpful, and identify people who express the problem you solve.
Your existing network: You already know 5-10 potential customers or people who can introduce you to them. Start here. Warm introductions convert at 30-50%. Cold outreach converts at 2-5%.
Conferences and events: Not for the stage. For the hallway conversations. One good conversation at an industry event is worth 50 cold emails.
Step 3: The Discovery Call FrameworkThe goal of the first call is NOT to sell. It's to understand whether this person has the problem you solve and whether your solution is the right fit. You should talk less than 30% of the time.
The 5 questions that close deals:
"What's the biggest challenge you're facing with [problem area] right now?" Open-ended. Let them talk. The answer tells you whether they actually have the problem and how painful it is.
"How are you solving it today?" This reveals the status quo — what you're really competing against. Usually it's not a competitor. It's spreadsheets, manual processes, or nothing.
"What does that cost you — in time, money, or missed opportunities?" Help them quantify the pain. If they can't, the problem isn't painful enough to pay for a solution.
"If you could wave a magic wand, what would the ideal solution look like?" Their answer tells you whether your product matches their expectation. If it does, you're aligned. If it doesn't, you've saved yourself a mismatched customer.
"What would need to be true for you to try something new?" This surfaces objections before you pitch. Price? Integration? Team buy-in? Timeline? Now you know exactly what to address.
After these five questions, you know whether to pitch or politely disqualify. Both are good outcomes. Wasting time on bad-fit prospects is the most common founder sales mistake.
Step 4: The Follow-Up System That Actually ClosesMost deals don't close on the first call. They close between follow-up 3 and follow-up 5. The founders who close the most are the ones who follow up consistently without being annoying.
The sequence:
Same day (after call): Send a recap email. Summarize what they told you, the specific pain points, and the next step you agreed on. This proves you listened and creates a written record they can share with their team.
Day 3: Share something useful — an article relevant to their problem, a data point from your industry research, or an introduction to someone in your network who could help them (even if it's unrelated to your product). This builds trust without asking for anything.
Day 7: Check in with a specific question related to your conversation. "You mentioned your team spends 10 hours/week on manual reporting — did you get a chance to calculate the exact cost? I ran some numbers that might be useful."
Day 14: Offer a new angle. A case study from a similar company. A limited pilot offer. A revised proposal based on what you learned. Something that gives them a reason to re-engage.
Day 30: The honest check-in. "Is this still a priority for you? No pressure either way — I'd rather know so I can focus my time appropriately." This gives them permission to say no (which frees up your pipeline) or re-engage with urgency.
The key principle: every follow-up should add value. If your follow-up is "just checking in" with no new information, it's noise. If it shares something useful, it's relationship-building.
The Numbers You Should TrackFounder-led sales without metrics is guessing. Track these five numbers weekly:
Outreach sent: How many new prospects did you contact? Target: 20-40 per week.
Conversations booked: How many discovery calls did you schedule? Target: 5-10 per week. If this number is low relative to outreach, your messaging needs work.
Qualified opportunities: How many conversations resulted in a genuine fit? Target: 40-60% of conversations. If lower, you're targeting the wrong people.
Proposals/trials sent: How many qualified prospects moved to the next step? Target: 50-70% of qualified. If lower, your pitch or pricing needs work.
Closed deals: How many became paying customers? Target: 30-50% of proposals. If lower, your follow-up or closing process needs work.
These numbers create a funnel that tells you exactly where to focus. Low outreach-to-conversation rate? Fix your messaging. Low conversation-to-qualified rate? Fix your targeting. Low close rate? Fix your follow-up.
Common Founder Sales MistakesPitching before understanding. If you start talking about your product before asking about their problem, you're selling what you built instead of what they need. Those are often different things.
Discounting too quickly. When a prospect says "it's too expensive," most founders immediately offer a discount. The right response is: "Help me understand — compared to what?" The objection is rarely about the absolute number. It's about perceived value relative to alternatives.
Giving up after one follow-up. 80% of sales require 5+ touchpoints. Most founders quit after 1-2. The fortune is in the follow-up — but only if each follow-up adds value.
Selling to everyone. Not every prospect is a good customer. Bad-fit customers churn fast, demand custom features, and leave negative reviews. Disqualifying prospects who aren't the right fit is one of the most valuable skills in founder-led sales.
Not asking for the sale. After a great conversation, many founders end with "let me know if you're interested" instead of "based on what you've told me, I think [specific plan] solves your problem. Shall I send over a proposal?" You have to ask.
When to Hire Your First SalespersonYou're ready to hire when you can answer YES to all five of these questions:
1. Have you personally closed 30-50 customers?
2. Can you describe your ideal customer in one sentence?
3. Can you predict your close rate within 10%?
4. Can you document the sales process in a playbook someone else could follow?
5. Is the bottleneck your time, not the process?
If any answer is no, you're not ready. Keep selling. The clarity you build by doing it yourself is worth more than the time you'd save by delegating.
When you are ready, hire someone who's sold at your stage before. An enterprise sales rep from a Fortune 500 won't know how to sell a $50/month product to a skeptical startup founder. Look for someone who's been employee #1-5 at a company your size.
Getting Started This WeekIf you do nothing else:
1. Write your one-sentence ideal customer definition.
2. List 10 people you already know who match it.
3. Send each of them a message: "I'm building [product] for [problem]. You came to mind because [specific reason]. Would you be open to a 15-minute call to get your perspective? I'm not pitching — I genuinely want to learn from your experience."
That message works because it's honest, specific, and asks for advice rather than a sale. 40-60% of warm contacts will say yes. Those conversations will teach you more about your market in one week than a month of competitor research.
For the technical side of building your sales pipeline — CRM setup, email automation, and tracking — read about what a fractional CTO does or when you need a product manager to help define what you're selling. If you're building your product with AI tools, check our vibe coding tools guide to ship faster.
If you want help building your founder-led sales system, book a strategy call. I'll review your current process and tell you where the biggest opportunities are.
### Frequently Asked Questions
**Q: What is founder-led sales?**
A: Founder-led sales means the founder personally handles sales conversations, closes deals, and builds the initial customer base before hiring a dedicated sales team. It's the default go-to-market motion for most startups from pre-seed through Series A, and it's how the majority of successful companies close their first 50-100 customers.
**Q: When should a founder stop doing sales themselves?**
A: Most founders should handle sales personally until they've closed 30-50 customers and can clearly articulate a repeatable sales process. You're ready to hire when you can document: who the buyer is, what triggers them to buy, what objections they raise, and how long the sales cycle takes. Hiring a salesperson before you have this clarity wastes money.
**Q: How do founders sell without being pushy?**
A: The best founder-led sales isn't selling at all — it's being genuinely useful. Ask questions about the prospect's problems. Share relevant experience. Offer a specific recommendation whether or not it involves your product. Founders who approach sales as 'how can I help this person?' outsell founders who approach it as 'how do I close this deal?' every time.
**Q: How many sales calls should a founder make per week?**
A: During active sales mode, aim for 5-10 discovery calls per week. This requires 20-40 outreach messages per week to maintain pipeline. Dedicate 2-3 hours daily to sales activities — outreach in the morning, calls in the afternoon. If you're spending less than 30% of your time on sales pre-product-market-fit, you're probably spending too much time building.
**Q: What's the best CRM for founder-led sales?**
A: For your first 50 customers, you don't need Salesforce. A spreadsheet works until 20 prospects. After that, use a lightweight CRM like HubSpot free tier, Pipedrive ($15/month), or if you're technical, build a simple Kanban pipeline (which is what I did). The system matters more than the tool.
**Q: How do you follow up without being annoying?**
A: The follow-up sequence that works: Day 1 (after call) — recap email with specific next steps. Day 3 — share something relevant (article, data point, introduction). Day 7 — check in with a specific question. Day 14 — offer a new angle or time-limited incentive. Day 30 — 'Is this still a priority?' Most deals close between follow-up 3 and 5.
**Q: How long does founder-led sales take to show results?**
A: Expect 4-8 weeks from first outreach to first paying customer. The first 2 weeks are the hardest because you're refining your messaging based on real conversations. By week 4, you'll have a repeatable pitch. By week 8, you should have a predictable pipeline with clear conversion rates at each stage.
---
## Vibe Coding with Claude: How I Build Real Apps with Claude Code
- **URL:** https://justinmckelvey.com/blog/vibe-coding-with-claude
- **Published:** April 14, 2026
- **Updated:** April 14, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 7 min
- **Description:** How a fractional CTO uses Claude Code daily for vibe coding. Real workflows for Rails, Python, and full-stack development. When to use Claude vs. Cursor.
TL;DR: Why Claude Code Is My Daily DriverClaude Code is the best vibe coding tool for complex backend work, multi-file refactors, and any project where you need the AI to understand your entire codebase. I use it every day as a fractional CTO — building client projects in Rails, architecting APIs, writing test suites, and managing deployments. Unlike browser-based tools that generate apps from prompts, Claude Code works in your terminal alongside your actual codebase. It reads your files, runs your commands, and iterates on errors. As of April 2026, "claude vibe coding" gets 590 searches per month, and the tool has become essential infrastructure for professional developers.
This site you're reading right now was built with Claude Code. The blog posts, the content strategy, the SEO infrastructure, the admin interface — all developed using Claude Code as my primary tool, with Cursor for frontend work. Here's how I actually use it.
What Makes Claude Code Different from Every Other ToolMost vibe coding tools work through a chat interface or browser IDE. You describe what you want, the tool generates code, and you copy it into your project. Claude Code eliminates the copy-paste entirely. It runs in your terminal, has direct access to your file system, and can execute commands — including running your tests, starting your dev server, and reading error output.
This means the feedback loop is fundamentally faster. Instead of "generate code → copy to file → run → see error → paste error back → get fix → copy to file," the loop is "describe what you want → Claude Code writes it, runs it, sees the error, and fixes it." The AI handles the iteration cycle that developers normally do manually.
The other critical difference is codebase awareness. Claude Code reads your existing files, understands your project structure, and generates code that matches your patterns. When I tell it "add a new model like the Contact model but for bookings," it reads my Contact model, sees my conventions (how I name scopes, how I structure validations, which test patterns I use), and produces code that looks like I wrote it.
My Daily Claude Code WorkflowsWorkflow 1: Feature Development in RailsThis is my most common workflow. I describe a feature in plain language, and Claude Code scaffolds the entire thing — model, migration, controller, views, routes, and tests. A typical prompt:
"Create a BookingType model with name, slug, duration_minutes, description, and active fields. Add a controller with CRUD actions under the admin namespace. Generate the admin views with the same Tailwind styling as the contacts views. Add model validations and tests."
Claude Code reads my existing admin controllers to match the pattern, generates everything, runs the migration, and verifies the tests pass. What would take 60-90 minutes manually takes 10-15 minutes. I review every file, accept 85-90% as-is, and tweak the rest.
Workflow 2: Complex RefactorsRefactoring across 20+ files is where Claude Code saves the most time relative to any other approach. "Extract the email sending logic from the Contact model, BookingConfirmation job, and FormSubmission handler into a unified Mailer service. Update all callers and their tests." Claude Code reads all the files involved, understands the dependencies, and makes coordinated changes across the entire codebase.
The key is being specific about what you want. Vague prompts like "clean up the mailer code" produce vague results. Specific prompts with clear before/after expectations produce exactly what you need.
Workflow 3: Test WritingI have a rule: no feature ships without tests. Claude Code makes this rule practically free to follow. After building a feature, I say "write tests for the BookingsController. Test all CRUD actions, authentication requirements, validation errors, and edge cases like booking in the past or double-booking the same slot."
Claude Code reads the controller, understands the business logic, and generates comprehensive tests. The tests usually catch 2-3 edge cases I hadn't considered — which is the whole point of testing. I review the tests to make sure they're testing behavior, not implementation, then run them.
Workflow 4: Debugging from Error MessagesWhen a production error occurs, I paste the full stack trace into Claude Code. "This error is happening in production when users try to book a time slot. Here's the stack trace." Claude Code reads the relevant files, identifies the issue, and proposes a fix. For routine bugs (nil errors, missing validations, incorrect query logic), this works 80-90% of the time on the first attempt.
For complex bugs, Claude Code serves as a pair debugger. "I think the issue is a race condition in the booking availability check. Can you trace the flow from the controller action through the availability service and identify where concurrent requests could produce incorrect results?" Having an AI that can read your entire codebase and trace execution paths is like having a senior developer available 24/7 for rubber ducking.
Workflow 5: Database ArchitectureClaude Code is exceptional at database design. "Design the schema for a lead nurture sequence system. Sequences have steps, steps can be email sends, waits, or conditions. Contacts can be enrolled in sequences and tracked through completion. Include the models, migrations, and associations."
The output is consistently well-normalized, properly indexed, and includes the associations and scopes that make the models usable. I've found that Claude's database design is better than what most mid-level developers produce — it naturally includes indexes on foreign keys, proper constraint definitions, and sensible default values.
When to Use Claude Code vs. CursorI use both tools daily. Here's how I divide the work:
Use Claude Code for: Multi-file refactors. Backend architecture. Database design. Test writing. Deployment configuration. Debugging from stack traces. Any task that requires reading 5+ files to understand context. Complex business logic. Infrastructure and DevOps.
Use Cursor for: Frontend development (seeing visual output matters). Single-file edits. Tab completions while typing. Quick inline changes. CSS and styling work. Any task where you want to see a visual diff before accepting.
Use both for: Full-stack feature development. Claude Code builds the backend (model, controller, service, tests); Cursor builds the frontend (views, Stimulus controllers, styling). This is my default workflow for client projects and it's the most productive combination I've found.
Claude Code Tips That Took Me Months to LearnUse CLAUDE.md files. A CLAUDE.md file in your project root tells Claude Code about your project's conventions, tech stack, and preferences. This is the equivalent of Cursor's .cursorrules file. I include my tech stack, coding conventions, testing patterns, and common gotchas. The difference in output quality between a project with CLAUDE.md and without is dramatic.
Be specific about what you don't want. "Build a contact form" gives you a generic contact form. "Build a contact form using Turbo Frames, Stimulus for validation, Tailwind for styling, and NO JavaScript frameworks" gives you exactly what fits your project.
Use it for code review. "Review this pull request for security vulnerabilities, performance issues, and deviations from our coding conventions." Claude Code reads the diff, understands the context of the changes, and provides feedback that catches real issues — not just style nits.
Chain commands. "Run the tests, fix any failures, run the tests again, and repeat until they all pass." Claude Code will iterate autonomously through the test-fix cycle. I use this after large refactors where I expect some tests to break — it handles the mechanical fix-and-verify loop while I focus on reviewing the changes.
Don't use it for security-critical code. Same rule as every other AI tool: authentication, authorization, payment processing, and encryption should be written with full human attention. Use Claude Code to draft these sections, then review every line yourself. The cost of a security vulnerability in these areas is orders of magnitude higher than the time saved.
What Claude Code Costs in PracticeClaude Code uses your Anthropic API account with usage-based pricing. Here's what I actually spend across different usage patterns:
Light day (1-2 hours of coding): $1-3 in API usage. Quick edits, small features, test writing.
Heavy day (4-6 hours of intensive coding): $5-15 in API usage. Large refactors, new features, debugging sessions.
Monthly total for professional daily use: $30-50/month. This varies significantly based on how much context Claude Code needs to read (larger codebases cost more per interaction) and how many iterations a task requires.
Anthropic also offers Max subscription plans at $100/month and $200/month that include API credits and additional features. For heavy professional use, the Max plan can be more cost-effective than pure API usage.
At $30-50/month, Claude Code saves me 15-25 hours of development time per month. At my billing rate, that's a 50-100x ROI. Even for a developer billing at $50/hour, the time savings make it one of the most cost-effective professional tools available.
The Bottom LineClaude Code isn't the flashiest vibe coding tool. It doesn't generate apps from a single prompt like Bolt. It doesn't have a visual editor like Cursor. It runs in a terminal and it requires you to know what you're doing. But for professional developers who want to ship faster without sacrificing code quality, it's the most powerful tool available in 2026.
The combination of Claude Code (backend) + Cursor (frontend) is my stack for every client project. For the full head-to-head, read Claude Code vs Cursor — feature table, pricing, and when to pick which. If you're curious about the broader vibe coding landscape, start with our Best Vibe Coding Tools comparison or What Is Vibe Coding? guide.
If you're building with AI tools and need help getting to production, book a strategy call. I'll review your codebase and tell you what's ready to ship and what needs work.
### Frequently Asked Questions
**Q: Can you vibe code with Claude?**
A: Yes. Claude Code is Anthropic's terminal-based coding agent that can create files, run commands, execute tests, and iterate on errors autonomously. It's one of the best vibe coding tools for developers, especially for backend work, complex refactors, and multi-file changes. It costs $5-50/month based on API usage.
**Q: Is Claude Code better than Cursor for vibe coding?**
A: They're complementary, not competing. Claude Code excels at backend architecture, complex refactors, and terminal-based workflows. Cursor excels at frontend development, visual editing, and IDE-integrated coding. Many developers use both — Claude Code for backend, Cursor for frontend.
**Q: How much does Claude Code cost?**
A: Claude Code uses your Anthropic API account with usage-based pricing. Light use costs $5-10/month. Moderate daily use costs $15-30/month. Heavy professional use can reach $30-50/month. There's also a Max subscription plan at $100/month or $200/month for heavier usage with included credits.
**Q: What programming languages does Claude Code support?**
A: Claude Code works with any programming language since it operates through the terminal and file system. It's particularly strong with Python, JavaScript/TypeScript, Ruby, Go, and Rust. It understands frameworks like Rails, Django, Next.js, FastAPI, and Express deeply enough to generate idiomatic, production-quality code.
**Q: Can non-developers use Claude Code?**
A: Claude Code requires comfort with the terminal and basic understanding of file systems and code structure. It's not designed for non-developers — use Bolt or Lovable instead. Claude Code is for developers who want to work faster, not for non-technical users who want to avoid code entirely.
**Q: What can you build with Claude Code?**
A: Claude Code can build APIs, backend services, CLI tools, database schemas, test suites, deployment configurations, and full-stack web applications. It excels at complex, multi-file changes that require understanding how different parts of a codebase interact. I use it daily for client projects ranging from Rails apps to Python data pipelines.
---
## Vibe Coding Examples: 10 Real Projects — What Worked and What Didn't
- **URL:** https://justinmckelvey.com/blog/vibe-coding-examples
- **Published:** April 14, 2026
- **Updated:** April 14, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 9 min
- **Description:** 10 real vibe-coded apps reviewed by a fractional CTO. See what tools were used, what worked, what broke in production, and lessons learned from each project.
Quick Answer
The 10 most common production-grade vibe coding projects in 2026 are: SaaS landing pages, internal admin dashboards, booking systems, e-commerce stores, portfolio sites, content sites, simple CRMs, MVP prototypes, mobile-web apps, and AI-wrapper tools. About 6 in 10 ship without major issues; the rest break on authentication, payments, scaling, or security. The pattern is consistent — vibe-coded prototypes are great, vibe-coded production apps almost always need a developer to review the code before launch.
Reviewed April–May 2026 · 10 projects · Author: Justin McKelvey, fractional CTO, 50+ products shipped
TL;DR: Real Vibe Coding Projects, Honest ReviewsI've reviewed dozens of vibe-coded applications as a fractional CTO — some brilliant, some disasters. Here are 10 real projects built with AI coding tools in 2025-2026, covering what tools were used, what went right, what broke, and what it cost to fix. These aren't hypothetical examples. They're real apps built by real founders, with real outcomes. As of April 2026, the pattern is clear: vibe coding produces great prototypes and dangerous production apps. The difference is always whether a professional reviews the code before launch.
The Success Stories1. Marketing Site That Replaced Squarespace ($36/month savings)Tool: Bolt | Build time: 3 hours | Outcome: Launched, still running
A consultant replaced their $36/month Squarespace site with a static marketing site generated in Bolt. Five pages, responsive design, contact form, and a blog section. The entire build took one afternoon. Deployed to Vercel's free tier. Annual savings: $432 with a better-performing site.
Why it worked: Marketing sites are vibe coding's sweet spot. No authentication, no user data, no complex business logic. The code quality doesn't need to be perfect because the stakes are low — the worst that happens is a layout bug, not a security breach.
Lesson: If your vibe coding project has no login, no payments, and no user data, you can ship it directly. This is the easiest win.
2. Internal Sales Dashboard (10 hours/week saved)Tool: Lovable | Build time: 2 days | Outcome: In daily use by 8-person sales team
A B2B SaaS company's sales team was spending 10+ hours per week manually compiling data from Salesforce, spreadsheets, and email into a weekly report. A non-technical ops manager used Lovable to build a dashboard that pulls from their existing APIs and displays the metrics the team actually looks at.
Why it worked: Internal tools have a forgiving audience. Your sales team won't exploit a SQL injection vulnerability — they'll just tell you the chart is wrong. The tolerance for rough edges is much higher than customer-facing products, and the ROI is immediate and measurable.
Lesson: Internal tools are the second-best use case for vibe coding after marketing sites. The users are known, the stakes are manageable, and the time savings compound every week.
3. MVP That Secured $500K Seed RoundTool: Cursor + Claude Code | Build time: 3 weeks | Outcome: Raised seed funding, hired engineering team
A technical founder with product experience but no time to code a full app used Cursor and Claude Code to build a functional SaaS MVP. The product was a workflow automation tool for real estate teams. Three weeks of part-time work (evenings and weekends) produced a working application with user accounts, a visual workflow builder, and email integrations.
Why it worked: The founder knew how to code and used AI to move faster, not to replace understanding. Every generated function was reviewed. Security-critical code (auth, API keys) was written manually. The AI handled the 80% that was boilerplate; the founder handled the 20% that required judgment.
Lesson: The best vibe coding outcomes come from developers using AI as an accelerator. The combination of human judgment and AI speed is unbeatable for MVP development.
4. Event Registration System for a Non-ProfitTool: Bolt | Build time: 1 week | Outcome: Processed 2,000+ registrations
A non-profit needed an event registration system but had zero tech budget. A volunteer used Bolt to build a registration form with date selection, attendee information capture, and confirmation emails. The system handled 2,000+ registrations across 12 events over 6 months.
Why it worked: The scope was narrowly defined. One form, one flow, one purpose. No payments (events were free), no user accounts (one-time form submission), and no complex business logic. When the scope matches vibe coding's strengths, the results are impressive.
Lesson: Scope discipline is everything. A narrowly defined tool with one purpose is where vibe coding shines. The moment you add "and also it should do..." is when things break.
5. Personal Finance Tracker (Replaced $120/year app)Tool: Cursor | Build time: 2 weekends | Outcome: Daily personal use for 8+ months
A developer frustrated with existing finance apps built a personal expense tracker using Cursor. CSV import from their bank, category tagging, monthly charts, and budget alerts. No multi-user features, no cloud sync, no mobile app — just a simple web app on their laptop.
Why it worked: Personal projects have zero external stakes. If it breaks, you fix it. If the data is wrong, you know immediately. The developer iterated on the tool over 2 months of daily use, gradually adding features based on their own frustrations — the ideal product development loop.
Lesson: Build something for yourself first. You're the best possible user tester because you'll use it daily and notice every flaw.
The Rescue Stories6. E-Commerce Store with Exposed Stripe KeysTool: Bolt | Build time: 1 week | Rescue cost: $3,500
A founder built an e-commerce store and launched it to 200+ customers. The site looked professional, the checkout flow worked, and orders were being processed. The problem: the Stripe secret key was in the frontend JavaScript, visible to anyone who opened browser developer tools. Additionally, the payment webhook handler didn't verify Stripe signatures, meaning anyone could send fake payment confirmations.
What we fixed: Moved all Stripe operations to the backend, added webhook signature verification, implemented proper environment variable handling, and added rate limiting on the checkout endpoint. The fix took 2 days of developer time. No breach occurred — but it was a matter of time.
Lesson: If your app processes payments, get a developer review before launch. A 4-hour security audit ($400-$1,000) would have caught this instantly. The $3,500 rescue was cheap compared to a potential breach.
7. Booking App with Double-Booking BugTool: Lovable | Build time: 2 weeks | Rescue cost: $5,000
A consulting firm built a client booking system. It worked perfectly in testing because only one person was testing it. In production, with 15 clients booking simultaneously, the race condition appeared: two clients could book the same time slot because the availability check and the booking creation weren't atomic. The first week of launch produced 8 double-bookings, each requiring manual rescheduling and apologetic emails.
What we fixed: Added database-level unique constraints on the time slot, implemented optimistic locking, and added a booking confirmation step that rechecks availability. We also added proper error messages instead of the silent failures that were confusing both staff and clients.
Lesson: Multi-user features need architectural thinking that AI tools don't provide. Anything involving concurrent access — booking, inventory, collaborative editing — needs a developer to handle the race conditions.
8. SaaS App with Bypassable AuthenticationTool: Replit | Build time: 3 weeks | Rescue cost: $8,000
A founder built a project management SaaS and started onboarding paying customers. The authentication looked solid — login screen, password reset, session management. But the authorization was entirely client-side: the API endpoints didn't verify that the requesting user had permission to access the requested data. Any logged-in user could access any other user's projects by changing the project ID in the URL.
What we fixed: Complete authorization overhaul. Added server-side permission checks to every API endpoint. Implemented proper row-level security. Added audit logging. Reviewed and fixed 47 endpoints that were vulnerable. This was the most expensive rescue because the authorization architecture needed to be rebuilt, not patched.
Lesson: Authentication (who you are) and authorization (what you can access) are different problems. AI tools handle authentication reasonably well. Authorization — especially multi-tenant SaaS authorization — requires careful human design.
9. Content Platform That Crashed at 500 UsersTool: Bolt + manual extensions | Build time: 1 month | Rescue cost: $6,000
A creator economy startup built a content platform and launched to their email list. The first 100 users were fine. At 300, pages started loading slowly. At 500, the site became unusable — 30+ second page loads and frequent timeouts. The founder assumed they needed to "scale their server" and spent $200/month upgrading infrastructure before calling for help.
The actual problem: N+1 database queries. The homepage loaded every piece of content, then for each piece ran a separate query to fetch the author, the comments count, and the like count. At 500 users with 2,000 pieces of content, that was 8,000+ database queries per page load. The fix wasn't more server — it was 3 lines of code adding eager loading (`.includes(:author, :comments)`).
Lesson: AI-generated code optimizes for readability and correctness, not performance. It works at small scale because every approach works at small scale. Performance problems are architecture problems that require understanding how databases and queries interact.
10. The $15,000 Payment Reconciliation DisasterTool: Cursor (by a junior developer) | Build time: 6 weeks | Rescue cost: $15,000
The most expensive rescue I've done. A marketplace app processed $200,000+ in transactions over 4 months with broken webhook handling. The Stripe webhook endpoint existed but didn't verify signatures, occasionally timed out (causing Stripe to retry and create duplicate records), and had no error handling for edge cases like partial refunds and disputed charges.
The result: approximately 15% of successful payments were never recorded in the database. Customers were charged but never received access. The refund process was manual and took the founder 20+ hours to reconcile. Some customers had been double-charged.
What we fixed: Rebuilt the entire payment pipeline with proper webhook verification, idempotency keys, a reconciliation system that cross-references Stripe records with the database, and automated handling for refunds and disputes. Added monitoring alerts for payment discrepancies.
Lesson: Payment processing is the one area where vibe coding is actively dangerous without professional oversight. The code "works" in testing because test mode payments always succeed. Production payments fail, get disputed, and require handling for dozens of edge cases that AI tools don't generate.
The Pattern: What Separates Success from DisasterAfter reviewing these 10 projects (and dozens more), the pattern is clear:
Successful vibe coding projects have: Narrow scope. Low stakes (no payments, no sensitive data). A single user type. Standard UI patterns. Either no backend or a simple CRUD backend.
Failed vibe coding projects have: Multiple user types with different permissions. Payment processing. Real-time or concurrent features. Complex business logic. Sensitive data handling. And critically — no professional review before launch.
The tool doesn't determine the outcome. The scope does. A marketing site built in Bolt ships perfectly. A multi-tenant SaaS with payments built in the same tool ships with vulnerabilities. The difference isn't the AI — it's the complexity of the problem.
For guides on the top tools, read Cursor for developers and Claude Code for backend work. Head-to-head comparisons: Claude Code vs Cursor, Replit vs Cursor, Lovable vs Cursor. For a full comparison of which tool is best for which type of project, read our Best Vibe Coding Tools guide. If you want to understand the broader landscape, start with What Is Vibe Coding?
If you've built something with vibe coding and want to know if it's safe to launch, book a strategy call. I'll review your code and give you a clear assessment of what's production-ready and what needs work.
### Frequently Asked Questions
**Q: What can you build with vibe coding?**
A: Vibe coding tools can build landing pages, SaaS dashboards, booking systems, e-commerce stores, internal tools, portfolio sites, and simple mobile apps. The best results come from applications with standard UI patterns and well-understood business logic. Complex backends, real-time systems, and payment processing typically need developer review.
**Q: What are good vibe coding projects for beginners?**
A: Start with a personal portfolio site, a simple landing page, or an internal tool for your business (like a contact directory or task tracker). These projects have low stakes, standard patterns, and clear success criteria. Avoid starting with payment processing, authentication-heavy apps, or multi-user real-time features.
**Q: Can you build a SaaS product with vibe coding?**
A: You can build a SaaS MVP with vibe coding tools, but production SaaS products need developer involvement for authentication, billing, multi-tenancy, and security. Use vibe coding to validate the idea and build the first version, then invest in professional development for the production build.
**Q: What are examples of successful vibe coded apps?**
A: Successful vibe-coded projects include marketing sites that replaced $500/month Squarespace plans, internal dashboards that saved teams 10+ hours/week, MVP prototypes that secured seed funding, and content management systems for small businesses. The common thread: clear scope, standard patterns, and professional review before launch.
**Q: How long does it take to vibe code an app?**
A: A simple landing page: 30-60 minutes. A multi-page marketing site: 2-4 hours. A functional CRUD app (like a task manager or directory): 1-2 days. A full MVP with authentication and integrations: 1-3 weeks. These timelines assume experience with the tools — first-time users should add 50-100% more time.
**Q: What vibe coded apps have failed?**
A: Common vibe coding failures include: e-commerce sites with payment security vulnerabilities, booking apps with double-booking race conditions, SaaS apps with authentication bypasses, and data-heavy apps that crash under load. The failures aren't about the tools — they're about shipping to production without professional review.
---
## How to Build an MVP in 2026: The 6-Week Framework
- **URL:** https://justinmckelvey.com/blog/how-to-build-mvp-2026
- **Published:** April 14, 2026
- **Updated:** April 14, 2026
- **Category:** Product Leadership
- **Reading time:** 8 min
- **Description:** Ship an MVP in 6 weeks with this 4-phase framework: scoping, build-or-buy, core build, launch. Real costs + examples from 50+ shipped products.
TL;DR: The MVP FormulaA good MVP tests one hypothesis with the smallest possible product in 6 weeks or less. After shipping 50+ products over 15 years — including PlayYourCourt (500K+ users), Qualifyed.ai (95% cost reduction), and Achievrs (athlete marketplace) — I've refined a 6-week framework that consistently gets founders from idea to paying customers. The framework works whether you're a technical founder, a non-technical founder using vibe coding tools, or working with a developer. As of 2026, MVP development costs range from $25/month (AI tools) to $50,000 (agency build).
Why 90% of MVPs Fail Before LaunchThe number one reason MVPs fail isn't bad ideas or bad execution. It's scope. Founders build too much because they're afraid of launching something "incomplete." But an MVP isn't supposed to be complete — it's supposed to be useful enough to test whether anyone cares.
The over-building trap: A founder wants to build a booking platform. They spec out user accounts, admin dashboards, payment processing, email notifications, calendar sync, team management, and analytics. Six months and $40,000 later, they launch to crickets. They could have validated demand in 2 weeks with a Google Form and a Calendly link.
The perfection trap: "We can't launch until the design is polished." Yes you can. Ugly products with real utility beat beautiful products with no users every time. Your first 50 customers don't care about animations and gradients. They care about whether you solve their problem.
The feature trap: "We need just one more feature before we can launch." No you don't. Every feature you add before launch is a feature you built without user feedback. The odds of guessing right on 10 features without user input are essentially zero.
Week 1: Define the One Problem You're SolvingThe goal this week is a single sentence that describes who you're helping, what problem you're solving, and how. Not a business plan. Not a feature list. One sentence.
Here's the format: "[Specific person] struggles with [specific problem] because [specific reason]. We solve this by [specific solution]."
Real examples from MVPs I've shipped:
"Tennis players can't find local hitting partners because existing platforms are dead or national-only. We solve this by matching players by location and skill level within 10 miles." (PlayYourCourt — grew to 500K+ users)
"Real estate teams spend 4-6 hours per day qualifying leads manually because AI tools are too expensive at $500+/month. We solve this by qualifying leads automatically for $50/month." (Qualifyed.ai — reduced client costs by 95%)
If you can't write this sentence clearly, you're not ready to build. Spend the week talking to 5-10 potential users and refining until the problem statement resonates. Every person you talk to should nod and say "yes, that's exactly my problem."
Week 2: Map the Smallest Possible SolutionTake your problem statement and ask: what is the absolute minimum product that tests whether this solution works? This is where discipline matters most. Every feature you add increases build time, delays user feedback, and increases the cost of being wrong.
The "haiku MVP" exercise: describe your MVP's complete user experience in 3 steps or fewer. If you need more than 3 steps, you're building too much.
PlayYourCourt haiku: "Enter zip code → See nearby players → Request a match." Three steps. No accounts. No payment. No scheduling algorithm. Just: are there other tennis players near you who want to play?
Decide your build approach based on your situation:
Non-technical founder, $0 budget: Use Bolt ($25/month) or Lovable ($25/month) to build a functional prototype. These vibe coding tools can generate a working web app from a description in under an hour.
Non-technical founder, $5K-15K budget: Hire a freelance developer for a 4-week sprint. Define the scope with your haiku MVP — no scope creep.
Technical founder: Use Cursor ($20/month) and build it yourself. With AI-assisted development, a solo technical founder can ship an MVP in 2-3 weeks. I build client MVPs in Rails 8 — one framework, one language, full stack.
Week 3: Build the Core Loop OnlyBuild only the feature that makes your problem statement true. Nothing else. No onboarding flow. No settings page. No "nice to have" features. Just the core loop that lets a user experience your value proposition.
This is the week where most founders fail. They start building and immediately think "oh, we also need email verification, and a forgot password flow, and an admin dashboard to manage users, and..." Stop. Every one of those things can wait until after you have users who care enough to complain about their absence.
Practical build tips from shipping 50+ MVPs:
Use authentication only if your core feature requires it. If users need to save data between sessions, add login. If not, skip it. An anonymous experience with zero friction gets more testers than a signup wall.
Use a third-party service for anything that isn't your core value. Stripe for payments. SendGrid for email. Cloudinary for images. Don't build commodity features.
Ship to a real URL. Even if it's ugly, put it on the internet. A local demo on your laptop isn't an MVP — it's a science project. Deploy to Railway ($5/month), Vercel (free), or Render (free tier). Real URLs force real decisions about what actually needs to work.
Week 4: Ship to 10 People You KnowSend your MVP to exactly 10 people who match your target user description. Not friends who'll be polite. Not fellow founders who'll give you theoretical feedback. People who actually have the problem you're solving.
Don't explain how to use it. Send them the URL and say: "I built something that [your problem statement]. Can you try it and tell me what happens?" Then watch. If they need more than 30 seconds to understand what to do, your product isn't clear enough — and no amount of onboarding tooltips will fix a confusing core experience.
What to measure this week:
Completion rate: What percentage of people complete the core action? If it's below 50%, your UX has fundamental issues. Fix them before expanding your test group.
Time to value: How long from opening the URL to experiencing the benefit? Under 2 minutes is good. Under 30 seconds is great. Over 5 minutes means you're asking too much.
Unprompted return: Do any of your 10 testers come back a second time without you asking? This is the strongest signal of product-market fit you can get at this stage.
Week 5: Watch Them Use ItSchedule 15-minute calls with your testers and watch them use the product over screen share. Don't guide them. Don't explain. Just watch where they get confused, frustrated, or delighted.
This is the most valuable week in the entire framework. Everything you built was based on assumptions. This week, you replace assumptions with observations. The gap between what you thought users would do and what they actually do is where the real product insights live.
After each session, write down:
What surprised you? Users always do something unexpected. They click things you didn't expect. They ignore the button you thought was obvious. They try to use the product for a use case you never considered. These surprises are more valuable than any feature request.
Where did they struggle? Every hesitation, every confused expression, every "wait, how do I..." is a usability issue. Fix the top 3 most common struggles. Ignore everything else.
Would they pay? Ask directly: "If this cost $X/month, would you use it?" The number doesn't matter yet — the reaction does. Enthusiastic yes, reluctant yes, and polite no are three very different signals.
Week 6: Decide — Iterate, Pivot, or KillBased on your user observations, make one of three decisions. Don't delay this decision.
Iterate if: 3+ of your 10 testers would pay, they completed the core action successfully, and the main feedback is about polish or secondary features. You've validated the core idea. Now improve it. Add the second feature, fix the UX issues, and expand to 50 users.
Pivot if: Users liked something about your product but not the thing you intended. Maybe your booking tool got ignored but everyone loved the availability display. The pivot isn't failure — it's following the data to where the real value is.
Kill if: 0 of 10 testers would pay, nobody came back without prompting, and the feedback is "I don't really need this." Killing a product after 6 weeks and $500 in costs is smart. Killing it after 12 months and $100,000 is devastating. The framework's biggest value is making bad news cheap.
What This Framework Actually CostsHere's the real cost breakdown for each approach in 2026:
Solo founder with AI tools: $25-100/month in subscriptions (Cursor, Bolt, or Lovable) + $5-10/month for hosting (Railway or Vercel). Total: $100-200 for a 6-week MVP. Your time is the main investment.
Founder + freelance developer: $5,000-15,000 for a 4-6 week engagement. Look for developers who've built MVPs before — enterprise developers tend to over-engineer everything.
Founder + fractional CTO: $5,000-10,000/month. You get technical leadership, not just code. The fractional CTO helps you scope the MVP, choose the right tech stack, and avoid expensive architectural mistakes. Best for non-technical founders building something complex.
Development agency: $15,000-50,000. The most expensive option and often the worst fit for MVPs. Agencies are optimized for building what you tell them, not for helping you figure out what to build. Use an agency for V2, not V1.
Real MVPs I've Shipped (And What They Taught Me)PlayYourCourt: MVP was a directory page with a zip code search. No accounts, no matching algorithm, no payments. Just: "are there tennis players near you?" The answer was yes, and the waitlist grew to 1,000 users before we wrote a single line of matching code. Lesson: validate demand before building supply.
Qualifyed.ai: MVP was a single API endpoint that took a real estate lead's contact info and returned a qualification score. No UI, no dashboard, no integrations. We gave 5 real estate teams API access and a spreadsheet. They loved it. The dashboard, CRM integrations, and automated workflows came after we had paying customers. Lesson: sell the output, not the interface.
Achievrs: MVP was a landing page with athlete profiles and a contact form. No matching, no payments, no messaging. Just: "here are athletes, here's how to reach them." We learned that the audience cared more about video content than profiles. The product pivoted based on that observation. Lesson: the MVP exists to be wrong — cheaply.
If you're ready to build your MVP and want help scoping it correctly, book a strategy call. I'll help you define your haiku MVP and choose the right build approach for your budget and timeline. For choosing the right AI tools for your build, start with our vibe coding tools guide.
### Frequently Asked Questions
**Q: How long does it take to build an MVP?**
A: A well-scoped MVP takes 4-8 weeks to build. The most common mistake is building too much — a true MVP tests one core hypothesis with the smallest possible product. Using vibe coding tools like Cursor or Bolt, a simple MVP can be functional in 1-2 weeks. The extra time goes to user testing and iteration.
**Q: How much does it cost to build an MVP?**
A: An MVP costs $0-50,000 depending on approach. DIY with AI tools: $25-100/month in tool costs. Freelance developer: $5,000-15,000. Development agency: $15,000-50,000. The cheapest path is building a prototype with vibe coding tools ($25/month), validating with users, then investing in professional development.
**Q: What should an MVP include?**
A: An MVP should include exactly one core feature that solves exactly one problem for a specific user. It should NOT include: user accounts (unless authentication IS the feature), admin dashboards, analytics, multiple user roles, social features, or any 'nice to have.' If you can't describe your MVP in one sentence, it's too big.
**Q: What is the difference between an MVP and a prototype?**
A: A prototype demonstrates that something is possible. An MVP demonstrates that someone will pay for it. Prototypes test feasibility; MVPs test demand. You might build a prototype in a day to prove the tech works, then spend 4-6 weeks building an MVP that's polished enough for real users to evaluate.
**Q: Do I need a technical co-founder to build an MVP?**
A: Not anymore. In 2026, vibe coding tools like Bolt and Lovable let non-technical founders build functional MVPs. For more complex products, a fractional CTO ($5,000-10,000/month) provides technical leadership without giving up equity. A technical co-founder is valuable but no longer a prerequisite.
**Q: What tech stack should I use for my MVP?**
A: For speed: use whatever your developer knows best. For a solo founder with no developer: Bolt or Lovable for the prototype, then Rails or Next.js for production. The tech stack matters far less than shipping quickly. I build MVPs in Rails 8 because it ships fastest for full-stack applications, but the best stack is the one that gets you to users fastest.
**Q: How do I validate my MVP?**
A: Put it in front of 10-20 target users and watch them use it without explaining anything. If they can complete the core task without asking for help, your UX works. If 3+ out of 10 would pay for it (or sign up, or come back next week), you have signal. If none would, you've learned something valuable for $0-5,000 instead of $50,000.
---
## Fractional Product Manager: What They Do, What They Cost, and When You Need One
- **URL:** https://justinmckelvey.com/blog/fractional-product-manager
- **Published:** April 14, 2026
- **Updated:** April 14, 2026
- **Category:** Fractional Product
- **Reading time:** 7 min
- **Description:** A fractional product manager costs $5K-12K/month — senior product leadership without the $200K salary. Costs, responsibilities, and hiring signals.
TL;DR: What a Fractional Product Manager DoesA fractional product manager gives your startup senior product leadership — roadmap strategy, user research, feature prioritization, stakeholder alignment — without the $200K+ salary. You typically pay $5,000-$12,000/month for 10-20 hours/week of the same caliber of leader that growth-stage companies hire full-time. As of April 2026, "fractional product manager" and "product management consulting" generate over 700 combined monthly searches, and the role is becoming as mainstream as fractional CTO was two years ago.
I've spent 15 years building products — 50+ shipped, $53M+ in revenue generated. When founders tell me they need a fractional CTO, I often discover what they actually need is product leadership. The distinction matters: a CTO answers "how do we build this?" A product manager answers "what should we build, and why?"
What Does a Fractional Product Manager Actually Do?Product management is one of the most misunderstood roles in startups. It's not project management (tracking tasks and deadlines). It's not engineering management (leading developers). It's the discipline of figuring out what to build next and making sure it solves real problems for real users.
A typical week for a fractional product manager includes:
User research and customer interviews. Talking to customers to understand their actual problems — not what they say they want, but what they actually do and where they struggle. Most startups build features based on the loudest customer request rather than the most impactful problem. A product manager fixes that by bringing data and user evidence to every decision.
Roadmap prioritization. Every startup has 50 features they could build and resources for 5. A product manager owns the framework for deciding which 5 matter most. This involves balancing customer impact, business value, engineering effort, and strategic alignment. The roadmap isn't a wishlist — it's a strategy document.
Feature specification and scoping. Translating business requirements into clear specs that engineers can build. This includes user stories, acceptance criteria, edge cases, and success metrics. Good specs reduce back-and-forth by 70-80% and prevent the "that's not what I meant" conversation after two weeks of development.
Stakeholder alignment. Keeping founders, investors, sales, and engineering aligned on priorities. In early-stage startups, the founder often has a different vision than the sales team, which has a different priority list than engineering. A product manager is the single point of truth for "what are we building and why."
Product analytics and metrics. Setting up tracking, defining KPIs, and using data to evaluate whether features are working. "We shipped it" isn't success. "Usage increased 40% and churn dropped 15%" is success. A product manager ensures every feature has measurable goals before it's built.
Do You Need a Product Manager or a CTO?This is the question I answer most often in initial consultations. Here's the decision framework:
You need a fractional product manager if: Your engineers build features that users ignore. You don't know why customers churn. The founder makes all product decisions based on gut feeling. Your roadmap changes every week based on whoever talked to the founder last. You have a dev team but no product strategy.
You need a fractional CTO if: Your code is falling apart. Deployments are scary. You can't evaluate your engineering team's work. You need to make a major technology decision. Your tech debt is slowing everything down.
You need both if: You're a non-technical founder with a dev team and no product process. This is more common than people think. The CTO handles engineering quality and technical strategy; the product manager handles what to build and user research. They work as a pair — the CTO says "here's what's possible and how long it takes," the product manager says "here's what users need and what moves the metrics."
A combined fractional CTO + product manager engagement typically runs $12,000-$20,000/month — still less than half the cost of two full-time senior hires.
How Much Does a Fractional Product Manager Cost in 2026?Here are the real market rates as of April 2026:
Hourly rate: $100-$250/hour depending on experience and market. Senior product leaders with 10+ years of experience command $175-$250/hour. Mid-career product managers range $100-$150/hour.
Monthly retainer (10-15 hrs/week): $5,000-$9,000/month. This covers user research, roadmap management, sprint planning input, and stakeholder alignment. Enough for strategic oversight but not hands-on daily product work.
Monthly retainer (20 hrs/week): $9,000-$12,000/month. This is the "embedded" engagement where the product manager is running sprint planning, writing specs, conducting user interviews, and managing the full product development cycle.
Fractional CPO: $10,000-$20,000/month. For companies with multiple product lines or product managers who need executive-level product strategy.
Full-time comparison: A VP of Product or CPO costs $180,000-$280,000/year in salary plus equity. A senior product manager costs $140,000-$200,000/year. The fractional model saves 50-70% while providing the same strategic capability.
6 Signs Your Startup Needs a Fractional Product Manager1. Your Engineers Are Building Features Nobody UsesIf your product analytics show that 40-60% of shipped features have low or no adoption, you have a prioritization problem, not an engineering problem. A product manager ensures every feature is validated against user need before a single line of code is written.
2. The Founder Is the Product BottleneckEvery feature decision requires the founder's input. Engineers wait 2-3 days for specifications. Edge cases aren't documented until someone hits them in production. This is the most common signal — and the most costly, because it means the founder isn't spending time on fundraising, sales, and strategy.
3. You're Losing Customers and Don't Know WhyChurn is high but nobody is doing exit interviews or analyzing usage patterns. A product manager's first 30-day deliverable is usually a churn analysis that identifies the top 3 reasons customers leave — and a plan to fix them. This analysis alone often justifies the engagement cost.
4. Your Roadmap Changes Every WeekA new customer request comes in and suddenly it's the top priority. Then an investor mentions a competitor feature and that becomes urgent. Without a product manager enforcing a prioritization framework, your engineering team is whiplashed between conflicting priorities.
5. You're Building a V2 or Major RedesignVersion 1 was built on founder intuition and it worked. Version 2 needs to be built on user data, competitive analysis, and strategic positioning. This is where many startups stumble — they try to rebuild with the same process that built the MVP, and the result is a bloated product that nobody asked for.
6. You're Preparing to Raise FundingInvestors want to see a clear product roadmap tied to business metrics. "We're going to build a bunch of features" isn't compelling. "Our roadmap reduces churn by 25% through these 3 improvements, then expands into this adjacent market" is a fundable story. A product manager helps you build and articulate that narrative.
How to Evaluate a Fractional Product ManagerThe product management market has the same problem as the fractional CTO market — lots of people calling themselves product leaders who've never actually shipped a product that customers pay for. Here's what to look for:
Ask for shipped products with outcomes. Not just "I was product manager at Company X." What did they ship, what metrics did it move, and what did they learn when it didn't work? A good product manager has as many stories about failed experiments as successful launches.
Ask how they prioritize. If they can't describe a prioritization framework in 2 minutes (RICE, ICE, weighted scoring, opportunity scoring), they're probably prioritizing based on gut feeling — which is what you're already doing without them.
Look for user research skills. The best product managers are obsessive about talking to users. Ask how many customer interviews they've conducted in the last month. If the answer is zero, they're a project manager with a better title.
Check for technical fluency. They don't need to write code, but they need to understand engineering trade-offs. "This feature is easy" from someone who can't estimate engineering effort is worse than no input at all.
Fractional Product Manager vs. Product Consultant vs. Product CoachFractional product manager: Embedded in your team 10-20 hrs/week. Owns the roadmap, writes specs, talks to users, runs sprint planning. They're a member of your team, not an outsider.
Product consultant: Engaged for a specific project or assessment. "Audit our product and tell us what's wrong." Typically 2-6 week engagements. Good for getting an outside perspective, not for ongoing product leadership.
Product coach: Mentors your existing product team or founder on product management skills. Doesn't do the work — teaches your team how to do it. Good when you have product people who need leveling up, not when you have no product function at all.
Most startups need a fractional product manager first, then transition to coaching once the founder or an internal hire can own the product function.
Getting StartedIf you're unsure whether you need product leadership, technical leadership, or both, book a strategy call. I'll assess your situation and tell you which type of fractional help — CTO, product manager, or a combination — makes the most sense for your stage. The call is 30 minutes, no pitch, and you'll leave with a clear recommendation.
For more on the technical leadership side, read What a Fractional CTO Actually Does. If you're building with AI tools and need help getting to production, check out our vibe coding tools guide.
### Frequently Asked Questions
**Q: What is a fractional product manager?**
A: A fractional product manager is a senior product leader who works with your company part-time, typically 10-20 hours per week. They own product strategy, roadmap prioritization, user research, and feature scoping — the same work a full-time VP of Product or CPO does, at 20-40% of the cost.
**Q: How much does a fractional product manager cost?**
A: As of 2026, fractional product managers charge $100-250/hour or $5,000-12,000/month on retainer. A typical 10-15 hours/week engagement costs $6,000-9,000/month. Compare that to a full-time product leader at $180,000-250,000/year plus equity.
**Q: What is the difference between a fractional CTO and a fractional product manager?**
A: A fractional CTO owns technical architecture, engineering team leadership, and technology decisions. A fractional product manager owns what gets built, why, and in what order — roadmap strategy, user research, feature prioritization, and stakeholder alignment. Some startups need both; many need one or the other.
**Q: When should a startup hire a fractional product manager?**
A: Hire a fractional product manager when your engineers are building features nobody uses, when the founder is the bottleneck for every product decision, when user churn is high but nobody is doing user research, or when you're preparing for a fundraise and need a clear product roadmap.
**Q: Can a fractional product manager help with UI/UX?**
A: Many fractional product managers have UI/UX experience and can handle product design decisions. For deep design work (full redesigns, design systems, complex interaction patterns), you may need a dedicated product designer. But for most startups, a product manager with design sensibility covers 80% of UX needs.
**Q: What is a fractional CPO?**
A: A fractional CPO (Chief Product Officer) is a more senior version of a fractional product manager, focused on product strategy across the entire company rather than individual features. Fractional CPOs typically work with companies that have 2+ product managers and need executive-level product leadership. They cost $10,000-20,000/month.
---
## Vibe Coding with Cursor: Agent Mode, Rules & Workflows (2026)
- **URL:** https://justinmckelvey.com/blog/vibe-coding-with-cursor
- **Published:** April 14, 2026
- **Updated:** April 14, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 9 min
- **Description:** The Cursor workflow that makes you 3-5x faster: agent mode, rules files, model selection, and the shortcuts working developers use daily. Updated April 2026.
TL;DR: Why Cursor Is the Developer's Vibe Coding ToolCursor is a VS Code fork with AI deeply integrated into every part of the editing experience. At $20/month for Pro, it gives you agent mode (autonomous multi-file edits), intelligent tab completions, inline code generation, and access to Claude, GPT-4, and Gemini models. I use it daily to ship client projects as a fractional CTO, and it makes me 3-5x faster on routine development tasks. As of April 2026, Cursor is the most popular AI-powered IDE with over 1,300 people searching for "cursor vibe coding" every month.
This isn't a review — I covered that in my best vibe coding tools comparison. This is the power user's guide: the workflows, settings, and habits that separate casual Cursor users from developers who've fundamentally changed how they work.
How to Set Up Cursor for Maximum ProductivityBefore you write a single prompt, configure these settings. The defaults are fine for trying Cursor; these changes are what make it a production tool.
Choose Your Default ModelCursor Pro gives you access to multiple AI models, and choosing the right one for the right task matters more than most people realize. Here's my model selection after 12 months of daily use:
Claude Sonnet 4 — my default for 80% of tasks. Best balance of speed, accuracy, and code quality. Excellent at understanding existing codebases and generating code that matches your project's patterns. Uses the least credits per request.
Claude Opus 4 — for complex architecture decisions, large refactors, and when Sonnet gets stuck. Significantly more expensive in credit usage but noticeably better at multi-step reasoning. I switch to Opus when a task requires understanding 5+ files and their relationships.
GPT-4.1 — occasionally useful for TypeScript-heavy projects and when Claude models produce repetitive patterns. I find GPT-4.1 generates slightly more creative solutions for frontend components but is less reliable for backend logic.
Don't overthink model selection when starting out. Use Claude Sonnet for everything, switch to Opus when you hit a wall.
Create a .cursorrules FileThis is the single most impactful configuration you can make. A .cursorrules file in your project root tells Cursor's AI about your project's conventions, tech stack, and preferences. Without it, Cursor generates generic code. With it, Cursor generates code that fits your project.
Here's the structure I use for every project:
Tech stack declaration. "This is a Rails 8 application using Hotwire/Turbo, Stimulus, Tailwind CSS, and SQLite. Do not suggest React, Vue, or any JavaScript framework."
Code conventions. "Use snake_case for Ruby methods. Use Tailwind utility classes, never custom CSS. Use Turbo Frames for partial page updates. Prefer server-side rendering over client-side JavaScript."
What to avoid. "Never add console.log statements. Never use inline styles. Never suggest adding a new gem without explaining why the existing tools can't solve the problem."
Testing expectations. "All new methods should have corresponding tests. Use Minitest, not RSpec. Test the behavior, not the implementation."
A good .cursorrules file is 30-50 lines. It saves you from correcting the same AI mistakes repeatedly — which, over a month, adds up to hours of wasted time.
Agent Mode: The Feature That Changes EverythingAgent mode is why Cursor dominates the developer vibe coding space. Instead of generating code snippets you copy-paste, agent mode can create files, modify existing files, run terminal commands, read error output, and iterate — all autonomously. You describe what you want; it figures out the steps.
When to Use Agent ModeScaffolding new features. "Create a booking system with a BookingType model, controller, views, and routes. Include validations for duration and active status." Agent mode will create 5-10 files, run migrations, and give you a working feature.
Multi-file refactors. "Rename the User model's 'name' field to 'full_name' across the entire codebase — models, views, controllers, tests, and seeds." Agent mode finds every reference and updates them consistently.
Bug fixing from error messages. Paste an error traceback and say "fix this." Agent mode reads the stack trace, identifies the file and line, understands the context, and proposes a fix. For routine bugs, this works 70-80% of the time on the first attempt.
Writing tests. "Write tests for the BookingsController. Test the happy path for create, update, and destroy. Test that unauthenticated users get redirected." Agent mode reads your controller code and generates tests that match your project's testing patterns.
When NOT to Use Agent ModeCritical security code. Authentication, authorization, payment processing, and encryption should be written with full human attention. Agent mode might generate something that looks correct but has subtle vulnerabilities. Use it to draft; review every line yourself.
Complex database migrations. Migrations that modify production data (not just schema) need careful human planning. Agent mode doesn't understand that your production database has 50,000 rows that all need to be transformed correctly.
Performance-critical paths. Agent mode optimizes for readability and correctness, not performance. If you're writing code that handles 10,000 requests per second, you need to think about query optimization, caching, and memory allocation yourself.
The Workflows That Make You 3-5x FasterSpeed in Cursor isn't about using agent mode for everything. It's about knowing which tool to use for which task. Here are the workflows I use daily.
Workflow 1: Tab Completion for Routine CodeCursor's tab completion predicts your next edit based on context — not just the current line, but the surrounding code, recent changes, and your .cursorrules file. For routine code (form fields, model validations, CSS classes), tab completion is faster than typing and faster than prompting. Just start typing and hit Tab when the prediction is right.
The power user move: make a change in one place, then navigate to the next similar location. Cursor often predicts the parallel change. Tab. Done. This turns a 15-minute find-and-replace refactor into 2 minutes of Tab key presses.
Workflow 2: Inline Edit for Focused ChangesSelect a block of code, hit Cmd+K (Mac) or Ctrl+K (Windows), and type what you want changed. "Add error handling for network failures." "Convert this to use async/await." "Add Tailwind responsive classes for mobile." The AI modifies just that block, showing you a diff before you accept.
This is the sweet spot between tab completion (too small) and agent mode (too big). Most of my daily Cursor usage is inline edits — targeted, predictable, and fast.
Workflow 3: Agent Mode for Feature ScaffoldingWhen I start a new feature, I describe the complete spec in agent mode. "Build a contact form that captures name, email, company, and message. Create a Form model, a public submission endpoint, and an admin view to see submissions. Send a confirmation email to the submitter and a notification to admin." Agent mode creates all the files, and I review each one.
The key is the review step. I accept about 85% of what agent mode generates and modify the other 15%. That 15% is where my experience adds the most value — catching missing validations, adding rate limiting, fixing edge cases the AI didn't consider.
Workflow 4: Error-Driven DevelopmentThis is the workflow most unique to vibe coding. Run your code, get an error, paste the error into Cursor, and let agent mode fix it. Repeat until the feature works. This sounds sloppy, but it's surprisingly effective for UI development and integration work where the fastest path to correctness is iterative.
I use this heavily for frontend work — generate a component, see that the spacing is wrong, tell Cursor "the card grid has too much gap on mobile, reduce to gap-4 on small screens." The iteration speed is faster than manually tweaking CSS values.
Common Mistakes (and How to Avoid Them)After a year of daily Cursor use and advising other developers on their AI workflows, these are the patterns that separate productive users from frustrated ones.
Mistake 1: Vague prompts. "Make this better" gives you unpredictable results. "Add input validation to the email field that checks for @ and a TLD, and show an inline error message using Tailwind" gives you exactly what you want. Be specific about what, where, and how.
Mistake 2: Never reading the generated code. This is how security vulnerabilities, performance problems, and subtle bugs accumulate. Agent mode is a pair programmer, not an autonomous developer. Read every diff. The 30 seconds you spend reviewing saves hours of debugging later.
Mistake 3: Fighting the tool's patterns. If Cursor keeps generating React components and you want Vue, update your .cursorrules file instead of correcting it every time. If it keeps suggesting Tailwind classes you don't use, add your preferred patterns to the rules. Teach the tool once; benefit forever.
Mistake 4: Using agent mode for tiny changes. Renaming a variable? Use find-and-replace. Fixing a typo? Just type it. Agent mode has startup time (reading context, generating a plan) that makes it slower than direct editing for small tasks. Match the tool to the task size.
Mistake 5: Not using multiple models. If Claude Sonnet struggles with a task, try Opus before rewriting your prompt 5 times. Different models have different strengths. Switching models takes 2 seconds; rewriting prompts takes 2 minutes.
Cursor vs. Other Vibe Coding Tools for DevelopersIf you're choosing between developer-focused AI coding tools, here's how Cursor compares as of April 2026:
Cursor vs. Windsurf: Both cost $20/month. Cursor has a larger community, more third-party integrations (MCPs), and a more mature agent mode. Windsurf has improved significantly since OpenAI's acquisition of its parent company and offers competitive code quality. Try both on your actual project — but most developers end up choosing Cursor.
Cursor vs. Claude Code: Different tools for different tasks. Cursor is an IDE (visual, file tree, integrated terminal). Claude Code is a terminal agent (text-only, runs commands directly). I use Cursor for frontend work and feature building, Claude Code for complex backend architecture and system-level changes. They complement each other — I often have both open.
Cursor vs. GitHub Copilot: Copilot is an autocomplete plugin inside VS Code. Cursor is an entire IDE rebuilt around AI. The difference is agent mode — Cursor can autonomously create, modify, and test multi-file changes. Copilot suggests lines; Cursor implements features. For serious AI-assisted development, Cursor is the clear choice.
Is Cursor Worth $20/Month?If you write code professionally, yes. Without qualification. The free tier is too limited to evaluate properly — you'll hit usage caps within an hour of real work. The Pro tier at $20/month is less than an hour of developer time at any market rate. If Cursor saves you one hour per month — and it will save you far more than that — it's paid for itself.
The question isn't whether Cursor is worth $20. It's whether the Pro+ tier at $60/month is worth the 3x usage increase over Pro. For most developers, Pro is sufficient. If you find yourself hitting usage limits regularly, upgrade — the productivity loss from waiting for quota resets costs more than $40/month.
Head-to-head comparisons with the other leading tools:
• Claude Code vs Cursor — terminal agent vs desktop IDE, when to use each
• Replit vs Cursor — cloud collaboration vs local performance
• Lovable vs Cursor — prompt-based app builder vs developer IDE
For a full comparison against every major vibe coding tool, read my Best Vibe Coding Tools in 2026 guide. For the honest take on vibe coding limitations and real project case studies, those guides complement this one. And if you want to understand the broader context of AI-assisted development, start with What Is Vibe Coding?
If you've been vibe coding with Cursor and need help getting your project production-ready — security review, architecture audit, or scaling guidance — book a strategy call.
### Frequently Asked Questions
**Q: Is Cursor good for vibe coding?**
A: Cursor is the best vibe coding tool for developers as of 2026. Its agent mode can scaffold entire features from descriptions, and the tab completion predicts your next edit with high accuracy. At $20/month for Pro, it's the highest-leverage AI coding tool available. However, it requires coding knowledge — non-developers should use Bolt or Lovable instead.
**Q: How much does Cursor cost?**
A: Cursor Hobby is free with limited usage. Cursor Pro costs $20/month with access to Claude, GPT-4, and Gemini models. Cursor Pro+ is $60/month for 3x usage limits. Cursor Ultra is $200/month for 20x usage. Most developers find Pro sufficient; heavy daily users may need Pro+.
**Q: Is Cursor better than VS Code?**
A: Cursor is a fork of VS Code with deep AI integration. It supports all VS Code extensions and settings, so you lose nothing by switching. The AI features (agent mode, tab completions, inline edits) are significantly more integrated than VS Code's Copilot extension. Most developers who try Cursor don't go back.
**Q: What's the difference between Cursor and GitHub Copilot?**
A: GitHub Copilot is an autocomplete plugin. Cursor is an entire IDE rebuilt around AI — it can create files, run terminal commands, edit multiple files, and iterate on errors autonomously in agent mode. Copilot suggests the next line; Cursor can implement an entire feature from a description.
**Q: How do I use Cursor agent mode effectively?**
A: Be specific in your prompts, provide context about your tech stack, and use .cursorrules files to set project conventions. Start with small, contained tasks and expand scope as you build trust. Always review the generated code — agent mode is a fast pair programmer, not an autonomous developer.
**Q: What AI models does Cursor support?**
A: Cursor Pro gives access to Claude (Sonnet and Opus), GPT-4o, GPT-4.1, and Gemini 2.5 Pro. You can switch models per conversation. Claude Sonnet is best for most coding tasks; Opus for complex architecture; GPT-4.1 for certain language-specific patterns.
---
## Is Vibe Coding Bad? 7 Failures from Rescuing Vibe-Coded Apps
- **URL:** https://justinmckelvey.com/blog/is-vibe-coding-bad
- **Published:** April 14, 2026
- **Updated:** April 14, 2026
- **Category:** Vibe Code Rescue
- **Reading time:** 8 min
- **Description:** Vibe coding isn't bad — but it fails in 7 specific ways. A fractional CTO on honest trade-offs and when AI code actually works in 2026.
The Short AnswerVibe coding isn't bad. But it's not what most people think it is. After 15 years of shipping software, building 50+ products, and a growing practice rescuing vibe-coded applications that broke in production, here's what I know: vibe coding is the best prototyping tool ever created and a mediocre production tool. That's not a criticism — it's a calibration. Use it for what it's good at and you'll move faster than ever. Pretend it replaces professional engineering and you'll learn expensive lessons.
As of April 2026, "is vibe coding bad" gets nearly 600 searches per month. The question itself reveals the problem — people are looking for a binary answer to a nuanced question. Let me give you the nuanced one instead.
Where Vibe Coding Is Genuinely GreatI want to start here because the anti-vibe-coding crowd is wrong too. These tools represent a genuine leap in how software gets built, and dismissing them is as foolish as overhyping them.
Speed to first version is unprecedented. A working web application in 30 minutes instead of 30 days. That's not incremental improvement — it's a category change. I've watched founders go from idea to user-testable prototype in an afternoon. Before vibe coding, that same process took weeks of developer time or months of learning to code.
The barrier to building software dropped to zero. A marketing director who sees a workflow problem can now build the solution herself instead of writing a requirements document, getting it prioritized, waiting three sprints, and receiving something that doesn't match what she imagined. That feedback loop compression is genuinely valuable.
Developers are significantly more productive. I use Cursor and Claude Code daily. They make me 3-5x faster on routine tasks — generating boilerplate, implementing well-known patterns, writing tests, and refactoring code. The time I save on mechanical work gets redirected to architecture, security, and the thinking that actually matters. AI-assisted development isn't vibe coding — it's the professional version of the same technology.
Prototyping costs dropped from $10,000-50,000 to $25/month. A startup that previously needed $20K of developer time to validate an idea can now test it for the cost of a Bolt subscription and a weekend. More ideas get tested, which means more good ideas get discovered. That's net positive for everyone.
The 7 Ways Vibe Coded Apps Break in ProductionNow the uncomfortable part. These are real failures I've seen in real client projects — not hypothetical risks, but actual breakages that cost real money to fix. I see one or more of these in every vibe-coded app that reaches my desk.
1. Exposed Credentials and API KeysThe most dangerous and most common failure. AI tools frequently generate code with API keys, database credentials, and secret tokens hardcoded directly in frontend JavaScript. This means anyone who opens their browser's developer tools can see — and steal — your Stripe secret key, your database password, or your admin credentials.
I reviewed a client's Bolt-generated e-commerce app where the Stripe secret key was in a JavaScript file served to every visitor. Anyone could have charged arbitrary amounts to the business's Stripe account. The fix took 30 minutes. Finding it after a breach would have cost thousands.
2. Authentication That Only Looks Like SecurityAI-generated authentication typically creates a login screen that checks credentials against a database. It looks like security. But the underlying implementation often relies on client-side checks only — meaning the authentication can be bypassed by anyone who knows how to modify a cookie or local storage value in their browser.
Production authentication requires server-side session management, CSRF protection, rate limiting on login attempts, secure password hashing, and proper session expiry. Most vibe-coded auth implementations have one or two of these. I've never seen one that had all five.
3. Silent Failures and White ScreensAI tools generate code for the happy path — the scenario where everything works perfectly. When the API returns a 500 error, when the database connection drops, when a user submits a form with unexpected characters, the app crashes. Not with a helpful error message. With a blank white screen or an inscrutable "TypeError: Cannot read property of undefined."
In my experience, the average vibe-coded app has zero error handling for network failures, zero error boundaries for rendering failures, and zero user-facing error messages. Every unexpected condition produces a silent crash.
4. Race Conditions in Multi-User FeaturesThis one is subtle and often doesn't surface until you have real concurrent users. Two people book the same time slot. Two users edit the same document and one loses their changes. A payment processes twice because the user clicked the button during a slow response. These are concurrency problems that require deliberate architectural decisions — database locks, optimistic concurrency control, idempotency keys.
AI tools don't think about concurrent access because their training examples rarely demonstrate it. The code works perfectly for one user. Add ten, and chaos begins.
5. Broken Payment ProcessingStripe integration is the canary in the coal mine for vibe-coded apps. A working checkout flow requires webhook verification (confirming payments actually completed), idempotency (preventing duplicate charges), proper error handling (what happens when a card is declined?), and reconciliation (matching Stripe records to your database).
The most expensive rescue I've done involved a vibe-coded app that processed over $200,000 in payments with broken webhook handling. Approximately 15% of successful payments were never recorded in the application's database. The customers were charged but never received access to what they paid for. Fixing this cost $15,000 — more than the entire application cost to build.
6. Input Validation VulnerabilitiesSQL injection, cross-site scripting (XSS), and other input validation attacks are web security basics that professional developers handle as second nature. AI-generated code frequently skips input validation entirely. A name field that accepts