The Vertical Video Ad Factory ($25K MRR)

The Vertical Video Ad Factory ($25K MRR)

Captions just raised $75M and revealed where the real margin in AI video is heading — not generation, but assembly. The vertical production layer is wide open.

On March 24, 2026, Mirage, the company behind the AI video-editing app Captions, announced $75 million in growth financing from General Catalyst. Total funding now exceeds $175 million. The company claims more than 20 million users and over 250 million videos created. TechCrunch separately reports 3.2 million downloads in the past year and $28.4 million in in-app revenue. These are infrastructure-scale numbers for what started as a creator tool.

Forget the funding round. What matters is the strategic pivot underneath it. CEO Gaurav Misra is repositioning Mirage around proprietary models, marketing automation, and what he describes as "agentic video creation." The company is investing in "assembly intelligence," the logic layer that decides what to combine, in what order, and for whom, plus accent-preserving audio that prevents international voices from being flattened into generic American speech. The winning category isn't "make one nice video." It's "manufacture endless useful video variants from structured inputs."

That shift points to an opening most founders are missing.

🎯
The play: Build a vertical AI video ad factory that turns structured business feeds into localized, channel-ready short-form ads at scale.

The money: 10 multi-location clients at $2,500/month is $25K MRR. Ken Garff's pilot cut lead costs up to 22.9% across eight dealerships.

Inside:
• Three vertical wedges with ROI math
• Feed-to-campaign MVP blueprint
• Service-to-SaaS pricing tiers
• Four compounding moats to build early

The platforms are ready. The production layer isn't.

The distribution rails for bulk AI video creative already exist. TikTok's Smart+ Catalog Ads convert static product catalogs into optimized shoppable video; advertisers in closed beta saw a 36% drop in cost per acquisition compared to manual campaigns. As of October 2025, Meta's Advantage+ Dynamic Media is the default for all new catalog ads, automatically choosing whether to show images or video based on individual engagement signals. Google reported nearly 70 million Gemini-generated ad assets created by advertisers in Q4 2025 alone.

AI video market surge statistics

The demand side matches. HubSpot's 2026 State of Marketing report shows 91% of businesses use video as a marketing tool. Animoto's 2026 State of Video report found 84% of marketers already use AI in video creation. U.S. programmatic video ad spending is projected to exceed $110 billion, with video capturing nearly 75% of new programmatic dollars allocated between 2024 and 2026. An estimated 86% of video ad buyers say they use or plan to use generative AI for creative, and roughly 40% of all video ad creative is projected to incorporate gen-AI by end of 2026.

Platform ad automation capabilities

Assembly intelligence over generation quality

The opportunity isn't another AI video editor. The operator-grade insight is that a new layer is opening between raw business data and paid distribution. The valuable product is an assembly engine that ingests feeds, offers, reviews, inventory, and scripts, then outputs dozens or hundreds of localized, channel-ready ads.

Think apartment listings, car dealerships, staffing firms, med spas, home services, and multi-location franchises. These businesses don't need one masterpiece. They need 50 fresh ads this week, 50 more next week, and a way to connect creative output to leads, booked appointments, or inventory movement.

Mirage's bet on assembly intelligence matters more than the generation models underneath it. The next category winner may not be the company with the prettiest output. It may be the company that gets best at deciding what to combine, in what order, for whom, and under which context. For a small team, that's excellent news. You don't need to win the global model race. You need to win one narrow decision stack: Which property photo opens the reel? Should the hook lead with "No broker fee" or "Pet-friendly"? Which voice and pacing pattern converts best in which ZIP code?

Generality is a losing game. Vertical specificity is where the margin lives.

Assembly intelligence layer diagram

The underpriced angle: accents, language, and local trust

The accent-preservation detail in the Mirage story isn't a cute product feature. It's a distribution clue.

Large swaths of local commerce still market as if every buyer wants generic U.S.-default content. A clinic in Miami doesn't need the same voice profile as a clinic in Phoenix. A staffing firm in Texas may want bilingual English-Spanish variants by default. A brokerage targeting Brazilian buyers in South Florida may care more about Portuguese nuance than cinematic editing quality.

Animoto's January 2026 research found 83% of consumers can identify AI-generated video, and 36% say it lowers brand trust. AI-generated sameness is getting cheaper. Trust cues are getting more valuable. Localized, accent-aware, vertical-specific creative at scale isn't just more content. It's content that feels closer, more native, more believable. That makes localization one of the most underpriced angles in AI video advertising.

Pick one vertical and own the feed

If you want the most practical version of this opportunity, start with one of these three verticals. Each has structured inputs, existing acquisition budgets, and a clear ROI story.

Unlock the Vault.

Join founders who spot opportunities ahead of the crowd. Actionable insights. Zero fluff.

“Intelligent, bold, minus the pretense.”

“Like discovering the cheat codes of the startup world.”

“SH is off-Broadway for founders — weird, sharp, and ahead of the curve.”

Already have an account? Sign in.

Similar ideas

New startup opportunities, ideas and insights right in your inbox.