In early 2024, our copywriters were spending 60% of their time writing variations of the same ad — different hooks, different CTAs, different lengths for different placements. Good copywriters should spend their time on strategy and big ideas, not grinding out the 47th variation of a Google RSA headline. That frustration is what led us to build CopyLoop, and 18 months later, it has generated over 10,000 ads with a 23% higher CTR than our human-written baseline. Here is the full story of how we built it.
The Problem Statement
A typical client campaign at Garage Collective needs 120-200 ad variations per month across Meta, Google, and LinkedIn. Each variation needs to respect brand guidelines, match the campaign objective, fit platform-specific character limits, and — most importantly — actually perform. Our copywriters were burning out, and quality was inconsistent. Monday morning copy was noticeably better than Friday evening copy.
We tried generic AI writing tools — ChatGPT, Jasper, Copy.ai. They produced grammatically correct copy that performed terribly. The output sounded like a Wikipedia article trying to sell you something. The problem was obvious: these tools were trained on general internet text, not on high-performing ad copy. They did not know what a hook was, how urgency works differently in Meta vs Google, or why certain emotional triggers convert better for Indian audiences.
Tech Stack Decisions
We evaluated three approaches: fine-tuning an open-source model (LLaMA), building a RAG pipeline over GPT-4, or creating a custom prompt-chaining system with Claude. We chose the third option. Fine-tuning required more training data than we had at the time, and RAG over GPT-4 gave us retrieval accuracy issues — it would pull structurally similar ads that were not actually high performers.
Our stack: Claude API for generation, a custom embeddings pipeline (using text-embedding-3-large) for our ad performance database, Supabase for storage and vector search, and a Next.js frontend for the team. The total build took four months with two engineers and cost under ₹8L in development — a fraction of what we would have spent licensing an enterprise AI writing tool for the team.
Training on Our Own Campaign Data
The secret sauce is the performance database. We tagged and scored every ad we had run over three years — 8,000+ ads across 200+ campaigns. Each ad was tagged with format (RSA headline, Meta primary text, LinkedIn sponsored), objective (awareness, consideration, conversion), industry, emotional trigger (urgency, social proof, curiosity, fear of missing out), and its actual performance metrics (CTR, conversion rate, CPA).
When CopyLoop generates copy, it does not just prompt an LLM — it first retrieves the top 15 performing ads from similar campaigns (same industry, same objective, same platform), extracts the structural patterns (hook type, CTA style, length), and then prompts Claude with those patterns as a framework. The result is copy that sounds like our best work, not like generic AI output.
A/B Test Results: Human vs CopyLoop
We ran a controlled test across 6 client accounts over 3 months. Each campaign received an equal mix of human-written and CopyLoop-generated ads. The writers did not know which ads were theirs and which were AI-generated. Results: CopyLoop ads averaged a 23% higher CTR and 11% lower CPA. Human-written ads performed better on brand-building campaigns where emotional nuance mattered, but CopyLoop dominated on direct-response and lead-gen.
The most surprising finding: CopyLoop was better at writing short-form copy (Google RSA headlines, Meta hooks) and humans were better at long-form (LinkedIn articles, email sequences). This makes sense — short-form ad copy is fundamentally a pattern-matching problem, and AI is better at patterns. Long-form requires sustained narrative logic that still favors human writers.
What We Learned Building In-House AI Tools
First, your proprietary data is your moat. Anyone can call the Claude API — what makes CopyLoop valuable is the 8,000-ad performance database that took three years to build. Second, AI tools for internal use do not need to be perfect; they need to be faster than the alternative. CopyLoop still produces duds — about 15% of its output gets rejected by our writers. But it produces those duds in 4 seconds instead of 20 minutes.
Third, adoption is harder than building. Our copywriters initially resisted CopyLoop because they saw it as a threat. It only succeeded when we repositioned it as an assistant that handles the grunt work so they can focus on big-idea creative. Now our writers use it as a starting point for 80% of performance ad copy and spend their freed-up time on brand campaigns, scripts, and strategy.
Key Takeaways
- Generic AI writing tools underperform because they are not trained on high-performing ad copy — you need domain-specific data.
- CopyLoop uses a retrieval-augmented approach over 8,000+ scored ads to generate copy that matches proven structural patterns.
- In controlled A/B tests, CopyLoop-generated ads had 23% higher CTR and 11% lower CPA vs human-written copy.
- AI excels at short-form, pattern-driven copy (RSA headlines, hooks) while humans still win on long-form narrative.
- The hardest part of building AI tools is internal adoption — reposition them as assistants, not replacements.