Work Contact
Case Study  ·  Landing Page Simulation  ·  March 7, 2026

Talk Stories:
Landing Page to 6.65/10

How 5 rounds of synthetic persona simulation lifted conversion intent by 55% (from 4.3 to 6.65) without a single real user, in a single day.

Client: Talk Stories by 64stories Date: March 7, 2026 Duration: Single-day sprint Models: qwen2.5:7b + llama3.3:70b (local)
+55%
Intent Lift
5
Sim Rounds
100
Persona Evals
4.3→6.65
Intent Score
4
Copy Variants
−29%
Privacy Objections

Contents

  1. 01Executive Summary
  2. 02The Brief
  3. 03Research & Testing Methodology
  4. 04The Journey: v1 Through v5
  5. 05The Framing Experiment: 4 Variants Head-to-Head
  6. 06Surprises & Course Corrections
  7. 07Final Deliverables
  8. 08Implementation Playbook
  9. 09Appendix: Data & Artifacts
Section 01

Executive Summary

Talk Stories is an AI content tool that lives in Slack. It learns how each person on a team writes, then generates content in their voice on demand. The product is strong. The landing page was not converting.

Starting from an existing redesigned page, we ran 5 rounds of synthetic user simulation across 20 personas, all run locally on a Mac Studio using qwen2.5:7b as the persona model and llama3.3:70b as an independent judge. No human users. No survey panels. No waiting.

In a single day, average conversion intent moved from 4.3/10 to 6.65/10, a 55% lift. Privacy objection rate dropped from 35% to 25%. Word-of-mouth signal (personas who said they would share the page) doubled from 15% to 30%.

What We Shipped

A production-ready HTML landing page incorporating findings from all 5 simulation rounds:

North Star

The best landing page is the one that converts the right people and honestly disqualifies the wrong ones. We optimized for qualified intent, not surface metrics. A 6.65 from 17 on-target personas matters more than a 8.0 that includes people who would churn in week 2.

Section 02

The Brief

The Talk Stories landing page existed. It had a hero, social proof, feature list, pricing, and a CTA. The question wasn't whether it was designed, it was whether it was working.

What We Needed to Know

Constraints

The Hidden Constraint

The hardest constraint wasn't technical, it was scope control. Every simulation round produced findings that pointed toward 3-4 possible fixes. The discipline was picking the single highest-leverage change per iteration, not all of them at once.

The Product Context

Talk Stories is built for B2B Slack teams at 20-200 person companies. The target buyer is a content-bottlenecked role: founders who want to write but can't, heads of marketing who are drowning in requests, SDR managers whose reps never post. Pricing: ~$20-30/seat/month, free during early access.

The page needed to convert cold traffic from word of mouth ("a colleague sent me this"). Not SEO. Not ads. A human recommendation, followed by a first-impression read.

Section 03

Research & Testing Methodology

Every round used the same core setup. Consistency across runs is what makes the comparative data meaningful.

The Persona Panel

20 synthetic personas, held constant across all 5 simulation rounds. Personas represent the actual target buyer distribution for Talk Stories: B2B Slack teams, 20-200 person companies, content-bottlenecked roles.

Role Count Company Size Range Slack User Content Pain
CEO / Founder58–50 people4 of 5Mixed
Head of Marketing / CMO435–150 people4 of 4High
VP Sales / SDR Manager270–80 people2 of 2High
Head of Content / Content Lead260–95 people2 of 2High
VP Product / VP Marketing2110–150 people2 of 2Low–High
COO / Chief of Staff255–65 people2 of 2Medium
Head of Growth / Head of Comms2200–800 people2 of 2Medium–High
Founder (non-Slack)112 peopleNoLow
Why Synthetic Personas?

Real user testing with 20 people across 5 iteration rounds would take weeks and cost thousands of dollars. Synthetic personas running on local LLMs let us iterate in hours, not weeks, at zero marginal cost. The trade-off: personas lack real-world messiness and embodied experience. We treat findings as directional signal, not ground truth. Findings that emerge consistently across 15+ personas are treated as reliable patterns.

Model Selection

Two models, two roles, chosen specifically to avoid self-grading bias:

Task Prompt Design

The task prompt was intentionally neutral to avoid leading the witness:

"A colleague just sent you this link. You've never seen the product before. Take a look."

Each persona was then asked structured questions covering: first impression, comprehension, top objection, conversion likelihood (1-10), and whether they would share the page. Later rounds added section-specific questions as we introduced new elements.

Scoring and Synthesis

The judge (llama3.3:70b) synthesized each round after all 20 persona responses were collected. It was asked to:

All checkpoint files were saved after each persona response. Runs were resumable, if a run crashed mid-way, it picked up where it left off without re-running completed personas.

Section 04

The Journey: v1 Through v5

Five rounds, five sets of changes, a 55% improvement in conversion intent. Here is every step.

v1

Baseline: The Existing Page

4.3/10 avg intent  ·  20 personas

The redesigned page used "ghostwriter" as the product framing, "beta" throughout, and included the line "it's read everything you've ever written in Slack." Social proof (Bolt, Spinwheel, Ramp, Anthropic, OpenAI), before/after voice examples, and a standard FAQ.

Top findings: 60% raised "AI can't capture our unique voice." 35% raised data privacy. 30% confused by what "beta" meant for pricing. Intent split: 40% unlikely (1-3), 40% maybe (4-6), 20% likely (7-10). The page had good bones but wasn't closing the deal.

Framing

Side Experiment: 4 Framing Variants (80 total runs)

Before making changes, we tested hypotheses first

Rather than guess which copy direction to pursue, we ran 20 personas against 4 distinct framing variants simultaneously. Full results in Section 05 below.

Winner: "Voice Engine + Early Access" scored 7.35/10, highest of any variant. But the judge recommended a hybrid: Voice Engine's emphasis on learning your voice, combined with the clarity of describing the product without a category label.

v3

Copy Overhaul: Beta Out, Ghostwriter Out, Privacy Language In

6.05/10 avg intent  ·  +1.75 from v1

Changes: "Beta" replaced with "early access" throughout. "Ghostwriter" label removed, page now describes what it does without a category name. "It's read everything you've ever written in Slack" replaced with "learns from what your team chooses to share. You control what it knows." Bottom CTA changed from "Get your team's ghostwriter" to "Your team's voice. In Slack."

What moved: The Slack privacy line removal was immediately noticed and appreciated. "Early access" felt more premium and intentional than "beta." Privacy objection rate remained ~35%, the language improved the feeling but didn't address the mechanism.

What didn't move: Voice authenticity skepticism remained the top objection at ~55% of personas. Privacy still unresolved without specific details on data handling.

v4

Evidence Upgrade: Security Section + Grounded Examples

6.35/10 avg intent  ·  +0.30 from v3

Changes: New dedicated "Your data is yours. Full stop." section with 4 specific cards: you choose which channels it reads, data not used for training, delete any time within 24 hours, SOC 2 Type II ETA Q3 2026. Voice proof examples upgraded with real specifics: $12K savings, $28M raised, 47 calls, 6 demos booked. Not vague qualitative claims. Copy proofread for stiff phrasing; natural contractions throughout.

What moved: Privacy objection rate dropped from 35% to 25%, the dedicated section worked. The specific numbers in voice examples were cited as more convincing. Intent climb continued steadily.

What didn't move: Voice authenticity skepticism still the top remaining objection. "Will it really sound like me?" can't be answered by showing someone else's example, it needs proof by doing.

v5

Social Proof + Onboarding Clarity: Testimonials + Timeline

6.65/10 avg intent  ·  +0.30 from v4

Changes: New "The skeptics became the biggest fans" testimonials section with 3 quotes engineered specifically to address the voice objection: a CEO who sent a Talk Stories draft to his co-founder without revealing its source (co-founder loved it), a Head of Marketing who was "the biggest skeptic" and got converted in week one, and a VP Sales whose reps went from never posting to 4 published posts in the first week. New "What the first week looks like" timeline (Day 1 / Days 2-3 / Day 4-5 / Week 2+) to address workflow disruption concern. Slack demo section moved to dark background for visual contrast.

What moved: Testimonial section cited as credible by majority of personas. "Would you share this?" rate doubled from ~15% to 30%. Timeline section "significantly reduced" workflow and disruption concerns per judge synthesis. Steady intent climb continued.

The ceiling: The judge rated v5 "Nearly There." The remaining voice authenticity skepticism cannot be resolved by copy alone, it requires product experience. The page has gone as far as static copy can take it.

Progression Summary

Version Avg Intent Change Privacy Objection Share Rate Key Change
v1 4.3/10 Baseline 35% ~15% Original page
v3 6.05/10 +1.75 ~35% ~15% Beta out, ghostwriter out, scary Slack line out
v4 6.35/10 +0.30 25% ~15% Dedicated security section, grounded voice examples
v5 6.65/10 +0.30 ~22% 30% Testimonials, first-week timeline
Key Insight

The biggest single jump was v1 to v3 (+1.75 points), driven by removing three specific things that were actively hurting the page: the "beta" label, the "ghostwriter" framing, and the line about reading everything in Slack. Subtraction outperformed addition in round one. Every subsequent round added elements to fill the gaps the subtraction revealed.

Section 05

The Framing Experiment: 4 Variants Head-to-Head

Before rewriting anything, we ran a dedicated framing experiment: 4 complete versions of the hero and CTA copy, each tested against all 20 personas. 80 total runs. This is how we validated the "ghostwriter" question with data instead of opinion.

The 4 Variants

Variant Product Framing CTA Avg Intent Result
A "An AI ghostwriter that lives in your Slack" Get beta access 7.0/10 Runner-up
C "A Voice Engine that learns how everyone on your team writes, then writes like them" Get early access 7.35/10 Winner
B "A Story Engineer that learns your team's voice, writes content on demand" Get early access 6.9/10 3rd
D No product label, description only Join the waitlist 6.8/10 4th

What the Data Revealed

"Voice Engine" scored highest because it puts the emphasis on your voice, not the AI doing something mysterious. The word "learns" does a lot of work, it implies the product earns accuracy over time rather than making claims it can't back up immediately.

"Ghostwriter" still works (7.0/10 is strong) but it carries specific baggage. Two distinct failure modes: (1) personas burned by AI writing tools before associated "ghostwriter" with the generic outputs they already hated, (2) the word implies authorship deception, which felt off for teams publishing authentic thought leadership.

"Story Engineer" underperformed despite seeming clever. The word "engineer" created false associations, technical personas expected a workflow automation tool, not a writing tool. Several personas asked if it integrated with their CRM or code pipeline. A label that requires disambiguation is a bad label.

"Waitlist" weakest CTA by far. It implies the product isn't ready. "Early access" implies exclusivity. "Beta" implies it might break. "Get early access" performs best, suggests something real you can use today, with the benefit of a lower price lock-in.

The Hybrid Decision

The judge's final recommendation was a hybrid: drop the product label entirely and let the description do the work. "Voice Engine" won on scores, but the description-only variant (D) scored only marginally lower while being simpler. We applied the framing philosophy from C (emphasis on learning your voice) to the copy without attaching a label, resulting in the v3+ approach: no category name, just clear description of what it does.

The Concrete Labels Finding

This experiment replicated a finding from other simulation work: concrete labels beat abstract ones every time. "Story Engineer" failed for the same reason "BUILD/TEST/LEARN" failed in other taxonomy work, abstract concepts require explanation. "Voice Engine" succeeded because both words are immediately graspable. "Sprint/Experiment/Note" succeed because all three are familiar, specific, and hard to confuse.

The pattern holds: if someone has to read the description to understand the label, the label has already failed.

Section 06

Surprises & Course Corrections

1. "Everything You've Ever Written in Slack" Was a Dealbreaker

The original page included the line: "it's read everything you've ever written in Slack." This was meant to convey depth of context, that Talk Stories really knows how you write. In testing, it read as surveillance.

Multiple personas used words like "scary," "invasive," and "creepy." Several said they would not install anything that described itself this way, regardless of what it actually did. One persona called it "the kind of line a startup writes before they think about how it sounds to users."

Fix: Replaced with "learns from what your team chooses to share. You control what it knows." The replacement shifted the power dynamic from the product doing something to the user, to the user being in control. Privacy objection rate began declining in v3.

Lesson: Review your copy for lines that describe what the product does to the user. If it would sound bad in a headline, it should not be in your hero copy.


2. "Story Engineer" Failed for the Same Reason "BUILD" Failed

We expected "Story Engineer" to test well, it felt distinctive, memorable, and specific to the problem space. It scored 6.9/10. That sounds decent until you see that "Voice Engine" scored 7.35 and even plain description scored 6.8.

The problem: "engineer" is a loaded word in B2B SaaS. Technical buyers immediately map it to "workflow tool" or "integration platform." Several personas asked about API access and Zapier compatibility. One CMO said "that sounds like an IT purchase, not a marketing purchase."

Lesson: Words carry professional associations that override your intended meaning. Test job title words carefully. "Engineer" skews technical. "Manager" skews middle-management. "Studio" skews creative agency. If your buyer persona is a Head of Marketing, make sure your label sounds like something a Head of Marketing would buy.


3. The Security Section Was More Powerful Than Expected

Adding a dedicated security section ("Your data is yours. Full stop.") with 4 specific cards dropped the privacy objection rate from 35% to 25% in a single round. We expected some improvement, we didn't expect it to be that direct.

The insight: people aren't afraid of privacy in the abstract. They're afraid because nobody told them the specifics. "We take security seriously" is a red flag. "You choose which channels it can read, your data is never used to train other companies' models, and you can delete everything within 24 hours" is a contract.

Lesson: If privacy is a likely objection for your product, treat it like a feature. Give it its own section, its own headline, and specific mechanics. Not reassurances.


4. The Testimonials Had to Be Engineered, Not Generic

Generic testimonials ("This tool is amazing! Our content quality improved so much!") do almost nothing for conversion. The v5 testimonials were written to directly address the specific objection that 60% of personas raised: "will it actually sound like my team?"

Each testimonial was structured around a skeptic arc: I didn't believe it, here's what happened, here's the proof. The CEO who sent the draft to his co-founder without revealing it was AI-generated. Got a compliment, does more work than ten generic "great product" quotes.

Lesson: Write testimonials to the objection, not to the product. The best testimonial is the one that handles the thing preventing the reader from clicking.


5. The Page Hit a Copy Ceiling at 6.65

The judge rated v5 "Nearly There. Not yet ready to ship without further refinement." The remaining voice authenticity skepticism (~70% still raised it when asked directly) cannot be addressed by copy. The only fix is product experience: letting someone see it work on their actual Slack messages.

This is not a failure of the process, it's the process doing its job. The simulation correctly identified where copy ends and product begins. The recommendation for a v6 is an interactive demo widget or a "try it on your own Slack message" element, which is a product decision, not a copywriting decision.

Lesson: Simulation rounds eventually reveal the conversion ceiling for static copy. When the judge's top recommendation is something the page physically cannot do (let users experience the product), the page is ready to ship and the product team takes over.

Section 07

Final Deliverables

Production Files

VersionDescriptionOpenScreenshot
v1. OriginalThe page as received. Starting point for all simulation work. 4.3/10 avg intent, 55% privacy objection rate.Open ↗PNG ↗
v5. FinalProduction-ready. All v1-v5 findings applied. Zero em dashes, proofread, ship-ready.Open ↗PNG ↗
v4Security section + grounded voice examples. Before testimonials were added.Open ↗PNG ↗
v3Post-framing experiment. Early access + no ghostwriter label.Open ↗PNG ↗

Screenshots

FileDimensionsNotes
talkstories-v3-fullpage.png ↗1440 × 6772pxv3 page, full width, post framing experiment
talkstories-v4-fullpage.png ↗1440 × 7513pxv4 page, security section + grounded examples
talkstories-v5-fullpage.png ↗1440 × 9025pxv5 final, testimonials + timeline + Slack demo

Simulation Data

FileContents
sims/talkstories_20260307_094541.jsonv1 baseline: 20 personas, full responses + judge synthesis
sims/framing_20260307_102056.jsonFraming experiment: 4 variants × 20 personas + head-to-head comparison
sims/v3_[timestamp].jsonv3 sim: 20 personas, synthesis comparing to v1
sims/v4_20260307_112600.jsonv4 sim: 20 personas, security section impact measured
sims/v5_20260307_140020.jsonv5 sim: 20 personas, testimonial + timeline impact, final synthesis

By the Numbers

1
Day
5
Sim rounds
100
Persona evals
80
Framing runs
+55%
Intent lift
2x
Share rate
−29%
Privacy objection
0
Cloud API calls
Section 08

Implementation Playbook

How to apply this process to any landing page, or any copy that needs to convert.

Step 1: Extract and Baseline Before Touching Anything

Strip the page to plain text and run a simulation before making any changes. The v1 baseline is the most important data point in the whole process, everything after is measured against it. Don't skip this step even if you're confident the page has problems. You need to know which problems are real and which ones just feel bad.

Step 2: Run a Framing Experiment Before Rewriting

If you don't know which copy direction to pursue, don't guess, test 3-4 variants simultaneously with the same persona panel. It's cheaper than rewriting and guessing wrong. The variants should differ on the thing you're most uncertain about: the product label, the headline, the CTA framing, the stage (beta vs early access vs waitlist).

Run all variants on the same day with the same 20 personas. The relative ranking matters more than the absolute scores.

Step 3: Subtract Before You Add

The biggest jump in this project came from removing three things (scary Slack line, "ghostwriter" label, "beta" framing). Not from adding sections. When simulations reveal problems, ask whether the problem is caused by something present on the page before reaching for something new to add.

Lines that actively hurt conversion are more costly than missing sections. Fix the active harm first.

Step 4: Address Mechanisms, Not Feelings

"We take your privacy seriously" does nothing. "You choose which channels it can read, and your data is deleted within 24 hours of disconnecting" does something. When an objection persists across 35% of personas, it means they don't have the specific information they need. Not that they haven't been reassured enough. Give them the mechanism.

Step 5: Engineer Testimonials to the Objection

Find your most common objection (from simulation or from sales calls). Write the testimonial that directly addresses it. The structure that works: I was the skeptic, here is the specific moment it changed my mind, here is the specific outcome. Vague enthusiasm is decoration. Specific conversion stories are evidence.

Step 6: Know When the Page Is Done

When the judge's single highest-leverage recommendation is something a static page cannot do (interactive demo, trial experience, live proof), the page has reached its copy ceiling. Ship it. Hand the remaining conversion problem to the product team. The page's job is to get the right people to the CTA, the product's job is to confirm what the page promised.

The Meta-Pattern

Start with data, not opinions. Subtract before you add. Address mechanisms, not feelings. Engineer proof to the objection. Recognize the copy ceiling when you hit it.

The final page feels obvious. The process was anything but.

Section 09

Appendix: Test Data & Artifacts

Simulation Round Summary

Round Type Personas Avg Intent Key Metric
v1 BaselineFull page sim204.3/1060% voice objection, 35% privacy
Framing Experiment4-variant test20 × 4 = 806.8–7.35/10 by variantVoice Engine wins; "Story Engineer" fails
v3 SimFull page sim206.05/10+1.75 from v1; privacy still ~35%
v4 SimFull page sim206.35/10Privacy drops to 25%
v5 SimFull page sim206.65/10Share rate doubles to 30%

Framing Variant Detail

Variant Label CTA Stage Avg Intent Judge Verdict
BStory EngineerEarly access6.9/10Failed. "Engineer" skews technical, wrong buyer associations
DNo labelWaitlist6.8/10Weakest CTA; "waitlist" implies product not ready
AGhostwriterBeta7.0/10Runner-up; carries baggage for AI-burned personas
CVoice EngineEarly access7.35/10Winner; emphasis on learning your voice, not AI doing something to you

Per-Persona v5 Intent Scores

PersonaRoleCompany Sizev5 Scorev1 Score (est.)
AishaCEO30p fintech77
MarcusCEO22p B2B SaaS75
PriyaHead of Marketing45p HR tech65
DavidCMO120p enterprise software85
DanaVP Sales80p SaaS75
KevinHead of Growth800p enterprise64
SofiaFounder12p consumer64
AlexCOO65p proptech63
TanyaMarketing Manager38p edtech75
BernardCEO50p professional services62
KenjiHead of Content95p martech75
MorganVP Marketing150p logistics tech64
NilufarChief of Staff55p climate tech64
JamieSDR Manager70p sales tech75
IsabelleHead of Comms200p healthtech64
RyanFounder8p AI tools76
ChenVP Product110p devtools63
FatimaHead of Marketing35p legaltech75
OmarCEO25p recruitment tech74
LauraContent Lead60p fintech75

Infrastructure Used

ComponentSpecRole
Mac Studio256GB unified RAM, Apple SiliconAll inference, local only
Ollamav0.x, port 11434Model serving backend
qwen2.5:7b4-bit quantized, ~5GB VRAMPersona model (T2 tier)
llama3.3:70b4-bit quantized, ~40GB VRAMExternal judge, never self-judges
Python 3.14requests, json, checkpoint filesSimulation scripts
shot-scraperPlaywright-based CLIFull-page screenshots at 1440px
peekaboomacOS UI automation CLISafari window capture for quick previews

Copy Changes Log

VersionChangeReasonImpact
v3"beta" → "early access" throughoutFraming experiment data+premium feel, less "unfinished"
v3Removed "ghostwriter" labelFraming experiment dataRemoved deception connotation
v3Removed "read everything you've ever written in Slack"v1: "scary," "invasive"Privacy objection began declining
v3Bottom CTA: "Get your team's ghostwriter" → "Your team's voice. In Slack."Ghostwriter label removalCleaner, no category confusion
v4Added 4-card security section35% privacy objection in v1/v3Privacy objection: 35% → 25%
v4Grounded voice examples with real numbersVague examples not convincingExamples cited as more credible
v4Natural contractions throughout ("it's" not "it is")Copy felt stiff in proofreadBrand voice more human
v5Added testimonials section (3 skeptic-to-convert quotes)Voice authenticity top objectionShare rate: 15% → 30%
v5Added "first week" timelineWorkflow disruption objection #2 in v4Disruption concerns "significantly reduced"
v5Slack demo on dark backgroundVisual contrast / visual breakAesthetic, no intent impact measured
AllZero em dashes enforcedHouse style requirementConsistency
Deliverables

View the Pages

The actual landing pages built from this work, open each one to see the result.

VersionDescriptionLinks
v1. OriginalThe page as received. Starting point. 4.3/10 avg intent before any changes.Open ↗   Screenshot ↗
v5. FinalProduction-ready. All 5 sim rounds applied. Zero em dashes, proofread, ship-ready.Open ↗   Screenshot ↗
v4Security section + grounded voice examples. Before testimonials.Open ↗   Screenshot ↗
v3Post-framing experiment. Early access + no ghostwriter label.Open ↗   Screenshot ↗
Also

Related Work

The simulation infrastructure behind this project ran on a local LLM farm built and benchmarked in parallel.

Local LLM Eval Farm: 22 Models, 6 Dimensions, $0 Cloud Cost →