Work Use Cases Talk to Hiten
Hiten Shah  ·  AI Studio

Find out what's wrong
before you pay to find out.

User research is slow and expensive. A/B tests take weeks. Gut feel is a guess. Kar runs 20 simulated buyers through your landing page, pitch, or product in a day. Tells you exactly what's blocking people before you spend a dollar on traffic.

1 day
vs 3-4 weeks for research firms
+55%
Intent lift, Talk Stories (5 rounds)
40%
CTA click rate, Otto (7 rounds)
$0
Cloud API cost, ever

The existing options aren't broken.
They're just built for a different era.

The old way
User research firm$8–15k, 3–4 weeks
CRO agency retainer$5k+/month, opinions
A/B testWeeks of traffic, one variable
AI consultantStrategy deck, no deliverable
Ship and hopeFast, but flying blind
Kar
20 buyers simulatedSame day
Per projectNo retainer
Test 4 variants at onceBefore any traffic
Every changeTied to data
5 rounds, iterate fastShip with confidence
01
Audience simulation
We build AI versions of your target customers and run them through your landing page, pitch, or product. You get a score, their objections, what confused them, and what they wanted to see, before a single real person sees it.
Typical project
20 personas · 3-5 rounds · 3-5 days · full data at every step
02
AI model selection
If you're building an AI product, you need to know which model actually performs for your use case. Not who's winning some benchmark you didn't design. We test the candidates against your real tasks and build the routing logic that picks the right one automatically.
Typical project
5-24 models · 6 quality dimensions · routing system included
03
Local AI infrastructure
AI that runs on your own hardware with no per-query cost and no data leaving your environment. We set it up, benchmark it against your workload, and hand it over. Good fit if you're running high volume or handling sensitive data.
Typical project
Setup + benchmarking · documentation · handoff
On simulation accuracy
AI simulations don't replace real users. They narrow the search space before you involve them. You find the obvious problems (the confusing copy, the missing context, the framing that puts people off) before you pay for traffic or book interviews. The Talk Stories project caught five specific things that would have hurt conversion. You still validate after launch. Think of it as the pre-flight check, not the flight.
Read the full case studies →

Three projects. All public.
All include the failures.

We're new. There are three case studies and they document everything: the wrong answer keys, the rounds that didn't move the needle, the times the obvious fix made things worse, the three simulation bugs we found mid-study and fixed in public. Most agencies show you the polished version. These are the actual records.

What we're confident about: the methodology is real, documented, and replicable. The infrastructure runs without cloud cost. The speed is genuine. A simulation that would take a research firm 4 weeks to run takes us a day.

What holds up

Talk Stories: +55% intent lift across 5 rounds, all iterations public. Otto: 40% CTA click rate after 7 rounds, plus documented methodology fixes that now apply to every future study. LLM eval: 24 models tested, 5.8x speed advantage found and measured. Every deliverable in all three case studies is live and clickable.

What's still early

Three projects is a thin track record. The Talk Stories lift is simulation intent, not post-launch conversion data. The Otto CTA rate is from synthetic personas, not real traffic. We're building the library as we take on projects. If you need years of client logos before you engage, this isn't that yet.

Who this works well for.

Good fit
  • Founders who already think in experiments and want faster answers
  • Product and growth teams tired of waiting weeks for user research
  • AI teams that need model selection backed by data, not vendor claims
  • Anyone about to spend on ads who doesn't know if the page will convert
  • Companies handling sensitive data that can't use cloud AI
Probably not
  • Enterprise procurement requiring SOC2 and formal compliance audits
  • Projects that need a 10-person team and a monthly status deck
  • Anyone who needs a decade of client logos before they'll engage
  • Situations where simulated user data legally can't substitute for real user data
H
Hiten Shah
Co-founder of KISSmetrics, Crazy Egg, and Product Habits. 20 years building products and studying how they grow. Every engagement is directed by Hiten and executed by Kar, an AI.
Accountable for every deliverable.
01
You describe the question
"Will this landing page convert?" "Which AI model should we use?" "What's blocking our signup?" Hiten scopes the right test with you.
02
We design the test
Audience simulation, model comparison, structured experiment. Whichever fits the question. This is where the infrastructure earns its keep.
03
We run it and share everything
You get the actual numbers, persona responses, and failure modes. Not a cleaned-up summary. The case studies show what this looks like in practice.
04
We adjust and go again
One round rarely answers the question. Most projects run 3 to 5 iterations. Each one is faster because the setup is already done.
05
You get something that ships
A landing page that tested well. A benchmarked model stack. A routing system. Not a deck. A thing you can use.
Kar is an AI. Claude, running on local infrastructure.

Have something worth testing?

Tell Hiten what you're trying to learn. If it's a good fit, he'll scope it and tell you exactly what you'd get back.

Hiten Shah hiten@64stories.com

No intake forms. No calls to schedule calls. Just describe the problem.