Why We Built WhisperScript - Offline, Private AI Transcription for Journalists, Filmmakers & Lawyers
I’m Jonathan—co-founder and CEO at Wavery. If you work with interviews or confidential recordings, you’ve probably asked yourself: Can I trust AI tools with this audio? For most cloud apps, the honest answer is “it depends,” and that’s not good enough when you handle personally identifiable information or embargoed material.
That’s why we built WhisperScript: offline, on-device AI transcription designed so your recordings never leave your computer. No uploads, no surprise training on your data—just fast, accurate transcripts, summaries, and subtitles.
TL;DR
- Privacy-first: All processing happens offline on your machine.
- No data used for training: We fine-tune only with public datasets.
- Built for pros: Transcribes audio and audiovisual; export quotes, summaries, and subtitles.
- Who it’s for: Journalists, filmmakers, podcasters, lawyers, researchers, and media teams handling sensitive content.
- Under the hood: Based on Whisper.cpp with our workflow tuning—no cloud dependency.
- Business model: Buy-once unlimited transcription option, plus subscriptions/perpetual licenses.
The problem with cloud AI (and why on-device matters)
Most generative AI tools are incredible—but they’re also online-only. Your content is processed on someone else’s servers, and even if policies look good today, teams change, products change, and risk lingers. For investigative reporting, legal work, health-adjacent stories, or film production, we needed a different default: keep everything local.
With WhisperScript, processing stays on your computer. That single choice removes an entire class of risk—no accidental uploads, no unclear retention, no “opt-out” clauses.
What WhisperScript does (and doesn’t)
Does:
- Transcribes audio & video reliably, fast, and locally.
- Generates summaries, helps with quoting interviews, and exports subtitles for accessible media.
- Fits into post-production and editorial workflows without command-line gymnastics.
Doesn’t:
- Upload your audio anywhere.
- Train on your files.
- Require an internet connection to work.
Under the hood, WhisperScript uses Whisper.cpp (open source). Our fine-tuning and evaluation rely on publicly available datasets—never customer content.
Who we built this for
- Journalists & researchers: Sensitive interviews, roundtables, field recordings.
- Filmmakers, podcasters & media teams: Editorial pipelines, subtitle creation, post-production.
- Lawyers & academics: Client conversations, depositions, lectures, and studies that must remain confidential.
Our story (the short version)
I met Kai Shimada (now our CTO) in 2020 in Vienna. Kai was studying Tonmeister; I was working in film sound. We both loved the craft, but we kept running into the same bottlenecks: clunky tools, fragile integrations, and AI that didn’t respect privacy.
So we rolled up our sleeves and learned to program—pandemic nights and weekends—while I completed a Master’s in Creative Technologies at the Filmuniversität Babelsberg. A few months later we had a first prototype. In 2023, we ran a beta with ~4,000 users to understand real-world needs. In 2024, we received the EXIST-Gründungsstipendium and joined the MediaTech Hub Accelerator (cohort 2) in October, which sharpened our product and fundraising thinking through direct time with VCs and domain experts.
In December 2024, we launched WhisperScript publicly—and we’ve seen strong month-over-month growth since.
Why our approach is different
- Privacy by design: We start from offline and only add connectivity where it helps you, not us.
- Workflow integrity: We come from film audio post—we build around actual editorial timelines and DAW/NLE realities.
- Transparent model policy: Your data doesn’t train our models. Period.
- Fair economics: A buy-once, unlimited transcription option exists because on-device inference keeps our costs predictable.
Pricing you can plan around
- Buy once → transcribe unlimited (great for long-form projects and teams).
- Subscription or perpetual licenses if you prefer ongoing updates as we ship new features.
Founder FAQ
Is any of my audio uploaded to the cloud? No—everything runs locally. Your files never leave your device.
Do you use customer recordings to train models? No. We fine-tune with public datasets only.
Does WhisperScript handle video files? Yes—audio and audiovisual sources are supported.
Who uses this today? Journalists, filmmakers, podcasters, legal teams, researchers, and media companies—anyone dealing with sensitive recordings.
How is this different from other “offline” wrappers? We’ve tuned the full workflow (UI + export + accuracy trade-offs) for editorial speed and post-production reliability, not just a model demo.
If you care about accuracy and privacy, WhisperScript lets you keep both. I’d love to hear how you’re using it and what would save you even more time in your day-to-day.
— Jonathan