A blogging workflow based on transcribing audio notes with Whisper

The problem

When you want to preserve your privacy, using cloud-based speech-to-text services is probably not a good idea. But how to still benefit from the user experience of quickly recording a (blog post) idea on your (Android) smartphone, and having it transcribed into a (markdown) file?

The solution

  1. Android’s Sound Recording app (in high quality mode to create .wav files).
  2. Syncthing, to get the recordings from the smartphone directly into the ~/blog/content/posts/ folder.
  3. Georgi Gerganov’s whisper.cpp repo.
  4. A bit of Bash-scripting, see below.

Without any previous experience in AI/LLM usage, but having read Google’s “We Have No Moat” memo, I was positively surprised about how easy implementing my workflow idea was.

[Read More]