Byblos User Manual

Version 0.1.0

Getting Started

First Launch

When you first open Byblos, an onboarding wizard guides you through:

Microphone permission — Byblos needs mic access to hear you.
Download a model — Pick a speech model and optionally an AI model.
Ready — Byblos appears as a waveform icon in your menu bar.

Permissions

Microphone (required) — for voice recording.
Accessibility (recommended) — for typing text into other apps and the hold-to-record hotkey. Grant in System Settings → Privacy & Security → Accessibility.

Without Accessibility permission, Byblos copies transcriptions to your clipboard instead of typing directly.

Recording & Transcription

Starting a Recording

Three ways to record:

Left-click the menu bar icon — click once to start, click again to stop.
Hold-to-record hotkey — hold Option (configurable), speak, release.
Transcript workspace button — click the red record button at the bottom.

While Recording

The menu bar icon turns red.
A floating overlay at the bottom of the screen shows a preview of the transcription, word count, and elapsed time.
For a full view, open the Transcript workspace.

Stopping

Left-click the menu bar icon again.
Release the hotkey.
Click the overlay.
Auto-stop — detects when you stop speaking (~3 seconds of silence).

Where Does the Text Go?

Your active app — text is pasted into whatever app you were using.
Transcript history — every transcription is saved automatically.

Tip: If text isn't appearing, check Accessibility permission. Without it, text is copied to your clipboard — press ⌘V to paste.

Undo

Say "scratch that" or "delete that" to undo the last transcription.

Dictation Modes

Right-click the menu bar icon → Mode to switch.

Clean (default)

Removes filler words (um, uh, like, you know), fixes punctuation and capitalization.

Email

Professional tone with paragraph breaks. Great for Mail, Gmail, Outlook.

Notes

Converts speech into bullet points. Pairs well with Notes.app, Obsidian, Notion.

Translate

Speak in any language, get English text. 99+ languages supported via Whisper.

Raw

Exact transcription with no processing.

Code Comment

Concise output prefixed with //. For developers dictating comments.

Agent EXPERIMENTAL

AI assistant mode. See Agent Mode.

Automatic Mode Switching

When enabled, Byblos picks the right mode based on which app you're using: Mail → Email, VS Code → Code, Notes → Notes, etc.

Transcript Workspace

Right-click → Show Transcripts.

Sidebar — all transcriptions grouped by date.
Detail panel — full text, metadata, editable.
Record button — record directly into the workspace.
Search & filter — by content or mode.
Export — Markdown, plain text, or JSON.

File Transcription

Drag audio/video files onto the Transcript workspace, or click Import Audio.

Supported: WAV, MP3, M4A, FLAC, OGG, MP4, MOV, MKV.

Managing Models

Settings → Models.

Speech Models (Whisper)

Model	Size	Best For
Tiny	74 MB	Quick notes, low memory
Base	141 MB	General use
Small	465 MB	Better accuracy
Large v3 Turbo	1.5 GB	Fast + excellent quality
Distil-Large v3	1.4 GB	Near-best quality, great speed
Medium	1.5 GB	High accuracy
Large v3	2.9 GB	Maximum accuracy

CoreML encoders are downloaded automatically for 3x faster transcription on Apple Silicon.

Recommendation: Start with Distil-Large v3 — near-best quality at good speed.

Custom Vocabulary

Settings → Vocabulary. Add replacements for names and jargon that Whisper misspells.

Examples: "byblos" → "Byblos", "kubernetes" → "Kubernetes", "react" → "React"

Settings Reference

General

Hold key to record (Option, Control, Fn)
Output mode (type into app / clipboard)
Voice commands, auto-capitalize
Auto-stop on silence + delay
Auto-select mode based on app
Launch at login
Language (18 supported)

Models

Download, activate, and remove speech and AI models.

Audio

Input device, noise suppression, VAD.

Agent Mode EXPERIMENTAL

Requires a local LLM.

"What's on my screen?" — reads current window
"Find files called readme" — searches via Spotlight
"Open Safari" — launches apps
"What's on my clipboard?" — reads clipboard

Local LLM Setup (Optional)

Download an AI model in Settings → Models. Recommended: Qwen 3 8B (4.7 GB, needs 16GB+ RAM) or Qwen 3.5 4B (2.7 GB, 8GB+).

The LLM runs in a separate helper process alongside Whisper, both using Metal GPU.

License & Support

Free for personal use. No limits, no nag screens, all features available.

Commercial use (work, business, revenue) requires a license: $49/user/year. Honor system — we trust you.

To activate a commercial license: Settings → About → paste key → Activate.

support@byblos.im · GitHub Issues

Troubleshooting

No transcription

Speak closer to the mic
Check input device in Settings → Audio
Check a model is downloaded in Settings → Models

Text not appearing in app

Grant Accessibility: System Settings → Privacy & Security → Accessibility
Or use clipboard mode: Settings → General → Output mode → Clipboard

Hotkey not working

Requires Accessibility permission
After updates, may need to re-grant

Logs

~/Library/Logs/Byblos.log

Quick Reference

Action	How
Start/stop recording	Left-click menu bar icon
Hold-to-record	Hold `Option`
Open menu	Right-click menu bar icon
Stop from overlay	Click the overlay
Undo last	Say "scratch that"
Auto-stop	Pause ~3 seconds
Show Transcripts	Right-click → Show Transcripts