FAQ & User Guide

Everything you need to know about using SoundMindAI, from recording your first meeting to setting up AI-powered transcription and summarization.

🎯 Getting Started

What is SoundMindAI?

SoundMindAI is a native macOS application that helps you capture, transcribe, and understand audio content. Whether you're recording meetings, lectures, interviews, or podcasts, SoundMindAI transforms spoken words into organized, searchable text with AI-powered insights.

Key capabilities:

  • Record system audio and microphone simultaneously
  • Import existing audio files
  • Transcribe with Apple Speech (free) or premium AI services
  • Generate summaries, key points, and action items using AI
  • Organize recordings with tags, folders, and search
  • Export in multiple formats
What are the system requirements?
  • macOS: macOS 13 (Ventura) or later
  • Processor: Apple Silicon or Intel Mac
  • Storage: 100 MB for the app, plus space for recordings
  • Permissions: Screen Recording (for system audio) and Microphone access
Tip: Screen Recording permission is required even for audio-only capture because macOS uses ScreenCaptureKit for system audio access.
How do I grant the required permissions?

On first launch, SoundMindAI will request the necessary permissions. If you need to enable them manually:

Open System Settings

Click the Apple menu > System Settings > Privacy & Security

Enable Screen Recording

Select "Screen Recording" from the list and toggle on SoundMindAI

Enable Microphone Access

Select "Microphone" from the list and toggle on SoundMindAI

Restart the App

Quit and relaunch SoundMindAI for permissions to take effect

How does the 7-day free trial work?

The free trial gives you full access to all features for 7 days. No credit card required.

  • Trial starts when you click "Start Free Trial"
  • All features are unlocked including BYOK AI services
  • Trial is tied to your Mac (hardware-based), not your email
  • After 7 days, purchase a license or continue with limited features

After trial expires: You can still record and use Apple Speech transcription. BYOK AI features require a license.

How private and secure is SoundMindAI?

SoundMindAI is designed with privacy as a core principle. Your data stays on your Mac.

What stays local:

  • All recordings and audio files
  • All transcripts and AI summaries
  • Your API keys (stored in macOS Keychain)
  • App settings and preferences

What we do NOT do:

  • Collect analytics or usage data
  • Track your behavior or features used
  • Upload your recordings or transcripts
  • Sell or share your data

Only network connections:

  • License validation: Periodic check with our licensing server (no personal data sent)
  • BYOK services: When you click transcribe or summarize, your audio/text goes directly to your chosen provider (OpenAI, Anthropic, etc.) - not through us
Note: When using BYOK AI services, your data is transmitted directly to those providers. Review their privacy policies to understand how they handle your data.
What's the difference between the license fee and AI costs?

There are two separate costs to understand:

1. SoundMindAI License (one-time, paid to us):

  • $49.95 one-time payment for lifetime access
  • Covers the app software, updates, and all built-in features
  • Includes Apple Speech transcription (free, on-device)
  • No recurring fees or subscriptions

2. AI Provider Costs (ongoing, paid directly to providers):

  • Paid directly to OpenAI, Anthropic, Google, AssemblyAI, etc.
  • Based on your actual usage (pay-as-you-go)
  • Prices set by each provider - we have no control over their rates
  • Billed separately by each provider to your account with them
Important: SoundMindAI does not collect, process, or have any involvement with your AI provider payments. We never see your payment information or usage charges. Any billing questions or disputes regarding AI costs must be handled directly with the respective provider (OpenAI, Anthropic, Google, etc.).
How do I deactivate my license?

To deactivate your license from a Mac:

  1. Click on the license badge in the top-right corner of the app
  2. In the popover, click the "Deactivate" button
  3. Confirm the deactivation when prompted

Why deactivate?

  • Moving to a new Mac: Your license allows up to 3 devices. Deactivate from your old Mac to free up a slot for your new one.
  • Selling or giving away your Mac: Remove your license before transferring your computer to someone else.
  • Troubleshooting: If you're having license validation issues, deactivating and re-entering your key can help resolve them.
Note: Your license key remains valid after deactivation. You can reactivate anytime by entering the same key on any Mac.
How do automatic updates work?

Automatic Updates — SoundMindAI can check for new versions automatically and notify you when updates are available.

How it works:

  • By default, SoundMindAI checks for updates when the app launches
  • If a new version is available, you'll see a notification in the sidebar
  • Click the notification to view update details and install
  • Updates are downloaded and installed seamlessly

Managing update settings:

  1. Open Settings (⌘,) or click the gear icon
  2. Go to the Tools tab
  3. Toggle "Check for updates on startup" on or off
  4. You can always manually check for updates from the menu: SoundMindAI > Check for Updates
Tip: Keeping SoundMindAI up to date ensures you have the latest features, performance improvements, and bug fixes.
What if auto-update fails?

If you see an error like "An error occurred while launching the installer" when trying to update, don't worry — you can easily update manually:

  1. Download the latest version from soundmindai.net
  2. Quit SoundMindAI if it's running
  3. Open the downloaded DMG and drag SoundMindAI to your Applications folder
  4. Click "Replace" when prompted to overwrite the old version
  5. Launch the new version — all your recordings and settings are preserved

Why does this happen?

This can occur on older versions (before v1.1.2) due to macOS security requirements. Once you manually update to the latest version, future auto-updates will work normally.

Your data is safe: Updating the app never affects your recordings, transcripts, or settings — they're stored separately.

Features

How does audio capture work?

Capture Everything — Record system audio and microphone simultaneously. Zoom, Teams, YouTube, podcasts — anything you can hear.

Audio sources:

  • System Audio: Capture sound from any application (meetings, videos, music)
  • Microphone: Record your voice or room audio
  • Both: Record system and mic together for complete meeting capture

How to start recording:

  1. Click the large red record button on the main screen
  2. Select your audio sources (system, mic, or both)
  3. The timer starts and you'll see a visual indicator
  4. Click stop when done — your recording is saved automatically
Tip: You can pause and resume recordings without creating multiple files.
What transcription options are available?

AI Transcription converts speech to text with accurate, timestamped transcripts ready in minutes.

Service Cost Speed Accuracy
Apple Speech Free (built-in) Fast Good
OpenAI Whisper ~$0.006/min Fast Excellent
AssemblyAI ~$0.003/min Fast Excellent + Speaker ID

Apple Speech requires no setup and works offline. BYOK services offer higher accuracy but require API keys.

What AI summaries does SoundMindAI generate?

Smart Summaries — Get AI-generated summaries, bullet points, and action items. Choose from Claude, GPT-4, Gemini, and more.

SoundMindAI extracts meaningful insights from your transcripts:

  • Summary: Concise overview of the content
  • Key Points: Important highlights and takeaways
  • Action Items: Tasks and follow-ups identified from the conversation

Each item includes timestamps so you can jump to the relevant part of the recording.

Tip: Different AI providers have different strengths. Experiment to find what works best for your content.
How do I organize my recordings?

Stay Organized — Tag, search, and filter your recordings. Find any meeting instantly with full-text search.

  • Tags: Add custom tags with colors to categorize recordings
  • Search: Full-text search across titles, transcripts, and tags
  • Sort: Sort by date, duration, or name
  • Filter: Filter by tags, date range, or transcription status

Adding tags:

  1. Select a recording from your library
  2. Click the Tags section in the detail view
  3. Type a tag name and press Enter, or select from existing tags
  4. Click the color dot to change tag color
How does click-to-seek work?

Click-to-Seek — Jump to any moment in your recording. Click on any transcript line to hear exactly what was said.

How it works:

  • Every transcript segment has a timestamp
  • Click any segment to jump the audio player to that exact moment
  • The current segment highlights as the audio plays
  • Works with summaries and action items too — click to hear the context

Navigation shortcuts:

  • Use arrow keys to move between segments
  • Press Space to play/pause
  • Click timestamps in summaries to jump to that point
How do I import existing audio and video files?

Import Any Media — Drop in MP3, MP4, M4A, WAV, and more to transcribe and summarize.

Supported formats:

  • Audio: MP3, M4A, WAV, CAF, AIFF
  • Video: MP4, MOV (audio is extracted automatically)

How to import:

  1. Drag and drop files directly onto the SoundMindAI window
  2. Or use File > Import from the menu bar
  3. The file is added to your library and ready for transcription

Use cases:

  • Transcribe podcast episodes you want to reference
  • Process lecture recordings from school
  • Import voice memos recorded on your phone
  • Extract insights from interview recordings
Tip: Imported files stay in their original location — SoundMindAI references them without duplicating.
What export options are available?

Export Anywhere — Export transcripts and summaries in multiple formats.

Export formats:

  • Plain Text (.txt): Simple text format
  • Markdown (.md): Formatted for documentation
  • HTML (.html): Web-ready with styling
  • SRT Subtitles (.srt): For video editing and YouTube
  • VTT Subtitles (.vtt): Web video subtitle format

Export options:

  • Include or exclude timestamps
  • Include or exclude speaker labels
  • Export transcript, summary, or both
What is AI Chat and how do I use it?

AI Chat — Ask questions about your recordings and get instant AI-powered answers.

How to use AI Chat:

  1. Open a recording that has been transcribed
  2. Click the "Chat" tab in the detail view
  3. Type your question (e.g., "What were the main decisions made?")
  4. The AI will analyze the transcript and provide an answer

Example questions:

  • "What action items were discussed?"
  • "Summarize what John said about the budget"
  • "What were the concerns raised about the timeline?"
  • "List all the names mentioned in this meeting"
Tip: AI Chat uses your configured AI provider (OpenAI, Anthropic, etc.). Make sure you have an AI provider set up in Settings.
What languages are supported? Does SoundMindAI translate?

Multi-Language Support + Auto-Translation — Transcribe in 99 languages with automatic detection, and get English translations automatically.

Supported Languages (with OpenAI Whisper):

English, Spanish, Chinese, French, German, Japanese, Korean, Portuguese, Russian, Arabic, Hindi, Italian, Dutch, Polish, Vietnamese, Thai, Turkish, Indonesian, and 40+ more languages. Whisper automatically detects the spoken language — no configuration needed.

AssemblyAI: Also supports automatic language detection when enabled.

Apple Speech (Native): Uses your Mac's system language setting. Does not auto-detect.

Automatic English Translation:

  • After transcription, if the detected language is not English, translation is triggered automatically
  • Your configured AI provider (OpenAI, Anthropic, Gemini, etc.) translates each segment
  • The English translation appears below each original segment in italics
  • Both original and translated text are searchable

Requirements for translation:

  • Use Whisper or AssemblyAI for transcription (for language detection)
  • Have an AI provider configured (OpenAI, Anthropic, Gemini, OpenRouter, or HuggingFace)
Note: Translation uses your AI provider and may incur API costs based on transcript length.
How do I control automatic translation in Settings?

SoundMindAI has a Languages section in Settings where you can control auto-translation behavior.

Auto-Translate to English setting:

  • Location: Settings > Languages section
  • Default: On (enabled)
  • When ON: Non-English transcripts are automatically translated to English after transcription
  • When OFF: Transcription only — no automatic translation is performed

Requirements:

  • An AI provider must be configured (OpenAI, Anthropic, Gemini, OpenRouter, or HuggingFace)
  • The toggle is disabled if no AI provider is set up

Manual translation:

  • Even with auto-translate OFF, you can still manually translate individual recordings
  • Open the recording, click Reprocess, and select "Translate to English"
  • This option appears for recordings where non-English language was detected
Tip: Keep auto-translate ON if you primarily work in English. Turn it OFF if you prefer to keep transcripts in their original language and only translate specific recordings when needed.
Can I edit transcripts?

Transcript Editing — Edit text, split or merge segments, and manage speaker names. Full control over your transcripts.

  • Edit text: Click on any transcript segment to edit the text directly
  • Split segments: Break a long segment into two separate parts
  • Merge segments: Combine adjacent segments into one
  • Manage speakers: Assign or change the speaker for any segment
  • Auto-save: All changes are saved automatically

Original timestamps are preserved when editing text, and adjusted appropriately when splitting segments.

What is speaker diarization?

Speaker Diarization — Know who said what with automatic speaker identification.

Speaker diarization automatically identifies and labels different speakers in a recording (e.g., "Speaker 1", "Speaker 2").

How to enable it:

  1. Set up an AssemblyAI API key in Settings
  2. Select AssemblyAI as your transcription provider
  3. Transcribe your recording — speakers are automatically detected
  4. Rename generic labels (Speaker 1, Speaker 2) to actual names

Speaker management:

  • Click the person icon next to any segment to change the speaker
  • Click the pencil icon to rename a speaker (updates everywhere)
  • Choose custom colors to easily distinguish speakers
  • Filter transcript by speaker to see only their segments

Note: Speaker diarization is only available with AssemblyAI. Apple Speech and OpenAI Whisper do not support automatic speaker identification.

How do I bookmark important parts of a transcript?

Bookmarks let you mark and quickly find important moments in your transcripts.

Adding bookmarks:

  • Click the bookmark icon next to any transcript segment
  • The icon fills in to indicate it's bookmarked
  • Click again to remove the bookmark

Filtering by bookmarks:

  • Click the bookmark filter button in the transcript toolbar
  • Only bookmarked segments will be shown
  • Click again to show all segments
Tip: Bookmarks are great for marking action items, important quotes, or moments you want to revisit.
What is AI Split and how does it work?

AI Split intelligently breaks large transcript blocks into smaller, readable chunks using AI analysis.

Why use AI Split?

  • Some transcription services return one giant wall of text
  • Long paragraphs are hard to read and navigate
  • AI Split adds logical breaks for sentences, paragraphs, or speaker changes

How to use AI Split:

  1. Open a recording with a transcript
  2. Click the "AI Split" button in the transcript toolbar
  3. The AI will analyze the content and add appropriate breaks
  4. Your transcript becomes easier to read and skim
Tip: AI Split uses your configured AI provider. Results may vary depending on the model used.
What is AI Help Chat?

AI Help Chat — Get instant answers about SoundMindAI features right in the app. Ask questions and the AI assistant explains how to use any feature.

How to access AI Help Chat:

  1. Click "AI Help" in the left sidebar
  2. Type your question about any SoundMindAI feature
  3. The AI assistant will provide helpful guidance

Example questions:

  • "How do I set up transcription?"
  • "What AI providers are supported?"
  • "How do I export my recordings?"
  • "What keyboard shortcuts are available?"

Features:

  • Instant answers: No need to search through documentation
  • Contextual help: Answers are specific to SoundMindAI features
  • Works offline: Uses bundled FAQ data when internet is unavailable
  • Conversation history: Ask follow-up questions in the same chat
Tip: AI Help Chat is perfect for quickly learning how to use features without leaving the app.

🎤 Recording

How do I start a recording?

Starting a recording is simple:

Click the Record Button

Click the large red record button on the main screen, or use the keyboard shortcut

Select Audio Sources

Choose to record system audio, microphone, or both

Start Recording

The timer will start and you'll see a visual indicator that recording is active

Stop When Done

Click the stop button or use the keyboard shortcut to end the recording

Tip: You can pause and resume recordings without creating multiple files. Use the Mini Recorder view for a compact, always-on-top recording interface.
What is the Mini Recorder view?

Mini Recorder — A compact, floating window that stays on top while you work, perfect for long recording sessions.

How to use Mini Recorder:

  1. Start a recording from the main window
  2. Click the "Mini" button in the recording controls
  3. The main window hides and a small floating panel appears
  4. The mini panel shows recording time, controls, and audio levels
  5. Click "Restore" to return to the full window

Mini Recorder features:

  • Always on top: Stays visible above other windows (toggle on/off)
  • Compact design: Minimal footprint on your screen
  • Full controls: Pause, resume, mute mic, and stop recording
  • Audio level indicator: Visual feedback of recording levels
  • Draggable: Position anywhere on your screen
Tip: Mini Recorder is perfect for recording meetings while taking notes in other apps, or for long recording sessions where you don't need the full interface.
How does auto-stop after silence work?

Auto-stop After Silence — Automatically stops recording when no audio is detected for a configurable duration. Perfect for hands-free recording sessions.

How it works:

  • During recording, SoundMindAI monitors audio levels
  • When audio drops below a threshold (very quiet/silence), a timer starts
  • If silence continues for the configured duration, recording stops automatically
  • Any sound resets the silence timer

How to configure:

  1. Open Settings (⌘,) or click the gear icon
  2. Scroll to the Audio section
  3. Toggle "Auto-stop recordings after silence" on or off
  4. Set your desired duration and unit (seconds, minutes, or hours)

Use cases:

  • Recording a meeting that might end early — the recording stops itself
  • Capturing voice memos without needing to manually stop
  • Recording background audio where you want to stop after extended silence

Settings:

  • Duration: Set from 15 seconds up to hours
  • Units: Choose seconds, minutes, or hours
  • Default: Enabled with 5 minutes
Tip: 5 minutes is a good default to avoid premature stops during natural conversation pauses.
Can I record system audio from specific applications?

SoundMindAI captures all system audio output. To record specific applications:

  • Mute applications you don't want to record
  • Use the application's own audio settings to control volume
  • Consider using a virtual audio device for more control

Future versions may include per-application audio selection.

How do I import existing audio files?

You can import audio files for transcription:

  • Drag and drop audio files onto the SoundMindAI window
  • Use File > Import Audio or the keyboard shortcut
  • Select files from the file browser

Supported formats: M4A, MP3, WAV, CAF, AIFF, MP4, MOV

Where are my recordings stored?

By default, recordings are stored in:

~/Library/Application Support/SoundMindAI/Recordings/

You can change the storage location in Settings > Storage. All recordings remain on your Mac - nothing is uploaded to the cloud.

📝 Transcription

What transcription options are available?

SoundMindAI offers multiple transcription options:

Service Cost Speed Accuracy
Apple Speech Free (built-in) Fast Good
OpenAI Whisper ~$0.006/min Fast Excellent
AssemblyAI ~$0.003/min Fast Excellent

Apple Speech requires no setup and works offline. BYOK services offer higher accuracy but require API keys.

What are the limitations of Apple Speech (native) transcription?

Apple's built-in Speech Recognition has the following limitations:

Mode Duration Limit Notes
Server-based 1 minute Hard limit, recognition stops after 60 seconds
On-device ~5 minutes recommended No hard limit, but becomes unreliable for longer recordings
Request limit 1000/hour per device Shared across all apps on the device

SoundMindAI enforces a 5-minute limit for native transcription to ensure reliability. For longer recordings, use OpenAI Whisper or AssemblyAI.

Important: If you have recordings longer than 5 minutes, configure a cloud transcription provider (Whisper or AssemblyAI) in Settings. These services have no practical duration limits and offer higher accuracy.
How do I view a recording's transcript and summary?

There are three ways to open the detail view for any recording:

  1. Double-click on the recording in your library
  2. Select a recording and click the "Transcript & Summary" button
  3. Click the document icon next to the recording

The detail view shows tabs for Summary, Transcript, Notes, Source, Tags, and AI Chat. You can navigate between tabs to access different aspects of your recording.

Tip: Double-clicking is the fastest way to open a recording. The detail view includes playback controls so you can listen while reading the transcript.
How do I transcribe a recording?

After recording or importing audio:

  1. Select the recording from your library
  2. Click "Transcribe" or use the keyboard shortcut
  3. Choose your transcription service (Apple Speech or BYOK)
  4. Wait for transcription to complete

Transcripts appear in the detail view with timestamps for easy navigation.

Can I edit transcripts?

Yes! SoundMindAI provides comprehensive transcript editing:

  • Edit text: Click on any transcript segment to edit the text directly
  • Split segments: Break a long segment into two separate parts at a specific point
  • Merge segments: Combine adjacent segments into one
  • Manage speakers: Assign or change the speaker for any segment
  • Auto-save: All changes are saved automatically

Original timestamps are preserved when editing text, and adjusted appropriately when splitting segments.

How do I manage speakers in transcripts?

SoundMindAI makes it easy to identify and organize speakers:

Changing a speaker:

  • Click the person icon next to any transcript segment
  • Select from existing speakers or create a new one
  • The change applies to that segment only

Editing speakers (rename & color):

  • Click the pencil icon next to a speaker name
  • Change the name (e.g., "Speaker 1" → "John") - updates everywhere that speaker appears
  • Choose a custom color from 8 options to easily distinguish speakers

Filtering by speaker:

  • Click the speaker filter dropdown (person icon) in the transcript toolbar
  • Select a specific speaker to show only their segments
  • Select "All Speakers" to show the full transcript again
Tip: Use AssemblyAI for automatic speaker diarization - it will automatically identify and label different speakers in your recording.
What is speaker diarization?

Speaker diarization automatically identifies and labels different speakers in a recording (e.g., "Speaker 1", "Speaker 2").

How to enable it:

  1. Set up an AssemblyAI API key in Settings
  2. Select AssemblyAI as your transcription provider
  3. Transcribe your recording - speakers are automatically detected
  4. Rename generic labels (Speaker 1, Speaker 2) to actual names

Note: Speaker diarization is only available with AssemblyAI. Apple Speech and OpenAI Whisper do not support automatic speaker identification.

How do I bookmark important parts of a transcript?

Bookmarks let you mark and quickly find important moments in your transcripts:

Adding bookmarks:

  • Click the bookmark icon next to any transcript segment
  • The icon fills in to indicate it's bookmarked
  • Click again to remove the bookmark

Filtering by bookmarks:

  • Click the bookmark filter button in the transcript toolbar
  • Only bookmarked segments will be shown
  • Click again to show all segments

Bookmarks are saved with your recording and persist across sessions.

🤖 AI Summarization

What AI summaries does SoundMindAI generate?

SoundMindAI uses AI to extract meaningful insights from your transcripts:

  • Summary: Concise overview of the content
  • Key Points: Important highlights and takeaways
  • Action Items: Tasks and follow-ups identified from the conversation

Each item includes timestamps so you can jump to the relevant part of the recording.

Which AI providers are supported?

SoundMindAI supports multiple AI providers for summarization:

  • OpenAI - GPT-4o, GPT-4, GPT-3.5-turbo
  • Anthropic - Claude 3.5 Sonnet, Claude 3 Opus
  • Google - Gemini 1.5 Pro, Gemini 1.5 Flash
  • HuggingFace - Various open-source models
  • OpenRouter - Access multiple providers through one API

Each provider has different strengths - experiment to find what works best for your content.

How much does AI summarization cost?

Costs vary by provider and model. Typical costs for summarizing a 1-hour transcript:

  • GPT-4o: ~$0.05-0.15
  • GPT-3.5-turbo: ~$0.01-0.03
  • Claude 3.5 Sonnet: ~$0.05-0.10
  • Gemini 1.5 Flash: ~$0.01-0.02

Costs depend on transcript length. You pay directly to your chosen provider - SoundMindAI doesn't add any markup.

Why are there mistakes in my transcript or summary?

SoundMindAI uses third-party AI services for transcription and summarization. We pass your audio/text directly to your chosen provider and display their results.

Common causes of errors:

  • Transcription errors: Background noise, accents, technical jargon, or multiple speakers can affect accuracy
  • Summary inaccuracies: AI models may misinterpret context, omit details, or occasionally "hallucinate" information
  • Model limitations: Different AI providers and models have varying accuracy levels

Important: SoundMindAI is not responsible for the accuracy of AI-generated transcripts or summaries. We simply pass your content to your chosen AI provider and display their output. Always review AI-generated content for accuracy, especially for important meetings or sensitive information.

Tip: You can edit transcripts directly in the app to correct any errors. Try different AI providers to find the best results for your content type.
What is AI Chat and how do I use it?

AI Chat lets you ask questions about your recordings and get instant answers based on the transcript content.

How to use AI Chat:

  1. Open a recording that has been transcribed
  2. Click the "Chat" tab in the detail view
  3. Type your question (e.g., "What were the main decisions made?")
  4. The AI will analyze the transcript and provide an answer

Example questions:

  • "What action items were discussed?"
  • "Summarize what John said about the budget"
  • "What were the concerns raised about the timeline?"
  • "List all the names mentioned in this meeting"
Tip: AI Chat uses your configured AI provider (OpenAI, Anthropic, etc.). Make sure you have an AI provider set up in Settings.

🔑 BYOK (Bring Your Own Keys) Explained

What does BYOK mean?

BYOK stands for "Bring Your Own Keys." Instead of us charging you for AI services, you get API keys directly from the service providers (like OpenAI or Anthropic) and enter them in SoundMindAI.

Here's how it works:

  1. You create an account with an AI provider (e.g., OpenAI)
  2. You add payment and get an API key from them
  3. You enter that API key in SoundMindAI's settings
  4. SoundMindAI uses your key to access the AI service
  5. You pay the provider directly based on your usage
Why does SoundMindAI use BYOK instead of including AI?

BYOK offers significant advantages for you:

  • Lower cost: Pay wholesale rates directly to providers instead of marked-up prices
  • Choice: Pick the AI provider and model that works best for you
  • Privacy: Your data goes directly to the provider - we never see it
  • Control: Set your own usage limits and budgets
  • No subscription: Pay only for what you use, when you use it
Note: You are responsible for understanding and managing costs with each AI provider. We recommend setting usage limits in your provider accounts.
Are my API keys secure?

Yes, your API keys are stored securely:

  • Keys are stored in macOS Keychain, the same place your passwords are stored
  • Keys are encrypted using macOS system-level encryption
  • Keys never leave your Mac (except when making API calls to providers)
  • We cannot access, view, or retrieve your keys
How much do AI services cost? (Estimated Pricing)

Below are estimated costs for supported AI providers. Actual costs depend on your usage and current provider pricing.

Transcription Services

Provider Cost per Minute 1-Hour Recording
Apple Speech Free $0.00
OpenAI Whisper ~$0.006 ~$0.36
AssemblyAI ~$0.003 ~$0.18

AI Summarization Services

Estimated cost to summarize a 1-hour transcript (~12,000 words)

Provider Model Est. Cost
OpenAI GPT-4o ~$0.05 - $0.15
OpenAI GPT-4o-mini ~$0.01 - $0.03
Anthropic Claude 3.5 Sonnet ~$0.04 - $0.10
Google Gemini 1.5 Flash ~$0.01 - $0.02
Google Gemini 1.5 Pro ~$0.02 - $0.06

Example: Total Cost for 1-Hour Meeting

Setup Transcription Summary Total
Budget Apple Speech (free) Gemini Flash (~$0.01) ~$0.01
Balanced Whisper (~$0.36) GPT-4o-mini (~$0.02) ~$0.38
Premium AssemblyAI (~$0.18) GPT-4o (~$0.10) ~$0.28
Important Disclaimer: These are rough estimates only and may not reflect current pricing. AI provider pricing changes frequently. Always check the official pricing pages for the most accurate and up-to-date information:

Quick Start Setup Guide

How do I set up SoundMindAI for best results?

SoundMindAI uses a BYOK (Bring Your Own Keys) model, which means you connect your own AI service accounts. This gives you lower costs, more privacy, and full control. Here's how to get the best experience:

What You'll Set Up

1.
Transcription - Converts your audio to text
Options: Native macOS (free) or OpenAI Whisper (more accurate) or AssemblyAI
2.
AI Summarization - Generates summaries and action items
Options: OpenAI GPT, Anthropic Claude, Google Gemini, and more
Recommended setup for best results (5 minutes)

For the best transcription accuracy and AI summaries, we recommend OpenAI - one account gives you both Whisper (transcription) and GPT (summarization).

Need speaker identification? Use AssemblyAI instead of Whisper for transcription. AssemblyAI supports speaker diarization — it identifies who said what in multi-person recordings. Perfect for meetings with multiple participants. You can still use OpenAI for AI summarization.

Step 1: Create OpenAI Account

Go to platform.openai.com/signup and create a free account.

Step 2: Add Payment Method

Go to Settings > Billing and add a credit card. You only pay for what you use (typically $0.01-0.05 per recording).

Step 3: Generate API Key

Go to API Keys section, click "Create new secret key", name it "SoundMindAI", and copy it immediately (you won't see it again).

Step 4: Configure SoundMindAI

In SoundMindAI, go to Tools > Settings:

  • Under Transcription, select "Whisper API"
  • Under AI Summarization, select "OpenAI"
  • Paste your API key when prompted

Step 5: Test Your Setup

Make a short test recording (30 seconds), then check:

  • The status indicators in the sidebar should show green
  • After recording, transcription should appear
  • AI summary and action items should generate
Cost Tip: Set a monthly spending limit in OpenAI's Billing settings (e.g., $10/month) to avoid surprises. Most users spend $1-5/month.
Alternative: Free setup with native transcription

If you prefer not to set up API keys, SoundMindAI works out of the box with native macOS transcription:

  • Transcription: Uses Apple's built-in Speech Recognition (free, on-device)
  • AI Summarization: Not available without an API key

Limitations of native transcription:

  • Limited to recordings under 5 minutes
  • Less accurate than Whisper for technical content
  • No AI summaries or action items

This is a good option for quick tests, but for regular use we recommend setting up OpenAI (or AssemblyAI for speaker diarization) for the full experience.

Understanding the status indicators

The sidebar shows the status of your transcription and AI providers:

Green - Configured and ready to use
Red - Needs setup (missing API key or provider not selected)
Gray - License required (trial expired)

Click on either indicator to quickly change providers without going to Settings.

Troubleshooting setup issues

"Invalid API key" error:

  • Make sure you copied the entire key (no extra spaces)
  • Check that you have billing set up with the provider
  • Try generating a new key if the old one isn't working

"Insufficient quota" or "Rate limit" error:

  • Add credits or a payment method to your provider account
  • Check your usage limits in the provider's dashboard

Transcription not working:

  • Verify the API key is entered correctly in Settings
  • Check that the correct transcription provider is selected
  • Try the API Diagnostics tool in Settings to test connectivity

Still having issues? See the detailed setup guides below for your specific provider, or check the Troubleshooting section.

🛠 API Setup Guides

How to set up OpenAI (Whisper + GPT)

OpenAI provides both Whisper (transcription) and GPT (summarization):

Create an OpenAI Account

Go to platform.openai.com/signup and create an account

Add Payment Method

Go to Settings > Billing and add a credit card. New accounts may receive free credits.

Generate API Key

Go to API Keys section, click "Create new secret key", and copy it immediately (you won't see it again)

Enter Key in SoundMindAI

Open SoundMindAI Settings > API Keys > OpenAI and paste your key

Tip: Set a monthly usage limit in OpenAI's settings to avoid unexpected charges.

Having issues? Visit OpenAI Help Center or OpenAI Documentation.

How to set up Anthropic (Claude)

Anthropic provides Claude for AI summarization:

Create an Anthropic Account

Go to console.anthropic.com and sign up

Add Payment Method

Navigate to Billing and add a payment method

Generate API Key

Go to API Keys, create a new key, and copy it

Enter Key in SoundMindAI

Open SoundMindAI Settings > API Keys > Anthropic and paste your key

Having issues? Visit Anthropic Documentation or Anthropic Support.

How to set up Google (Gemini)

Google provides Gemini for AI summarization:

Go to Google AI Studio

Visit aistudio.google.com and sign in with your Google account

Get API Key

Click "Get API Key" in the left sidebar, then "Create API key"

Copy Your Key

Copy the generated API key

Enter Key in SoundMindAI

Open SoundMindAI Settings > API Keys > Google and paste your key

Tip: Google offers a generous free tier for Gemini. Check current limits at Google AI Studio.

Having issues? Visit Google AI Documentation or Google AI Studio.

How to set up AssemblyAI (Transcription)

AssemblyAI provides high-quality transcription:

Create AssemblyAI Account

Go to assemblyai.com and sign up

Get Your API Key

Your API key is shown on your dashboard immediately after signing up

Add Credits

Go to Billing to add credits or set up a payment method

Enter Key in SoundMindAI

Open SoundMindAI Settings > API Keys > AssemblyAI and paste your key

Having issues? Visit AssemblyAI Documentation or AssemblyAI Support.

How to set up OpenRouter (Multiple Providers)

OpenRouter lets you access multiple AI providers with one API key:

Create OpenRouter Account

Go to openrouter.ai and sign up

Add Credits

Add credits to your account in the Billing section

Get API Key

Go to Keys section and create a new API key

Enter Key in SoundMindAI

Open SoundMindAI Settings > API Keys > OpenRouter and paste your key

Tip: OpenRouter is great if you want to try different models without setting up multiple accounts.

Having issues? Visit OpenRouter Documentation or OpenRouter Discord.

📁 Organization & Management

How do I organize my recordings?

SoundMindAI provides several ways to organize your recordings:

  • Tags: Add custom tags with colors to categorize recordings
  • Folders: Create folders to group related recordings
  • Search: Search by title, transcript content, or tags
  • Sort: Sort by date, duration, or name
  • Filter: Filter by tags, date range, or transcription status
How do I add tags to recordings?

To add tags:

  1. Select a recording from your library
  2. Click the Tags section in the detail view
  3. Type a tag name and press Enter, or select from existing tags
  4. Click the color dot to change tag color

You can also batch-tag multiple recordings by selecting them first.

Can I select multiple recordings at once?

Yes! SoundMindAI supports multi-select for batch operations:

How to multi-select:

  • Command + Click: Add individual recordings to your selection
  • Shift + Click: Select a range of recordings
  • Command + A: Select all recordings

Batch operations available:

  • Delete multiple recordings at once
  • Add tags to multiple recordings
  • Export multiple recordings
Can I export my recordings and transcripts?

Yes! SoundMindAI offers flexible export options:

Export formats:

  • Plain Text (.txt): Simple text format for any application
  • Markdown (.md): Formatted text with headers and styling
  • HTML (.html): Web-ready format with styling
  • JSON (.json): Structured data format for developers
  • SRT (.srt): Subtitle format for video editing

What you can export:

  • Full transcript with timestamps and speakers
  • AI-generated summary, key points, and action items
  • Original audio file (M4A)

Access export via File > Export, the toolbar button, or right-click on a recording.

How do I delete recordings?

To delete recordings:

  1. Select the recording(s) you want to delete
  2. Press Delete or right-click and select "Delete"
  3. Confirm deletion in the dialog
Warning: Deleted recordings are permanently removed. Consider exporting important recordings before deleting.

▶️ Playback

How do I play back recordings?

Select a recording and use the built-in player:

  • Play/Pause: Space bar or play button
  • Seek: Click anywhere on the timeline
  • Skip: Arrow keys for 5-second jumps
  • Speed: Adjust playback speed (0.5x to 2x)
  • Volume: Use the volume slider
Can I jump to specific parts of a recording?

Yes! Click on any timestamped item to jump to that position:

  • Click transcript segments to jump to that part
  • Click key points to hear the relevant section
  • Click action items to hear the context

Timestamps are clickable throughout the interface.

🔧 Troubleshooting

Recording isn't capturing system audio

If system audio isn't being captured:

  1. Check Screen Recording permission: System Settings > Privacy & Security > Screen Recording. Ensure SoundMindAI is enabled.
  2. Restart the app: Quit and relaunch SoundMindAI after granting permission.
  3. Check audio source: Ensure "System Audio" is selected in recording settings.
  4. Test with another app: Play audio from any app while recording to verify.
Why does my recording sound degraded or echo-y when the microphone is on?

This is not a bug in SoundMindAI — it's a fundamental physics issue with audio recording that affects all recording software.

What's happening

When you record with your microphone enabled and speakers playing audio:

  1. Your speakers play the system audio (e.g., a YouTube video, Zoom call)
  2. Your microphone picks up that same audio from the room
  3. Now you have two copies of the same audio: the clean digital original + a degraded room recording
  4. During playback, these two copies interfere with each other, causing phasing, echo, and a "hollow" or "underwater" sound

The solution: Use headphones

Wearing headphones while recording completely solves this issue:

  • Headphones isolate audio — the sound goes directly to your ears, not into the room
  • Your mic only captures your voice — clean, isolated input with no speaker bleed
  • Result: Crystal-clear recordings with no phasing or echo

If you can't use headphones

  • Mute your mic during recording if you don't need to capture your voice. SoundMindAI sets mic playback to 0% by default for clean system audio.
  • Lower your speaker volume to reduce how much the mic picks up
  • Position your mic away from speakers if using an external microphone
Pro tip: For the best recording quality, always use headphones when recording meetings, calls, or any audio where you need both system audio and your microphone. This is standard practice in professional audio production.
"Required permissions not granted" error (even when permissions are enabled)

After updating SoundMindAI, you may see a "Required permissions not granted" error even though the permission toggles appear enabled in System Settings. This happens because macOS caches permissions based on the app's code signature, which changes with each update.

Quick Fix

  1. Open System Settings > Privacy & Security > Screen & System Audio Recording
  2. Click the minus (-) button to remove SoundMindAI from the list entirely
  3. Quit SoundMindAI completely (Cmd+Q)
  4. Relaunch SoundMindAI - it will prompt for permission again
  5. Click Allow when prompted

Nuclear Reset (if Quick Fix doesn't work)

Open Terminal and run these commands to completely reset permissions:

tccutil reset ScreenCapture com.robertgrow.SoundMindAI
tccutil reset Microphone com.robertgrow.SoundMindAI

Then:

  1. Quit SoundMindAI completely
  2. Relaunch the app
  3. Grant permissions when prompted
Why does this happen? macOS ties permissions to an app's code signature. When SoundMindAI is updated and re-signed, macOS may not recognize it as the same app that was previously granted permission, even though the toggle still appears "on" in System Settings.
Transcription is failing or stuck

If transcription isn't working:

  • Recording too long (Apple Speech): Native transcription is limited to 5 minutes. For longer recordings, configure Whisper or AssemblyAI in Settings.
  • Apple Speech: Ensure your Mac has an internet connection (required for some languages)
  • BYOK services: Verify your API key is correct in Settings
  • Check credits: Ensure you have available credits/balance with your AI provider
  • File format: Ensure the audio file is in a supported format
Tip: If transcription gets stuck, you can cancel it using the X button next to the progress bar, then restart with a different provider.
API key errors

If you're seeing API key errors:

  • Invalid key: Double-check you copied the entire key with no extra spaces
  • Expired key: Some providers expire unused keys. Generate a new one.
  • No credits: Check your provider account for available balance
  • Rate limited: Wait a moment and try again

You can test your API keys in Settings by clicking the "Test" button next to each key.

How do I use API Diagnostics?

The Diagnose feature helps you test and troubleshoot your API configurations:

How to access:

  1. Click "Diagnose" in the sidebar
  2. Select the provider you want to test (Transcription or AI)
  3. Click "Run Test" to verify the connection

What diagnostics check:

  • API key validity
  • Network connectivity to the provider
  • Account status and available credits
  • Model availability

When to use diagnostics:

  • After entering a new API key
  • When transcription or summarization fails
  • To verify your setup before an important recording
  • After changing providers or models
Tip: Run diagnostics after setting up each API key to ensure everything is working before you need it.
How do I contact support?

For support, please email us at:

support@soundmindai.net

Please include:

  • Description of the issue
  • macOS version
  • SoundMindAI version (from About menu)
  • Steps to reproduce the problem
  • Any error messages you see