FAQ & User Guide
Everything you need to know about using SoundMindAI, from recording your first meeting to setting up AI-powered transcription and summarization.
🎯 Getting Started
What is SoundMindAI?
SoundMindAI is a native macOS application that helps you capture, transcribe, and understand audio content. Whether you're recording meetings, lectures, interviews, or podcasts, SoundMindAI transforms spoken words into organized, searchable text with AI-powered insights.
Key capabilities:
- Record system audio and microphone simultaneously
- Import existing audio files
- Transcribe with Apple Speech (free) or premium AI services
- Generate summaries, key points, and action items using AI
- Organize recordings with tags, folders, and search
- Export in multiple formats
What are the system requirements?
- macOS: macOS 13 (Ventura) or later
- Processor: Apple Silicon or Intel Mac
- Storage: 100 MB for the app, plus space for recordings
- Permissions: Screen Recording (for system audio) and Microphone access
How do I grant the required permissions?
On first launch, SoundMindAI will request the necessary permissions. If you need to enable them manually:
Open System Settings
Click the Apple menu > System Settings > Privacy & Security
Enable Screen Recording
Select "Screen Recording" from the list and toggle on SoundMindAI
Enable Microphone Access
Select "Microphone" from the list and toggle on SoundMindAI
Restart the App
Quit and relaunch SoundMindAI for permissions to take effect
How does the 7-day free trial work?
The free trial gives you full access to all features for 7 days. No credit card required.
- Trial starts when you click "Start Free Trial"
- All features are unlocked including BYOK AI services
- Trial is tied to your Mac (hardware-based), not your email
- After 7 days, purchase a license or continue with limited features
After trial expires: You can still record and use Apple Speech transcription. BYOK AI features require a license.
How private and secure is SoundMindAI?
SoundMindAI is designed with privacy as a core principle. Your data stays on your Mac.
What stays local:
- All recordings and audio files
- All transcripts and AI summaries
- Your API keys (stored in macOS Keychain)
- App settings and preferences
What we do NOT do:
- Collect analytics or usage data
- Track your behavior or features used
- Upload your recordings or transcripts
- Sell or share your data
Only network connections:
- License validation: Periodic check with our licensing server (no personal data sent)
- BYOK services: When you click transcribe or summarize, your audio/text goes directly to your chosen provider (OpenAI, Anthropic, etc.) - not through us
What's the difference between the license fee and AI costs?
There are two separate costs to understand:
1. SoundMindAI License (one-time, paid to us):
- $49.95 one-time payment for lifetime access
- Covers the app software, updates, and all built-in features
- Includes Apple Speech transcription (free, on-device)
- No recurring fees or subscriptions
2. AI Provider Costs (ongoing, paid directly to providers):
- Paid directly to OpenAI, Anthropic, Google, AssemblyAI, etc.
- Based on your actual usage (pay-as-you-go)
- Prices set by each provider - we have no control over their rates
- Billed separately by each provider to your account with them
How do I deactivate my license?
To deactivate your license from a Mac:
- Click on the license badge in the top-right corner of the app
- In the popover, click the "Deactivate" button
- Confirm the deactivation when prompted
Why deactivate?
- Moving to a new Mac: Your license allows up to 3 devices. Deactivate from your old Mac to free up a slot for your new one.
- Selling or giving away your Mac: Remove your license before transferring your computer to someone else.
- Troubleshooting: If you're having license validation issues, deactivating and re-entering your key can help resolve them.
How do automatic updates work?
Automatic Updates — SoundMindAI can check for new versions automatically and notify you when updates are available.
How it works:
- By default, SoundMindAI checks for updates when the app launches
- If a new version is available, you'll see a notification in the sidebar
- Click the notification to view update details and install
- Updates are downloaded and installed seamlessly
Managing update settings:
- Open Settings (⌘,) or click the gear icon
- Go to the Tools tab
- Toggle "Check for updates on startup" on or off
- You can always manually check for updates from the menu: SoundMindAI > Check for Updates
What if auto-update fails?
If you see an error like "An error occurred while launching the installer" when trying to update, don't worry — you can easily update manually:
- Download the latest version from soundmindai.net
- Quit SoundMindAI if it's running
- Open the downloaded DMG and drag SoundMindAI to your Applications folder
- Click "Replace" when prompted to overwrite the old version
- Launch the new version — all your recordings and settings are preserved
Why does this happen?
This can occur on older versions (before v1.1.2) due to macOS security requirements. Once you manually update to the latest version, future auto-updates will work normally.
✨ Features
How does audio capture work?
Capture Everything — Record system audio and microphone simultaneously. Zoom, Teams, YouTube, podcasts — anything you can hear.
Audio sources:
- System Audio: Capture sound from any application (meetings, videos, music)
- Microphone: Record your voice or room audio
- Both: Record system and mic together for complete meeting capture
How to start recording:
- Click the large red record button on the main screen
- Select your audio sources (system, mic, or both)
- The timer starts and you'll see a visual indicator
- Click stop when done — your recording is saved automatically
What transcription options are available?
AI Transcription converts speech to text with accurate, timestamped transcripts ready in minutes.
| Service | Cost | Speed | Accuracy |
|---|---|---|---|
| Apple Speech | Free (built-in) | Fast | Good |
| OpenAI Whisper | ~$0.006/min | Fast | Excellent |
| AssemblyAI | ~$0.003/min | Fast | Excellent + Speaker ID |
Apple Speech requires no setup and works offline. BYOK services offer higher accuracy but require API keys.
What AI summaries does SoundMindAI generate?
Smart Summaries — Get AI-generated summaries, bullet points, and action items. Choose from Claude, GPT-4, Gemini, and more.
SoundMindAI extracts meaningful insights from your transcripts:
- Summary: Concise overview of the content
- Key Points: Important highlights and takeaways
- Action Items: Tasks and follow-ups identified from the conversation
Each item includes timestamps so you can jump to the relevant part of the recording.
How do I organize my recordings?
Stay Organized — Tag, search, and filter your recordings. Find any meeting instantly with full-text search.
- Tags: Add custom tags with colors to categorize recordings
- Search: Full-text search across titles, transcripts, and tags
- Sort: Sort by date, duration, or name
- Filter: Filter by tags, date range, or transcription status
Adding tags:
- Select a recording from your library
- Click the Tags section in the detail view
- Type a tag name and press Enter, or select from existing tags
- Click the color dot to change tag color
How does click-to-seek work?
Click-to-Seek — Jump to any moment in your recording. Click on any transcript line to hear exactly what was said.
How it works:
- Every transcript segment has a timestamp
- Click any segment to jump the audio player to that exact moment
- The current segment highlights as the audio plays
- Works with summaries and action items too — click to hear the context
Navigation shortcuts:
- Use arrow keys to move between segments
- Press Space to play/pause
- Click timestamps in summaries to jump to that point
How do I import existing audio and video files?
Import Any Media — Drop in MP3, MP4, M4A, WAV, and more to transcribe and summarize.
Supported formats:
- Audio: MP3, M4A, WAV, CAF, AIFF
- Video: MP4, MOV (audio is extracted automatically)
How to import:
- Drag and drop files directly onto the SoundMindAI window
- Or use File > Import from the menu bar
- The file is added to your library and ready for transcription
Use cases:
- Transcribe podcast episodes you want to reference
- Process lecture recordings from school
- Import voice memos recorded on your phone
- Extract insights from interview recordings
What export options are available?
Export Anywhere — Export transcripts and summaries in multiple formats.
Export formats:
- Plain Text (.txt): Simple text format
- Markdown (.md): Formatted for documentation
- HTML (.html): Web-ready with styling
- SRT Subtitles (.srt): For video editing and YouTube
- VTT Subtitles (.vtt): Web video subtitle format
Export options:
- Include or exclude timestamps
- Include or exclude speaker labels
- Export transcript, summary, or both
What is AI Chat and how do I use it?
AI Chat — Ask questions about your recordings and get instant AI-powered answers.
How to use AI Chat:
- Open a recording that has been transcribed
- Click the "Chat" tab in the detail view
- Type your question (e.g., "What were the main decisions made?")
- The AI will analyze the transcript and provide an answer
Example questions:
- "What action items were discussed?"
- "Summarize what John said about the budget"
- "What were the concerns raised about the timeline?"
- "List all the names mentioned in this meeting"
What languages are supported? Does SoundMindAI translate?
Multi-Language Support + Auto-Translation — Transcribe in 99 languages with automatic detection, and get English translations automatically.
Supported Languages (with OpenAI Whisper):
English, Spanish, Chinese, French, German, Japanese, Korean, Portuguese, Russian, Arabic, Hindi, Italian, Dutch, Polish, Vietnamese, Thai, Turkish, Indonesian, and 40+ more languages. Whisper automatically detects the spoken language — no configuration needed.
AssemblyAI: Also supports automatic language detection when enabled.
Apple Speech (Native): Uses your Mac's system language setting. Does not auto-detect.
Automatic English Translation:
- After transcription, if the detected language is not English, translation is triggered automatically
- Your configured AI provider (OpenAI, Anthropic, Gemini, etc.) translates each segment
- The English translation appears below each original segment in italics
- Both original and translated text are searchable
Requirements for translation:
- Use Whisper or AssemblyAI for transcription (for language detection)
- Have an AI provider configured (OpenAI, Anthropic, Gemini, OpenRouter, or HuggingFace)
How do I control automatic translation in Settings?
SoundMindAI has a Languages section in Settings where you can control auto-translation behavior.
Auto-Translate to English setting:
- Location: Settings > Languages section
- Default: On (enabled)
- When ON: Non-English transcripts are automatically translated to English after transcription
- When OFF: Transcription only — no automatic translation is performed
Requirements:
- An AI provider must be configured (OpenAI, Anthropic, Gemini, OpenRouter, or HuggingFace)
- The toggle is disabled if no AI provider is set up
Manual translation:
- Even with auto-translate OFF, you can still manually translate individual recordings
- Open the recording, click Reprocess, and select "Translate to English"
- This option appears for recordings where non-English language was detected
Can I edit transcripts?
Transcript Editing — Edit text, split or merge segments, and manage speaker names. Full control over your transcripts.
- Edit text: Click on any transcript segment to edit the text directly
- Split segments: Break a long segment into two separate parts
- Merge segments: Combine adjacent segments into one
- Manage speakers: Assign or change the speaker for any segment
- Auto-save: All changes are saved automatically
Original timestamps are preserved when editing text, and adjusted appropriately when splitting segments.
What is speaker diarization?
Speaker Diarization — Know who said what with automatic speaker identification.
Speaker diarization automatically identifies and labels different speakers in a recording (e.g., "Speaker 1", "Speaker 2").
How to enable it:
- Set up an AssemblyAI API key in Settings
- Select AssemblyAI as your transcription provider
- Transcribe your recording — speakers are automatically detected
- Rename generic labels (Speaker 1, Speaker 2) to actual names
Speaker management:
- Click the person icon next to any segment to change the speaker
- Click the pencil icon to rename a speaker (updates everywhere)
- Choose custom colors to easily distinguish speakers
- Filter transcript by speaker to see only their segments
Note: Speaker diarization is only available with AssemblyAI. Apple Speech and OpenAI Whisper do not support automatic speaker identification.
How do I bookmark important parts of a transcript?
Bookmarks let you mark and quickly find important moments in your transcripts.
Adding bookmarks:
- Click the bookmark icon next to any transcript segment
- The icon fills in to indicate it's bookmarked
- Click again to remove the bookmark
Filtering by bookmarks:
- Click the bookmark filter button in the transcript toolbar
- Only bookmarked segments will be shown
- Click again to show all segments
What is AI Split and how does it work?
AI Split intelligently breaks large transcript blocks into smaller, readable chunks using AI analysis.
Why use AI Split?
- Some transcription services return one giant wall of text
- Long paragraphs are hard to read and navigate
- AI Split adds logical breaks for sentences, paragraphs, or speaker changes
How to use AI Split:
- Open a recording with a transcript
- Click the "AI Split" button in the transcript toolbar
- The AI will analyze the content and add appropriate breaks
- Your transcript becomes easier to read and skim
What is AI Help Chat?
AI Help Chat — Get instant answers about SoundMindAI features right in the app. Ask questions and the AI assistant explains how to use any feature.
How to access AI Help Chat:
- Click "AI Help" in the left sidebar
- Type your question about any SoundMindAI feature
- The AI assistant will provide helpful guidance
Example questions:
- "How do I set up transcription?"
- "What AI providers are supported?"
- "How do I export my recordings?"
- "What keyboard shortcuts are available?"
Features:
- Instant answers: No need to search through documentation
- Contextual help: Answers are specific to SoundMindAI features
- Works offline: Uses bundled FAQ data when internet is unavailable
- Conversation history: Ask follow-up questions in the same chat
🎤 Recording
How do I start a recording?
Starting a recording is simple:
Click the Record Button
Click the large red record button on the main screen, or use the keyboard shortcut
Select Audio Sources
Choose to record system audio, microphone, or both
Start Recording
The timer will start and you'll see a visual indicator that recording is active
Stop When Done
Click the stop button or use the keyboard shortcut to end the recording
What is the Mini Recorder view?
Mini Recorder — A compact, floating window that stays on top while you work, perfect for long recording sessions.
How to use Mini Recorder:
- Start a recording from the main window
- Click the "Mini" button in the recording controls
- The main window hides and a small floating panel appears
- The mini panel shows recording time, controls, and audio levels
- Click "Restore" to return to the full window
Mini Recorder features:
- Always on top: Stays visible above other windows (toggle on/off)
- Compact design: Minimal footprint on your screen
- Full controls: Pause, resume, mute mic, and stop recording
- Audio level indicator: Visual feedback of recording levels
- Draggable: Position anywhere on your screen
How does auto-stop after silence work?
Auto-stop After Silence — Automatically stops recording when no audio is detected for a configurable duration. Perfect for hands-free recording sessions.
How it works:
- During recording, SoundMindAI monitors audio levels
- When audio drops below a threshold (very quiet/silence), a timer starts
- If silence continues for the configured duration, recording stops automatically
- Any sound resets the silence timer
How to configure:
- Open Settings (⌘,) or click the gear icon
- Scroll to the Audio section
- Toggle "Auto-stop recordings after silence" on or off
- Set your desired duration and unit (seconds, minutes, or hours)
Use cases:
- Recording a meeting that might end early — the recording stops itself
- Capturing voice memos without needing to manually stop
- Recording background audio where you want to stop after extended silence
Settings:
- Duration: Set from 15 seconds up to hours
- Units: Choose seconds, minutes, or hours
- Default: Enabled with 5 minutes
Can I record system audio from specific applications?
SoundMindAI captures all system audio output. To record specific applications:
- Mute applications you don't want to record
- Use the application's own audio settings to control volume
- Consider using a virtual audio device for more control
Future versions may include per-application audio selection.
How do I import existing audio files?
You can import audio files for transcription:
- Drag and drop audio files onto the SoundMindAI window
- Use File > Import Audio or the keyboard shortcut
- Select files from the file browser
Supported formats: M4A, MP3, WAV, CAF, AIFF, MP4, MOV
Where are my recordings stored?
By default, recordings are stored in:
~/Library/Application Support/SoundMindAI/Recordings/
You can change the storage location in Settings > Storage. All recordings remain on your Mac - nothing is uploaded to the cloud.
📝 Transcription
What transcription options are available?
SoundMindAI offers multiple transcription options:
| Service | Cost | Speed | Accuracy |
|---|---|---|---|
| Apple Speech | Free (built-in) | Fast | Good |
| OpenAI Whisper | ~$0.006/min | Fast | Excellent |
| AssemblyAI | ~$0.003/min | Fast | Excellent |
Apple Speech requires no setup and works offline. BYOK services offer higher accuracy but require API keys.
What are the limitations of Apple Speech (native) transcription?
Apple's built-in Speech Recognition has the following limitations:
| Mode | Duration Limit | Notes |
|---|---|---|
| Server-based | 1 minute | Hard limit, recognition stops after 60 seconds |
| On-device | ~5 minutes recommended | No hard limit, but becomes unreliable for longer recordings |
| Request limit | 1000/hour per device | Shared across all apps on the device |
SoundMindAI enforces a 5-minute limit for native transcription to ensure reliability. For longer recordings, use OpenAI Whisper or AssemblyAI.
How do I view a recording's transcript and summary?
There are three ways to open the detail view for any recording:
- Double-click on the recording in your library
- Select a recording and click the "Transcript & Summary" button
- Click the document icon next to the recording
The detail view shows tabs for Summary, Transcript, Notes, Source, Tags, and AI Chat. You can navigate between tabs to access different aspects of your recording.
How do I transcribe a recording?
After recording or importing audio:
- Select the recording from your library
- Click "Transcribe" or use the keyboard shortcut
- Choose your transcription service (Apple Speech or BYOK)
- Wait for transcription to complete
Transcripts appear in the detail view with timestamps for easy navigation.
Can I edit transcripts?
Yes! SoundMindAI provides comprehensive transcript editing:
- Edit text: Click on any transcript segment to edit the text directly
- Split segments: Break a long segment into two separate parts at a specific point
- Merge segments: Combine adjacent segments into one
- Manage speakers: Assign or change the speaker for any segment
- Auto-save: All changes are saved automatically
Original timestamps are preserved when editing text, and adjusted appropriately when splitting segments.
How do I manage speakers in transcripts?
SoundMindAI makes it easy to identify and organize speakers:
Changing a speaker:
- Click the person icon next to any transcript segment
- Select from existing speakers or create a new one
- The change applies to that segment only
Editing speakers (rename & color):
- Click the pencil icon next to a speaker name
- Change the name (e.g., "Speaker 1" → "John") - updates everywhere that speaker appears
- Choose a custom color from 8 options to easily distinguish speakers
Filtering by speaker:
- Click the speaker filter dropdown (person icon) in the transcript toolbar
- Select a specific speaker to show only their segments
- Select "All Speakers" to show the full transcript again
What is speaker diarization?
Speaker diarization automatically identifies and labels different speakers in a recording (e.g., "Speaker 1", "Speaker 2").
How to enable it:
- Set up an AssemblyAI API key in Settings
- Select AssemblyAI as your transcription provider
- Transcribe your recording - speakers are automatically detected
- Rename generic labels (Speaker 1, Speaker 2) to actual names
Note: Speaker diarization is only available with AssemblyAI. Apple Speech and OpenAI Whisper do not support automatic speaker identification.
How do I bookmark important parts of a transcript?
Bookmarks let you mark and quickly find important moments in your transcripts:
Adding bookmarks:
- Click the bookmark icon next to any transcript segment
- The icon fills in to indicate it's bookmarked
- Click again to remove the bookmark
Filtering by bookmarks:
- Click the bookmark filter button in the transcript toolbar
- Only bookmarked segments will be shown
- Click again to show all segments
Bookmarks are saved with your recording and persist across sessions.
🤖 AI Summarization
What AI summaries does SoundMindAI generate?
SoundMindAI uses AI to extract meaningful insights from your transcripts:
- Summary: Concise overview of the content
- Key Points: Important highlights and takeaways
- Action Items: Tasks and follow-ups identified from the conversation
Each item includes timestamps so you can jump to the relevant part of the recording.
Which AI providers are supported?
SoundMindAI supports multiple AI providers for summarization:
- OpenAI - GPT-4o, GPT-4, GPT-3.5-turbo
- Anthropic - Claude 3.5 Sonnet, Claude 3 Opus
- Google - Gemini 1.5 Pro, Gemini 1.5 Flash
- HuggingFace - Various open-source models
- OpenRouter - Access multiple providers through one API
Each provider has different strengths - experiment to find what works best for your content.
How much does AI summarization cost?
Costs vary by provider and model. Typical costs for summarizing a 1-hour transcript:
- GPT-4o: ~$0.05-0.15
- GPT-3.5-turbo: ~$0.01-0.03
- Claude 3.5 Sonnet: ~$0.05-0.10
- Gemini 1.5 Flash: ~$0.01-0.02
Costs depend on transcript length. You pay directly to your chosen provider - SoundMindAI doesn't add any markup.
Why are there mistakes in my transcript or summary?
SoundMindAI uses third-party AI services for transcription and summarization. We pass your audio/text directly to your chosen provider and display their results.
Common causes of errors:
- Transcription errors: Background noise, accents, technical jargon, or multiple speakers can affect accuracy
- Summary inaccuracies: AI models may misinterpret context, omit details, or occasionally "hallucinate" information
- Model limitations: Different AI providers and models have varying accuracy levels
Important: SoundMindAI is not responsible for the accuracy of AI-generated transcripts or summaries. We simply pass your content to your chosen AI provider and display their output. Always review AI-generated content for accuracy, especially for important meetings or sensitive information.
What is AI Chat and how do I use it?
AI Chat lets you ask questions about your recordings and get instant answers based on the transcript content.
How to use AI Chat:
- Open a recording that has been transcribed
- Click the "Chat" tab in the detail view
- Type your question (e.g., "What were the main decisions made?")
- The AI will analyze the transcript and provide an answer
Example questions:
- "What action items were discussed?"
- "Summarize what John said about the budget"
- "What were the concerns raised about the timeline?"
- "List all the names mentioned in this meeting"
🔑 BYOK (Bring Your Own Keys) Explained
What does BYOK mean?
BYOK stands for "Bring Your Own Keys." Instead of us charging you for AI services, you get API keys directly from the service providers (like OpenAI or Anthropic) and enter them in SoundMindAI.
Here's how it works:
- You create an account with an AI provider (e.g., OpenAI)
- You add payment and get an API key from them
- You enter that API key in SoundMindAI's settings
- SoundMindAI uses your key to access the AI service
- You pay the provider directly based on your usage
Why does SoundMindAI use BYOK instead of including AI?
BYOK offers significant advantages for you:
- Lower cost: Pay wholesale rates directly to providers instead of marked-up prices
- Choice: Pick the AI provider and model that works best for you
- Privacy: Your data goes directly to the provider - we never see it
- Control: Set your own usage limits and budgets
- No subscription: Pay only for what you use, when you use it
Are my API keys secure?
Yes, your API keys are stored securely:
- Keys are stored in macOS Keychain, the same place your passwords are stored
- Keys are encrypted using macOS system-level encryption
- Keys never leave your Mac (except when making API calls to providers)
- We cannot access, view, or retrieve your keys
How much do AI services cost? (Estimated Pricing)
Below are estimated costs for supported AI providers. Actual costs depend on your usage and current provider pricing.
Transcription Services
| Provider | Cost per Minute | 1-Hour Recording |
|---|---|---|
| Apple Speech | Free | $0.00 |
| OpenAI Whisper | ~$0.006 | ~$0.36 |
| AssemblyAI | ~$0.003 | ~$0.18 |
AI Summarization Services
Estimated cost to summarize a 1-hour transcript (~12,000 words)
| Provider | Model | Est. Cost |
|---|---|---|
| OpenAI | GPT-4o | ~$0.05 - $0.15 |
| OpenAI | GPT-4o-mini | ~$0.01 - $0.03 |
| Anthropic | Claude 3.5 Sonnet | ~$0.04 - $0.10 |
| Gemini 1.5 Flash | ~$0.01 - $0.02 | |
| Gemini 1.5 Pro | ~$0.02 - $0.06 |
Example: Total Cost for 1-Hour Meeting
| Setup | Transcription | Summary | Total |
|---|---|---|---|
| Budget | Apple Speech (free) | Gemini Flash (~$0.01) | ~$0.01 |
| Balanced | Whisper (~$0.36) | GPT-4o-mini (~$0.02) | ~$0.38 |
| Premium | AssemblyAI (~$0.18) | GPT-4o (~$0.10) | ~$0.28 |
⚡ Quick Start Setup Guide
How do I set up SoundMindAI for best results?
SoundMindAI uses a BYOK (Bring Your Own Keys) model, which means you connect your own AI service accounts. This gives you lower costs, more privacy, and full control. Here's how to get the best experience:
What You'll Set Up
Options: Native macOS (free) or OpenAI Whisper (more accurate) or AssemblyAI
Options: OpenAI GPT, Anthropic Claude, Google Gemini, and more
Recommended setup for best results (5 minutes)
For the best transcription accuracy and AI summaries, we recommend OpenAI - one account gives you both Whisper (transcription) and GPT (summarization).
Step 1: Create OpenAI Account
Go to platform.openai.com/signup and create a free account.
Step 2: Add Payment Method
Go to Settings > Billing and add a credit card. You only pay for what you use (typically $0.01-0.05 per recording).
Step 3: Generate API Key
Go to API Keys section, click "Create new secret key", name it "SoundMindAI", and copy it immediately (you won't see it again).
Step 4: Configure SoundMindAI
In SoundMindAI, go to Tools > Settings:
- Under Transcription, select "Whisper API"
- Under AI Summarization, select "OpenAI"
- Paste your API key when prompted
Step 5: Test Your Setup
Make a short test recording (30 seconds), then check:
- The status indicators in the sidebar should show green
- After recording, transcription should appear
- AI summary and action items should generate
Alternative: Free setup with native transcription
If you prefer not to set up API keys, SoundMindAI works out of the box with native macOS transcription:
- Transcription: Uses Apple's built-in Speech Recognition (free, on-device)
- AI Summarization: Not available without an API key
Limitations of native transcription:
- Limited to recordings under 5 minutes
- Less accurate than Whisper for technical content
- No AI summaries or action items
This is a good option for quick tests, but for regular use we recommend setting up OpenAI (or AssemblyAI for speaker diarization) for the full experience.
Understanding the status indicators
The sidebar shows the status of your transcription and AI providers:
Click on either indicator to quickly change providers without going to Settings.
Troubleshooting setup issues
"Invalid API key" error:
- Make sure you copied the entire key (no extra spaces)
- Check that you have billing set up with the provider
- Try generating a new key if the old one isn't working
"Insufficient quota" or "Rate limit" error:
- Add credits or a payment method to your provider account
- Check your usage limits in the provider's dashboard
Transcription not working:
- Verify the API key is entered correctly in Settings
- Check that the correct transcription provider is selected
- Try the API Diagnostics tool in Settings to test connectivity
Still having issues? See the detailed setup guides below for your specific provider, or check the Troubleshooting section.
🛠 API Setup Guides
How to set up OpenAI (Whisper + GPT)
OpenAI provides both Whisper (transcription) and GPT (summarization):
Create an OpenAI Account
Go to platform.openai.com/signup and create an account
Add Payment Method
Go to Settings > Billing and add a credit card. New accounts may receive free credits.
Generate API Key
Go to API Keys section, click "Create new secret key", and copy it immediately (you won't see it again)
Enter Key in SoundMindAI
Open SoundMindAI Settings > API Keys > OpenAI and paste your key
Having issues? Visit OpenAI Help Center or OpenAI Documentation.
How to set up Anthropic (Claude)
Anthropic provides Claude for AI summarization:
Create an Anthropic Account
Go to console.anthropic.com and sign up
Add Payment Method
Navigate to Billing and add a payment method
Generate API Key
Go to API Keys, create a new key, and copy it
Enter Key in SoundMindAI
Open SoundMindAI Settings > API Keys > Anthropic and paste your key
Having issues? Visit Anthropic Documentation or Anthropic Support.
How to set up Google (Gemini)
Google provides Gemini for AI summarization:
Go to Google AI Studio
Visit aistudio.google.com and sign in with your Google account
Get API Key
Click "Get API Key" in the left sidebar, then "Create API key"
Copy Your Key
Copy the generated API key
Enter Key in SoundMindAI
Open SoundMindAI Settings > API Keys > Google and paste your key
Having issues? Visit Google AI Documentation or Google AI Studio.
How to set up AssemblyAI (Transcription)
AssemblyAI provides high-quality transcription:
Create AssemblyAI Account
Go to assemblyai.com and sign up
Get Your API Key
Your API key is shown on your dashboard immediately after signing up
Add Credits
Go to Billing to add credits or set up a payment method
Enter Key in SoundMindAI
Open SoundMindAI Settings > API Keys > AssemblyAI and paste your key
Having issues? Visit AssemblyAI Documentation or AssemblyAI Support.
How to set up OpenRouter (Multiple Providers)
OpenRouter lets you access multiple AI providers with one API key:
Create OpenRouter Account
Go to openrouter.ai and sign up
Add Credits
Add credits to your account in the Billing section
Get API Key
Go to Keys section and create a new API key
Enter Key in SoundMindAI
Open SoundMindAI Settings > API Keys > OpenRouter and paste your key
Having issues? Visit OpenRouter Documentation or OpenRouter Discord.
📁 Organization & Management
How do I organize my recordings?
SoundMindAI provides several ways to organize your recordings:
- Tags: Add custom tags with colors to categorize recordings
- Folders: Create folders to group related recordings
- Search: Search by title, transcript content, or tags
- Sort: Sort by date, duration, or name
- Filter: Filter by tags, date range, or transcription status
How do I add tags to recordings?
To add tags:
- Select a recording from your library
- Click the Tags section in the detail view
- Type a tag name and press Enter, or select from existing tags
- Click the color dot to change tag color
You can also batch-tag multiple recordings by selecting them first.
Can I select multiple recordings at once?
Yes! SoundMindAI supports multi-select for batch operations:
How to multi-select:
- Command + Click: Add individual recordings to your selection
- Shift + Click: Select a range of recordings
- Command + A: Select all recordings
Batch operations available:
- Delete multiple recordings at once
- Add tags to multiple recordings
- Export multiple recordings
Can I export my recordings and transcripts?
Yes! SoundMindAI offers flexible export options:
Export formats:
- Plain Text (.txt): Simple text format for any application
- Markdown (.md): Formatted text with headers and styling
- HTML (.html): Web-ready format with styling
- JSON (.json): Structured data format for developers
- SRT (.srt): Subtitle format for video editing
What you can export:
- Full transcript with timestamps and speakers
- AI-generated summary, key points, and action items
- Original audio file (M4A)
Access export via File > Export, the toolbar button, or right-click on a recording.
How do I delete recordings?
To delete recordings:
- Select the recording(s) you want to delete
- Press Delete or right-click and select "Delete"
- Confirm deletion in the dialog
▶️ Playback
How do I play back recordings?
Select a recording and use the built-in player:
- Play/Pause: Space bar or play button
- Seek: Click anywhere on the timeline
- Skip: Arrow keys for 5-second jumps
- Speed: Adjust playback speed (0.5x to 2x)
- Volume: Use the volume slider
Can I jump to specific parts of a recording?
Yes! Click on any timestamped item to jump to that position:
- Click transcript segments to jump to that part
- Click key points to hear the relevant section
- Click action items to hear the context
Timestamps are clickable throughout the interface.
⚖️ Legal & Compliance
What are my legal obligations when using SoundMindAI?
By downloading, installing, or using SoundMindAI, you agree to the following:
- You will research and understand all recording consent laws applicable to your jurisdiction before making any recordings
- You will obtain all necessary permissions, consents, and disclosures required by law before recording any conversation
- You will comply with all applicable laws including but not limited to wiretapping laws, eavesdropping statutes, privacy regulations, workplace policies, and any other relevant regulations
- You assume full responsibility and liability for all recordings you make and how you use them
- You will not use the software for any illegal, unauthorized, or harmful purpose
What happens if I violate recording laws?
You bear all consequences. GrowTech Development, LLC and SoundMindAI have no liability whatsoever for your use of the software.
By using SoundMindAI, you agree to:
- Indemnify and hold harmless GrowTech Development, LLC, its officers, directors, employees, and affiliates from any and all claims, damages, losses, costs, and legal fees arising from your use of the software
- Defend us against any third-party claims resulting from your recordings or use of the software
- Accept that we have no obligation to provide legal assistance, advice, or support if you face legal action
This indemnification obligation survives termination of your license and continues indefinitely.
See our Terms of Service for complete indemnification terms.
Do you verify that my recordings are legal?
No. We have no ability or obligation to verify the legality of your recordings.
SoundMindAI is a local application that runs entirely on your Mac. We do not:
- Monitor what you record
- Review your recordings for legal compliance
- Verify that you have obtained required consents
- Have any knowledge of how you use the software
The entire burden of legal compliance rests with you. You must independently research applicable laws, obtain required consents, and ensure your recordings are lawful. We provide a tool; you are responsible for using it legally.
What are my obligations regarding AI-generated content?
You are solely responsible for verifying and using AI-generated content appropriately.
By using AI transcription and summarization features, you acknowledge and agree:
- AI-generated transcripts, summaries, and other content may contain errors, omissions, inaccuracies, or fabricated information
- You will review and verify all AI-generated content before relying on it
- You will not hold us liable for any errors in AI output or decisions made based on AI-generated content
- You assume all risk associated with using AI-generated content for any purpose
We make no representations or warranties regarding the accuracy, completeness, reliability, or fitness for any purpose of AI-generated content.
What are my obligations regarding third-party AI services?
You are solely responsible for your relationship with third-party AI providers.
When you use BYOK (Bring Your Own Keys) features, you agree:
- You will review and accept each provider's terms of service and privacy policy before using their services
- You are responsible for all charges, fees, and costs incurred with third-party providers
- You will resolve all disputes with third-party providers directly with them
- You acknowledge we have no control over third-party services' availability, pricing, data handling, or performance
- You release us from any liability related to third-party services
Your data is transmitted directly to the providers you choose. Review their privacy policies to understand how they handle your data.
Where are your binding legal terms?
By using SoundMindAI, you are bound by the following legal documents:
- Terms of Service — Includes license terms, acceptable use, disclaimers of warranties, limitation of liability, indemnification, binding arbitration, and class action waiver
- Privacy Policy — Describes data collection, storage, and your privacy rights
- Refund Policy — All sales are final; 7-day free trial provided for evaluation
These terms are legally binding. By downloading or using the software, you acknowledge that you have read, understood, and agree to be bound by these terms. If you do not agree, do not use the software.
Do you provide legal advice?
No. Nothing in this FAQ, on our website, or in our software constitutes legal advice.
All information is provided for general informational purposes only and should not be relied upon as legal guidance. Laws vary by jurisdiction and change over time.
If you are unclear about any legal requirements or have any doubts about the legality of recording in your situation, you must seek advice from a qualified attorney licensed in your jurisdiction before using the recording features.
GrowTech Development, LLC is a software company. We are not a law firm and do not provide legal services.
🔧 Troubleshooting
Recording isn't capturing system audio
If system audio isn't being captured:
- Check Screen Recording permission: System Settings > Privacy & Security > Screen Recording. Ensure SoundMindAI is enabled.
- Restart the app: Quit and relaunch SoundMindAI after granting permission.
- Check audio source: Ensure "System Audio" is selected in recording settings.
- Test with another app: Play audio from any app while recording to verify.
Why does my recording sound degraded or echo-y when the microphone is on?
This is not a bug in SoundMindAI — it's a fundamental physics issue with audio recording that affects all recording software.
What's happening
When you record with your microphone enabled and speakers playing audio:
- Your speakers play the system audio (e.g., a YouTube video, Zoom call)
- Your microphone picks up that same audio from the room
- Now you have two copies of the same audio: the clean digital original + a degraded room recording
- During playback, these two copies interfere with each other, causing phasing, echo, and a "hollow" or "underwater" sound
The solution: Use headphones
Wearing headphones while recording completely solves this issue:
- Headphones isolate audio — the sound goes directly to your ears, not into the room
- Your mic only captures your voice — clean, isolated input with no speaker bleed
- Result: Crystal-clear recordings with no phasing or echo
If you can't use headphones
- Mute your mic during recording if you don't need to capture your voice. SoundMindAI sets mic playback to 0% by default for clean system audio.
- Lower your speaker volume to reduce how much the mic picks up
- Position your mic away from speakers if using an external microphone
"Required permissions not granted" error (even when permissions are enabled)
After updating SoundMindAI, you may see a "Required permissions not granted" error even though the permission toggles appear enabled in System Settings. This happens because macOS caches permissions based on the app's code signature, which changes with each update.
Quick Fix
- Open System Settings > Privacy & Security > Screen & System Audio Recording
- Click the minus (-) button to remove SoundMindAI from the list entirely
- Quit SoundMindAI completely (Cmd+Q)
- Relaunch SoundMindAI - it will prompt for permission again
- Click Allow when prompted
Nuclear Reset (if Quick Fix doesn't work)
Open Terminal and run these commands to completely reset permissions:
tccutil reset ScreenCapture com.robertgrow.SoundMindAI
tccutil reset Microphone com.robertgrow.SoundMindAI
Then:
- Quit SoundMindAI completely
- Relaunch the app
- Grant permissions when prompted
Transcription is failing or stuck
If transcription isn't working:
- Recording too long (Apple Speech): Native transcription is limited to 5 minutes. For longer recordings, configure Whisper or AssemblyAI in Settings.
- Apple Speech: Ensure your Mac has an internet connection (required for some languages)
- BYOK services: Verify your API key is correct in Settings
- Check credits: Ensure you have available credits/balance with your AI provider
- File format: Ensure the audio file is in a supported format
API key errors
If you're seeing API key errors:
- Invalid key: Double-check you copied the entire key with no extra spaces
- Expired key: Some providers expire unused keys. Generate a new one.
- No credits: Check your provider account for available balance
- Rate limited: Wait a moment and try again
You can test your API keys in Settings by clicking the "Test" button next to each key.
How do I use API Diagnostics?
The Diagnose feature helps you test and troubleshoot your API configurations:
How to access:
- Click "Diagnose" in the sidebar
- Select the provider you want to test (Transcription or AI)
- Click "Run Test" to verify the connection
What diagnostics check:
- API key validity
- Network connectivity to the provider
- Account status and available credits
- Model availability
When to use diagnostics:
- After entering a new API key
- When transcription or summarization fails
- To verify your setup before an important recording
- After changing providers or models
How do I contact support?
For support, please email us at:
Please include:
- Description of the issue
- macOS version
- SoundMindAI version (from About menu)
- Steps to reproduce the problem
- Any error messages you see