Skip to content

Incept5/note-taker

Repository files navigation

NoteTaker

Privacy-first meeting transcription and summarization for macOS. Capture meeting audio, transcribe it with WhisperKit, and generate structured summaries with a local LLM. No data ever leaves your machine.

Think Granola, but fully local. The privacy guarantee is architectural, not contractual.

How It Works

NoteTaker lives in your menu bar. Click to start recording — it captures system audio (Zoom, Teams, Meet, etc.) via ScreenCaptureKit and mixes in your microphone input, producing a single combined audio stream with all voices. Audio is transcribed in real-time during recording, with a live transcript displayed as you go. When you stop, it generates a structured summary with key points, decisions, action items, and open questions. Everything stays on your machine.

Record (system audio + mic mixed together)
    -> Live transcription (SFSpeech, on-device)
    -> On stop: finalize transcript
        -> Summarize locally (MLX or Ollama)
            -> Browse & copy results

Requirements

  • macOS 14.2+ (Sonoma) — required for ScreenCaptureKit audio capture
  • Apple Silicon (M1 minimum, M2 Pro+ recommended)
  • 16 GB RAM minimum (32 GB recommended for larger LLM models)

Installation

  1. Download NoteTaker dmg from the latest release
  2. Open the DMG and drag NoteTaker to your Applications folder
  3. Launch NoteTaker from Applications — it appears in the menu bar (no Dock icon)

First Launch Permissions

macOS will ask for two permissions:

Permission Why
Microphone Captures your voice during meetings
Screen Recording Required by macOS for ScreenCaptureKit to capture system audio (no video is recorded)
Calendars (optional) Identifies meeting participants from your calendar for richer summaries

Grant the required permissions, then restart NoteTaker if prompted. Calendar access is optional and only requested when needed.

Summarization Setup

NoteTaker supports two backends for local summarization:

MLX (Default — No Setup Required)

MLX runs models directly on Apple Silicon with no external dependencies. On first use, open Settings and download an MLX model — everything is managed within the app. No terminal commands, no servers to run.

Built-in Models
Model Size RAM Needed Best For
Qwen3 30B MoE ~17 GB ~20 GB Best summary quality. 30B parameters but only 3B active (mixture-of-experts), so it runs efficiently. Recommended if you have 32GB RAM.
Qwen3 4B ~2.3 GB ~4 GB Fast and capable. Recommended for 16GB machines.
Llama 3.1 8B ~4 GB ~8 GB Strong general-purpose summarization. Good middle ground.
Llama 3.2 3B ~2 GB ~4 GB Lightweight and quick. Decent quality for short meetings.
Qwen 2.5 7B ~4 GB ~8 GB High quality, similar tier to Llama 3.1 8B.
Mistral 7B v0.3 ~4 GB ~8 GB Solid general-purpose option.
Qwen 2.5 3B ~2 GB ~4 GB Fast with low RAM usage.

Which model should I choose?

  • 32 GB RAM: Use Qwen3 30B MoE — it produces the best summaries by a clear margin, and the mixture-of-experts architecture keeps it fast despite the large parameter count.
  • 16 GB RAM: Use Qwen3 4B — excellent quality-to-size ratio, runs comfortably alongside a browser and video call.
  • Tight on RAM or want fastest results: Use Llama 3.2 3B or Qwen 2.5 3B — smallest footprint, still produces useful summaries.

You can also add any HuggingFace MLX model by pasting its ID in Settings — the built-in list is just a curated starting point.

Ollama (Alternative)

If you prefer Ollama, switch the backend to Ollama in Settings, then install and start the server:

brew install ollama
ollama serve

# Pull a model (in a separate terminal)
ollama pull qwen3-vl           # Recommended — excellent summarization quality

Ollama must be running (ollama serve) whenever you want to generate summaries. Transcription works without it.

Using a Remote Ollama Server

If you have a more powerful machine running Ollama on your network (e.g. a Mac Mini or Studio with more RAM for larger models), you can point NoteTaker at it:

  1. Open Settings (gear icon in the menu bar popover)
  2. Switch the summarization backend to Ollama
  3. Change the Server URL to your remote machine's address (e.g. http://192.168.1.50:11434)
  4. Click Connect — NoteTaker will check availability and list the models on that server

This lets you run larger models (70B+) on a dedicated machine while keeping NoteTaker lightweight on your laptop. Audio capture and transcription still run locally — only the summarization request is sent to the remote Ollama server.

Usage

Auto-Record for Zoom & Teams

NoteTaker can automatically start recording when Zoom or Microsoft Teams launches, and stop when the meeting ends — no manual intervention needed.

  1. Open Settings (gear icon) → Audio Capture
  2. Enable "Auto-record when meeting starts"

When a monitored app launches, recording starts automatically. When the meeting ends (detected by 30 seconds of sustained audio silence), recording stops and the transcription/summarization pipeline kicks in. If the meeting app quits entirely, recording stops immediately.

Auto-stop only applies to auto-started recordings — manually started recordings are never stopped automatically.

Recording Retention

Audio recordings are compressed using AAC (M4A format), keeping file sizes small (~50-80 MB for a 58-minute meeting). Old recordings are automatically deleted on launch based on a configurable retention period.

  1. Open Settings (gear icon) → Audio Capture
  2. Choose a retention period: 7, 14, 28, 60, or 90 days

The default is 28 days. Changes take effect on the next app launch.

Microphone Settings

By default, microphone capture is enabled and uses the system default input device. Your mic audio is mixed into the system audio stream so all voices appear in the transcript.

Open Settings (gear icon) to:

  • Toggle microphone capture on or off
  • Choose a specific microphone — if you have an external USB mic or audio interface, select it from the device list

Your selection is remembered across restarts, and the device list updates automatically when you plug in or disconnect hardware.

Google Calendar Integration

NoteTaker can connect to your Google Calendar to automatically identify meeting participants when you start recording. Participant names are included in the summary, helping the LLM attribute action items and contributions to specific people.

  1. Open Settings (gear icon) → Google Calendar
  2. Click "Sign in with Google"
  3. Authorize in your browser — you'll be redirected back to NoteTaker

NoteTaker requests read-only access to calendar events. It checks for events around the time you start recording, matches the current meeting, and pulls in the participant list. Tokens are stored in your macOS Keychain.

NoteTaker tries Apple Calendar (EventKit) first — if your Google Calendar is already synced via macOS System Settings, it works without signing in. The Google Calendar API is a fallback for when EventKit has no events.

Recording a Meeting

  1. Click the NoteTaker icon in your menu bar
  2. Click Start Recording
  3. An audio level meter shows the combined audio stream
  4. A live transcript appears as audio is transcribed in real-time
  5. Click Stop Recording when done

Transcription

Audio is transcribed in real-time during recording using Apple's SFSpeechRecognizer — you see text appear as people speak. When you stop, the live transcript is used directly, making post-recording processing instant. If SFSpeech was unavailable (e.g. permission denied), NoteTaker falls back to batch-transcribing from the audio file using WhisperKit. The first WhisperKit run downloads the Whisper model (~1.5 GB).

Summarization

If a summarization model is selected and available, summarization starts automatically after transcription. The summary includes:

  • Key Points — important topics discussed
  • Decisions — what was agreed on
  • Action Items — tasks with owners where identifiable
  • Open Questions — unresolved topics
  • Full Summary — detailed narrative overview with paragraph breaks

Re-summarize with Different Models

Want to compare how different models summarize your meeting? Open any meeting from History and click the "Re-summarize" button in the summary panel header. It uses whichever model is currently selected in Settings, so you can switch models and re-summarize the same transcript to compare quality.

Custom System Prompt

You can customise the instructions sent to the LLM when summarising meetings:

  1. Open Settings (gear icon) → Summarization
  2. Click "Edit Prompt"
  3. Modify the prompt in the editor and click Save

Use placeholders to inject dynamic values:

  • {{duration}} — meeting duration in minutes
  • {{context}} — the detected meeting app (e.g. "from Zoom")
  • {{participants}} — participant names from your calendar

Click "Reset to Default" to restore the built-in prompt at any time. Your custom prompt persists across sessions.

Copying Results

Both the summary (as markdown) and raw transcript have copy buttons. Paste into your notes app, email, or document of choice.

Meeting History

All sessions are saved automatically to a local SQLite database. Click the clock icon in the popover to open the history window — a dedicated resizable window where you can browse past meetings and click to drill into a detail view with side-by-side summary and transcript, each with copy buttons.

Architecture

Menu Bar UI (SwiftUI)
    |
AppState (Phase-driven state machine)
    |
    +-- AudioCaptureService
    |     +-- SystemAudioCapture (ScreenCaptureKit + mic mixing)
    |
    +-- AudioDeviceManager (input device enumeration)
    |
    +-- TranscriptionService (WhisperKit batch fallback)
    |     +-- SpeechStreamingTranscriber (SFSpeech live during recording)
    |
    +-- SummarizationService (MLX or Ollama)
    |
    +-- MeetingStore (SQLite via GRDB)
    |
    +-- CalendarService (EventKit + Google Calendar API fallback)
    |     +-- GoogleCalendarAuthService (OAuth 2.0 + PKCE)
    |     +-- GoogleCalendarClient (Calendar Events API)
    |
    +-- MeetingAppMonitor (NSWorkspace launch/terminate detection)

Audio capture uses ScreenCaptureKit for system audio (all apps) with microphone input mixed in via AVAudioEngine. Mic samples are captured into a thread-safe ring buffer and added to the system audio stream in the SCStream callback, producing a single AAC-compressed M4A file in ~/Library/Application Support/NoteTaker/recordings/. AAC compression reduces file sizes by ~15-20x compared to uncompressed WAV (~50-80 MB vs ~1 GB for a 58-minute meeting). Old recordings are automatically cleaned up based on a configurable retention period (default 28 days). During recording, audio buffers are forwarded to SpeechStreamingTranscriber which wraps Apple's SFSpeechRecognizer for near-instant live transcription. Text is accumulated across SFSpeech session resets, producing a continuous transcript.

State management is driven by a single AppState class with a Phase enum: idle -> recording (with live transcript segments) -> stopped -> transcribing -> transcribed -> summarizing -> summarized. Each phase transition drives the UI.

Storage uses SQLite (via GRDB.swift) for meeting metadata, transcripts, and summaries. Audio files are stored on the filesystem.

Key Technical Decisions

  • ScreenCaptureKit for driver-free system audio capture — no kernel extensions needed, works reliably across all output devices including Bluetooth
  • Mixed audio stream — mic input is mixed into the system audio stream in real-time via a ring buffer, producing a single combined recording with all voices
  • Live transcription — audio is transcribed in real-time during recording using Apple's SFSpeechRecognizer (on-device), with text appearing as people speak
  • WhisperKit as fallback transcription — MLX-optimized for Apple Silicon, runs entirely on-device when SFSpeech is unavailable
  • MLX for summarization (default) — runs local LLMs directly on Apple Silicon with no external dependencies. Ollama also supported as an alternative, configurable to use a remote server for access to larger models
  • SQLite over Core Data — lighter weight, simpler, no ORM overhead
  • Structured summary output — Ollama is prompted to return JSON with distinct fields, not unstructured text
  • No sandbox — required for system audio capture to function

For Developers

Prerequisites

  • Xcode 15+
  • XcodeGen (brew install xcodegen)
  • Ollama running locally for summarization testing

Building from Source

git clone <repo-url>
cd note-taker

# Generate the Xcode project
xcodegen generate

# Build
xcodebuild -project NoteTaker.xcodeproj -scheme NoteTaker build

Or open NoteTaker.xcodeproj in Xcode and build from there.

Project Structure

Sources/
  App/            AppState, AppDelegate (@main entry point),
                  MeetingAppMonitor, CalendarService
  Audio/          SystemAudioCapture (ScreenCaptureKit + mic mixing),
                  AudioCaptureService, AudioDeviceManager, AudioLevelMonitor,
                  AudioProcessDiscovery, CoreAudioUtils
  Transcription/  TranscriptionService, SpeechStreamingTranscriber, ModelManager,
                  MeetingTranscription
  Summarization/  SummarizationService, MLXClient, MLXModelManager,
                  OllamaClient, MeetingSummary
  Storage/        DatabaseManager (GRDB), MeetingStore, MeetingRecord
  GoogleCalendar/ GoogleCalendarConfig, GoogleCalendarAuthService,
                  GoogleCalendarClient
  Models/         AudioProcess, CapturedAudio
  Views/          All SwiftUI views (popover, recording, transcription,
                  summary, history, settings)
Resources/        Info.plist, entitlements
Assets.xcassets/  App icon

Dependencies

Package Version Purpose
WhisperKit 0.15.0+ Local speech-to-text (MLX-optimized)
mlx-swift-lm 2.0.0+ Local LLM inference on Apple Silicon
GRDB.swift 7.0.0+ SQLite database wrapper

Ollama is an optional external runtime dependency (not a Swift package) for users who prefer it over MLX. It communicates via HTTP (default localhost:11434, configurable to a remote server).

Building a Release (Signed DMG)

The release script archives, signs with Developer ID, notarizes with Apple, and packages a DMG.

Prerequisites:

  • Apple Developer Program membership ($99/year)
  • Developer ID Application certificate installed (Xcode -> Settings -> Accounts -> Manage Certificates)
  • App-specific password from appleid.apple.com (Sign-In and Security -> App-Specific Passwords)
TEAM_ID=YOUR_TEAM_ID \
[email protected] \
APP_SPECIFIC_PASSWORD=xxxx-xxxx-xxxx-xxxx \
./scripts/release.sh

Optional environment variables:

  • VERSION — set the version string (e.g. 1.0.0)
  • BUILD_NUMBER — explicit build number (auto-increments if omitted)
  • OUTPUT_DIR — output directory (default: build/release)

The signed and notarized DMG is written to build/release/NoteTaker-{version}.dmg.

Notes for Contributors

  • No fatalError in production paths — use guard/throw with descriptive errors
  • @MainActor for all UI state
  • Weak self in all audio callbacks to prevent retain cycles
  • App sandbox is disabled (required for system audio capture)
  • Screen Recording permission is required — without it, ScreenCaptureKit cannot capture system audio

Troubleshooting

See TROUBLESHOOTING.md for solutions to common issues, including permission problems when upgrading from a previous version.

Privacy

NoteTaker makes zero network calls for audio capture, transcription, and summarization (when using MLX) — everything runs entirely on-device. If you use Ollama on a remote server, the transcript text is sent to that server for summarization — but this is a machine you control on your own network, not a third-party cloud service. If you sign in with Google Calendar, NoteTaker makes read-only API calls to fetch event details around the time you start recording — participant names are stored locally and never sent elsewhere. Audio files, transcripts, and summaries are stored locally in ~/Library/Application Support/NoteTaker/. No telemetry, no analytics, no cloud sync. See our Privacy Policy for full details.

License

Apache License 2.0. See LICENSE for details.

About

Local Note Taker that can transcribe and summarise audio captured from system sources and mic input using LLM of your choice

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors