Audio Library
Tagger
AI-powered local audio cataloguing. Scans your entire sound library, classifies 527 sound event categories using GPU-accelerated neural networks, extracts music features, and gives you a fast browser-based search interface — all without uploading a single file to the cloud.
What it does
Audio Library Tagger scans your audio folders recursively and runs two AI systems on every file:
- PANNs (Pretrained Audio Neural Networks) — Google's open-source model trained on 527 AudioSet sound categories. Detects explosions, rain, footsteps, piano, gunshots, crowd noise, engines, and hundreds more with confidence scores.
- librosa feature extraction — Estimates BPM, musical key, loudness, spectral brightness, and zero-crossing rate, then derives semantic tags (tempo, energy, mood, type) from those values.
Everything is stored in a local SQLite database. A lightweight Flask web server lets you search and filter the entire library in your browser — with live audio preview, copy path, and open-in-Explorer functionality.
The web app and the tagger can run simultaneously. Start searching tagged files while the rest of your library is still being processed.
Requirements
A GPU is strongly recommended for large libraries. CPU mode works but is significantly slower — roughly 30–60 seconds per file versus 1–2 seconds on GPU. Use --workers 4 to parallelise on CPU.
Installation
-
Install Python 3.10+
Download from python.org. On Windows, check "Add Python to PATH" during installation.
-
Install PyTorch with CUDA
Check your CUDA version first with
nvidia-smi, then install the matching PyTorch build:PowerShell / Terminalpip install torch torchaudio --index-url https://download.pytorch.org/whl/cu121
Replace
cu121with your CUDA version (e.g.cu118,cu124). -
Install dependencies
PowerShell / Terminalcd audio_tagger pip install -r requirements.txt
-
Download the PANNs model
On Windows, run these in PowerShell to pre-download the model files (avoids a wget issue on first run):
PowerShellNew-Item -ItemType Directory -Force -Path "$env:USERPROFILE\panns_data" Invoke-WebRequest -Uri "https://raw.githubusercontent.com/qiuqiangkong/audioset_tagging_cnn/master/metadata/class_labels_indices.csv" -OutFile "$env:USERPROFILE\panns_data\class_labels_indices.csv" Invoke-WebRequest -Uri "https://zenodo.org/record/3987831/files/Cnn14_mAP%3D0.431.pth" -OutFile "$env:USERPROFILE\panns_data\Cnn14_mAP=0.431.pth"
The model is ~400 MB and only needs to be downloaded once.
Quick Start
Step 1 — Scan your library
Double-click Scan Library.bat in the audio_tagger folder. It will prompt you to enter your audio folder path:
Audio folder path: V:\Music & Sound for Games
The tagger walks all subfolders and processes every supported audio file. Progress is shown in the terminal and saved to tagger.log.
If you stop the tagger and restart it, already-processed files are automatically skipped. You can safely stop and resume at any time.
Step 2 — Start the web app
Double-click Start Audio Library Local Server.bat — this starts the local server automatically. Then either open http://localhost:5000 in your browser, or use the Open Audio Library shortcut if you've created one. The app can run while the tagger is still processing — results appear as files are tagged.
Batch File Launchers
Three files are included to make launching the tool as simple as a double-click — no terminal required.
| File | What it does |
|---|---|
| Scan Library.bat | Prompts you for your audio folder path, then starts the AI scanner. Resumable — safe to stop and restart any time. |
| Start Audio Library Local Server.bat | Starts the local web server. Keep this running whenever you want to search your library. |
| Open Audio Library shortcut | A browser shortcut pointing to http://localhost:5000. Open this after launching the web server to go straight to search. |
You can run Scan Library.bat and Start Audio Library Local Server.bat simultaneously in separate windows — search while your library is still being tagged.
Scanning Your Library
Command line options
python tagger.py --path "V:\Audio" --workers 1 --db audio_library.db
For most users, double-clicking Scan Library.bat is simpler — it prompts for the path and handles everything automatically.
| Flag | Default | Description |
|---|---|---|
| --path | required | Root folder to scan. All subfolders are included. |
| --workers | 1 | Parallel workers. Keep at 1 for GPU. Increase to 4+ for CPU-only. |
| --db | audio_library.db | Path to the SQLite database file. |
Run the tagger again with a different --path to add more files to the same database. Already-tagged files are skipped regardless of path.
Searching & Filtering
Text search
The search bar performs full-text search across filenames and all AI-generated tags. Try terms like explosion, piano, dark ambient, one-shot, or fast.
Negative search
Prefix any keyword with a - minus symbol to exclude results containing that tag. This is useful when a broad search returns unwanted categories alongside your target sounds.
squeal -animal → metal squeals only, no animal sounds impact -music → impact SFX with no musical results explosion -small → large explosions only rain -music -ambient → rain SFX excluding music and atmosphere tracks
Negative terms work across filenames and all AI tags — not just categories. Use -animal, -piano, -loop or any tag value you want to remove from results.
Sidebar filters
Click any filter in the sidebar to narrow results by category, type, tempo, energy, mood, or file format. Filters are additive — combine them freely.
Tag pill search
Click any tag pill on a file card to instantly filter by that tag value.
Duration filter
Enter min/max duration in seconds at the top of the results pane to find files of a specific length — useful for isolating one-shots, loops, or full tracks.
File actions
- Play button — previews the file directly in the browser
- Open folder — opens the file location in Explorer / Finder
- Copy path — copies the full file path to clipboard
- ⟳ Loop — toggles looping playback on/off. Turns green when active. Useful for testing whether a sound loops cleanly.
- Speed selector — adjusts playback rate (0.5×, 0.75×, 1×, 1.5×, 2×). Useful for checking fast one-shots in slow motion or auditioning loops at different tempos.
Tag Reference
Every file receives tags across multiple categories. Here's what each category means and how it's generated.
| Category | Example values | Source |
|---|---|---|
| sound_event | Explosion, Rain, Piano, Gunshot, Dog, Thunder… | PANNs neural network — 527 AudioSet classes with confidence scores |
| category | sound effect, music, ambience / atmosphere, musical sfx | Derived from top PANNs results |
| type | one-shot, short clip, loop / stem, full track | Derived from file duration |
| tempo | very slow, slow, medium, fast, very fast | Derived from librosa BPM estimate |
| energy | quiet, medium energy, loud | Derived from RMS loudness in dB |
| mood | bright, dark | Derived from detected musical key mode (major/minor) |
| texture | tonal, noisy, warm, bright | Derived from spectral centroid and zero-crossing rate |
Music files also display BPM, musical key (e.g. A minor), loudness in dB, sample rate, and channel count in the card header.
Supported Formats
If your library came from a Mac or was distributed as a Mac zip, you may see ._filename files. These are Apple metadata stubs, not real audio. The tagger logs them as errors and skips them — this is expected behaviour.
Performance
| Mode | Speed per file | 10,000 files |
|---|---|---|
| NVIDIA GPU (CUDA) | 1–2 seconds (tested on RTX 2070) | ~3–6 hours |
| CPU (single worker) | 30–60 seconds | ~80–170 hours |
| CPU (4 workers) | ~15–20 seconds | ~40–55 hours |
For large libraries, start the tagger before you sleep. It's fully resumable — if anything interrupts it, just re-run the same command and it picks up where it left off.
FAQ
Does any data get uploaded to the internet?
No. All analysis runs locally on your machine. The only internet access is the one-time download of the PANNs model (~400 MB) from Zenodo.org.
Can I search while the tagger is still running?
Yes. Double-click Start Audio Library Local Server.bat in a second window and open localhost:5000. Files appear in search results as soon as they're tagged.
How do I add more folders to the library?
Double-click Scan Library.bat and enter the new folder path when prompted. Files are identified by a hash so there's no duplication — already-tagged files are skipped.
I see lots of errors for ._filename files
These are Apple macOS metadata stubs created when zipping on a Mac. From v1.0 onwards these are filtered out before scanning and will not appear in logs. If you see them, ensure you are running the latest version of tagger.py.
The PANNs model failed to load
On Windows, the panns_inference library sometimes tries to use wget which isn't available. Follow the manual model download steps in the Installation section to pre-download the files using PowerShell's built-in Invoke-WebRequest.
Can I run this on macOS or Linux?
Yes. The Open Folder button uses open -R on macOS and xdg-open on Linux automatically. GPU acceleration on macOS requires an NVIDIA GPU via CUDA (most Macs use Apple Silicon — CPU mode is recommended).