Documentation

Audio Library
Tagger

AI-powered local audio cataloguing. Scans your entire sound library, classifies 527 sound event categories using GPU-accelerated neural networks, extracts music features, and gives you a fast browser-based search interface — all without uploading a single file to the cloud.

100% Local GPU Accelerated 527 Sound Classes Windows / Mac / Linux
http://localhost:5000
Main search interface
The main search interface — sidebar filters, file cards with tag pills, and in-browser audio preview.
//

What it does

Audio Library Tagger scans your audio folders recursively and runs two AI systems on every file:

  • PANNs (Pretrained Audio Neural Networks) — Google's open-source model trained on 527 AudioSet sound categories. Detects explosions, rain, footsteps, piano, gunshots, crowd noise, engines, and hundreds more with confidence scores.
  • librosa feature extraction — Estimates BPM, musical key, loudness, spectral brightness, and zero-crossing rate, then derives semantic tags (tempo, energy, mood, type) from those values.

Everything is stored in a local SQLite database. A lightweight Flask web server lets you search and filter the entire library in your browser — with live audio preview, copy path, and open-in-Explorer functionality.

Tip

The web app and the tagger can run simultaneously. Start searching tagged files while the rest of your library is still being processed.

01

Requirements

Python
3.10 or higher
GPU (recommended)
NVIDIA with CUDA support
Disk space
~2 GB for PANNs model
OS
Windows / macOS / Linux
CPU Mode

A GPU is strongly recommended for large libraries. CPU mode works but is significantly slower — roughly 30–60 seconds per file versus 1–2 seconds on GPU. Use --workers 4 to parallelise on CPU.

02

Installation

  1. Install Python 3.10+

    Download from python.org. On Windows, check "Add Python to PATH" during installation.

  2. Install PyTorch with CUDA

    Check your CUDA version first with nvidia-smi, then install the matching PyTorch build:

    PowerShell / Terminal
    pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu121

    Replace cu121 with your CUDA version (e.g. cu118, cu124).

  3. Install dependencies

    PowerShell / Terminal
    cd audio_tagger
    pip install -r requirements.txt
  4. Download the PANNs model

    On Windows, run these in PowerShell to pre-download the model files (avoids a wget issue on first run):

    PowerShell
    New-Item -ItemType Directory -Force -Path "$env:USERPROFILE\panns_data"
    
    Invoke-WebRequest -Uri "https://raw.githubusercontent.com/qiuqiangkong/audioset_tagging_cnn/master/metadata/class_labels_indices.csv" -OutFile "$env:USERPROFILE\panns_data\class_labels_indices.csv"
    
    Invoke-WebRequest -Uri "https://zenodo.org/record/3987831/files/Cnn14_mAP%3D0.431.pth" -OutFile "$env:USERPROFILE\panns_data\Cnn14_mAP=0.431.pth"

    The model is ~400 MB and only needs to be downloaded once.

03

Quick Start

Step 1 — Scan your library

Double-click Scan Library.bat in the audio_tagger folder. It will prompt you to enter your audio folder path:

Prompt
Audio folder path: V:\Music & Sound for Games

The tagger walks all subfolders and processes every supported audio file. Progress is shown in the terminal and saved to tagger.log.

Resumable

If you stop the tagger and restart it, already-processed files are automatically skipped. You can safely stop and resume at any time.

Step 2 — Start the web app

Double-click Start Audio Library Local Server.bat — this starts the local server automatically. Then either open http://localhost:5000 in your browser, or use the Open Audio Library shortcut if you've created one. The app can run while the tagger is still processing — results appear as files are tagged.

http://localhost:5000
Search results for explosion
Searching for "explosion" returns all matching SFX with PANNs confidence scores shown as tag pills.
//

Batch File Launchers

Three files are included to make launching the tool as simple as a double-click — no terminal required.

FileWhat it does
Scan Library.bat Prompts you for your audio folder path, then starts the AI scanner. Resumable — safe to stop and restart any time.
Start Audio Library Local Server.bat Starts the local web server. Keep this running whenever you want to search your library.
Open Audio Library shortcut A browser shortcut pointing to http://localhost:5000. Open this after launching the web server to go straight to search.
Tip

You can run Scan Library.bat and Start Audio Library Local Server.bat simultaneously in separate windows — search while your library is still being tagged.

04

Scanning Your Library

Command line options

PowerShell / Terminal
python tagger.py --path "V:\Audio" --workers 1 --db audio_library.db
Easier way

For most users, double-clicking Scan Library.bat is simpler — it prompts for the path and handles everything automatically.

advanced
Flag Default Description
--path required Root folder to scan. All subfolders are included.
--workers 1 Parallel workers. Keep at 1 for GPU. Increase to 4+ for CPU-only.
--db audio_library.db Path to the SQLite database file.
Adding more folders

Run the tagger again with a different --path to add more files to the same database. Already-tagged files are skipped regardless of path.

05

Searching & Filtering

http://localhost:5000
Sidebar filters active
Sidebar facets let you filter by category, type, tempo, energy, mood, and file format.

Text search

The search bar performs full-text search across filenames and all AI-generated tags. Try terms like explosion, piano, dark ambient, one-shot, or fast.

Prefix any keyword with a - minus symbol to exclude results containing that tag. This is useful when a broad search returns unwanted categories alongside your target sounds.

Examples
squeal -animal          → metal squeals only, no animal sounds
impact -music           → impact SFX with no musical results
explosion -small        → large explosions only
rain -music -ambient    → rain SFX excluding music and atmosphere tracks
Tip

Negative terms work across filenames and all AI tags — not just categories. Use -animal, -piano, -loop or any tag value you want to remove from results.

Sidebar filters

Click any filter in the sidebar to narrow results by category, type, tempo, energy, mood, or file format. Filters are additive — combine them freely.

Tag pill search

Click any tag pill on a file card to instantly filter by that tag value.

Duration filter

Enter min/max duration in seconds at the top of the results pane to find files of a specific length — useful for isolating one-shots, loops, or full tracks.

File actions

  • Play button — previews the file directly in the browser
  • Open folder — opens the file location in Explorer / Finder
  • Copy path — copies the full file path to clipboard
  • ⟳ Loop — toggles looping playback on/off. Turns green when active. Useful for testing whether a sound loops cleanly.
  • Speed selector — adjusts playback rate (0.5×, 0.75×, 1×, 1.5×, 2×). Useful for checking fast one-shots in slow motion or auditioning loops at different tempos.
http://localhost:5000
Expanded file card with audio player
Clicking a file card expands it to reveal the audio player and action buttons.
http://localhost:5000
Loop and speed controls
The Loop button turns green when active. The speed selector lets you audition sounds at 0.5×–2× speed.
06

Tag Reference

Every file receives tags across multiple categories. Here's what each category means and how it's generated.

Category Example values Source
sound_event Explosion, Rain, Piano, Gunshot, Dog, Thunder… PANNs neural network — 527 AudioSet classes with confidence scores
category sound effect, music, ambience / atmosphere, musical sfx Derived from top PANNs results
type one-shot, short clip, loop / stem, full track Derived from file duration
tempo very slow, slow, medium, fast, very fast Derived from librosa BPM estimate
energy quiet, medium energy, loud Derived from RMS loudness in dB
mood bright, dark Derived from detected musical key mode (major/minor)
texture tonal, noisy, warm, bright Derived from spectral centroid and zero-crossing rate

Music files also display BPM, musical key (e.g. A minor), loudness in dB, sample rate, and channel count in the card header.

07

Supported Formats

WAV MP3 FLAC OGG AIF AIFF
Mac metadata files

If your library came from a Mac or was distributed as a Mac zip, you may see ._filename files. These are Apple metadata stubs, not real audio. The tagger logs them as errors and skips them — this is expected behaviour.

08

Performance

ModeSpeed per file10,000 files
NVIDIA GPU (CUDA) 1–2 seconds (tested on RTX 2070) ~3–6 hours
CPU (single worker) 30–60 seconds ~80–170 hours
CPU (4 workers) ~15–20 seconds ~40–55 hours
Run overnight

For large libraries, start the tagger before you sleep. It's fully resumable — if anything interrupts it, just re-run the same command and it picks up where it left off.

09

FAQ

Does any data get uploaded to the internet?

No. All analysis runs locally on your machine. The only internet access is the one-time download of the PANNs model (~400 MB) from Zenodo.org.

Can I search while the tagger is still running?

Yes. Double-click Start Audio Library Local Server.bat in a second window and open localhost:5000. Files appear in search results as soon as they're tagged.

How do I add more folders to the library?

Double-click Scan Library.bat and enter the new folder path when prompted. Files are identified by a hash so there's no duplication — already-tagged files are skipped.

I see lots of errors for ._filename files

These are Apple macOS metadata stubs created when zipping on a Mac. From v1.0 onwards these are filtered out before scanning and will not appear in logs. If you see them, ensure you are running the latest version of tagger.py.

The PANNs model failed to load

On Windows, the panns_inference library sometimes tries to use wget which isn't available. Follow the manual model download steps in the Installation section to pre-download the files using PowerShell's built-in Invoke-WebRequest.

Can I run this on macOS or Linux?

Yes. The Open Folder button uses open -R on macOS and xdg-open on Linux automatically. GPU acceleration on macOS requires an NVIDIA GPU via CUDA (most Macs use Apple Silicon — CPU mode is recommended).