Sonogram: Transforming Sound into Visual Fingerprints
A deep dive into Sonogram, a full-stack web app that creates interactive 3D visualizations from audio mathematical properties.
- FastAPI
- Three.js
- DSP
- Generative Art
- Sonogram
Sonogram is a full-stack web app that transforms audio files into unique generative visual artworks — “sonic fingerprints.” Upload a song, and it analyzes the audio’s mathematical properties to produce an interactive 3D visualization.
What it does
librosa) and renders a real-time 3D visualization using Three.js./embed/ route for external use.Tech Stack
| Layer | Technologies |
|---|---|
| Backend | FastAPI, SQLAlchemy (SQLite), Uvicorn |
| Deployment | Fly.io, Docker |
| Audio Analysis | librosa, scipy, numpy (Single-pass STFT) |
| Frontend | Vanilla JS (~9K lines), Three.js (WebGL), PWA |
| Imaging | Pillow (for PNG generation) |
Interesting Facts
UUIDs are invisibly embedded in artwork PNGs using LSB (Least Significant Bit) encoding in the blue channel.
The frontend auto-detects GPU, CPU cores, RAM, and display size. It then classifies the device into LOW / MEDIUM / HIGH tiers to dynamically adjust:
A SHA-256 hash of the raw audio bytes serves as the seed. This ensures that the same file always produces the exact same visualization, maintaining a 1:1 relationship between sound and art.
Users can switch between three distinct modes in fullscreen:
The app uses the Krumhansl-Schmuckler algorithm. It correlates chroma vectors against major/minor pitch-class profiles to estimate the musical key with high accuracy.
Includes theme definitions (Bauhaus, Neon Cyberpunk, Aerial Objects) with ControlNet scale parameters, hinting at a future generative AI triptych feature.
The Gallery and Studio operate as a SPA-style vanilla JS application for speed, while individual sonogram pages (/s/{id}) are server-rendered to ensure rich social sharing metadata and SEO optimization.