Real-Time Breath Detection in the Browser: Spectral Centroid and Dual-Path State Machines
These articles are AI-generated summaries. Please check the original sources for full details.
Real-Time Breath Detection in the Browser: Spectral Centroid, Dual-Path State Machines, and a Nasty iOS Bug
Felix Zeller developed @shiihaa/breath-detection to solve the problem of identifying respiratory phases using standard browser APIs. The system leverages a 4096-point FFT to analyze the 150–2500 Hz band for real-time inhale/exhale classification. By distinguishing between turbulent nasal airflow and laminar exhalation, the library provides high-signal biofeedback data.
Why This Matters
Traditional energy-based microphone detection often fails because users breathe continuously without silence gaps or operate in environments with drifting noise floors. The reality of browser-based audio processing involves dealing with non-standardized hardware and critical platform bugs, such as the iOS WKWebView AnalyserNode failure. This library addresses these technical hurdles by implementing dual-path detection (thresholds and peaks) and automated noise floor re-sampling every 10 seconds to maintain accuracy without constant manual recalibration.
Key Insights
- Inhalation generates higher-frequency energy (800–2500 Hz) due to nasal turbulence, while exhalation is lower-frequency (200–800 Hz).
- The library employs a secondary detection path using energy peaks to handle continuous breathing where energy never hits the silence threshold.
- Auto-recalibration cycles occur every 10 seconds to adapt to environmental changes like HVAC noise or microphone hardware swaps.
- A specific iOS bug causes AnalyserNode to return zeroed data in WKWebView, requiring a native AVAudioEngine bridge for Capacitor apps.
- The BreathCycle object provides a labelSwapped flag to indicate when spectral centroid evidence has corrected an initial threshold-based guess.
Working Examples
Initializing the BreathDetector with centroid-based inhale/exhale classification.
import { BreathDetector } from '@shiihaa/breath-detection';
const detector = new BreathDetector({
thresholdFactor: 0.35,
enableCentroid: true,
centroidThreshold: 40,
minCycleGapSeconds: 2.5,
});
detector.onCycle((cycle) => {
console.log(`${cycle.inhaleMs}ms in / ${cycle.exhaleMs}ms out`);
console.log(`Method: ${cycle.method}`); // 'threshold' or 'peak'
});
const ok = await detector.start();
if (ok) {
await detector.calibrate();
detector.startDetection();
}
Bypassing the iOS WKWebView AnalyserNode bug using a native Capacitor plugin.
import { AudioAnalysis } from '@shiihaa/capacitor-audio-analysis';
await AudioAnalysis.start({ gain: 8.0 });
await AudioAnalysis.addListener('audioData', (data) => {
console.log('RMS:', data.rms);
console.log('Band energy:', data.bandEnergy);
});
Practical Applications
- Guided Breathwork Apps: Utilizing the onCycle event for box breathing or coherence training; pitfall: failing to monitor the confidence score can lead to inaccurate feedback in loud environments.
- Cross-Platform Capacitor Biofeedback: Using the native audio analysis plugin to ensure functionality on iOS; pitfall: relying on standard Web Audio API results in silent failure on WKWebView.
References:
Continue reading
Next article
Mastering SPF Records: Solving the 10-DNS Lookup Limit in Email Security
Related Content
Building Graph-Based Zero-Trust Network Simulations for Insider Threat Detection
Learn to build a dynamic Zero-Trust simulation using graph-based micro-segmentation and adaptive policy engines to block threats in real-time.
Building Real-Time Simulations with State.js: Eliminating Frontend Framework Complexity
State.js enables the creation of autonomous simulation games in a single HTML file by treating the DOM as the primary state database.
Memoo: Scaling Browser Automation with Gemini Multimodal Vision and Voice
Memoo uses Gemini 2.0 Flash to transform manual browser workflows into reusable playbooks with real-time vision and voice guidance.