2025-12-09 ワシントン大学(UW)

The team combined off-the-shelf noise-canceling headphones with binaural microphones to create the prototype, pictured here.Hu et al./EMNLP
<関連情報>
- https://www.washington.edu/news/2025/12/09/ai-headphones-smart-noise-cancellation-proactive-listening/
- https://aclanthology.org/2025.emnlp-main.1289/
- https://arxiv.org/abs/2503.18698
自己中心的な会話を遮断するプロアクティブ聴覚アシスタント Proactive Hearing Assistants that Isolate Egocentric Conversations
Guilin Hu, Malek Itani, Tuochao Chen, Shyamnath Gollakota
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
DOI:https://doi.org/10.18653/v1/2025.emnlp-main.1289
Abstract
We introduce proactive hearing assistants that automatically identify and separate the wearer’s conversation partners, without requiring explicit prompts. Our system operates on egocentric binaural audio and uses the wearer’s self-speech as an anchor, leveraging turn-taking behavior and dialogue dynamics to infer conversational partners and suppress others. To enable real-time, on-device operation, we propose a dual-model architecture: a lightweight streaming model runs every 12.5 ms for low-latency extraction of the conversation partners, while a slower model runs less frequently to capture longer-range conversational dynamics. Results on real-world 2- and 3-speaker conversation test sets, collected with binaural egocentric hardware from 11 participants totaling 6.8 hours, show generalization in identifying and isolating conversational partners in multi-conversation settings. Our work marks a step toward hearing assistants that adapt proactively to conversational dynamics and engagement.
プログラム可能な音声AIアクセラレータを搭載したワイヤレスヒアラブル Wireless Hearables With Programmable Speech AI Accelerators
Malek Itani, Tuochao Chen, Arun Raghavan, Gavriel Kohlberg, Shyamnath Gollakota
arXiv last revised 22 Oct 2025 (this version, v2)
DOI:https://doi.org/10.48550/arXiv.2503.18698
Abstract
The conventional wisdom has been that designing ultra-compact, battery-constrained wireless hearables with on-device speech AI models is challenging due to the high computational demands of streaming deep learning models. Speech AI models require continuous, real-time audio processing, imposing strict computational and I/O constraints. We present NeuralAids, a fully on-device speech AI system for wireless hearables, enabling real-time speech enhancement and denoising on compact, battery-constrained devices. Our system bridges the gap between state-of-the-art deep learning for speech enhancement and low-power AI hardware by making three key technical contributions: 1) a wireless hearable platform integrating a speech AI accelerator for efficient on-device streaming inference, 2) an optimized dual-path neural network designed for low-latency, high-quality speech enhancement, and 3) a hardware-software co-design that uses mixed-precision quantization and quantization-aware training to achieve real-time performance under strict power constraints. Our system processes 6 ms audio chunks in real-time, achieving an inference time of 5.54 ms while consuming 71.6 mW. In real-world evaluations, including a user study with 28 participants, our system outperforms prior on-device models in speech quality and noise suppression, paving the way for next-generation intelligent wireless hearables that can enhance hearing entirely on-device.


