声帯が無くても発話を可能にする AI 支援型新ウェアラブルデバイス (Speaking without vocal cords, thanks to a new AI-assisted wearable device)

2024-03-14 アメリカ合衆国・カリフォルニア大学ロサンゼルス校(UCLA)

・ UCLA が、人間の喉頭部の筋肉の動きを検出し、その信号を機械学習(ML)技術により約 95%の精度で可聴発話に変換する、薄くフレキシブルなパッチ型ウェアラブルデバイスを開発。
・バイオエレクトリックシステムによるソフトなデバイスで、喉頭部に張り付けることで損なわれた声帯機能の回復を支援する。過去には、アメリカ手話(ASL)を英語の発話にリアルタイムに翻訳するウェアラブルグローブも開発している。
・デバイスのサイズは約 1.2 平方インチ(3cm)四方、重さは約 7g で薄さは僅か 0.06 インチ(1.5mm)。生体適合性の両面テープを用いて声帯付近の喉頭部に容易に張り付けられ、必要に応じてテープを付け直して再利用できる。
・新デバイスは自己給電型センサーとアクチュエーターより構成。センサーは筋肉の動きで発生する信号を検出し、それを高忠実度で分析可能な電気信号に変換。その電気信号は ML アルゴリズムを通じて発話信号に翻訳され、アクチュエーターが発話信号を任意の音声表現に変換する。
・それら 2 つのコンポーネントには、弾性的な性質を提供する生体適合性シリコーン化合物のポリジメチルシロキサン(PDMS)層と、銅の誘電コイルによる磁気誘導層の 2 種類の層がそれぞれ含まれている。
・ 2021 年に開発した、機械的な力（本件では喉頭部の筋肉の動き）による磁場の変化を検出する、磁気弾性によるセンシング機構を活用。磁気誘電層に埋め込まれた誘電コイルが、検出されやすい高忠実度の電気信号を生成する。
・研究結果によると、あらゆる年齢層や人口統計学的グループに蔓延する音声障害は、30%近くの人が生涯に少なくとも一度は経験する。外科的介入や音声療法等の治療方法では音声の回復に 3 カ月～1 年がかかり、侵襲的技術においては術後に長期間の発声安静を必要とするものもある。
・健康な成人 8 人の喉頭部筋肉の動きのデータを収集し、生成された信号と特定の言葉を ML アルゴリズムで関連付けてから、アクチュエーターを通じて関連する音声出力信号を選択した。参加者らに 5 種類の文章を大声と無声で発話させ、94.68%のシステム精度を実証した。
・ ML によるデバイスの語彙の拡充と、音声障害者による試験の実施を予定。
・本研究には、米国立衛生研究所(NIH)、米国海軍研究室(ONR)、米国心臓協会(AHA)、Brain & Behavior
Research Foundation、UCLA Clinical and Translational Science Institute、UCLA Samueli School of
Engineering が資金を提供した。
URL: https://newsroom.ucla.edu/releases/speaking-without-vocal-cords-ucla-engineering-wearable-tech

＜NEDO海外技術情報より＞

関連情報

Nature Communications 掲載論文(フルテキスト）
Speaking without vocal folds using a machine-learning-assisted wearable sensing-actuation system
URL: https://www.nature.com/articles/s41467-024-45915-7

Abstract

Voice disorders resulting from various pathological vocal fold conditions or postoperative recovery of laryngeal cancer surgeries, are common causes of dysphonia. Here, we present a self-powered wearable sensing-actuation system based on soft magnetoelasticity that enables assisted speaking without relying on the vocal folds. It holds a lightweighted mass of approximately 7.2 g, skin-alike modulus of 7.83 × 10⁵Pa, stability against skin perspiration, and a maximum stretchability of 164%. The wearable sensing component can effectively capture extrinsic laryngeal muscle movement and convert them into high-fidelity and analyzable electrical signals, which can be translated into speech signals with the assistance of machine learning algorithms with an accuracy of 94.68%. Then, with the wearable actuation component, the speech could be expressed as voice signals while circumventing vocal fold vibration. We expect this approach could facilitate the restoration of normal voice function and significantly enhance the quality of life for patients with dysfunctional vocal folds.

月	火	水	木	金	土	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31