2024-11-27 テキサス大学オースチン校(UT Austin)
<関連情報>
- https://news.utexas.edu/2024/11/27/researchers-use-ai-to-turn-sound-recordings-into-accurate-street-images/
- https://www.sciencedirect.com/science/article/abs/pii/S0198971524000516
- https://www.nature.com/articles/s41599-024-03645-7
聴覚から視覚へ: サウンドスケープから画像へ生成する人工知能により、聴覚と視覚による場所の認識をリンクさせる From hearing to seeing: Linking auditory and visual place perceptions with soundscape-to-image generative artificial intelligence
Yonggai Zhuang, Yuhao Kang, Teng Fei, Meng Bian, Yunyan Du
Computers, Environment and Urban Systems Available online: 1 May 2024
DOI:https://doi.org/10.1016/j.compenvurbsys.2024.102122
Highlights
- A Soundscape-to-Image Diffusion Model is proposed to visualize street soundscapes.
- Human auditory and visual perceptions are linked to understanding the sense of place.
- Soundscapes provide sufficient visual information of places.
Abstract
People experience the world through multiple senses simultaneously, contributing to our sense of place. Prior quantitative geography studies have mostly emphasized human visual perceptions, neglecting human auditory perceptions at place due to the challenges in characterizing the acoustic environment vividly. Also, few studies have synthesized the two-dimensional (auditory and visual) perceptions in understanding human sense of place. To bridge these gaps, we propose a Soundscape-to-Image Diffusion model, a generative Artificial Intelligence (AI) model supported by Large Language Models (LLMs), aiming to visualize soundscapes through the generation of street view images. By creating audio-image pairs, acoustic environments are first represented as high-dimensional semantic audio vectors. Our proposed Soundscape-to-Image Diffusion model, which contains a Low-Resolution Diffusion Model and a Super-Resolution Diffusion Model, can then translate those semantic audio vectors into visual representations of place effectively. We evaluated our proposed model by using both machine-based and human-centered approaches. We proved that the generated street view images align with our common perceptions, and accurately create several key street elements of the original soundscapes. It also demonstrates that soundscapes provide sufficient visual information places. This study stands at the forefront of the intersection between generative AI and human geography, demonstrating how human multi-sensory experiences can be linked. We aim to enrich geospatial data science and AI studies with human experiences. It has the potential to inform multiple domains such as human geography, environmental psychology, and urban design and planning, as well as advancing our knowledge of human-environment relationships.
場所のアイデンティティ:ジェネレーティブAIの視点 Place identity: a generative AI’s perspective
Kee Moon Jang,Junda Chen,Yuhao Kang,Junghwan Kim,Jinhyung Lee,Fabio Duarte & Carlo Ratti
Humanities and Social Sciences Communications Published:07 September 2024
DOI:https://doi.org/10.1057/s41599-024-03645-7
Abstract
Do cities have a collective identity? The latest advancements in generative artificial intelligence (AI) models have enabled the creation of realistic representations learned from vast amounts of data. In this study, we test the potential of generative AI as the source of textual and visual information in capturing the place identity of cities assessed by filtered descriptions and images. We asked questions on the place identity of 64 global cities to two generative AI models, ChatGPT and DALL·E2. Furthermore, given the ethical concerns surrounding the trustworthiness of generative AI, we examined whether the results were consistent with real urban settings. In particular, we measured similarity between text and image outputs with Wikipedia data and images searched from Google, respectively, and compared across cases to identify how unique the generated outputs were for each city. Our results indicate that generative models have the potential to capture the salient characteristics of cities that make them distinguishable. This study is among the first attempts to explore the capabilities of generative AI in simulating the built environment in regard to place-specific meanings. It contributes to urban design and geography literature by fostering research opportunities with generative AI and discussing potential limitations for future studies.