Disclaimer: We may earn a commission if you make any purchase by clicking our links. Please see our detailed guide here.

Follow us on:

Google News

MIT Develops Machine-learning Tech that Simulates How Listeners Hear Sound from Any Point

Yusuf Balogun
Yusuf Balogun
Yusuf is a law graduate and freelance journalist with a keen interest in tech reporting.

Join the Opinion Leaders Network

Join the Techgenyz Opinion Leaders Network today and become part of a vibrant community of change-makers. Together, we can create a brighter future by shaping opinions, driving conversations, and transforming ideas into reality.

Yilun Du, a grad student in the Department of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology, has developed a machine learning technique that accurately captures and models the underlying acoustics of a scene from only a limited number of sound recordings. This machine-learning system can simulate how a listener would hear sound from any point in a room.

Joining Du on the work are lead author Andrew Luo, a graduate student at Carnegie Mellon University (CMU), senior author Joshua B. Tenenbaum, the Paul E. Newton Career Development Professor of Cognitive Science and Computation in MIT’s Department of Brain and Cognitive Sciences and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL), and Antonio Torralba, the Delt Professor of Cognitive and Brain Science at CMU, and Michael J. Tarr, the Kavi-Moura Professor of Cognitive and Brain Science At the Conference on Neural Information Processing Systems, the research will be presented.

The use of spatial acoustic information to aid robots in better understanding their environs was also investigated by the researchers. They created a machine-learning model that can mimic what a listener would hear at various positions by capturing how any sound in a room will travel across space.

An implicit neural representation model, a sort of machine-learning model, has been utilized in computer vision research to produce continuous, smooth reconstructions of 3D scenes from photographs. These models make use of neural networks, which are composed of layers of connected nodes, or neurons, that analyze data to act.

The same kind of model was used by MIT researchers to depict how sound permeates a scene continuously.

However, they discovered that sound models do not share a trait known as photometric consistency that makes vision models more advantageous. The identical thing appears to be about the same when viewed from two different angles. However, when it comes to sound, different locations could result in completely different sounds due to obstructions, distance, etc. As a result, audio prediction is quite challenging.

The reciprocal nature of sound and the influence of regional geometric elements were two acoustic properties that the researchers incorporated into their model to solve this issue.

Lead author Andrew Luo, a graduate student at Carnegie Mellon University (CMU) -“If you imagine standing near a doorway, what most strongly affects what you hear is the presence of that doorway, not necessarily geometric features far away from you on the other side of the room. We found this information enables better generalization than a simple fully connected network,” .

In addition, they discovered that incorporating the acoustic data their model picks up into a computer vision model can improve the visual reconstruction of the scene.

“When you only have a sparse set of views, using these acoustic features enables you to capture boundaries more sharply, for instance. And maybe this is because to accurately render the acoustics of a scene, you have to capture the underlying 3D geometry of that scene,” Du says.

The model will be improved further by the researchers so that it may be applied to fresh scenes. Additionally, they aim to use this method for more involved impulsive reactions and bigger scenarios, such as entire buildings or even a whole town or metropolis.

“This new technique might open up new opportunities to create a multimodal immersive experience in the metaverse application,” adds Gan.


Partner With Us

Digital advertising offers a way for your business to reach out and make much-needed connections with your audience in a meaningful way. Advertising on Techgenyz will help you build brand awareness, increase website traffic, generate qualified leads, and grow your business.

Power Your Business

Solutions you need to super charge your business and drive growth

More from this topic