Skip to main content

Researchers find a way to make photos and muted videos ‘speak’ – here’s what it could mean for your privacy

Web Hosting & Remote IT Support

Capturing audio from a still image may feel like something out of a sci-fi novel, but one scientist has actually devised a way to do it, with the helping hand of AI.

By creating a machine learning tool called Side Eye, a team led by professor of electrical and computer engineering and computer science at Northeastern University, Kevin Fu, can read into images to an extraordinary degree.

By applying Side Eye to a still image, they can determine the gender of a speaker in the room, where the photo was taken, and the words they spoke, according to TechXplore. They can also apply the tool to muted videos.

An AI-powered privacy nightmare?

"Imagine someone is doing a TikTok video and they mute it and dub music," Fu told the publicaton. "Have you ever been curious about what they're really saying? Was it 'Watermelon watermelon' or 'Here's my password?' Was somebody speaking behind them? You can actually pick up what is being spoken off camera."

The machine learning-powered Side Eye exploits image stabilization technology that’s universally used across almost all smartphone cameras. 

Cameras built into smartphones have springs to suspend the lens in liquid, meaning photos aren’t taken blurry or out of focus due to somebody’s shaky grip. Sensors and an electromagnet combine to push the lens in the opposite direction to whatever shakiness is being applied, to stabilize the image.

When somebody speaks near the camera lens while the photo is being taken, it creates tiny vibrations in the springs and bends the light in a subtle way. Although it would be near-impossible to extract the sonic frequency from these vibrations, this is made simple due to the rolling shutter method of photography most cameras use.

"The way cameras work today to reduce cost basically is they don't scan all pixels of an image simultaneously – they do it one row at a time," Fu added. "[That happens] hundreds of thousands of times in a single photo. What this basically means is you're able to amplify by over a thousand times how much frequency information you can get, basically the granularity of the audio."

While Side Eye itself is in a very basic form, and requires far more training data to refine and perfect, should a more advanced form of the system fall into the wrong hands, it could pose a cybersecurity nightmare for many.  

But, there are positive implications for the technology too, especially should a far more advanced form of Side Eye be used as a kind of digital evidence for those working to investigate crime. 

More from TechRadar Pro



via Hosting & Support

Comments

Popular posts from this blog

Microsoft, Google, and Meta have borrowed EV tech for the next big thing in data centers: 1MW watercooled racks

Web Hosting & Remote IT Support Liquid cooling isn't optional anymore, it's the only way to survive AI's thermal onslaught The jump to 400VDC borrows heavily from electric vehicle supply chains and design logic Google’s TPU supercomputers now run at gigawatt scale with 99.999% uptime As demand for artificial intelligence workloads intensifies, the physical infrastructure of data centers is undergoing rapid and radical transformation. The likes of Google, Microsoft, and Meta are now drawing on technologies initially developed for electric vehicles (EVs), particularly 400VDC systems, to address the dual challenges of high-density power delivery and thermal management. The emerging vision is of data center racks capable of delivering up to 1 megawatt of power, paired with liquid cooling systems engineered to manage the resulting heat. Borrowing EV technology for data center evolution The shift to 400VDC power distribution marks a decisive break from legacy sy...

Google’s AI Mode can explain what you’re seeing even if you can’t

Web Hosting & Remote IT Support Google’s AI Mode now lets users upload images and photos to go with text queries The feature combines Google Gemini and Lens AI Mode can understand entire scenes, not just objects Google is adding a new dimension to its experimental AI Mode by connecting Google Lens's visual abilities with Gemini . AI Mode is a part of Google Search that can break down complex topics, compare options, and suggest follow-ups. Now, that search includes uploaded images and photos taken on your smartphone. The result is a way to search through images the way you would text but with much more complex and detailed answers than just putting a picture into reverse image search. You can literally snap a photo of a weird-looking kitchen tool and ask, “What is this, and how do I use it?” and get a helpful answer, complete with shopping links and YouTube demos. AI Eyes If you take a picture of a bookshelf, a plate of food, or the chaotic interior of your junk...

Passing the torch to a new era of open source technology

Web Hosting & Remote IT Support The practice of developing publicly accessible technologies and preventing monopolies of privately-owned, closed-source infrastructure was a pivotal technological movement in the 1990s and 2000s. The open source software movement was viewed at the time as a form of ‘digital civil duty’, democratizing access to technology. However, while the movement's ethos underpins much of today’s technological landscape, its evolution has proven to be a challenge for its pioneers. Hurdles Facing Young Developers Open source models successfully paved a path for the development of a multitude of technologies, cultivating a culture of knowledge sharing, collaboration , and community along the way. Unfortunately, monetizing such projects has always been a challenge, and ensuring contributors are compensated for their contributions working on them, even more so. On the other hand, closed-source projects offer greater control, security, and competitive advant...