Skip to main content

Researchers find a way to make photos and muted videos ‘speak’ – here’s what it could mean for your privacy

Web Hosting & Remote IT Support

Capturing audio from a still image may feel like something out of a sci-fi novel, but one scientist has actually devised a way to do it, with the helping hand of AI.

By creating a machine learning tool called Side Eye, a team led by professor of electrical and computer engineering and computer science at Northeastern University, Kevin Fu, can read into images to an extraordinary degree.

By applying Side Eye to a still image, they can determine the gender of a speaker in the room, where the photo was taken, and the words they spoke, according to TechXplore. They can also apply the tool to muted videos.

An AI-powered privacy nightmare?

"Imagine someone is doing a TikTok video and they mute it and dub music," Fu told the publicaton. "Have you ever been curious about what they're really saying? Was it 'Watermelon watermelon' or 'Here's my password?' Was somebody speaking behind them? You can actually pick up what is being spoken off camera."

The machine learning-powered Side Eye exploits image stabilization technology that’s universally used across almost all smartphone cameras. 

Cameras built into smartphones have springs to suspend the lens in liquid, meaning photos aren’t taken blurry or out of focus due to somebody’s shaky grip. Sensors and an electromagnet combine to push the lens in the opposite direction to whatever shakiness is being applied, to stabilize the image.

When somebody speaks near the camera lens while the photo is being taken, it creates tiny vibrations in the springs and bends the light in a subtle way. Although it would be near-impossible to extract the sonic frequency from these vibrations, this is made simple due to the rolling shutter method of photography most cameras use.

"The way cameras work today to reduce cost basically is they don't scan all pixels of an image simultaneously – they do it one row at a time," Fu added. "[That happens] hundreds of thousands of times in a single photo. What this basically means is you're able to amplify by over a thousand times how much frequency information you can get, basically the granularity of the audio."

While Side Eye itself is in a very basic form, and requires far more training data to refine and perfect, should a more advanced form of the system fall into the wrong hands, it could pose a cybersecurity nightmare for many.  

But, there are positive implications for the technology too, especially should a far more advanced form of Side Eye be used as a kind of digital evidence for those working to investigate crime. 

More from TechRadar Pro



via Hosting & Support

Comments

Popular posts from this blog

Hacking Huawei Modems

Report: Android's desktop mode might allow future tablets to double as computers

Web Hosting & Remote IT Support Back in April , evidence surfaced online revealing that Google was working on improving Android's desktop mode. Early demos show it’ll be more user-friendly than before by having movable windows, although it still lacks vital features. Since then, we haven’t heard much about the project until recently, when it popped up again in the “latest Android 15 Beta 4.1 release”. Android expert Mishaal Rahman discovered that Android’s feature may work on a tablet – provided it has a big enough display. In the build, he states that if you go to the device’s 'Recents' view and open the dropdown menu for an app, you will see a new button called “Desktop.” Tapping said button causes whatever app you were on to turn into a free-floating window. From here on, it behaves similarly to a browser on Samsung's New DeX system. The app can be minimized, maximized, attached to the side, or connected to another window. Down at the bottom is a taskbar