Hey Siri, what does the "S" stand for in "IoT"?
January 16, 2020 4:27 AM   Subscribe

It turns out that the MEMS microphones used in most always-listening voice-activated home assistants are sensitive not only to sound but to modulated light as well. Smarter Every Day explores some of the consequences.
posted by flabdablet (48 comments total) 19 users marked this as a favorite
 
This is pretty bad. But, to be honest, it's also kind of hilarious.
posted by ardgedee at 4:32 AM on January 16 [2 favorites]


There’s much to IoT that is sensible, securable and useful. And then there’s so much of it that is developers not caring about your security (generally Lucky Dragon brand), and so so much of it is developers banking on your lack of security (generally US MegaCorp brand).
posted by pompomtom at 4:49 AM on January 16 [5 favorites]


tbh, I can't honestly say it would occur to me, if I were in charge of testing these things, to wonder if the microphone could pick up light. Maybe it should occur to someone who is a qualified engineer for this kind of gadget, though.
posted by thelonius at 4:49 AM on January 16 [1 favorite]


“Should” there is presumably implying that the rights of the user were any part of the design process. You can fix all of that shit (for your company) with a click-wrapper that no-one will read.
posted by pompomtom at 4:53 AM on January 16 [6 favorites]


While I agree that in general there's a lack of care or concern for user protection in IOT tools, I'm not sure this is the best specific example to use as illustration; a hijackable home assistant is not something the service provider wants either since from their point of view it inserts noise into the data they're collecting on you.

This should also be pretty easy to fix with a piece of black tape.
posted by ardgedee at 5:07 AM on January 16 [3 favorites]


I think I get your point, but the design of “let’s have a device that takes unauthenticated input and then does things on the user’s behalf, filtering the lot through MegaCorp’s servers” seems a bit shit to me.

It’s probably just the security:convenience trade-off again, but with amendment, and MegaCorp is selling convenience, not the other thing.
posted by pompomtom at 5:27 AM on January 16 [5 favorites]


...and yeah, I’ve worked in a whole team of data-monkeys who give no shit whatsoever about their own privacy, as we attempt to aggregate loads of data ourselves (subject to the privacy legislation under which we operate), because “whatever... google already knows”, so it may be that I’m an old crank.
posted by pompomtom at 5:34 AM on January 16


You can do nasty things to certain computers with light: the Raspberry Pi 2 Xenon Death Flash, for instance. MEMS mics are in so many devices and can be always-on and you'd never know.
posted by scruss at 5:53 AM on January 16 [5 favorites]


because “whatever... google already knows”

I think this attitude has been the biggest cause of the erosion of privacy - passive acquiescence on account of perceived powerlessness.

If only people had said 'Yes, I understand that the CIA, FBI, NSA, GCHQ, Google, Facebook, etc. are going to covertly gather my personal data, and that there is little or nothing I can do about it - but I will be damned to hell if I ever give them my actual consent to do so."
posted by Cardinal Fang at 6:14 AM on January 16 [31 favorites]


(To answer the question posed in the post's title, the customary answer from infosec people is "Shit.")
posted by wenestvedt at 6:18 AM on January 16 [1 favorite]


Maybe it should occur to someone who is a qualified engineer for this kind of gadget, though.

Not really, though now that people are mentioning it, yeah I can see how that could work.

I think from an IoT company's point of view, this mechanism isn't amenable to widespread exploitation. So you're not going to wake up one day and find news reports of millions of your devices were hacked overnight. And if it only affects a few hundred people who are specifically targeted, does the company even care? (They should, but you know the real answer.)

OTOH, if someone started selling a cheap TV-B-Gone-like device to fuck with home assistants, then the companies would start to worry.
posted by ryanrs at 6:20 AM on January 16 [3 favorites]


Once someone start selling a device that makes your friend's Alexa automatically order Rick Astley's Greatest Hits, companies will fix this issue.
posted by ryanrs at 6:31 AM on January 16 [4 favorites]


Light has pressure; intense light a significant amount. Digitally pulse a laser above the Nyquist frequency for speech and whatever that laser hits (especially if it's designed to respond to speech frequencies) will act like a low pass filter and "hear" what the laser is saying.

(I hear this every day when laser engraving; the beam is modulated at 8khz and when I'm carving something at modulated fraction of full power the object squeals, if it's thin enough to resonate.)
posted by seanmpuckett at 6:36 AM on January 16 [21 favorites]


To be sure the photoelectric effect is also a real thing, but with light pressure you don't need to specifically knock the electrons off something to have it respond to light, you are actually pummelling it with photons and it mechanically reacts.
posted by seanmpuckett at 6:39 AM on January 16 [2 favorites]


Once someone start selling a device that makes your friend's Alexa automatically order Rick Astley's Greatest Hits, companies will fix this issue.

Unless the companies also sell dollhouses.
posted by Cardinal Fang at 6:47 AM on January 16


Once someone start selling a device that makes your friend's Alexa automatically order Rick Astley's Greatest Hits, companies will fix this issue.

Hello yes I believe you meant to include a link to your Kickstarter.
posted by Mayor West at 6:54 AM on January 16 [15 favorites]


Light has pressure; intense light a significant amount. Digitally pulse a laser above the Nyquist frequency for speech and whatever that laser hits (especially if it's designed to respond to speech frequencies) will act like a low pass filter and "hear" what the laser is saying.

(I hear this every day when laser engraving; the beam is modulated at 8khz and when I'm carving something at modulated fraction of full power the object squeals, if it's thin enough to resonate.)


The energy of a photon E = pc where p is momentum & c is the speed of light.

Therefore p = E / c. The momentum of a 100J pulse of light is 10 / 3x10^8 or 3 x 10^-7 kg m / s. Ergo a 100 watt laser exerts a force of 3x10^-7 N.

I don’t think that’s enough force to accelerate much of anything enough to generate sound. You can shape the pulses of course, but then you apply the acceleration for a shorter time & the net momentum transfer is the same. It’s more likely that the sound comes from thermal cycling of the material - the photoacoustic effect.
posted by pharm at 7:40 AM on January 16 [7 favorites]


Pharm, I'm not going to extensively correct your math for you, but I will tell you that you that 80db converts to 0.2N/M^2. One doesn't need a whole lot of Newtons to wiggle a diaphragm whose weight is measured in picograms, especially when wiggling it with a beam with energy measured in watts over an area of maybe a 0.1mm^2. I figure you can take it from there.
posted by seanmpuckett at 8:18 AM on January 16 [3 favorites]


God I love physics arguments.
posted by biogeo at 8:22 AM on January 16 [9 favorites]


I don’t think that’s enough force to accelerate much of anything enough to generate sound.
Pharm - that's an interesting calculation and it looks basically correct to me (except for one spot where I think you meant to write '100' instead of '10', but the final answer is right). However, for comparison, do you know how much force would be required to produce audible sound? The threshold of hearing is about 1 x 10^-12 W/m^2 which is pretty tiny. How to compare that number to the 3x10^-7 N that you calculated?
posted by crazy_yeti at 8:22 AM on January 16 [1 favorite]


qualified engineer

"Qualified" in what exactly? How exactly does "qualifications" help thinking about security?

/noncreeping-credentialist

Seriously, QA and systems solutions help with the problems you know about already, and help manage things like error rates. What they don't help with is outside-context problems, which is what this is.

But the biggest thing in a functioning, effective quality system is active engagement. Pro forma qualifications are not necessary and definitely not sufficient. Indeed, in my experiences, are often over relied upon, masking real fixable problems.
posted by bonehead at 8:25 AM on January 16


the design of “let’s have a device that takes unauthenticated input and then does things on the user’s behalf, filtering the lot through MegaCorp’s servers” seems a bit shit to me.

Yeah, I agree that this is the real issue here. The attack vector is really interesting, but the actual vulnerability is in just blindly accepting and acting on input from voice-activated devices. The assumption seems to be that these devices are not on the attack surface by virtue of maybe being inside someone's house, but it should be obvious that that's false.
posted by biogeo at 8:27 AM on January 16 [4 favorites]


Maybe it should occur to someone who is a qualified engineer for this kind of gadget, though.

I'm a hardware engineer that works in the medical space, recently on medical IoT (those two words should scare the piss out of all of you). The most important design criteria for people in my position are "what's the bare minimum we need to do to get through FDA" and "what's our shortest time to market". The second criteria is one that I share with the rest of the IoT space and is the reason that giant wads of peoples personal information keeps getting left in publicly accessible AWS buckets. It's why people use gobs of poorly audited libraries from God knows where. It's why privacy is always an after thought and it's why under the hood most IoT hardware looks pretty much the same (broadly speaking). Speed and desire for low technical risk drives nearly all design decisions.

The one, single way to make the companies that employ people like me give a shit about your privacy is to give regulatory bodies teeth. Make lax security really expensive and stuff will change.

I've also worked on hardware projects for the casino gaming industry, and boy howdy to they pay attention to security. Broken stuff still shows up when somebody screws up, but because money is on the line, security is a first priority, not an afterthought.
posted by Dr. Twist at 8:32 AM on January 16 [24 favorites]


wiggling it with a beam with energy measured in watts
seanmpuckett - the beam energy might be measured in watts, but very little of that energy results in momentum transfer, since the photons are massless, their (relativistic) momentum is tiny. So very little of the beam energy is available for "wiggling" the membrane.

After reading up on the photoacoustic effect (thanks Wikipedia) it now seems clear that the effect is mostly due to thermal effects due to absorbed energy heating the sample. There are other contributing factors, but they are due to photochemical reactions, not simple momentum-transfer from light pressure.
posted by crazy_yeti at 8:35 AM on January 16 [2 favorites]


And before somebody jumps on me let me point out that I am aware that watts are a unit of power, not energy! Replace "energy" by "power" in the above for dimensional correctness.
posted by crazy_yeti at 8:38 AM on January 16 [1 favorite]


More generally, a recent talk by Ross Anderson of Cambridge University at the CCC last month: The sustainability of safety, security and privacy
posted by swr at 9:01 AM on January 16 [1 favorite]


"Qualified" in what exactly? How exactly does "qualifications" help thinking about security?

Well, I don't know, but one would like to think that the companies building these devices do.
posted by thelonius at 9:03 AM on January 16 [1 favorite]


Sean: hey, if there’s an error in my maths feel free to correct it (crazy_yeti is correct that I lost a 0 in the middle there).
posted by pharm at 9:03 AM on January 16


Stuff like this is why we still don't have any smart devices in our home, except for our phones, which of course may be covertly listening and watching at all times.
posted by grumpybear69 at 9:04 AM on January 16 [3 favorites]


also, to piggyback on bonehead's comment most of the best security people I've worked with over the years had no formal credentialing and some of them didn't have any academic qualifications at all.
posted by Dr. Twist at 9:06 AM on January 16 [1 favorite]


nb. What are you engraving that’s free floating & weighs picograms? Or were you walking about the mems microphones here?
posted by pharm at 9:07 AM on January 16


How exactly does "qualifications" help thinking about security?

I simply meant understanding the technical details and physics and electronics of the microphone . I don't, and I'd imagine that someone else does. That person, not me, should be working on security for these IoT abominations.

If you want to quibble about calling that "qualifications", enjoy.
posted by thelonius at 9:19 AM on January 16 [2 favorites]


I think I get your point, but the design of “let’s have a device that takes unauthenticated input and then does things on the user’s behalf, filtering the lot through MegaCorp’s servers” seems a bit shit to me.

Except, not really. Even the most insecure device in the video (Android tablet) required a recording of the device owner's voice. I had an embarrassing amount of Siri enabled devices in ear shot when I played this video (ok, it was 7), and only my HomePod acknowledged the "Hey Siris". Even then the response was how it didn't recognize the voice of whomever was talking.

And that's all this attack really is. Instead of just replaying a recording of the actual device owner with sound, they are doing the reply with light. Although, I do admit that infra lasers would attract much less attention than an extremely loud recording being blasted outside someone's window.

However, even with this replay attack, all devices are not created equal. You'll notice they couldn't get the iPhone to work bar a controlled lab experiment using an older gen device in a vice that was already unlocked. Also, you'll notice that for anything dealing with doors, Siri required an unlocked device. In fact, for some things (like a security system), Siri will be able to arm it, but disarming cannot be done via voice, not matter if the device is unlocked or not.

LOL at the "I assume other devices do it as well" when it came to Samsung et al requiring the device to be unlocked. You just did it with an Android tablet my guy.
posted by sideshow at 9:31 AM on January 16


Interesting video, thanks for posting it.

If I were to guess at a better attack vector, it would be infrasound. Solid surface penetration, not noticeable to humans, no alignment problems, and systems without a low frequency filter will, at least potentially, treat it like any other voice input. You could even mask any audible side-band generations by putting it in a thumper car driving past.
posted by cowcowgrasstree at 9:54 AM on January 16


We can go about the physics somewhat more simply. They say on the project page they were able to successfully attack at wattage W =0. 5 mW. The pressure P of that beam on a square patch 1 mm x 1 mm is

P = (W/c)/(1 mm * 1 mm) = 1.67 micropascals, (1)

where c is the speed of light. The threshold of human hearing is 20 micropascals. It’s possible that the devices are more sensitive than we are, but I find it unlikely given my own experience with smart devices. Moreover, their images of the focused spots seem to be >> 1 mm square (look at the 110m example with 5 mW) , so they were able to get results with even less beam momentum. Maybe there’s some resonant effect they can exploit to juice the amplitude but seems unlikely in a commercial device. Photoacoustic seems more likely.

(Also if you read the paper they explicitly say it’s photoacoustic)

Bonus fact: you can exploit photoacoustics to make very interesting images of biological samples!
posted by Maecenas at 10:32 AM on January 16 [3 favorites]


god I love physics arguments.
posted by biogeo at 10:22 AM on January 16 [2 favorites +] [!]


It's all fun and games until someone invents the nuclear bomb.
posted by symbioid at 10:49 AM on January 16 [2 favorites]


Because I love confusing issues further, I have to point out that when you laser etch a material, you may be releasing internal stresses within the material, which would amplify the amount of energy released significantly.

The other possibility is that you're laser etching a material that can scream because it feels pain, in which case you should probably stop. 🎶some things aren't legal to laaassseeeerrr🎶
posted by phooky at 11:00 AM on January 16 [5 favorites]


OTOH, if someone started selling a cheap TV-B-Gone-like device to fuck with home assistants, then the companies would start to worry.

You wouldn't happen to know a renegade electrical engineer that could make such a device, would you?

Asking for a friend.

Siri? Divide by zero!
posted by loquacious at 11:04 AM on January 16 [4 favorites]


I think Sean is right in that it might be possible to focus enough photons onto a mems transducer that the momentum transfer matches that of an incident sound wave & a back of the envelope calculation working out the momentum transferred to the mems transducer by a 20dB sound wave & using that to work out the energy of the incident photons (using very hand-wavy physics) does seem to be within at least the right order of magnitude ball-park for a mW scale laser - you need about a mW of light incident to the transducer if my numbers are right.

But the thermal effects of dumping a mW into a microgram of silicon are going to dominate I would have thought.
posted by pharm at 11:24 AM on January 16 [1 favorite]


This could be defeated by putting sunglasses over the mic, right?
posted by grumpybear69 at 11:35 AM on January 16 [1 favorite]


"Qualified" in what exactly?

Qualified to work on home assistants = not good enough to work on the mobile devices team, ha ha.

Seriously though, nobody's looking at the certifications and training for the engineer designing Alexa's microphone circuit. The way this industry works, it's the vendor's job to mention weird behavior like this in the datasheet or in some app note, if they are even aware of it.

Besides, doesn't everyone already know voice identification is for convenience, not security?
posted by ryanrs at 11:38 AM on January 16


Secure Home Internet Things
posted by gurple at 12:03 PM on January 16 [1 favorite]


Some fields use ‘qualified’ as a synonym for ‘certified’ and some as a synonym for ‘capable’.
posted by clew at 12:54 PM on January 16 [1 favorite]


(To answer the question posed in the post's title, the customary answer from infosec people is "Shit.")

I thought the answer was "Security", that being there is no S in "IoT".
posted by solarion at 2:05 PM on January 16 [3 favorites]


Seems entirely likely to me that the effect is going to be straight-up photoelectric, and that no analysis of the beam's effect on the microphone diaphragm is required.

MEMS microphones are ridiculously small, so the analog signals that physical movement of their diaphragms with respect to their silicon substrates generate are just crazy small. The initial signal processing for that analog signal has to be on-chip because there's just no way to achieve an acceptable signal to noise ratio otherwise, and I can't see how any on-chip analog-to-digital converter sensitive enough to respond to signals as tiny as that could possibly be capable of separating charge movement due to diaphragm vibration from charge movement generated directly inside its own input transistors by incident photons.

This attack works the same way on the micro scale as it does on the macro scale, in that it breaks an assumption about the local nature of signal sources. The laser beam is just a way to swamp the microphone diagphragm signal with an unexpected noise source; noise that just happens to be modulated with something indistinguishable from legitimate input.

And yes, the "S" in "IoT" does stand for "Security".
posted by flabdablet at 2:30 PM on January 16 [1 favorite]


Seems entirely likely to me that the effect is going to be straight-up photoelectric, and that no analysis of the beam's effect on the microphone diaphragm is required.

That is what I thought as well. All silicon devices are sensitive to the photoelectric effect. All you need is a P-N junction which is present in every transistor on a chip. That is why they package silicon chips in black plastic, to keep out light which could otherwise generate errors.

But if you read the paper, the authors tested whether the effect was electrical or mechanical. They applied a little dot of transparent glue to the back side of the diaphragm as a weight. This mechanical modification shouldn't have changed the response if it was photoelectric, but in fact it reduced the response to light to 10% of previously.

The most likely photoacoustic effect would be thermal, not photon momentum. The light energy causes thermal expansion and contraction of the diaphragm. The fact that the measured frequency response drops off very steeply beyond 100 Hz would suggest that the thermal inertia of the diaphragm could be a factor.

I'm not entirely convinced, but the authors did try to distinguish between electrical and mechanical effects.
posted by JackFlash at 3:28 PM on January 16


Yeah, I wasn't super convinced by their effect-cause methodology. Seems to me that both removal of the can and addition of a dot of transparent though unavoidably refractive glue could easily affect the amount of incident laser light that actually makes its way from the listening hole to the input transistors.

A less ambiguous test would involve selective exposure of assorted bits of a MEMS microphone's innards to very carefully focused and masked beams to see which parts were most strongly responsive.
posted by flabdablet at 3:59 PM on January 16



I've also worked on hardware projects for the casino gaming industry, and boy howdy to they pay attention to security.


This is a good story, that hospitals are gambling with their data, more than casinos do.
posted by eustatic at 10:10 PM on January 16 [2 favorites]


« Older “May you build a ladder to the stars”   |   Bart the Mothman Newer »


This thread has been archived and is closed to new comments