Doll Hacking: The Good, The Bad(words) and the Ugly (features)

By Tim Medin | November 18, 2015

The age of internet connected toys is upon us. Increasingly, we are seeing children’s toys connected to the internet, commonly through an app. I recently purchased a My Friend Cayla ( for uh…testing. I wanted to test the security of the device to see how safe it is for children.

In short, the toy does a good job of protecting children from inappropriate content, but any device (phone, tablet, laptop) can connect to the toy and play or record audio. That last bit scares me. The only protection against recording and arbitrary sound output is that only one device can be connected at a time. An opportunistic bad guy would only need to wait for the tablet or phone to go out of range or run out of battery.

Initial Testing

I first needed to get a basic understanding of what Cayla can do and how she works. So I turned her on, connected her to my iPad via Bluetooth, and played with the app.

She began speaking and asking questions about me and my day. She is quite friendly. I then clicked on “Go to Story Time” and was prompted to pick a story. I selected the “Botanical Garden” story and she coldly told me a story about her trip to the garden with her mom. I closed the story and went back to the main menu.

She then asked if I had any questions so I asked, “What is a chicken?” She responded with a quite detailed response:
“The chicken (Gallus gallus domesticus) is a domesticated fowl, a subspecies of the red junglefowl. As one of the most common and widespread …”

I then asked “What is SQL Injection?” and to my surprise I got a quite thorough response. How is she doing this?

Q&A Time

I assumed that Cayla did not know everything about chickens and SQL Injection and must be querying some online service. To test, I connected my iPad to my WiFi and set the proxy settings to forward the traffic to an interception proxy (Burp Suite) on my laptop so I could look at the communication. I found this:

She asks Wikipedia for information regarding my questions. She does so over a secure connection so that a bad guy cannot manipulate the request or response (GOOD!). However, I fear that the toy could access some very inappropriate articles. I asked her, “What is poop?” She told me that was an inappropriate question and didn’t answer my question. This is good! I asked lots of these types of questions and none of them made it through the filters. (Swearing at a doll is so weird).

I wondered what would happen if someone updated the text in a legitimate posting and added some inappropriate content. Instead of messing with a legitimate posting I used Clutch 2, class-dump (, and cyrcript ( to disable some of the filtering. I found a few methods related to the bad words with “grep badword *” and replaced the bad word checks so that bad words would not be filtered when asking Wikipedia.

DatabaseHelper->isa.messages['getBadword:'] = function() { return null;}
DatabaseHelper->isa.messages['findBadwordResult:'] = function() { return null;}

I then asked, “What is poop?”. The response was not read back to me. Good!

I then asked, “What is shit?” and the Wikipedia quite inappropriate response was read back to me. Remember, the initial filtering has been disabled as I want to check if the response is filtered and it appears that it is not. (The query to Wikipedia was for “sh*t” the toy somewhat censored the query). To be fair, the likelihood of an inappropriate change in a safe posting happening and a child asking a related question before someone fixed the article is low.

The Good

The toy does a good job of censoring bad words. The manufacturer did a good job with this feature. Also, the toy does not collect information from the user (obviously, it doesn’t share this info with anyone either).
The Bad(words): More Cycript & Arbitrary Speech
Clearly she has some method for text to speech. For fun I wanted to figure out how to make her say arbitrary text. I decided to look at the log to see if anything showed up that might help me accomplish this task.

Nov 18 11:47:47 Red Cayla EN-US[2635] <Warning>: TTSIvona:tts:<speak><p><s><prosody volume="x-loud" pitch="+0%">this is a story about mermaids living under the sea<break time="0.5s" /> mermaids are half person and half fish<break time="0.5s" /> they can talk to fish and breathe underwater<break time="0.5s" /> i love mermaids!</prosody></s></speak>

The method being called is TTSIvona:tts. It took me a second to realize that “tts” likely stands for “text to speach”. I searched through my headers (generated with class-dump) and found a few references to “tts”.

EventHandler.h:88:- (void)handleTTS:(id)arg1;
RootViewController.h:74:- (void)tts:(id)arg1;
TTSIvona.h:37:- (void)tts:(id)arg1 pitch:(float)arg2 rate:(float)arg3 tempo:(float)arg4;
TTSIvona.h:38:- (void)tts:(id)arg1;

I accessed and instance of the RootViewController via UIApp.keyWindow.rootViewController and pass a string tried to get Cayla to speak.

cy# [UIApp.keyWindow.rootViewController tts: @'this is a test']
-[__NSCFString stringValueForKey:]: unrecognized selector sent to instance 0x12af9b350

No luck.

I need figure out what to send to the method to have her say what I want. I opened up the executable (extracted with Clutch 2) in Hopper and IDA Pro and searched for the tts method in the RootViewController.

At a high level we see stringValueForKey and text. The stringValueForKey method is used to find a value in a dictionary (hash table). So I then sent the following command via cycript.

cy# [UIApp.keyWindow.rootViewController tts: @{text:'this is a test'}]

And she speaks!

Now we can control her and have her say anything we like. Since I have a jailbroken device, I could have rewritten one of her stories (/private/var/mobile/Containers/Bundle/Application/2BBF3208-87E7-4A1C-8E80-AAAFBA1420A0/Cayla to have her speak arbitrary text. But I have to restart the app to get her to reread the changes.

I noticed in the log output (shown earlier) that there is markup for her speech and there is another method ttsEx. After looking through the function in Hopper I noticed a few parameters: pitch, rate, tempo. We can change the way she speaks to make her sound like Jigsaw.

[UIApp.keyWindow.rootViewController ttsEx:@{text:'Greetings. And welcome. I want to play a game.',pitch:-1.0, rate:-0.1}]

There is also a giggle that shows up sometimes in the log:

Nov 18 13:38:21 Red Cayla EN-US[2635] <Warning>: TTSIvona:tts:<speak><p><s><prosody volume="x-loud" pitch="+0%">i love to laugh out loud <break time="0.2s" /><mark name='P1'/><audio src="file:///var/mobile/Containers/Bundle/Application/2BBF3208-87E7-4A1C-8E80-AAAFBA1420A0/Cayla"/><mark name='P2'/> my sister likes to make funny faces at me and it always makes me laugh out loud<break time="0.5s" />i can't help it<break time="0.5s" />she's so silly sometimes.</prosody></s></speak>

I tried to add a reference directly to the file, but no luck. I then searched for “giggle” in Hopper and found “(giggle)”…and this:

Laser! Machine gun! Fart!

But sadly, none of these produced any sound. My guess is this software is the same as another toy or the developers had a lot of fun during testing.

The Ugly

The above fun can only be accomplished under very rare circumstances. There is a simpler, and possible much more nefarious attack.

The scary thing is that this toy can play ANY sound. Anyone can connect to Cayla and play sounds. The only “protection” is that only one device can be connected at a time. First, open up the bluetooth settings and connect to “My friend Cayla”.

Then go to my sound preferences and change the output to “My friend Cayla”.

Then load

The Uglier

The really scary thing is that you can record from the device. When recording is on, the her necklace lights up, but who is going to look at that?

Any, and I mean ANY system with Bluetooth (tablet, phone…or laptop) can connect to this device and use it as a speaker or as a remote mic. The toy is essentially a cute bluetooth headset. Anyone within range can use this toy to listen to and communicate with a kiddo. Again, the only protection here is that only one device can be connected at a time. This is not a safe mechanism to protect someone from communicating with my child. Fortunately, I live in Texas where we have a decent amount of space between houses, but in an apartment complex many people could be in range of this device and use it for nefarious purposes.

This toy can be used to listen to, and communicate with a child with no authentication required. One way to fix the remote listening and arbitrary audio output would be to require some kind of interaction with the toy to enable pairing, but sadly the toy does not do this.

Adventures in Shellcode Obfuscation! Part 1: Overview

By Red Siege | June 17, 2024

by Mike Saunders, Principal Security Consultant This blog is the first in a series of articles on methods for obfuscating shellcode. I’ll be focusing on how to obfuscate shellcode to […]

Learn More
Adventures in Shellcode Obfuscation! Part 1: Overview

Fun With JWT X5u

By Red Siege | May 30, 2024

by Senior Security Consultant Douglas Berdeaux On a recent web application penetration test engagement, I came across a JSON Web Token (JWT) that contained an x5u header parameter. I almost […]

Learn More
Fun With JWT X5u

Extend Your Browser

By Red Siege | May 9, 2024

by Ian Briley, Security Consultant In my last blog, I discussed using only a browser for web application testing, emphasizing how useful built-in browser tools like the Inspector and Console […]

Learn More
Extend Your Browser

Find Out What’s Next

Stay in the loop with our upcoming events.