Factors in securing app and cloud data

Published March 10^th, 2017 — Updated over a week ago

Aidan Fitzpatrick Ethics & fact-checking

With increasing amounts of valuable data stored in the cloud it is all the more important that it is robustly protected. Strong security measures around cloud data services such as the iCloud benefit everyone: end-users, cloud service providers like Apple, and ecosystem platforms such as Reincubate.

Mechanisms to protect this data take a number of forms, and Apple’s track record of implementing them has been good. This article examines a number of the techniques that they — and the team at Reincubate — have been using.

Store less — if possible

Firstly, storing less rather than more pays off. There are examples of this where data has never been stored in the iCloud, such as where facial recognition data from Photos is not stored in the iCloud Photo Library, but instead cached locally. Data artefacts from Apple’s HealthKit, HomeKit and TouchID are treated similarly, as is Android's lock-screen facial recognition data. Of course, this is often a trade-off of convenience and the value of synchronising that data between devices. Some systems use a partial approach. For instance, iMessage stores a limited amount of content in the cloud, but relies on other data being directly synchronised between devices.

Secure messaging systems such as Telegram or WhatsApp take a completely different approach. Because no data (or at least, not much) is stored on their servers, a rich set of data must be included in any backup of a device in order for a user to be able to meaningfully be able to backup and restore. In a way, to maintain security, those app vendors are washing their hands of management of that data and leaving it to the end-user, who may choose to back up their devices locally or on the cloud.

Thus there are two extremes, and one can contrast WhatsApp -- with little in the cloud and everything in the backup — to Facebook Messenger, which stores nothing in a device backup because it is readily available in the cloud, and must be in order for Facebook to provide its web interface.

There’s no right solution here, but it bears thinking about: use of an app which hosts this data requires that the app vendor manages it securely, often users kept in the dark about what is stored and how. Use of an app that doesn’t store data centrally means the end-user must take responsibility for the security of their own backup data — unless they leave it to be stored on their platform’s cloud, as is the case with iCloud Backups and Google’s Android 7.0+ device backups.

Interestingly, as in the chart below, some app makers occupy both extremes of this position. The Snapchat app stores almost all of its data in the cloud, whereas the app’s physical extension -- Spectacles -- store all of their data locally on the device.

Different approaches to app data security

Whilst not broadly rolled out, Snapchat Spectacles are quite insecure: by holding the “snap” button for seven seconds, a hacker can pair them to a new phone and obtain any of the unencrypted snaps which were stored on them by the rightful owner. This stands in marked contrast to the security principles adopted for the Apple Watch.

Strong encryption

That takes us to the second critical technique: strong encryption, ideally with incomplete storage of the keys. If data is deemed too sensitive to be backed up in a way that can be accessed in the event of a device’s destruction or replacement, there are other techniques that can be used. By requiring use of a specific device, Facebook’s “Secret Conversations” feature makes use of hardware identifiers to encrypt content. If the device is lost, the content becomes practically irretrievable.

Some systems employ a similar technique where a device-specific token can be created to ensure only a limited number of devices can access content at a time. Snapchat and WhatsApp each use a variation of this. When logging in to Snapchat on a device, a token is created and shared between Snapchat’s servers and the device. If the same user logs into a different device, a new token will be created, replacing what was on the cloud before, and forcing the first device to log out as it would be unable to decrypt any inbound messages. There’s an obvious vulnerability to this technique when used on insecure systems, which is that the token can be copied and used elsewhere. If an argument was needed as to why using a rooted Android or jailbroken iOS device is a terrible idea, this is it, and it also explains why there’s unlikely to be a web client for Snapchat any time soon.

Whilst it’s one thing to potentially lose secret Facebook Messages, some data is both highly sensitive and highly valuable to its owner. It seems counter-intuitive, but if the personal value if the data is great enough, it is often stored in such a way that there are many potential ways to decrypt or recover it. This trades off security with reduced impracticality. For example, there’s a mass of data associated with iCloud and Google accounts, including — in both cases — some form of distributed cloud keychain, storing all of a user’s passwords. Not only that, but Google’s promotion of their accounts as an authentication mechanism on other sites with OAuth means that irrecoverably losing access to accounts like these would be a bitter pill to swallow.

Given these accounts are deemed too valuable to readily risk their permanent loss, a mid-way solution is used where data is stored on the cloud, but encrypted with a token or password that only the user keeps, and which isn’t stored. This is why, for instance, when logging into a fresh install of Google's Chrome browser, the cloud-stored passwords aren’t available until a secondary synchronisation password has been entered into the browser.

Sharing the encryption burden

Apple’s Tim Cook — who is both serious and fundamentally right on security — attested to their belief in this technique only last year:

For many years, we have used encryption to protect our customers’ personal data because we believe it’s the only way to keep their information safe. We have even put that data out of our own reach, because we believe the contents of your iPhone are none of our business.

— Tim Cook in "A Message to Our Customers"

Aside from anything else, this can help cloud providers avoid disclosing full user data to government actors. If they are compelled to provide it by a warrant, they can do so without directly exposing a user’s data, as the data requires further decryption which can only be done with a key that only the end-user has.

Sometimes, as in the case with multi-factor authentication (MFA), two-factor authentication (2FA) or two-step verification (2SV) this means the additional key can only be sent to or generated by a particular device. Banks and VPNs often use physical card readers or secure seed generators, which are forms of time-based one-time password generators (OTP or TOTP). Both Google and Apple offer simple SMS-based MFA as well as full 2FA systems, with Google’s using HOTP/OATH and Apple’s being proprietary.

When security or law enforcement agencies (LEA) have this encrypted data but no end-user key to decrypt it, they can attempt to brute-force it using powerful machines to test the data against automatically guessed passwords. Apple’s iOS has a massive advantage over Google’s Android here: because Apple tightly control the hardware in their devices, they are able to design encryption techniques that only their hardware can perform rapidly, and that other devices like desktop computers perform slowly. When Reincubate became the first company in the world to support decryption of data from 10.2, the significance wasn’t just that the company was first to understand how 10.2 worked — it was that we were first to scale rapid decryption of Apple’s clever technique. By making each decryption attempt take a number of seconds on conventional hardware, it became impractical for hackers or LEA to brute force.

Security through obscurity

It wouldn’t do to review these techniques without covering one that Apple are particularly good at: security through obscurity. By not releasing public APIs, detailed documentation, or even communicating much about these systems, the company are able to avoid alerting hackers to potential attack vectors. For a long time, the argument against obscurity was that disclosure and Open Source led to better security outcomes, but that was before 2014’s discovery that OpenSSL — a critical Open Source security library that virtually everything uses — was and has almost always been fundamentally insecure.

Please get in touch if you'd like to explore ways of working with app or cloud data, or if you're interested in better understanding the dynamics of the different approaches.