In the investigations of Paul Manafort and Michael Cohen, the FBI has retrieved messages from Signal, Telegram and WhatsApp. While there are weaknesses inherent in all of these apps, the question remains: What does a good data protection scheme look like?
A few days ago, the FBI revealed that Michael Cohen’s messages sent with Signal and WhatsApp are now available as evidence in the on-going investigation into his various dealings. While thousands of emails and documents have already been recovered from Cohen’s devices, home, hotel room, and office, the recovery of data from messaging apps that promise end-to-end encryption is surprising. One would presume that end-to-end message encryption should ensure that those messages are unrecoverable without assistance from Mr. Cohen. However, clearly that is not the case.
While the specific method of accessing these messages is not public information, techniques for gathering encrypted data have been revealed in other cases. For example, Paul Manafort’s recent issues with witness tampering reveal a much more obvious source of information leakage – some of the recipients of his messages shared them with law enforcement. Perhaps he should have sent them with Snapchat, although the security guarantees of Snapchat are more circumspect than his platforms of choice (Telegram and WhatsApp). Manafort’s information leakage reveals the value of data expiration – forever is a long time, and our data is increasingly persistent as the cost and availability of storage plummets.
In other cases, the flow of information through the complex web of technologies supporting our devices has taken on more significance. For example, it appears that some of Mr. Manafort’s WhatsApp messages were recovered from an iCloud backup. WhatsApp has an option to allow backups, so Mr. Manafort could have, but did not, disable backups. Backups are a form of information leak in that they open a new frontier for compromising encrypted data. It appears that investigators simply issued Apple a subpoena for his iCloud backup. Users are not always aware of how information flows, particularly as device providers like Apple and Google increasingly surround devices with an ecosystem of cloud-based technology that operates seamlessly from a single account.
It is surprising, though, that an iCloud backup on its own was enough to unlock Mr. Manafort’s WhatsApp messages, highlighting the known fact that WhatsApp’s iCloud backup encryption does not require a unique, user credential to unlock it. I would consider this encryption weakness to be a substantial leak in an “end-to-end encrypted” messaging system. If user credential-derived encryption were applied by WhatsApp to its backups, an additional, personal credential from Mr. Manafort would either have to be guessed or discovered (people often write down their passwords …) to unlock his messages.
A quick look at the website for Signal (used by Mr. Cohen) reveals that, on iOS for example, messages are encrypted using the built-in capabilities of iOS. This includes the passcode-based encryption on an iOS device as a first measure of file encryption, and a second layer of encryption on top of that for the message database using an encryption key stored in the iOS keychain. Jail breaking a device is a sure way to steal both the key and the underlying encrypted information. A simple Google search for “jailbreak iOS 11.3.1” reveals that even very recent versions of iOS are candidates for a jailbreak, although mileage with jailbreaking tools varies with the device in question. However, the underlying point remains salient – iOS is a frequent target for hackers, and jailbreaking tools are well developed and sophisticated. The last thing you need is a large, motivated community providing encryption-busting hacks for completely unrelated reasons (i.e., most people are not jailbreaking iOS to compromise Signal data).
The most important lesson to glean from Mr. Cohen’s and Mr. Manafort’s fate, however, is social, not technical. Humans are accustomed to protecting sensitive property with care. We generally know not to leave expensive items sitting in the back seat of our car; we know to put our money in the bank, not under the mattress; and we have the sense to store sensitive documents in a locked file cabinet. However, we also appreciate the convenience of keeping our dishes, towels, clothes, and other every day items outside of our locked cabinetry. We know well the difference between objects that merit special protection, and those that don’t.
However, back in the realm of data, we put our trust in technologies that promise data protection without any clear understanding of how those technologies work, where data flows, and the circumstances under which that data can be unlocked. Photos of our children, mundane chat conversations with friends and family, etc. sit in the same encrypted data blob as business documents, confidential messages, and financial data (credit card numbers used by of Apple Pay, for example). However, these different types of data have drastically different requirements for data security, data lifetime and data retention. We want our photos to stick around forever in iCloud, but perhaps we should prefer that our encrypted messages do not. From a logical perspective this makes no sense – it is the equivalent of leaving valuables in your underwear drawer, and most of us have long since learned that is a horrible idea even if we have locks on our house doors and our windows.
Protecting sensitive data must be a methodical process designed by a professional, and then imposed seamlessly on users. Asking users to participate as partners in keeping data safe is a losing endeavor, as in my experience users will almost always choose convenience over security (see Mr. Manafort and his iCloud backups).
Designing a data protection scheme requires that:
- Sources of sensitive data are clearly identified, and there is an easy (or, even better, singular) path from the data source to a safe data repository.
- Once data is in a safe place, it has to stay there. There cannot be any unintended leaks of that data (e.g., to iCloud).
- Data must move safely. No matter where that data is (on the network, on a USB stick, etc.), it must always be protected, and both ends of any communication must be authenticated.
- The only way to unlock data is a strong method of authentication (e.g., a complex password along with an authenticator app or a certificate that the user must have).
- Data should persist only as long as it needs to, either for legal reasons or to fulfill the intended usage of that data.
Unfortunately, implementing items 1-5 above with a traditional “data locker” (i.e., the safe place you are supposed to keep all of your data) is impossible in today’s world. It is simply too easy to move data to unintended places, and there is absolutely no way to close all of the leaks. Hence, the definition of a “safe data repository” has to change – this is no longer a physical, or even a virtual location that is designated as safe. Instead, it is an encryption barrier – a separation of data defined only by the fact that sensitive data is encrypted with a different set of keys and a different mechanism to lock/unlock that data than non-sensitive data. Attempting to control the location of that data is futile, so instead focus on strong measure to control access to that data.
Had Mr. Manafort or Mr. Cohen joined an organization with a sophisticated approach to data security, their secure emails and messages would have been neatly separated from everything else on their mobile device. A complex password and a second factor of authentication would have been required to unlock that data. And, since a sophisticated organization would understand the dangers of data centralization and data retention (e.g., a central key to unlock all data; or a repository of network passwords that IT could unlock …), any sensitive data would be retained only as long as required by law, and thereafter it would be inaccessible to anyone but the data owner. While there are cases where a sophisticated approach to data security may empower criminal activity, in my opinion that negative effect pales in comparison to the up to $600 billion annual cost in lost economic activity due to IP theft. While strengthening trade relationships and legal IP protections is always a good step, being careful with sensitive data and protecting it well is common sense that most organizations continue to ignore.
Seth Hallem is the CEO & Co-founder of Mobile Helix, Inc.
This post was originally published in CSO Online, June 19, 2018