Google is declaring a new set of open-source differential privacy libraries that not only provide the equations and models needed to set boundaries and constraints on identifying data but also consists of an interface to make it easier for more developers to implement the protections. The idea is to make it possible for corporates to mine and analyze their database information without invasive identity profiles or tracking. The measures can also help mitigate the fallout of a data breach because user data is stored with other confounding noise.
“It’s all about data protection and about limiting the consequences of releasing data,” says Bryant Gipson, an engineering manager at Google. “This way, companies can still get insights about data that are valuable and useful to everybody without doing something to harm those users.”
Google presently uses differential privacy libraries to protect all different types of information, like location data, generated by its Google Fi mobile customers. And the techniques also crop up in features like Google Maps meters that tell users how busy different businesses are throughout the day. Google intentionally built its differential privacy libraries to be flexible and applicable to as many database features and products as possible.
Differential privacy is similar to cryptography in the sense that it’s extremely complicated and difficult to do right. And as with encryption, experts strongly discourage developers from attempting to “roll your own” differential privacy scheme or design one from scratch. Google hopes that its open-source tool will be easy enough to use that it can be a one-stop-shop for developers who might otherwise get themselves into trouble.
“The underlying differential privacy noisemaking code is very, very general,” says Lea Kissner, chief privacy officer of the workplace behavior start-up Humu and Google’s former global lead of privacy technology. Kissner oversaw the differential privacy project until her departure in January. “The interface that’s put on the front of it is also quite general, but it’s specific to the use case of somebody making queries to a database. And that interface matters. If you want people to use it right you need to put an interface on it that is usable by actual human beings who don’t have a Ph.D. in the area.”