May 8, 2013
Cell phones are so many things now–computer, map, clock, calculator, camera, shopping device, concierge, and occasionally, a phone. But more than anything, that little device that never leaves your person is one amazingly prolific data engine.
Which is why last October, Verizon Wireless, the largest U.S, carrier with almost 100 million customers, launched a new division called Precision Market Insights. And why, at about the same time, Madrid-based Telefonica, one of the world’s largest mobile network providers, opened its own new business unit, Telefonica Dynamic Insights.
The point of these ventures is to mine, reconstitute and sell the enormous amount of data that phone companies gather about our behavior. Every time we make a mobile call or send a text message–which pings a cell tower–that info is recorded. So, with enough computer power, a company can draw pretty accurate conclusions about how and when people move through a city or a region. Or they can tell where people have come from to attend an event. As part of a recent case study, for example, Verizon was able to say that people with Baltimore area codes outnumbered those with San Francisco area codes by three to one inside the New Orleans Superdome for the Super Bowl in February.
In a world enamored of geolocation, this is digital gold. It’s one thing to know the demographic blend of a community, but to be able to find out how many people pass by a business and where they’re coming from, that adds a whole nother level of precision to target marketing.
Follow the crowd
But this data have value beyond companies zeroing in on potential customers. It’s being used for social science, even medical research. Recently IBM crunched numbers from 5 million phone users in the Ivory Coast in Africa and, by tracking movements of people through which cell towers they connected to, it was able to recommend 65 improvements to bus service in the city of Abidjan.
And computer scientists at the University of Birmingham in England have used cell phone data to fine tune analysis of how epidemics spread. Again, it’s about analyzing how people move around. Heretofore, much of what scientists knew about the spread of contagious diseases was based largely on guesswork. But now, thanks to so many pings from so many phones, there’s no need to guess.
It’s important to point out that no actual identities are connected to cell phone data. It all gets anonymized, meaning there shouldn’t be a way to track the data back to real people.
There shouldn’t be.
Leaving a trail
But a study published in Scientific Reports in March found that even anonymized data may not be so anonymous after all. A team of researchers from Louvain University in Belgium, Harvard and M.I.T. found that by using data from 15 months of phone use by 1.5 million people, together with a similar dataset from Foursquare, they could identify about 95 percent of the cell phones users with just four data points and 50 percent of them with just two data points. A data point is an individual’s approximate whereabouts at the approximate time they’re using their cell phone.
The reason that only four locations were necessary to identify most people is that we tend to move in consistent patterns. Just as everyone has unique fingerprints, everyone has unique daily travels. While someone wouldn’t necessarily be able to match the path of a mobile phone–known as a mobility trace–to a specific person, we make it much easier through geolocated tweets or location “check-ins,” such as when we use Foursquare.
“In the 1930s, it was shown that you need 12 points to uniquely identify and characterize a fingerprint,” the study’s lead author, Yves-Alexandre de Montijoye, told the BBC in a recent interview. “What we did here is the exact same thing, but with mobility traces. The way we move and the behavior is so unique that four points are enough to identify 95 percent of the people.”
“We think this data is more available than people think. When you share information, you look around and you feel like there are lots of people around–in a shopping center or a tourist place–so you feel this isn’t sensitive information.”
In other words, you feel anonymous. But are you really? De Montijoye said the point of his team’s research wasn’t to conjure up visions of Big Brother. He thinks there’s much good that can come from mining cell phone data, for businesses, for city planners, for scientists, for doctors. But he thinks it’s important to recognize that today’s technology makes true privacy very hard to keep.
The title of the study? “Unique in the Crowd.”
Here are other recent developments related to mobile phones and their data:
- Every picture tells your story: Scientists at Carnegie Mellon University’s Human Computer Interaction Center say their research of 100 smartphone apps found that about half of them raised privacy concerns. For instance, a photo-sharing app like Instagram provided information that allowed them to easily discover the location of the person who took the photo.
- Cabbies with cameras: In the Mexican city of Tuxtla Gutiérrez, taxi drivers have been provided with GPS-enabled cell phones and encouraged to send messages and photographs about accidents or potholes or broken streetlights.
- Follow that cell: Congress has started looking into the matter of how police use cell phone data to track down suspects. The key issue is whether they should be required to get a warrant first.
- Follow that cell II: Police in Italy have started using a data analysis tool called LogAnalysis that makes it especially easy to visualize the relationships among conspiring suspects based on their phone calls. In one particular case involving a series of robberies, the tool showed a flurry of phone activity among the suspects before and after the heists, but dead silence when the crimes were being committed.
Video bonus: If you’re at all paranoid about how much data can be gleaned from how you use your mobile phone, you may not want to watch this TED talk by Malte Spitz.
Sign up for our free email newsletter and receive the best stories from Smithsonian.com each week.