Each day, many of us exchange content through Facebook, Twitter, blogs, discussion boards, and other online forums. What does this digital information disclose about us, and how are companies and other organizations using our data?
Rohini Srihari, an associate professor at the University of Buffalo's School of Engineering and Applied Sciences, discusses that topic in the following interview.
Why is Google's data trove so valuable?
Srihari: The sheer volume of data that they have is powerfulthere's so much of it, and it's so diverse. It reflects the voices of consumers, the voices of citizens, the voices of people across countries. One way they can exploit this information is through usage mining, which is tracking how people are using the Internet. They know what people are querying. Google has access to all sorts of information that marketers would love to get their hands on. When people query a brand, for instance, what are they querying for? Google was able to spot outbreaks of flu-like illnesses before government agencies could, because government agencies rely on traditional reporting, waiting for hospitals to send in statistics, whereas Google relies on queries."
What other companies or organizations are investing in data mining on the Web, and why?
Srihari: Practically everyone. The telecoms, credit card agencies, retailers, airlines, e-commerce providers like Amazon. One emerging technology is socially targeted advertising. Companies analyze the browsing patterns of brand loyalists, identify Internet users with similar browsing patterns, and use that information to target advertising.
What are some interesting challenges that researchers and companies face when mining data on the Web?
Srihari: The No. 1 challenge is balancing privacy with data mining. We've come to a stage where we do less than we can for fear of spooking the public. How do you gain enough information to help a retailer without creating a backlash? You don't want people to feel like you're invading their privacy.
What are some potential public benefits that could come from data mining?
Srihari: Trends emerge quickly on the Web, and that can be used in an advantageous way. We've heard that gang members often post on their Facebook pages what they did, so law-enforcement agents frequently go and look at Facebook. In communities, if the volume of chatter about some topic increases to a certain levelmaybe roads need fixing or there's a dangerous traffic lightpublic officials might take notice.