ORIE Colloquium

Augustin ChaintreauColumbia University
How to fix big data disparate impact?

Tuesday, November 22, 2016 - 4:15pm
Rhodes 253

Big Data is flawed; opacity, fragmentation, and hegemony - it seems - are embedded in its fabric. The threats are not just theoretical: We show using reproducible experiments that personalization algorithms in services used by millions today pose moral hazards, that metrics of social endorsement currently used are vastly misleading, and that the network dynamics facilitated by online interactions stand in the way of reducing various inequalities. Personal information collection and usage, however, ultimately bring benefits that we cannot forego, including in areas such as health, energy efficiency and public policies. How can one address those concerns at root?

This talk presents an overview of our projects to reverse those disturbing trends: First, we show how to bring transparency in personalization at web scale, using scalable inference algorithms that learn from small set of examples to keep advertisers accountable. Second, we present tools reconciling openly accessible data to inform mobile consumers and social media participants on the risks their information pose across domains. Third, we analyze the network effect of personal information sharing. Its evolution reveals in our experiment a public good dynamics, the kind that we show reinforces disparate outcomes, sometimes to the extreme. However, we show the benefit from an expanding social network is guaranteed for all participants under a general spectral condition that controls for segregation.