You’ve likely heard the term “Big Data” being thrown around recently.
Dr. Nello Cristianini, professor of artificial intelligence at the University of Bristol, discusses data’s collection, analysis, and usage in the digital age.
Nello Cristianini is a professor of Artificial Intelligence at the University of Bristol since March 2006, and a holder of the Royal Society Wolfson Merit Award. He has wide research interests in the area of computational pattern analysis and its application to problems ranging from genomics, to computational linguistics and artificial intelligence systems. He has contributed extensively to the field of kernel methods. Before the appointment to Bristol he has held faculty positions at the University of California, Davis, and visiting positions at the University of California, Berkeley, and in many other institutions. Before that he was a research assistant at Royal Holloway, University of London. He has also covered industrial positions. He has a PhD from the University of Bristol, a MSc from Royal Holloway, University of London, and a Degree in Physics from University of Trieste. Since 2001 has been Action Editor of the Journal of Machine Learning Research (JMLR), and since 2005 also Associate Editor of the Journal of Artificial Intelligence Research (JAIR). He is co-author of the books ‘An Introduction to Support Vector Machines‘ and ‘Kernel Methods for Pattern Analysis’ with John Shawe-Taylor, and “Introduction to Computational Genomics” with Matt Hahn (all published by Cambridge University Press).
The Big-Data Revolution
Science used to be about making sense of observational data, and theories were the result of scientific activity. Think again: now the collection of data has taken priority, and its analysis by massive computers is being used to make predictions and decisions, which are very effective but which we do not fully understand.
Take the example of amazon.com: it can recommend books to readers which are very relevant, without having an explicit model of the reader and her preferences. The same happens with Google ads, spam filtering, machine translation, face recognition, and so on. Today’s Artificial Intelligence has found a way to by-pass the need for understanding a phenomenon before we can replicate it in a computer.
The enabling technology is called machine learning: a method to program computers by showing them examples of their desired behavior.
And the fuel that powers it all is data.
Data has been called the new oil, a new natural resource, that businesses and scientists alike can leverage, by feeding it to massive learning computers to create behaviours that we cannot describe in enough detail for a traditional model or program to be written.
It is knowing what – not how: predicting, not explaining. It is about knowing what a new drug will do to a patient, not why.
But was not science meant to help us make sense of the world, comprehend it, and explain it to us?
Or was it just meant to deliver good predictions and enable good decisions?
The big data revolution, started in artificial intelligence, has now reached every corner of science, from biology to social sciences, and it spreading further. Schooling, policing, even design are now driven by big data.
But let us remember that the fuel that powers this revolution is every often our own personal – even intimate – data, and that we still do not have a clear legal, moral and cultural framework to think about this: should we have control over our own personal data?
These ideas about the nature of science, the ethical implications of this new method, as well as the opportunities unleashed by this revolution, are my research focus at the moment, as part of the THINK-BIG research project.