A picture paints ten thousand digits
In the age of big data, visualisation has become an essential tool, helping us to discover the hidden relationships that lie beyond the reach of algorithms. By Daniel Saraga
Big data is getting bigger. Government statistics are escaping red tape thanks to ‘open data’, then there’s accumulating commercial and financial information and beyond that the abundance of traces we leave as we surf the Internet.
But still the question remains: what do we do with this mass of information and how can we transform it into useful knowledge? One option uses powerful statistical algorithms to uncover correlations, a process known as ‘data mining’. But humans can also get the data to talk. “If the data is well represented visually, the eye is capable of quickly detecting relationships which are beyond algorithms”, says Denis Lalanne, a researcher in the computer science department at the University of Fribourg. “This is the case with trends, outliers and even similarities between datasets”.
From the UN to New York cabs
Along with his PhD student, Ilya Boyandin, and his colleague, Enrico Bertini, Lalanne has been developing new tools, including for visualising ‘flows’ (the popularity of routes from an origin to a destination) and analysing changes over time. These infographics are published in the form of an open source library and have found a multitude of uses, such as studying the allocation of funds for international aid or the distribution chains of logistics companies.
“I was surprised by the response our work has generated”, says Lalanne. One notable project, known as “Flowstrates”, was created in collaboration with the United Nations to study the movement of refugees between different countries. But it’s been taken over by other users and adapted to investigate not only the mobility of workers in Chile and students in Australia, but also the international trade of resources and the movement of New York taxicabs.
Choose and elucidate
“Existing tools are no match for the range of questions that even a single user can have”, says Lalanne. After all, good data visualisation is not about representing everything; that just leads to illegible graphics and charts. “The key is to understand the needs of a user and to define practical usage scenarios in order to select relevant information”, he says. As the tool is interactive, it must allow easy exploration of data and give rise to new hypotheses, which can then be put to statistical tests.
“We’ve compared the conclusions drawn from different diagrams. This proved that the way the information is presented clearly influences what we can take from it. Our goal is to stay as close to the data without distorting it. But it’s clear that visualisation can be used in communications to target a message easily”.
The success of data visualisation has started to attract students. Computer scientists are going even further and developing ‘visual analytics’ algorithms to analyse the charts and graphics generated by other computer programs. But for now, the top tool for sifting through torrents of information, and not drowning under them, is still the faithful old eye of homo sapiens.
Daniel Saraga is a freelance science journalist, working also on behalf of the LargeNetwork agency.
(From "Horizons" No. 101, June 2014)