Managing the flood of information

10.03.2015

Increasing digitisation in the fields of communication and science means academic libraries are cultivating closer, more active contact with the scholarly community. This is also necessary in order to make the best use of the vast quantity of publications and raw data we have today. By Stéphane Praz

(From "Horizons" no. 104, March 2015)

“The library is a growing organism” is the last of Shiyali Ranganathan’s five laws of library science – a field co-founded by this mathematician who died in 1972. This law remains valid. However, it seems if anything an understatement to speak only of “growth”, at least with regard to what libraries offer today. “Explosion” would be a more accurate term.

The Internet enables academic libraries to acquire huge holdings overnight – if they have the financial means to get access to the big publishers. They don’t even need any new physical space for this, because the products in question are stored on server farms elsewhere in the world. And the scholarly outputs available online are growing faster than ever before: at present, they double in volume every nine years, according to a study by ETH Zurich.

The digital revolution has changed academic libraries. International networks and new possibilities of interaction with clients are just two aspects of this – think of the new model for libraries known as “Library 2.0”. But these changes are far wider reaching. “Digitisation links libraries closer than ever before with the whole scholarly community”, says Wolfram Neubauer, Director of the ETH Library in Zurich. “Digitisation presents academia with new challenges, and we are the people predestined to meet them”.

Scholars realise this perhaps most clearly when it comes to publishing their findings. While they can access many publications free of charge online without having to visit a library in person – thanks to the open-access movement –, at the same time, it is libraries that offer them the means to disseminate their own work throughout the world. This is because universities and other tertiary institutions have tasked libraries with publishing on their data servers a multitude of open-access texts ranging from doctoral and habilitation theses to conference proceedings and journal articles. They also advise scholars in their relationships with the big open-access publishers such as PLOS and BioMed Central. Publishing work in the open-access format is attractive and is becoming compulsory more often for research findings from publicly financed projects. But open-access journals also raise certain questions. “Many scientists are concerned about the reputation of some journals and publication avenues”, says Nicolas Sartori, open-access specialist at the Basel University Library. “And when open-access is chosen for the initial publication of research, there’s also the matter of finance”. Open-access journals often involve costs for the authors. However, in the long term they profit from the process, as Sartori confirms. All legal issues are covered that might otherwise arise when their research results are accessed; publications are disseminated quicker; and they are cited more often.

Different data formats

But work that’s ready for publication is just the tip of the iceberg. A vast amount of raw data lies beneath it. Today, every laboratory and every computer simulation produces more data in a single day than whole universities were producing annually until not so long ago. This data has to be archived. This is a major challenge to good scientific practice, because experiments and all the observations they involve have to remain verifiable and comprehensible. This task, too, is increasingly being delegated to libraries, since individual institutes often can no longer cope with the volume of data. Until now, few researchers have considered archiving work for longer than ten years, and there are few coherent guidelines as to how to structure data storage. This was proven by a survey that the ETH Library carried out among 450 professors and research groups. “If we want to leave future generations more than a mountain of data as vast as it is unusable, then we have to organise it according to unified standards”, says Neubauer from the ETH Library. Libraries have the necessary expertise in data management. But without close contact to the scholarly community, nothing will work. This was proven by the ETH pilot project “Data curation”. Data formats differ from one discipline to another, and the structures they need vary just as much. “In some cases, we have to work out good solutions for individual projects in collaboration with the scientists themselves – ideally even before any data has been generated at all”, says Neubauer. The Anglo-American world has already established the concept of the “embedded librarian”, specialists working from within research teams whose tasks involve organising the structure and the archiving of data.

More subtle search methods

But even before the researchers get around to publishing their work, and before they even start generating data, they have to acquire the knowledge that is already out there. Libraries place this knowledge at their disposal, whether online and freely accessible or on the spot in the library itself. Above and beyond this, libraries see themselves increasingly as mediators of information literacy – the skills that actually enable scholars to carry out information searches in the first place. “It’s information literacy, not user training” – that’s an important difference for Thomas Henkel, specialist for search techniques at the Cantonal and University Library of Fribourg. “We are no longer geared only to our own holdings. Instead we empower scientists to search for, evaluate and use information in the worldwide data jungle”. The information age has clearly done little to promote our ability to search for information. According to Henkel, many students in their introductory courses have little more than the basic knowledge of how to do a Google search. Even academics with doctorates often have just a simple knowledge of Google Scholar, Web of Science or Scopus. “More subtle search methods, such as using Boolean operators, are unknown to many”, says Henkel. Not to mention specific search tools such as the image search for chemical structures offered by Scopus, subject-specific databases and literature administration programs. These aids are almost indispensable today if one wants to search and manage scholarly literature efficiently. Neubauer also sees specialised information literacy as a basic prerequisite for scholarly work: “It has to form an integral component of our teaching”, he says, “just as it does at the many American universities that already practise it today”. But this needs close collaboration between faculties and libraries.

Stéphane Praz is a freelance science journalist.