Journal Article

The foundations of big data sharing: A CGIAR international research organization perspective

The potential of big data capabilities to transform and understand global agricultural and biological systems often relies on data from different sources that must be considered together or aggregated to provide insights. The value of data is however not only in its collection and storage, but largely in its re-use. Big data storage repositories are not enough when we consider a world brimming with escalating volumes of data, here we need to consider innovative systems and tools which address data harmonization and standardization and importantly, ones that can bridge the gap between science and end users. In this paper, we will demonstrate how CGIAR (including the Alliance of Bioversity International and CIAT) develops a culture of co-operation and collaboration among custodians of agrobiodiversity data, as well as new directions for big data. CGIAR first launched the Platform for Big Data in Agriculture to enhance the development and maintenance of its data. This helped establish workflows of cross-platform synthesis, annotate and apply the lessons learnt. The Platform then built GARDIAN (Global Agricultural Research Data Innovation and Acceleration Network)—a digital tool that harvests from ∼40 separate open data and publication repositories that 15 CGIAR centres have used for data synthesis. While there have been significant advances in big data management and storage, we also identify the gaps to improve use, and the re-use of data in order to reveal its added value in decision making.