Journal Article

A data-mining approach for developing site-specific fertilizer response functions across the wheat-growing environments in Ethiopia

The use of chemical fertilizers is among the main innovations brought by the 1960s Green Revolution. In Ethiopia, fertilizer application during the last four decades has led to significant yield gains, yet yield remains below its potential across much of the country. One of the main challenges responsible for low yield response to fertilizer application has been the use of ‘blanket’ recommendations, whereby no tailoring of fertilizer amount and frequency is done based on soil requirements. As a result, the amount of fertilizer applied ranges widely, and can be either sub- or supra-optimal. There is thus an increasing need for site-specific fertilizer recommendations which take into account site characteristics such as climate variables (temperature, rainfall, and solar radiation); soil factors (soil organic carbon, moisture, pH, texture, cation exchange capacity, and level of macro- and micronutrients); and topographic position indices. This article reports on a data-mining approach we developed on a large dataset of 6585 wheat (Triticum aestivum) field trials. The dataset includes detailed, site-specific biophysical variables to create nutrient response functions that can guide optimal site-specific fertilizer application. The approach used a machine-learning model (random forest) to capture the relationship between nutrients – nitrogen (N), phosphorous (P), potassium (K), and sulfur (S) – and wheat yield. The model explained about 83, 82, 47, and 69% of variances of yield for N, P, K, and S omission, respectively, with consistent performance across training and testing datasets. Expectedly, for N and P omission data, the most important explanatory variables are nutrient rate, followed by soil organic carbon and soil pH. For K and S, however, climatic variables played an important role alongside nutrient rates. The site-specific yield–fertilizer response curves derived from our model are highly variable from location to location, as they are affected by the climatic, soil, or topographic conditions of the site. Importantly, using principal component analysis, we showed that the shape of the fertilizer response curves is a result of the multiple environmental factors (including soil, topography, and climate) that are at play at a given site, rather than of a specific dominant one. The research output is expected to respond to the national policy demands for a sound method to identify the optimal fertilizer rate to increase economic returns of fertilizer investments and take fertilizer utilization research one step further.