Analyzing iShamba data for improving livestock advisories

This project seeks to comprehend historical messaging patterns and their correlation with climate hazard data from livestock farmers' current interactions with agriculture service provider platforms like iShamba. The goal is to develop a high-resolution, targeted, and timely advisory package for livestock farmers through these platforms. The dataset comprises 319,569 records of text messages sent by farmers via the iShamba platform between 2015 and 2022. After data pre-processing, including filtering for livestock-specific messages and removing duplicate SMS messages, the final sample size is 72,062, with no missing data.

Recognizing the impact of dry days (NDD) and heat stress livestock index (THI) on livestock productivity, the analysis employs the Structural Topic Model (STM) package in R. This tool explores the optimal number of topics for topic modeling, visualizes results, estimates relationships between metadata and topics, creates a Shiny application for STM results, and utilizes the Stupid Back-Off (SBO) package in R to build a predictive model for the next word(s).

The results reveal that simplifying THI and NDD stress level classes did not yield expected outcomes, especially when compared to the more intricate model of NDD. The intricate model successfully identified stress-related words and SMSs. The findings emphasize that extreme weather conditions, such as prolonged absence of rain for over 25 days in a month, trigger farmer concerns about crop health and the impact of pesticides. While the simplified model approach could have been advantageous, results may have been influenced by the specific time period during data collection