NDIZI

NDIZI - Alliance Bioversity International - CIAT

NDIZI is an AI-driven research initiative that captures rich farmer feedback from voice, text, and images. It leverages cutting-edge machine learning approaches to turn data into actionable insights for many sectors such as research and development (e.g., crop breeding) and market research (e.g., consumer preferences). It bridges the gap between farmer preferences and breeding decisions, enabling scalable, data-driven development of better crop varieties.

Project Name (full): NDIZI - NLP to Develop and Innovate Zero-Shot Intelligence

Region and Countries: Rather than geographical scope, this project focuses on low-resource African languages, namely: Swahili, Chichewa, Amharic, Hausa, Wolof, and Yoruba. Among the high-resource languages, this project supports: English, French and Spanish.

Funder: The Gates Foundation

Partner: Tanzania Agricultural Research Institute (TARI)

What NDIZI does

NDIZI addresses a persistent gap in agricultural research: while significant investments have been made in crop breeding, methods for capturing farmer preferences—critical for variety adoption—remain limited in depth, scale, and usability. Traditional surveys often fail to capture the nuance of farmer observations, especially in diverse, low-resource environments.

The initiative aims to test and operationalise a new paradigm: using AI to collect and analyse rich, multimodal farmer feedback. At its core is Sikia, an integrated mobile app that combines Automatic Speech Recognition (ASR), Small and Large Language Models (SLMs/LLMs), and computer vision to capture voice, text, and image data directly from on-farm trials. Through conversational, dynamic prompting, NDIZI enables farmers to express preferences in their own words, while AI systems structure this input into trait-level insights aligned with breeding workflows.

What makes NDIZI unique is its offline-capable data collection in low-connectivity settings, and its focus on multilingual, farmer-centered interaction. NDIZI's primary focus is on low-resource languages (Swahili, Chichewa, Amharic, Hausa, Wolof, Yoruba, Shona). Among the high-resource languages, the project will target English, French and Spanish.

NDIZI - Alliance Bioversity International - CIAT - Image 2

 Key activities

NDIZI is implemented through a set of coordinated research and development activities that combine AI model development, field validation, and close collaboration with breeding programs. The work begins with improving core components such as Automatic Speech Recognition (ASR) and language models to better process farmer speech and extract insights across low-resource languages and noisy field conditions.

A key focus is expanding multilingual capability (including Swahili, Chichewa, Amharic, Hausa, Wolof, Yoruba, Shona, as well as English, French, and Spanish), which is critical to ensuring that farmer insights can inform breeding pipelines beyond a single linguistic context and respond to demonstrated downstream demand from partners such as CIMMYT

These components are integrated into Sikia, a mobile, offline-capable data collection tool powered by agentic AI that combines on-device and cloud-based models. Field trials, conducted in collaboration with breeding partners, are used to test and compare conversational AI approaches and to generate datasets for evaluating data quality, usability, and insight generation. A core component of the work is the development and implementation of an ethical AI framework. This includes ensuring informed consent, protecting farmer data, and systematically monitoring and mitigating bias across demographic groups and languages. Together, these activities aim to assess whether AI-based, multilingual, and ethically grounded approaches can produce reliable and scalable inputs into breeding decision-making.

Project leader

Project members