Dataset

RGB Image Dataset of Urochloa Hybrids for High-Throughput Phenotyping and Artificial Intelligence Applications

This dataset represents an extended version of a previous work, accessible at this link: https://doi.org/10.7910/DVN/U0KL6Y. An additional 139 images and a total of 24,983 new annotations have been included. Combined with the original dataset, a total of 394 images with 47,323 annotations are now available. This new dataset differs from the previous one in several key ways, primarily in the conditions and types of images captured, as well as in the expanded annotations. In the initial release, lighting conditions were carefully controlled to standardize histogram distribution across all images. The images were also captured at a fixed distance and exclusively in a nadir (top-down) view, using a single sensor in a single geographic location. For this updated dataset, variability was prioritized across all aspects. Images were taken in multiple geographic locations, including Palmira, Colombia, and Ocozocoautla de Espinosa, Mexico. Different sensors were used, including a professional Nikon D5600 camera, smartphones (such as the Realme C53 and Oppo Reno 11), and even a Phantom 4 Pro V2 drone. The capture distance varied from 1 to 3 meters, resulting in images with differing spatial resolutions. Additionally, several capture angles were employed: no longer just nadir views but also oblique and frontal angles. Raceme density per plant was also increased. In the original dataset, the plant with the highest raceme count had 851 racemes. In the updated dataset, raceme counts reach as high as 1,586 in a similar area (~1m²), nearly doubling the count. This increase leads to a much higher degree of raceme overlap. This expanded dataset is expected to provide significant benefits for deep learning applications. The enhanced variability supports the development of more robust deep learning models, better suited to handle real-world diversity and complexity.