2025-12-09 愛媛大学

<関連情報>
- https://www.ehime-u.ac.jp/data_relese/pr_20251209_agr/
- https://www.ehime-u.ac.jp/wp-content/uploads/2025/12/pr_20251209_agr.pdf
- https://www.sciencedirect.com/science/article/pii/S016816992501302X
Hort-YOLO: 統合された半自動アノテーションフレームワークを備えた多作物ディープラーニングモデル Hort-YOLO: A multi-crop deep learning model with an integrated semi-automated annotation framework
M.P. Islam, K. Hatou, K. Shinagawa, S. Kondo, Y. Kadoya, M. Aono, T. Kawara, K. Matsuoka
Computers and Electronics in Agriculture Available online: 15 November 2025
DOI:https://doi.org/10.1016/j.compag.2025.111196
Highlights
- Modules like MSRM, PUSM, SPPM for retaining contextual information and better accuracy.
- Evaluated on multi-crop datasets, effective in diverse range of horticultural scenarios.
- Hort–YOLO excels in low to moderate class imbalance datasets.
- Hort–YOLO balances annotation speed and scaling up supervised learning.
- The network-centric unit analyses data and displays results on the user’s device or smart glasses.
Abstract
This study addresses the significant challenge of accurate object detection in highly variable lighting conditions (ambient and artificial). We introduce a novel architecture, Hort-YOLO, which features a custom backbone, DeepD424v1, and a redesigned YOLOv4 head. The DeepD424v1 backbone is built on a modular, asymmetric structure that effectively extracts discriminative multi-scale global–local spatial features. This design fuses features at different depths to prevent the loss of feature perception while simultaneously enhancing recognition speed and accuracy. The network’s asymmetric branches, with multi-scale and parallel downsampling layers, gradually reduce the spatial size of feature maps. This process extracts fine-to-coarse details with richer feature information, and generates diverse contextual information in both spatial and channel dimensions. This design effectively reduces computational complexity and enhances the representation learning capabilities of the convolutional neural network (CNN). The model size is approximately 2.6 × 102 MB. The improved Spatial Pyramid Pooling Module (SPPM) of the Hort-YOLO detector can accurately locate the target object even its pixel size is less than 5 % of the input image. A comparative performance evaluation was conducted on a class imbalanced, dynamic, and noisy horticultural dataset. Despite the presence low and moderate level of class imbalance, Models 1, 2, and 4 achieved a higher F1 score of 0.68 on the validation dataset. In comparison with other object detectors, including YOLOv10 (n, s, m, l, x, b), YOLOv11 (n, s, m, l, x), YOLOv12 (n, s, m, l, x), YOLOx (medium coco), and both standard and modified YOLOv4, Hort-YOLO achieved a mAP05 and recall score of score 0.77 and 0.80, respectively. This study also demonstrates the efficiency of a semi-automatic annotation process, which reduces annotation time by 5 to 6 times. This annotation framework will help to scale up the supervised learning process by efficiently processing large datasets. Hort-YOLO also demonstrates the model’s robustness under different lighting, occlusion, and background complexity conditions, detecting objects at a range of 15 to 30 frames per second (FPS) in a real-world scenario.
