Hort-YOLO:半自動アノテーションフレームワークを統合した多作物深層学習モデル

2025-12-09 愛媛大学

愛媛大学の研究グループは、園芸作物のリアルタイムモニタリングを可能にする多作物対応の深層学習モデル「Hort-YOLO」を開発した。本モデルは、作物の形態や病害状態を高精度で検出する物体認識システムであり、MSRM・PUSM・SPPMといったモジュールを搭載することで、画像間のコンテクスチュアル情報を保持しつつ検出精度を向上させた。また、半自動アノテーションフレームワークを統合し、学習用データ作成にかかる作業時間を大幅に短縮しながら教師あり学習の精度と安定性を確保した点も特徴である。多種の園芸作物データセットで検証した結果、症状や品質ランクに偏りがある中規模データセットにおいて特に高性能を示し、農作物検査や栽培管理への実装可能性が高い。さらに、分析結果をスマートグラスなどユーザー端末へ表示でき、現場でのリアルタイム意思決定を支援する。研究成果は「Computers and Electronics in Agriculture」に掲載された。

Hort-YOLO:半自動アノテーションフレームワークを統合した多作物深層学習モデル

<関連情報>

Hort-YOLO: 統合された半自動アノテーションフレームワークを備えた多作物ディープラーニングモデル Hort-YOLO: A multi-crop deep learning model with an integrated semi-automated annotation framework

M.P. Islam, K. Hatou, K. Shinagawa, S. Kondo, Y. Kadoya, M. Aono, T. Kawara, K. Matsuoka
Computers and Electronics in Agriculture  Available online: 15 November 2025
DOI:https://doi.org/10.1016/j.compag.2025.111196

Highlights

  • Modules like MSRM, PUSM, SPPM for retaining contextual information and better accuracy.
  • Evaluated on multi-crop datasets, effective in diverse range of horticultural scenarios.
  • Hort–YOLO excels in low to moderate class imbalance datasets.
  • Hort–YOLO balances annotation speed and scaling up supervised learning.
  • The network-centric unit analyses data and displays results on the user’s device or smart glasses.

Abstract

This study addresses the significant challenge of accurate object detection in highly variable lighting conditions (ambient and artificial). We introduce a novel architecture, Hort-YOLO, which features a custom backbone, DeepD424v1, and a redesigned YOLOv4 head. The DeepD424v1 backbone is built on a modular, asymmetric structure that effectively extracts discriminative multi-scale global–local spatial features. This design fuses features at different depths to prevent the loss of feature perception while simultaneously enhancing recognition speed and accuracy. The network’s asymmetric branches, with multi-scale and parallel downsampling layers, gradually reduce the spatial size of feature maps. This process extracts fine-to-coarse details with richer feature information, and generates diverse contextual information in both spatial and channel dimensions. This design effectively reduces computational complexity and enhances the representation learning capabilities of the convolutional neural network (CNN). The model size is approximately 2.6 × 102 MB. The improved Spatial Pyramid Pooling Module (SPPM) of the Hort-YOLO detector can accurately locate the target object even its pixel size is less than 5 % of the input image. A comparative performance evaluation was conducted on a class imbalanced, dynamic, and noisy horticultural dataset. Despite the presence low and moderate level of class imbalance, Models 1, 2, and 4 achieved a higher F1 score of 0.68 on the validation dataset. In comparison with other object detectors, including YOLOv10 (n, s, m, l, x, b), YOLOv11 (n, s, m, l, x), YOLOv12 (n, s, m, l, x), YOLOx (medium coco), and both standard and modified YOLOv4, Hort-YOLO achieved a mAP05 and recall score of score 0.77 and 0.80, respectively. This study also demonstrates the efficiency of a semi-automatic annotation process, which reduces annotation time by 5 to 6 times. This annotation framework will help to scale up the supervised learning process by efficiently processing large datasets. Hort-YOLO also demonstrates the model’s robustness under different lighting, occlusion, and background complexity conditions, detecting objects at a range of 15 to 30 frames per second (FPS) in a real-world scenario.

1200農業一般
ad
ad
Follow
ad
タイトルとURLをコピーしました