DRFW-TQC: Reinforcement Learning for Robotic Strawberry Picking with Dynamic Regularization and Feature Weighting

Strawberry harvesting represents a labor-intensive agricultural operation where existing end-effector pose control algorithms frequently exhibit insufficient precision in fruit grasping, often resulting in unintended damage to target fruits. Concurrently, deep learning-based pose control algorithms...

Full description

Saved in:
Bibliographic Details
Main Authors: Anping Zheng, Zirui Fang, Zixuan Li, Hao Dong, Ke Li
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:AgriEngineering
Subjects:
Online Access:https://www.mdpi.com/2624-7402/7/7/208
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Strawberry harvesting represents a labor-intensive agricultural operation where existing end-effector pose control algorithms frequently exhibit insufficient precision in fruit grasping, often resulting in unintended damage to target fruits. Concurrently, deep learning-based pose control algorithms suffer from inherent training instability, slow convergence rates, and inefficient learning processes in complex environments characterized by high-density fruit clusters and occluded picking scenarios. To address these challenges, this paper proposes an enhanced reinforcement learning framework DRFW-TQC that integrates Dynamic L2 Regularization for adaptive model stabilization and a Group-Wise Feature Weighting Network for discriminative feature representation. The methodology further incorporates a picking posture traction mechanism to optimize end-effector orientation control. The experimental results demonstrate the superior performance of DRFW-TQC compared to the baseline. The proposed approach achieves a 16.0% higher picking success rate and a 20.3% reduction in angular error with four target strawberries. Most notably, the framework’s transfer strategy effectively addresses the efficiency challenge in complex environments, maintaining an 89.1% success rate in eight-strawberry while reducing the timeout count by 60.2% compared to non-adaptive methods. These results confirm that DRFW-TQC successfully resolves the tripartite challenge of operational precision, training stability, and environmental adaptability in robotic fruit harvesting systems.
ISSN:2624-7402