Reinforcement Learning Control for Bionic Robots Driven by Redundant Artificial Muscles
-
摘要: 人工肌肉是仿生机器人的核心驱动部件, 然而当前人工肌肉的应用与真实生物相差甚远, 缺乏像生物一样的冗余多肌肉协同. 围绕仿生机器人的复杂人工肌肉驱动与协同, 本文提出一种由多股人工肌肉并联驱动的软体机器人设计, 并围绕这种设计建立了基于强化学习的运动控制策略. 研制了以柔性十字形电路板为主体, 集成六路液晶弹性体人工肌肉与驱动电路的原型样机, 并测试获得其应变特性与响应性能; 针对原型样机形变-运动特点, 在仿真环境中构建了基于绳腱驱动的简化模型. 通过合理设计状态空间、动作空间及奖励函数等, 以Soft Actor-Critic算法进行强化学习并行训练, 得到平移与旋转运动肌肉协同策略. 将运动策略中稳定周期段以离线方式驱动实物样机, 实现有效的多向平移与旋转运动, 验证了采用强化学习控制复杂人工肌肉系统的可行性.Abstract: Artificial muscles are widely regarded as key actuators for bio-inspired robotics, yet their functionality still falls short of natural musculature, particularly in terms of redundancy and coordinated control. To address this challenge, this work proposes a soft robotic design driven by multiple artificial muscles and develops a reinforcement learning based motion control strategy for it. A prototype was built using a flexible cross-shaped circuit board, integrating six liquid crystal elastomer actuators with their driving circuits. To capture its deformation-motion behavior, a simplified tendon-driven model was constructed for simulation. Using the Soft Actor-Critic algorithm, muscle coordination policies for translation and rotation were trained and subsequently transferred to the physical prototype. The results demonstrate effective multi-directional locomotion and validate the feasibility of reinforcement learning as a control paradigm for complex artificial muscle systems.
-
Key words:
- Bionic robot /
- artificial muscles /
- reinforcement learning /
- soft robot /
- locomotion control
-
图 4 拉压力学性能测试: (a)人工肌肉力输出实验(b) 2.9 cm人工肌肉力输出曲线(c) 4.2 cm人工肌肉力输出曲线(d) FPCB压缩形变实验, 压缩前(e) FPCB压缩形变实验, 压缩后(f) FPCB压缩
Fig. 4 Tension-compression mechanical testing: (a) Experiment on force output of artificial muscle (b) Force output curve of 2.9 cm artificial muscle (c) Force output curve of 4.2 cm artificial muscle (d) FPCB compression deformation experiment, before compression (e) FPCB compression deformation experiment, after compression (f) FPCB compression
-
[1] Joachimczak M, Suzuki R, Arita T. Improving evolvability of morphologies and controllers of developmental soft-bodied robots with novelty search. Frontiers in Robotics and AI, 2015, 2: 33 doi: 10.3389/frobt.2015.00033 [2] Woodward M A, Sitti M. Morphological intelligence counters foot slipping in the desert locust and dynamic robots. Proceedings of the National Academy of Sciences of the United States of America, 2018, 115(36): E8358−E8367 doi: 10.1073/pnas.1804239115 [3] Ghazi-Zahedi K, Haeufle D F B, Montufar G, Schmitt S, Ay N. Evaluating Morphological Computation in Muscle and DC-motor Driven Models of Human Hopping. [Online], available: https://doi.org/10.48550/arXiv.1512.00250, Dec. 1, 2015. [4] Uppington M, Gobbo P, Hauert S, Hauser H. Evolving and generalising morphologies for locomoting micro-scale robotic agents. Journal of Micro and Bio Robotics, 2022, 18: 37−47 doi: 10.1007/s12213-023-00155-8 [5] 王久斌, 贺威, 孟亭亭, 邹尧, 付强. 基于高仿生形态布局的仿鸽扑翼飞行机器人系统设计. 自动化学报, 2024, 50(2): 308−319 doi: 10.16383/j.aas.c220836Wang J B, He W, Meng T T, Zou Y, Fu Q. System Design of Dove-like Flapping-wing Flying Robot Based on Highly Bionic Morphological Layout. Acta Automatica Sinica, 2024, 50(2): 308−319 doi: 10.16383/j.aas.c220836 [6] Wang T Y, Pierce C, Kojouharov V, Chong B X, Diaz K, Lu H, et al. Mechanical intelligence simplifies control in terrestrial limbless locomotion. Science Robotics, 2023, 8: eadi2243 doi: 10.1126/scirobotics.adi2243 [7] Chen A, Song B F, Liu K, Wang Z H, Xue D, Qi H D. Flapping-wing robot achieves bird-style self-takeoff by adopting reconfigurable mechanisms. Science Advances, 2025, 11: eadx0465 doi: 10.1126/sciadv.adx0465 [8] Zhong Q, Zhu J, Fish F E, Kerr S J, Downs A M, Bart-Smith H, et al. Tunable stiffness enables fast and efficient swimming in fish-like robots. Science Robotics, 2021, 6: eabe4088 doi: 10.1126/scirobotics.abe4088 [9] Wen L, Ren Z Y, Di Santo V, Hu K N, Yuan T, Wang T M, et al. Understanding fish linear acceleration using an undulatory biorobotic model with soft fluidic elastomer actuated morphing median fins. Soft Robotics, 2018, 5(4): 375−388 doi: 10.1089/soro.2017.0085 [10] 吴正兴, 喻俊志, 谭民. 两类仿鲹科机器鱼倒游运动控制方法的对比研究. 自动化学报, 2013, 39(12): 2032−2042 doi: 10.3724/SP.J.1004.2013.02032Wu Z X, Yu J Z, Tan M. Comparison of Two Methods to Implement Backward Swimming for a Carangiform Robotic Fish. Acta Automatica Sinica, 2013, 39(12): 2032−2042 doi: 10.3724/SP.J.1004.2013.02032 [11] Liu Z M, Liu J Q, Wang H, Yu X, Yang K, Liu W B, et al. A 1 mm-thick miniatured mobile soft robot with mechanosensation and multimodal locomotion. IEEE Robotics and Automation Letters, 2020, 5(2): 3291−3298 doi: 10.1109/LRA.2020.2976306 [12] Zhang Y F, Yang D Z, Yan P N, Zhou P W, Zou J, Gu G Y. Inchworm inspired multimodal soft robots with crawling, climbing, and transitioning locomotion. IEEE Transactions on Robotics, 2022, 38(3): 1806−1819 doi: 10.1109/TRO.2021.3115257 [13] Ren Z Y, Sitti M. Design and build of small-scale magnetic soft-bodied robots with multimodal locomotion. Nature Protocols, 2024, 19: 441−486 doi: 10.1038/s41596-023-00916-6 [14] Hu W Q, Lum G Z, Mastrangeli M, Sitti M. Small-scale soft-bodied robot with multimodal locomotion. Nature, 2018, 554: 81−85 doi: 10.1038/nature25443 [15] Niu J W, Zhang F W, Liu C L, Xie K R, Zhang Y X, Zhang J, et al. Magnetically driven biomimetic microrobot loaded with eleutheroside B for targeted delivery and neural repair in spinal cord injury. ACS Applied Materials & Interfaces, 2025, 17(30): 42688−42705 doi: 10.1021/acsami.5c07658 [16] Yu S M, Zhang W W, Feng Y Z, Zhang X, Li C H, Shi S J, et al. Magnetic cell-mimetic droplet microrobots with division and exocytosis capabilities. Research, 2025, 8: 0730 doi: 10.34133/research.0730 [17] Li T L, Yu S M, Sun B, Li Y L, Wang X L, Pan Y L, et al. Bioinspired claw-engaged and biolubricated swimming microrobots creating active retention in blood vessels. Science Advances, 2023, 9: eadg4501 doi: 10.1126/sciadv.adg4501 [18] Pan F, Liu J Q, Zuo Z H, He X, Shao Z Y, Chen J Y, et al. Miniature deep-sea morphable robot with multimodal locomotion. Science Robotics, 2025, 10: eadp7821 doi: 10.1126/scirobotics.adp7821 [19] Feng R Y, He Y M, Feng S Y, Li S G. Impulsive actuation for soft robots. npj Robotics, 2025, 3: 27 doi: 10.1038/s44182-025-00045-0 [20] Xu Y, Zhuo J S, Fan M Y, Li X, Cao X N, Ruan D R, et al. A bioinspired shape memory alloy based soft robotic system for deep-sea exploration. Advanced Intelligent System, 2024, 6: 2300699 doi: 10.1002/aisy.202300699 [21] Huang X N, Kumar K, Jawed M K, Nasab A M, Ye Z S, Shan W L, et al. Chasing biomimetic locomotion speeds: Creating untethered soft robots with shape memory alloy actuators. Science Robotics, 2018, 3: eaau7557 doi: 10.1126/scirobotics.aau7557 [22] Gu G Y, Zou J, Zhao R K, Zhao X H, Zhu X Y. Soft wall-climbing robots. Science Robotics, 2018, 3: eaat2874 doi: 10.1126/scirobotics.aat2874 [23] Wang X X, Pei X, Wang X Y, Hou T G. Lightweight untethered soft robotic fish. In: Proceedings of 2024 IEEE International Conference on Robotics and Automation (ICRA). Yokohama, Japan: IEEE, 2024. 669-675 doi: 10.1109/ICRA57147.2024.10610533 [24] Shintake J, Cacucciolo S, Shea H, Floreano D. Soft biomimetic fish robot made of dielectric elastomer actuators. Soft Robotics, 2018, 5(4): 466−474 doi: 10.1089/soro.2017.0062 [25] Wang T L, Joo H J, Song S Y, Hu W Q, Keplinger C, Sitti M. A versatile jellyfish-like robotic platform for effective underwater propulsion and manipulation. Science Advances, 2023, 9: eadg0292 doi: 10.1126/sciadv.adg0292 [26] 陶子辰, 刘松源, 桂昀, 郝思远, 方浩, 杨庆凯. 张拉整体跨域机器人的设计与控制. 机器人, 2025, 47(3): 338−347 doi: 10.13973/j.cnki.robot.240303Tao Z C, Liu S Y, Gui Y, Hao S Y, Fang H, Yang Q K. Design and Control of Tensegrity Based Cross-domain Robot. Robot, 2025, 47(3): 338−347 doi: 10.13973/j.cnki.robot.240303 [27] Mo J X, Gao C Q, Fang H, Yang Q K. Design and locomotion characteristic analysis of a novel tensegrity hopping robot. In: Proceedings of 2023 IEEE International Conference on Robotics and Biomimetics (ROBIO). Koh Samui, Thailand: IEEE, 2023. 1-8 doi: 10.1109/ROBIO58561.2023.10354708 [28] Mo J X, Fang H, Yang Q K. Design and locomotion characteristic analysis of two kinds of tensegrity hopping robots. iScience, 2024, 27(3): 109226 doi: 10.1016/j.isci.2024.109226 [29] 陈雯慧, 周晓航, 刘珂. 液晶弹性体在人工肌肉领域的研究进展. 液晶与显示, 2025, 40(2): 201−217 doi: 10.37188/CJLCD.2024-0228Chen W H, Zhou X H, Liu K. Application of liquid crystal elastomers in the development of artificial muscles. Chinese Journal of Liquid Crystals and Displays, 2025, 40(2): 201−217 doi: 10.37188/CJLCD.2024-0228 [30] Chen W H, Tong D Z, Meng L H, Tan B W, Lan R C, Zhang Q F, et al. Knotted artificial muscles for bio-mimetic actuation under deepwater. Advanced Materials, 2024, 36: 2400763 doi: 10.1002/adma.202400763 [31] Chen W H, Yang S A, Zhu C, Cheng Y T, Shi Y T, Yu C P, et al. Scalable jet swimmer driven by pulsatile artificial muscles and soft chamber buckling. Advanced Materials, 2025, 37: 2503777 doi: 10.1002/adma.202503777 [32] Lai M, Go K, Li Z B, Kroger T, Schaal S, Allen K, et al. RoboBallet: Planning for multirobot reaching with graph neural networks and reinforcement learning. Science Robotics, 2025, 10: eads1204 doi: 10.1126/scirobotics.ads1204 [33] Cao S J, Sun L, Jiang J J, Zuo Z Y. Reinforcement learning-based fixed-time trajectory tracking control for uncertain robotic manipulators with input saturation. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(8): 4584−4595 doi: 10.1109/TNNLS.2021.3116713 [34] Pavlichenko D, Behnke S. Real-robot deep reinforcement learning: Improving trajectory tracking of flexible-joint manipulator with reference correction. In: Proceedings of 2022 IEEE International Conference on Robotics and Automation (ICRA). Philadelphia, PA, USA: IEEE, 2022. 2671−2677 doi: 10.1109/ICRA46639.2022.9812023 [35] He J Z, Zhang C, Jenelten F, Grandia R, Bacher M, Hutter M. Attention-based map encoding for learning generalized legged locomotion. Science Robotics, 2025, 10: eadv3604 doi: 10.1126/scirobotics.adv3604 [36] Huang H D, Sun S L, Zhao Z D, Huang H L, Shen C Q, Xu W F. PTRL: Prior Transfer Deep Reinforcement Learning for Legged Robots Locomotion. [Online], available: https://doi.org/10.48550/arXiv.2504.05629, Apr. 8, 2025. [37] 李源潮, 陶重犇, 王琛. 基于最大熵深度强化学习的双足机器人步态控制方法. 计算机应用, 2024, 44(02): 445−451 doi: 10.11772/j.issn.1001-9081.2023020153Li Y C, Tao C B, Wang C. Gait control method based on maximum entropy deep reinforcement learning for biped robot. Journal of Computer Applications, 2024, 44(02): 445−451 doi: 10.11772/j.issn.1001-9081.2023020153 [38] 吴晓光, 刘绍维, 杨磊, 邓文强, 贾哲恒. 基于深度强化学习的双足机器人斜坡步态控制方法. 自动化学报, 2021, 47(8): 1973−1987 doi: 10.16383/j.aas.c190547Wu X G, Liu S W, Yang L, Deng W Q, Jia Z H. A Gait Control Method for Biped Robot on Slope Based on Deep Reinforcement Learning. Acta Automatica Sinica, 2021, 47(8): 1973−1987 doi: 10.16383/j.aas.c190547 [39] Ma J C, Lu H M, Xiao J H, Zeng Z W, Zheng Z Q. Multi-robot target encirclement control with collision avoidance via deep reinforcement learning. Journal of Intelligent & Robotic Systems, 2020, 99: 371−386 doi: 10.1007/s10846-019-01106-x [40] Zhou Z Q, Zhu P M, Zeng Z W, Xiao J H, Lu H M, Zhou Z T. Robot navigation in a crowd by integrating deep reinforcement learning and online planning. Applied Intelligence, 2022, 52: 15600−15616 doi: 10.1007/s10489-022-03191-2 [41] Hua H, Wamg Y N, Zhong H, Zhang H, Fang Y C. Deep reinforcement learning-based hierarchical motion planning strategy for multirotors. IEEE Transactions on Industrial Informatics, 2025, 21(6): 4324−4333 doi: 10.1109/TII.2024.3523594 [42] 朱亚洲, 刘煜莹, 王亚东, 谢慧婷, 李恭新. 基于液晶弹性体的仿尺蠖软体机器人. 液晶与显示, 2025, 40(4): 527−535 doi: 10.37188/CJLCD.2025-0002Zhu Y Z, Liu Y Y, Wang Y D, Xie H T, Li G X. Inchworm-like soft robot based on liquid crystal elastomer. Chinese Journal of Liquid Crystals and Displays, 2025, 40(4): 527−535 doi: 10.37188/CJLCD.2025-0002 [43] Wu S, Hong Y Y, Zhao Y, Yin J, Zhu Y. Caterpillar-inspired soft crawling robot with distributed programmable thermal actuation. Science Advances, 2023, 9: eadf8014 doi: 10.1126/sciadv.adf8014 [44] Rogers J A, Someya T, Huang Y. Materials and Mechanics for Stretchable Electronics. Science, 2023, 9: eadf8014 doi: 10.1126/sciadv.adf8014 [45] Todorov E, Erez T, Tassa Y. MuJoCo: A physics engine for model-based control. In: Proceedings of 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Vilamoura-Algarve, Portugal: IEEE, 2012. 5026-5033 doi: 10.1109/IROS.2012.6386109 [46] Kumar S, Narayanan M S, Singhal P, Corso J J, Krovi V. Surgical tool attributes from monocular video. In: Proceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA). Hong Kong, China: IEEE, 2014. 4887-4892 doi: 10.1109/ICRA.2014.6907575 [47] Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, et al. OpenAI Gym. [Online], available: https://doi.org/10.48550/arXiv.1606.01540, Jun. 5, 2016. [48] Ma J, Han Z J, Yang L S, Min G C, Liu Z J, He W. Dynamics modeling of a soft arm under the Cosserat theory. In: Proceedings of 2021 IEEE International Conference on Real-time Computing and Robotics (RCAR), Xining, China: IEEE, 2021. 87-90 doi: 10.1109/RCAR52367.2021.9517660 [49] Li J Z, Ma J, Hu Y J, Zhang L, Liu Z J, Sun S Y. Vision-based reinforcement learning control of soft robot manipulators. Robotic Intelligence and Automation, 2024, 44(6): 783−790 doi: 10.1108/RIA-01-2024-0002 [50] 杨妍, 刘运鹏, 韩江涛, 刘志杰, 韩志冀. 软体机械臂的建模与神经网络控制. 工程科学学报, 2023, 43(3): 454−464 doi: 10.13374/j.issn2095-9389.2021.12.17.003Yang Y, Liu Y P, Han J T, Liu Z J, Han Z J. Modeling and neural network control of a soft manipulator. Chinese Journal of Engineering, 2023, 43(3): 454−464 doi: 10.13374/j.issn2095-9389.2021.12.17.003 [51] 程屹涛, 杨焕煜, 刘珂. 基于梁单元的曲面软体机器人简化力学模型. 机器人, 2025, 47(5): 646−656 doi: 10.13973/j.cnki.robot.240122Cheng Y T, Yang H Y, Liu K. Reduced Order Model for Soft Robotic Surface Based on Beam Elements. Robot, 2025, 47(5): 646−656 doi: 10.13973/j.cnki.robot.240122 [52] Sutton R S, Barto A G. Reinforcement Learning: An Introductions, 2nd ed. Cambridge, MA: MIT Press, 2018. [53] Haarnoja T, Zhou A, Hartikainen K, Tucker G, Ha S, Tan J, et al. Soft Actor-Critic Algorithms and Applications. [Online], available: https://doi.org/10.48550/arXiv.1812.05905, Dec. 13, 2018. [54] Haarnoja T, Zhou A, Abbeel P, Levine S. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. [Online], available: https://doi.org/10.48550/arXiv.1801.01290, Jan. 4, 2018. -
计量
- 文章访问数: 9
- HTML全文浏览量: 5
- 被引次数: 0
下载: