查看更多>>摘要:The tendon-sheath mechanism (TSM) has significantly advanced both robotic systems and minimally invasive surgery (MIS) by enabling flexible and precise movement through narrow and tortuous paths. However, the inherent flexibility of TSM introduces nonlinear behaviors which depend on its geometrical shape and applied forces, making accurate control challenging. Furthermore, the shape dependency becomes critical in endoscopic robots, where the geometrical shape varies and is not directly visible, limiting the applicability of existing distal sensorless compensation methods. To address the geometry identification problem of TSM, this paper proposes an approach that utilizes real-time visual input from an endoscopic camera for on-line calibration of the TSM's physical model. By introducing the concept of the ‘Equivalent Circle,’ complex shapes of TSMs are simplified, enabling the estimation of their equivalent geometry without direct observation or measurement. Simulation results validate the equivalent circle model, demonstrating minimal deadband percentage errors despite larger discrepancies in equivalent radii across varied configurations. On-line calibration experiments achieved a percent error of 1.38% (±2.92%) for accumulated curve angles and 2.32% (±3.08%) for equivalent radii, demonstrating the method's reliability in shape estimation across varying conditions. In prediction and feedforward experiments, leveraging the equivalent circle to compensate for deadband in arbitrarily shaped TSMs resulted in a maximum trajectory error of 0.25 mm and an RMSE of 0.09 mm. This approach advances distal sensorless control, improving the operational accuracy and feasibility of endoscopic surgical robots under varying geometrical and force conditions.
查看更多>>摘要:In numerous real-world applications, the ability to accurately perceive and respond to dynamic changes in the environment, while also maintaining the flexibility to transfer learned skills across different tasks, is crucial for the effective operation of robotic arms. Behavior cloning is particularly promising in this context due to its data efficiency and strong task transferability, enabling robots to quickly adapt to new tasks by learning from demonstrations. However, traditional behavior cloning methods, which rely primarily on the observation and state information of the current frame to predict subsequent actions, fall short in dynamic contexts due to their static nature. To address this limitation, we propose Dynamic Behavior Cloning with Temporal Feature Prediction (DBC-TFP), which integrates with behavior cloning by leveraging historical frames to capture dynamic features crucial for predicting future scene images. This method uses a loss function based on the mean squared error (MSE) between the predicted future scene image and the ground truth counterpart, improving the model's accuracy in action prediction for dynamic scenarios. To evaluate our approach, we design a benchmark comprising eight task scenarios, including six foundational tasks and two advanced tasks. Experimental results on this benchmark demonstrate that DBC-TFP significantly improves the success rate of behavior cloning in dynamic scenarios compared to traditional behavior cloning methods.
查看更多>>摘要:Reinforcement learning (RL) has demonstrated success across multiple robotic grasping and manipulation tasks. However, for RL to be widely applicable, policies must be able to transfer across the sim-to-real gap, and transfer to hand geometries that they are not trained on. Methods such as domain randomization and domain adaptation only partially help with bridging these gaps. In this letter, we explore the impact of state and action space selection on transferability across both the sim-to-real gap and across different hand geometries. Using two exemplar manipulation tasks we demonstrate that state and action space selection significantly affect the overall performance of a policy and its robustness to both types of transfer. We also show that, for both types of transfer, a reduced state space that avoids hand specific information is preferable, even when it provides less information than a full state space.
查看更多>>摘要:Neuromorphic sensors are a promising technology in artificial touch due to their low latency and low computational and power requirements, particularly when paired with spiking neural networks (SNNs). Here, we explore the ability of these systems to adapt to and generalize across varying sources of uncertainty in tactile tasks. We choose Braille reading as an application task and collect event-based data for 27 braille characters with a neuromorphic tactile sensor (NeuroTac) under varying conditions of tapping speed, center position and indentation depth using a 6-DOF robot arm. We initially analyze the effect of spatial location and speed on classification performance with spiking convolutional neural networks (SCNNs). We then show that SCNNs are able to generalize across each dimension. The final general SCNN model reaches 95.33% accuracy with uncertainty in all 4 dimensions. This research demonstrates the noise degradation performance of SCNNs in a tactile task, and outlines the potential of a single SCNN to generalize across several dimensions of uncertainty.
Thomas PritchardSaifullah IjazRonald ClarkBasaran Bahadir Kocer...
5233-5240页
查看更多>>摘要:Recent advancements in visual odometry systems have improved autonomous navigation, yet challenges persist in complex environments like forests, where dense foliage, variable lighting, and repetitive textures compromise the accuracy of feature correspondences. To address these challenges, we introduce ForestGlue. ForestGlue enhances the SuperPoint feature detector through four configurations – grayscale, RGB, RGB-D, and stereo-vision inputs – optimised for various sensing modalities. For feature matching, we employ LightGlue or SuperGlue, both of which have been retrained using synthetic forest data. ForestGlue achieves comparable pose estimation accuracy to baseline LightGlue and SuperGlue models, yet require only 512 keypoints, just 25% of the 2048 keypoints used by baseline models, to achieve an LO-RANSAC AUC score of 0.745 at a 10° threshold. With a 1/4 of the keypoints required, ForestGlue has the potential to reduce computational overhead whilst being effective in dynamic forest environments, making it a promising candidate for real-time deployment on resource-constrained platforms such as drones or mobile robotic platforms. By combining ForestGlue with a novel transformer based pose estimation model, we propose ForestVO, which estimates relative camera poses using the 2D pixel coordinates of matched features between frames. On challenging TartanAir forest sequences, ForestVO achieves an average relative pose error (RPE) of 1.09 m and kitti_score of 2.33%, outperforming direct-based methods such as DSO in dynamic scenes by 40%, while maintaining competitive performance with TartanVO despite being a significantly lighter model trained on only 10% of the dataset. This work establishes an end-to-end deep learning pipeline tailored for visual odometry in forested environments, leveraging forest-specific training data to optimise feature correspondence and pose estimation for improved accuracy and robustness in autonomous navigation systems.
查看更多>>摘要:As one of the minimally invasive surgeries (MIS), transoral robotic surgery (TORS) has garnered sustained interest for the treatment of pathological tissue, such as oropharyngeal tumors. Flexible manipulators employed in transoral surgeries necessitate the variable-stiffness capabilities to perform diverse tasks. Specifically, the manipulator must flexibly navigate through natural orifices to reach the target site and subsequently enhance the stiffness to provide a stable platform for surgical instrument manipulation. However, most existing flexible surgical manipulators have relatively small stiffness variation ratio and long transition time between flexible and rigid states. In this letter, we proposed a novel cable-driven variable-stiffness flexible manipulator with teeth-engagement structure to address such challenges. The proposed manipulator has an 18-mm diameter and provides four channels for forceps, electric knife, water/gas, and CMOS camera. Experiment results of manipulator's bending range and bending characteristics indicated that the manipulator could meet the requirements of transoral surgery. The variable-stiffness experiments showed that the manipulator could achieve a stiffness variation ratio up to 84.07 folds. Laryngeal phantom experiments and ex vivo tissue experiments were performed to further demonstrate the feasibility of the proposed manipulator. We believe this study could provide new ideas for the development of flexible manipulators requiring high load capability.