PartGrasp

Universal grasping capability is vital for robotic applications in open environments. Traditional end-to-end methods exhibit limited generalization due to insufficient training data. Recent advancements in foundation models have enabled grasping capability transfer from a few demonstrations to unseen objects. However, the deficiency of foundation models in 3D structural analysis restricts them to transferring only the coarse information, such as grasping points, impeding precise grasping transfer based on 3D structural correspondence between unseen and demonstrated objects. To overcome this limitation, we propose PartGrasp, a novel one-shot grasping method that complements the deficient 3D cognition of foundation models with fine-grained correspondence of inherent geometric structures of object parts. Specifically, we first construct object-level descriptor fields by grounding features from foundation models into 3D scene representation. Then we establish part-level correspondence for grasp transferring through a coarse-to-fine, semantic-to-geometric pipeline. Extensive experiments demonstrate the superior ability of our method in both generalization and precision.

PartGrasp: Generalizable Part-level Grasping via Semantic-Geometric Alignment

Demonstrations of intra-category experiments (left: reference object, right: target object).

Demonstrations of inter-category experiments (left: reference object, right: target object).

Abstract

The Pipleline of PartGrasp