PoseIt: A Visual-Tactile Dataset of Holding Poses for Grasp Stability Analysis
Shubham Kanitkar*, Helen Jiang*, Wenzhen Yuan
Carnegie Mellon University
When humans grasp objects in the real world, we often move our arm to hold the object in a different pose where we can use it. In contrast, typical lab settings only study the stability of the grasp immediately after lifting, without any subsequent re-positioning of the arm. However, the grasp stability could vary widely based on the object’s holding pose, as the gravitational torque and gripper contact forces could change completely. To facilitate the study of how holding poses affect grasp stability, we present PoseIt, a novel multi-modal dataset that contains visual and tactile data collected from a full cycle of grasping an object, re-positioning the arm to one of the sampled poses, and shaking the object. Using data from PoseIt, we can formulate and tackle the task of predicting whether a grasped object is stable in a particular held pose. We train an LSTM classifier which achieves 85% accuracy on the proposed task. Our experimental results show that multi-modal models trained on PoseIt achieve higher accuracy than using solely vision or tactile data and that our classifiers can also generalize to unseen objects and poses.
Holding Pose Sample Space: A set of 16 distinct poses to mimic typical human-like holding poses.
Group 1 poses take advantage of the gravity to stabilize potential rotational movements in the case of objects with varying mass distribution.
Group 2 poses gradually rotate the gripper to counterbalance the gravitational force to avoid slippage for objects with non-uniformly distributed mass.
Group 3 poses aid in analyzing the stability of the overhead (above the shoulder joint) arm movements by incrementally increasing the object’s distance from the ground.
Robot Setup and Data Modalities – We use this setup to collect PoseIt, which consists of multi-modal data on 26 objects, including RGB-D cameras, GelSight tactile sensor, force/torque sensor, and robot trajectory.
In total, PoseIt consists of 3680 from 26 diverse objects across 16 different poses. We collect 1840 data points each with both the Gelsight sensor and with the WSG-DSA pressure array sensor.
Note – WSG-DSA pressure array sensor data is not included in the published work.