07/12/2016 | Posted in:News

Team Delft won the Amazon Picking Challenge (APC) almost two weeks ago. Now that everybody had a bit of time to recover, we interviewed Dr. Carlos Hernandez Corbato,TU Delft Robotics Institute, one of the team leaders of Team Delft, on the technical details.

Congratulations to the great performance. What was the most difficult challenge from a technical point-of-view in the APC?

Joey Durham/Amazon Robotics (right) hands over the award of the Amazon Picking Challenge to Dr. Hernandez Corbato (middle) and Mr. van Deurzen from Team Delft. ©Amazon

Joey Durham/Amazon Robotics (right) hands over the award of the Amazon Picking Challenge to Dr. Hernandez Corbato (middle) and Mr. van Deurzen from Team Delft. ©Amazon

Dr. Hernandez Corbato: Dealing with objects with reflective surfaces, e.g. transparent plastic wrapping, was probably the hardest challenge for our vision system. Our main solution for grasp planning required to estimate the object position and orientation, and for that we relied on PointCloud data (depth information) from our 3D camera to match with a 3D model of the object. But due to reflections there was simply not enough Point Cloud data to determine the pose of some objects in certain situations.

How did you solve it?

We included application-specific heuristics on top of our pose-estimation algorithm (Super4PCS) to correct it: e.g. we snapped the estimated height of the object position to be lying in its bin, since objects could not be piled or ‘floating’. This did not solve the problem completely, but our approach was to have a fast and robust system, so that upon difficulties it puts the difficult object in a ‘to do’ list, that is addressed once the rest of the task is finished. After some trials and different viewpoints to get a better 3D image, the robot manages to pick also these difficult objects.

Which part that you thought to be easy in the beginning proved to be more difficult?

The robot of Team Delft.

The robot of Team Delft.

It happened the other way around. We though there was no easy part in the challenge, and were specially concerned about deformable objects, e.g. t-shirt, gloves. They could be hard to recognise, and moreover you do not have a static 3D model of these object for pose estimation. However, our robust approach to object recognition using Deep Learning performed excellent even with those objects, and the powerful suction in our gripper paved the way for grasping. We discovered that the robot could easily pick deformable products without complete 3D pose estimation, by simply aiming the suction cup to the centre of the region where it is detected.

Please give us a short description of the outstanding features of your robot.

Following the Factory-in-a-day approach, we analysed the picking application and chose the best technologies and components to build a robust and fast solution, as industry would require. We aimed for a really fast and robust system, capable of detecting failed picks and try them again. Instead of trying slow complex manoeuvres to reach occluded products, our robot simply removes blocking objects to other bins, keeping track of the location of all products. We used ROS-Industrial to quickly test and integrate the software components that we needed, and develop those not available (e.g. a generic grasp planner for a suction-based-gripper).

  • The Robot itself: After preliminary analysis, we concluded that maximum reachability in the workspace was a critical requirement for the competition. We chose a configuration with aSIA20F Motoman robot arm by Yaskawa mounted on a rail. The total 8 degrees of freedom allowed the system to reach all the bins with enough manoeuvrability to pick the target objects. Thanks to the ROS-Industrial drive supported by Motoman could integrate our complex robot configuration in our system, and we have contributed to improved the driver features.
  • Perception: We decided to avoid noise and calibration problems using industrial stereo camera’s with rgb overlay camera’s. One for detection of objects in the tote and one on the robot gripper to scan the bins. Detection of the target object went in two steps. For
    object recognition we used a deep learning neural network based on Faster RCNN. Next we used an implementation
    of super4PCS to estimate the pose of the object, refining it with ICP (Iterative Closest Point).
  • Grasping: Following the Factory-in-a-day approach of quick 3D prototyping, we developed a hybrid suction+pinch gripper customised to handle all the 39 products in the competition. We designed an algorithm that auto generates candidate grasp locations on the objects based on their estimated pose from the vision system, their geometry and application constraints. Object-specific heuristics
    were also included after intensive testing.
  • Motion: For motion planning and control we develop customised ROS-services on top of MoveIt!. To optimise motions we created a ‘motion-database’, with static free-collision trajectories between relevant locations, e.g. image-capture and approach locations in front of the shelf’s bins and over the tote. For the approach and retreat motions to pick objects we used dynamic cartesian planning using the depth information from the 3D camera for collision avoidance.

Factory-in-a-day aims at quick installation time: did you fulfill this and could you repeat the Picking challenge with another robot as successfully?

The development of the APC project took four months, from March to June. But we got the robot only five weeks before the competition. Thanks to ROS we could prepare the required components and test them in advance using Rviz and the ROS-Industrial simulator. But time was so short, we could only focus on testing for the Picking challenge with the real robot, and not for the new Stowing challenge. I think this is a good demonstration of the Factory-in-a-day approach. We arrived to the competition, installed the system and calibrated it in less than a day. Then we integrated and tested the stowing during a day and a half. Next day we won the Stowing Challenge with an almost perfect scoring.

Our solution for perception, grasp planning and task planning and coordination could be easily re-used. Only our robot motion and control subsystem would need to be adapted.

How long did it take to teach the robot grasping 40 objects? And is there some kind of “learning algorithm” that would make it faster in the future?

We used Deep learning techniques for the recognition of the products. Given the time constraint and our custom hybrid gripper with suction and pinch, we chose a customised solution to automatically generate the grasp strategy based on geometry information, item-specific heuristics and application constraints. This took 10 person week to develop. We are now considering using Deep Learning also to teach the system how to grasp objects.

I imagine the last months were quite stressful, what was the first thing you did after returning to Delft?

I went to have a nice dinner at a new restaurant in the cozy Delft’s city centre! Now we are working on consolidating the knowledge acquired during the project, and preparing the technology we developed to release it to the community.

Thank you!


Here are links to the real-time videos of Team Delft’s performance:

Picking: https://youtu.be/3KlzVWxomqs

Stowing: https://youtu.be/AHUuDVdiMfg

There are also a number of articles in our media section.