School of Computer Science and Software Engineering

Postgraduate profiles


Umar Asif

Phone: (+61 8) 6488 3455
Fax: (+61 8) 6488 1089

Start date

Aug 2012

Submission date

Feb 2016

Curriculum vitae

Umar Asif CV
[text/rtf, 538.33 kb]
Updated 07 Jan 2014

Umar Asif

Umar Asif profile photo


RGB-D Vision for Robot Manipulation in Highly Complex Human-Living Environments


Robots that work in cooperation with humans in their homes and offices could extent the time an elderly person can live at home, provide physical assistance to a worker or assist with household chores. Human environments present special challenges for robot manipulation since they are complex, dynamics, and difficult to perceive reliably. The exchange of daily life objects between humans and robots is a fundamental cooperative manipulation task. A fully autonomous perception system can assist elderly people in their daily activities such as table cleaning, handing objects to/from a person and even preparing food. Furthermore, cooperative manipulation tasks such as a robot holding a box within which the user can place objects can enable human-robot interactions where the human and robot complement one another’s abilities and work together to achieve results. In this work, we present advancements in robot manipulation that address the unique challenges of human living environments. We specifically, describe the design of a robotic perception framework that allow a mobile robot assist a person in everyday tasks, and discuss general techniques for building robotic perception systems that enable robots work alongside humans in their homes and offices.

What makes the “manipulation in human living environments” as the greatest challenge for service robots, is the prime focus of our investigation throughout this work. The variety of objects and their changing locations (without the robot action) makes a human living environment, variable, dynamic and difficult to model. Moreover, every home and workplace is different and perceptually noisy (appearance of objects change due to lightning variations, presence of clutter and occlusion in unpredictable ways, objects which are worn or dirty make prior description of their appearance inaccurate). Above all, the presence of humans arises another challenge for the robots to be responsive, operating in real-time and perform their physical interactions in a safe manner. Thus, for manipulation in human environments, a fully autonomous robotic perception framework should be able to smoothly transition between different tasks, robust to real-world noise and recover from task failures.

The availability of low cost range sensors has equipped the robots with the necessary data for multi-category recognition tasks yet a few systems offer reliable and robust real-time tracking of both textured and non-textured objects in cluttered real-world environments. The second big challenge is to achieve low system latency while not compromising with the accuracy and robustness of the perception system. Many current approaches are not practical because of costly computation and their inaccurate pose tracking in complex real world environments. Our goal is to enable robots to accomplish object grasping tasks of everyday objects. Because of the characteristics of human environments, we avoid methods that rely on precise knowledge about the state of the world and have high computational cost. Instead, we are guided by the ideas of reasonable trade-off between accuracy and system latency, and using low cost imaging sensors.

This work also advances our understanding of how to build robots that can perform cooperative manipulation tasks. Our work is implemented and tested on real hardware. Our experimentation encompasses performance evaluation using both offline datasets as well as online real-world test scenarios. We specifically, demonstrate the capabilities of our system in scenarios such as: 1) picking/placing objects from a table, 2) taking objects from a person, and 3) passing the objects to a person. We address the challenges (low system latency with high detection rate, recognition of transparent objects, manipulation of unseen objects and safe human-robot interaction in cooperative manipulation tasks) in the context of object manipulation through a single integrated framework that can be scalable to a variety of household objects in different real-world human environments.

Why my research is important

With the recent advancements in the navigation and mobile manipulation for indoor service robotics, one of the most challenging problem of robot perception still requires a reliable and robust solution to be able to cope with the difficulties of autonomous object manipulation within house or office environments. In the view of these challenges, we aim to address the task of establishing a practical robot perceptual system for rigid object manipulation which involves detection, recognition, and grasping in highly complex environments. Our robot perceptual system would benefit the home robotics in the following ways:

-Assistive robots aim to provide physical or instructional assistance to people in need, such as the elderly, hospital patients, and people with physical disabilities. Manipulation is generally thought of as an isolated activity of just the robot. However, a robotic perceptual system capable of performing cooperative manipulation can intuitively leverage the advanced perceptual and planning skills of a collaborative partner to result better output.

-Conventional robot manipulators are stiff and dangerous. Our design configuration of light-weight arms provide intrinsic safety.

-Our object tracking approach emphasizes the importance for a robot to constantly sense its environment instead of referring to an internal model. Consequently, our robot perceptual system is specialized and more robust that can be generalized across a variety of daily life household objects.

-Using full-scale imagery provides larger number of visual (RGB-D) cues which result in higher detection rates for general object recognition tasks.

-Our clustering method is unique in the sense that it fuses visual cues with surface semantic information to perform segmentation of scenes containing transparent objects. This allows a great flexibility for object manipulation tasks in human environments since a large classes of the household items are either non-textured or transparent.

-Our performance analysis is done in unmodified real-world human environments characterized by a large variability in lighting, cluttered backgrounds, and interactions with people. Consequently, our visual attention system is appropriate for real-world situations.

-Ultimately, our objective (to do work that is useful to people) is met through real-time implementation of our methods on a real-world robot performing domestic chores such as table cleanup.

-Our methods would also benefit the industrial manipulators which lack haptics. Since, it is not possible to physically guide an industrial robot in a free-hanging gravity mode, therefore our proposed framework would help in improving the precision and efficiency of reach-to-grasp operations in industrial manipulators.


  • This work is supported by Australian Research Council grants DP110102166, and DE120102960.


School of Computer Science and Software Engineering

This Page

Last updated:
Wednesday, 13 February, 2013 8:19 AM