The field of Humanoids Robotics is widely recognized as the current challenge for robotics research .The humanoid research is an approach to understand and realize the complex real world interactions between a robot, an environment, and a human. The humanoid robotics motivates social interactions such as gesture communication or co-operative tasks in the same context as the physical dynamics. This is essential for three-term interaction, which aims at fusing physical and social interaction at fundamental levels.

People naturally express themselves through facial gestures and expressions. Our goal is to build a facial gesture human-computer interface fro use in robot applications. This system does not require special illumination or facial make-up. By using multiple Kalman filters we accurately predict and robustly track facial features. Since we reliably track the face in real-time we are also able to recognize motion gestures of the face. Our system can recognize a large set of gestures (13) ranging from “yes”, ”no” and “may be” to detecting winks, blinks and sleeping.
Integration of vision and touch in edge

In order to validate the anthropomorphic model of sensory-motor co-ordination in grasping, a module was implemented to perform visual and tactile edge tracking, considered as the first step of sensory-motor co-ordination in grasping actions.
The proposed methodology includes the application of the reinforcement-learning paradigm to back propagation NNs, in order to replicate the human capability of creating associations between sensory data and motor schemes, based on the results of attempts to perform movements. The resulting robot behavior consists in co-ordinating the movement of the fingertip along an object edge, by integrating visual information on the edge, proprioceptive information on the arm configuration, and tactile information on the contact, and by processing this information in a neural framework based on the reinforcement-learning paradigm. The aimed goal of edge tracking is pursued by a strategy starting from a totally random policy and evolving via rewards and punishments

The Vision System:

The use of MEP tracking system is made to implement the facial gesture interface. This vision system is manufactured by Fujitsu and is designed to track in real time multiple templates in frames of a NTSC video stream. It consists of two VME-bus cards, a video module and tracking module, which can track up to 100 templates simultaneously at video frame rate (30Hz for NTSC).
The tracking of objects is based on template (8x8 or 16x16 pixels) comparison in a specified search area. The video module digitizes the video input stream and stores the digital images into dedicated video RAM. The tracking module also accesses this RAM. The tracking module compares the digitized frame with the tracking templates within the bounds of the search windows.
This comparison is done by using a cross correlation which sums the absolute difference between corresponding pixels of the template and the frame. The result of this calculation is called the distortion and measures the similarity of the two comparison images. Low distortions indicate a good match while high distortions result when the two images are quite different.