Welcome Guestlogin to KGsePGregister at KGsePG email | FAQs

Model-based Integration Of Visual Cues For Hand Tracking

download

    1 of 7

    Model-based Integration Of Visual Cues For Hand Tracking



    Model-based Integration Of Visual Cues For Hand Tracking - Transcript


    Model based Integration of Visual Cues for Hand Tracking
    Shan Lu Gang Huang Dimitris Samaras Dimitris Metaxas

    CS Department CIS Department CS Department CS Department Rutgers University University of Pennsylvania S U N Y Stony Brook Rutgers University Piscataway NJ 08854 Philadelphia PA 19104 Stony Brook NY 11794 Piscataway NJ 08854

    Abstract
    We present a model based approach to the integration of multiple cues for tracking high degree of freedom articulated motions We then apply it to the problem of hand tracking using a single camera sequence Hand tracking is particularly challenging because of occlusions shading variations and the high dimensionality of the motion The novelty of our approach is in the combination of multiple sources of information which come from edges optical ow and shading information In particular we introduce in deformable model theory a generalized version of the gradient based optical ow constraint that includes shading ow i e the variation of the shading of the object as it rotates with respect to the light source This constraint uni es the shading and the optical ow constraints it simpli es to each one of them when the other is not present Our use of cue information from the entirety of the hand enables us to track its complex articulated motion in the presence of shading changes Given the model based formulation we use shading when the optical ow constraint is violated due to signi cant shading changes in a region We use a forward recursive dynamic model to track the motion in response to 3D data derived forces applied to the model The hand is modeled as a base link palm with ve linked chains ngers while the allowable motion of the ngers is controlled by recursive dynamics constraints Model driving forces are generated from edges optical ow and shading The effectiveness of our approach is demonstrated with experiments on a number of different hand motions with shading changes rotations and occlusions of signi cant parts of the hand

    1 Introduction
    In this paper we present a model based approach to high degree of freedom articulated motion tracking based on the integration of visual cues and apply it to the problem of 1

    hand tracking using a single camera sequence Hand tracking has received signi cant attention in the last few years because of its crucial role in the design of new human computer interaction methods gesture analysis and sign language understanding Glove based devices capture human hand motion directly but are expensive and hard to use Vision based hand tracking is a cost effective non invasive alternative Serious challenges lie in the high number of degrees of freedom and the problem of occlusions Two general approaches have been suggested for this problem Model based approaches try to estimate the position of a hand by projecting a 3 D hand model to image space and comparing it with image features ngertips 26 25 27 line segments 26 A spline based hand shape model was used in 24 to minimize differences between the silhouettes Others 30 26 have used stereo to avoid occlusions Appearance based approaches estimate hand postures directly from the images after learning the mapping from image feature space to hand con guration space 29 28 Such systems are more useful for recognizing discrete hand states than for general purpose hand tracking Study of motion and shading together has been recently formalized 21 23 and extended to multiple views 22 Our approach is model based and hence can work with a single view Our rst contribution is in the combination of cue forces from edges optical ow and shading In particular we introduce in deformable model theory a generalized version of the gradient based optical ow constraint that includes shading ow i e the variation of the shading of the object as it rotates with respect to the light source This constraint uni es the shading and the optical ow constraints and degenerates to each one of them when the other is not present Although optical ow and edges in deformable models have been used in the past 20 as well as shading 19 these two methods were applied to different problem domains moving and static objects respectively In this paper we combine them to correct for the errors due to the brightness constancy assumption We use cue information from the entirety of the hand and we are able to track

    its complex articulated motion in the presence of shading changes Given the model based formulation we augment the optical ow constraint with shading information The hand can have as many as 26 degrees of freedom when we model it as a multiple open chain structure The dynamic kinematic problem of such a large system which contains not only open chains but also closed chains can be modeled as a sub problem of robotic mechanisms There are many forward and inverse dynamics simulation techniques for human and robotic motion 14 16 18 10 17 13 15 The second contribution of our approach is the use of a forward recursive dynamic model to track motion in response to 3D data derived forces applied to the model The hand is modeled as a base link palm with ve linked chains ngers Using such a formulation we limit the allowable motion of the ngers with the use of recursive dynamics constraints The model s driving forces are computed from image cues such as edges optical ow and shading In our formulation we compute from edge optical ow and shading cue constraints 2D data based forces The perspective camera model is used to convert these 2D forces into 3D forces that drive the hand model These 3D forces are then used to calculate the acceleration of our dynamic hand its new velocity and new position Since this is a second order dynamic hand model we use it to predict nger motion from one frame to the next so that we are closer to the data in the next frame To avoid unnecessary calculations of the shading constraint we monitor the intensity changes in several hand areas during tracking and use it only if these changes are signi cant The dynamic hand model is described in Sec 2 Sec 3 presents model initialization and generation of image forces Sec 4 introduces illumination information on the optical ow constraint Recursive dynamics of the hand model and constraints on the allowable motion are presented in Sec 5 Tracking experiments are shown in section 6 ranging including complex palm nger tracking with signi cant rotation

    dummy links and 20 one degree of freedom revolute joints We number the palm base link as link 0 For each nger there are 4 links including one dummy link and 4 joints The joint connecting the nger to the palm is joint 1 and link 1 connects joint 1 and joint 2 Link 1 is the dummy link Joint connects link and link link links joint and joint Each link has a local coordinate frame xed to its starting end The above geometric model is based on the measurements of an average male The user speci es the joint locations in the image to initialize model nger lengths When the hand is illuminated by a directional light we recover surface normal elds of parts of the hand by tting this basic model based on previous deformable model methodology that uses shape from shading and edges 19 to images of the hand These normals will be used to calculate the generalized shading ow constraint

    3 Image Based Cues
    3 1 Fitting the 3 D Model to 2 D images
    This approach needs a geometric 3 D model to transform 2 D forces into 3 D ones which will be applied on the dynamic model Initially the model is tted to a known pose of the hand as can be seen in Figure 1 b At this stage of the work we assume knowledge of the camera parameters At each frame visibility checking is performed in order to match correctly image and model points The computation of the relative motion to the palm of occluded ngers is based on the rigid motion of the hand When the relative motion is not too large we pick up the nger edges when they reappear This method will fail when the ngers undergo signi cant relative motions when occluded In order to track them successfully in that case other methods should be integrated in the framework such as appearance based methods which is outside the scope of this paper

    3 2 Force Calculation for Dynamic Model
    The 3 D nger motion is recovered by tting the model to image derived data The external forces are applied on the dynamic model then the rotation and translation of nger joints are calculated Figure 1 c shows two kinds of typical nger motion We obtain the forces by calculating displacements using the following procedure

    2 Hand Model
    In our forward dynamics formulation the hand model Fig 1 a consists of a base link palm and ve linkchains ngers connected to the base link through ve twodegree of freedom revolute joints Each nger is three links connected by two one degree of freedom revolute joints The nger parts are modeled as cylinders and the palm is modeled as a six rectangle side solid A two degree of freedom revolute joint can be simpli ed as two one degree of freedom revolute joints connected by a zero length and zero mass link dummy link 4 In the hand model there are 21 links including 5 2



    Extract the nger edges using the Canny edge operator A curvature nding operator 7 is used to nd the base points of each nger such as shown in Figure 1 d The edges between and correspond to the nger segment The edge points of sub segments can be derived from the corresponding 3 D points in the 3 D model during tracking

    a

    b

    c

    d

    e

    Figure 1 a Dynamic Model of Hand b Initial posture of hand model c Finger motion and force from edge displacement d nger segmentation and base points e Representing the projection of the model s articulated segments by their medial axis thick white line





    Because the hand motion will result to the change of base point position between the current and afterframe a normalization process is necessary to match the base points in current and after frame according to the distance of two base points and the length of nger segment Let and corresponding edge points in th frame and th frame The 2 D force from edge displacement can be calculated by the equation


    Following previous work 8 by taking the time derivatives of the perspective projection equation with an image with point we get




















    4











    1

    Another force can be directly derived from the optical ow of the image In the optical ow equation








    2

    the temporal differential at position will be considered as the external force The optical ow of hand motion is computed by the Lucas Kanade method 9 Optical ow near nger edges is not as reliable due to possible mismatches of edge points so we will only consider the optical ow of the inside area of the nger segment obtained from the projection of the 3 D model in the image plane For optical ow computation we select points with signi cant gradient magnitude only In Fig 2 we see the edge forces and the optical ow forces applied to different regions of the image

    The focal length is obtained by pre calibration of the camera According to deformable model theory these 3D forces are converted to generalized forces on the model parameters with the Jacobian of the model points by Consequently the generalized forces calculated from 2 D images will be the Jacobian of the with model points under perspective projection To apply the external forces on the dynamic model we transform the individual forces obtained from edges and the optical ow within every hand segment into one total force and torque to be used in the recursive dynamic framework The total force and torque for each hand segment are respectively and are the individual force vector and force position vectors























    4

    A new constraint

    3 3 Force transformation from 2 D to 3 D
    We assume a perspective projection model Therefore the point in the world coordinate system and the point in the camera coordinate system ensure the following equation



    where



    and









    3

    are translation and rotation matrices 3

    In previous work 19 a methodology was developed for the incorporation of illumination constraints any type that is differentiable w r t the model parameters in a deformable model formulation In that work the tting of the model was done based on a static image i e that data did not change during the tting process Hence any partial derivatives with respect to time in the illumination constraint were zero Here we will generalize our constraint formulation to include image motion Instead of one image the tting process will be guided by a sequence of moving images We will start by taking the re ectance equation Let us assume that we have a re ectance function of the general

    200 Top light Side Light 180

    160

    Average Indensity 30x50 pixels

    140

    120

    100

    80 1 2 3 4 Frame number 5 6 7

    a

    b

    c

    d

    Figure 2 Forces applied to the hand model and the effects of shading a Edge forces b optical ow forces in the interior of the model d is the change in average intensity in a small smooth area of the hand depicted in c when the illumination comes from the top blue line and from the side green dashed line respectively form where is the observed image intenThis error is small when the change of gradient is big but in sity and are the lighting model parameters which can be the case of smooth surfaces this effect becomes much more differentiated with respect to the normal of the surface and since normals change pronounced Similarly based only on the model parameters are the hand model parameters This means that the re ectance of the surface is locally computable and that there This means that when there is no motion the constraint are no global illumination effects We also assume that the equation simpli es to the shading constraint Therefore illumination parameters do not change with time The con straint equation is and we differentiate 9 it w r t time and apply Baumgarte stabilization 3 in order to obtain encompasses both constraints In the case of a smooth moving object 9 allows to deal with errors due to directed illu 5 q mination and offers the possibility of recovering the motion In this case we cannot ignore the partial derivatives w r t of relatively smoothly shaded surfaces Fig 2 c d shows time Therefore using the above formulas we expand Equathe change in average intensity in a small smooth area of the tion 5 to hand when the illumination comes from the top and from the side respectively In the second case changes in the in tensity of the points are dramatic 6 We notice that if is the Jacobian of the model points and 5 Dynamic Tracking of Hand Motion is the Jacobian of the model points under perspective projection as described in Sec 3 then In our methodology we estimate the hand motion in response to the applied 3D forces on the hand as a Forward 7 Dynamics problem where given the external forces we want to compute the velocity and position of the palm is the left hand side of the model based optical ow conSince we use a recursive dynamic formulation we will straint equation 20 In model based optical ow motion use Featherstone s 2 5 spatial notation to model our eld vectors are vectors of velocities of model points and kinematic and dynamic variables We integrate the conhence applies Typically in the literature 11 this straint of Eq 9 in the above formulation to determine the optical ow term is set to 0 This is correct in the case of vector of the model s degrees of freedom which includes ambient only illumination For the case of light sources at the joint variables global rotation and translation in nity it is also correct for pure translational motion For Furthermore human ngers are not ideal dynamic links the simplest case of a Lambertian surface with a light source their joints have upper and lower bounds Therefore we at in nity it can be shown 12 that if is the angular velocneed to solve the above dynamic equations under joint limit ity of the rotational motion and the light source direction constraints These joint limits which constrain the relative the magnitude of the error between the true motion eld and motion of ngers together with our dynamic formulation the apparent and computable optical ow is which does not allow the inter penetration of ngers make hand tracking signi cantly more robust Our method has 8 the following steps













































    4

    1 At time mark the joints that reach their joint limits 2 Solve the dynamic equations of the hand at time recursively 3 For each nger starting at joint 1 the joint that connects the palm and the nger mark the rst joint that keeps at its joint limit during the time period from to If there is no such joint go to step 6 4 Fix the joints marked at step 3 and merge two links connected by a xed joint to one link Update the dynamic hand model 5 Go back to step 2 6 Output the status of the dynamic model of the hand at time Increase time and go to step 1

    segment of zero lebgth whereas a better model would have 2 segments only

    7 Conclusions
    In this paper we have augmented traditional optical ow and replaced it with a more general equation that includes shading information We have used this formulation within a deformable model framework and we were able to track dif cult hand motions under a variety of illumination conditions To improve the ef ciency of the approach we use the augmented equations only in areas where the optical ow constraint is signi cantly violated Our dynamic hand model formulation allows the integration of multiple cues and for robustness we also use edges in our tracking We have shown tracking results for simple and complex palm and nger motions Future work includes better occlusion recovery handling using Kalman Filtering and the incorporation of other sources of visual information such as color in order to work on cluttered backgrounds

    6 Experiments
    We performed a series of experiments to test our method with a variety of hand motions All our experiments run on a PIII 500MHz processor at approximately 4 frames per second Two similar datasets were taken under two different illumination conditions The rst dataset Fig 3 was taken with the light coming from the top of the hand thus minimizing the variations in intensity w r t the illumination The second dataset Fig 4 was taken with the light on the side approx 50 degrees so illumination effects are pronounced Each sequence was approximately 100 frames Due to space limitations we include only a few frames in this paper The full sequences and the tracking results are available as movie les http www cs sunysb edu samaras hand Files trk top mpg and trk side mpg respectively To show the accuracy of the tracking we project the segments of the hand model back onto the image We represent the segments by their medial axes Fig 1 e At the same web site an additional data sequence where ngers ex to a closed position and un ex back to open without losing track is in movie le trk flex mpg and the full model while tracking but rendered from a different viewpoint in movie le mdl flex mpg From such a viewpoint it can be seen that our dynamic model allows for accurate tracking of segments that are almost occluded from the camera In gure 3 we present a number of complex rotational motions for the ngers and for the whole hand First the ngers bend away from the camera then the whole hand rotates with signi cant occlusions Neither edges nor optical ow alone would have succeeded in tracking this sequence Finally in gure 4 we demonstrate the increased power of the shading ow constraint since classic optical ow based on the brightness constancy assumption fails due to the signi cant appearance changes from frame to frame due to illumination We notice that tracking is quite successful in these examples There are some slight inaccuracies tracking the thumb here modeled as a 3 segment nger with one 5

    References
    1 G Engeln Mullges F Uhlig Numerical Algorithms with C Springer 1996 2 R Featherstone Robot Dynamics Algorithm Kluwer Academic Boston 1987 3 J Baumgarte Stabilization of constraints and integrals of motion in dynamical systems Computer Methods in Applied Mechanics and Engineering 1 1 16 1972 4 G Huang D Metaxas and J Lo Human Motion Planning Based on Recursive Dynamics and Optimal Control Techniques Computer Graphics International 2000 pp 19 28 5 K W Lilly Ef cient Dynamic Simulation of Robotic Mechanisms Kluwer Academic Boston 1993 6 J Lo and D Metaxas Recursive Dynamics and Optimal Control Techniques for Human Motion Planning CA 99 Geneva Switzerland May 26 29 1999 7 S B Kang and K Ikeuchi Toward Automatic Instruction from Perception Recognizing a Grasp from Observation IEEE Trans of Robotics and Automation pp 432 443 Aug 1993 8 D N Metaxas Physics Based Deformable Models Applications to Computer Vision Graphics and Medical Imaging Kluwere Academic Publishers 1998 9 B Lucas and T Kanade An Iterative Technique of Image Registration and Its Application to Stereo Proc 7th IJCAI pp 674 679 August 1981 10 J Angeles and O Ma Dynamic Simulation of Axis Serial Robotic Manipulators Using a Natural Orthogonal Complement The International Journal of Robotics Research 7 5 32 47 October 1988

    Figure 3 Seven frames from a longer sequence tracking exing of ngers and hand rotation First row Original data Second row The accuracy of the tracking is demonstrated by projecting the medial axes of each model nger white lines on the tracked data Third row Full model during tracking The full sequence can be seen in movie clip le http www cs sunysb edu samaras hand trk top mpg 21 Negahdaripour S Revised De nition of Optical 11 B K P Horn Robot Vision 1986 Flow Integration of Radiometric and Geometric Cues 12 A Verri and T A Poggio Motion eld and optical for Dynamic Scene Analysis PAMI 20 No 9 Sep ow Qualitative properties PAMI 11 5 490 498 1998 pp 961 979 May 1989 22 Carceroni R L Kutulakos K N Multi View Scene 13 H Brandl R Johanni and M Otter An Algorithm for Capture by Surfel Sampling From Video Streams the Simulation of Multibody Systems with Kinematic to Non Rigid 3D Motion Shape and Re ectance Loops IFToMM Seventh World Congress on the Theory ICCV01 II 60 67 of Machines and Mechanisms Sep 1987 23 Haussecker H and D J Fleet 2000 Computing op 14 J K Hodgins W L Wooten D C Brogan and J F tical ow with physical models of brightness variation O Brien Animation of Human Athletics SIGGRAPH PAMI 23 No 6 pp 661 673 2001 95 24 J J Kuch and T S Huang Vision based hand modeling 15 R H Lathrop Constrained Closed Loop Robot and tracking for virtual teleconferencing and telecollabSimulation by Local Constaint Propagation IEEE oration In ICCV95 pg 666 671 1995 ICRA 86 25 J Lee and T Kunii Model based analysis of hand pos 16 A J Stewart and J F Cremer Beyond keyframing An ture IEEE CGA 15 77 86 Sept 1995 algorithmic approach to animation Graphics Interface 26 J Rehg and T Kanade Model based tracking of self1992 occluding articulated objects In ICCV 95 pg 612 617 17 M W Walker and D E Orin Ef cient Dy 27 Ying Wu and T S Huang Capturing articulated hunamic Computer Simulation of Robotic Mechanisms man hand motion A divide and conquer approach In Journal of Dynamic Systems Measurement and ICCV 99 pg 606 611 Corfu Greece Sept 1999 Control 104 205 211 Sep 1982 28 Ying Wu Lin J Y Huang T S Capturing natural hand articulation ICCV 01 II 426 432 18 J Wilhelms and B Barsky Using Dynamic Analy 29 Rosales R Athitsos V Sigal L and Sclaroff S sis to Animate Articulated Bodies such as Humans and 3D Hand Pose Reconstruction Using Specialized MapRobots In Graphics Interface 1985 pings ICCV01 19 D Samaras and D Metaxas Incorporating Illumina 30 Q Delamarre and O Faugeras Finding pose of hand tion Constraints in Deformable Models CVPR 1998 in video images a stereo based approach AFGR 98 pp 322 329 20 D DeCarlo and D Metaxas Optical Flow Constraints on Deformable Models with Applications to Face Tracking IJCV July 2000 38 2 pp 99 127 6

    Finger Flexing

    Hand Rotation Figure 4 Eight frames from a longer sequence tracking exing of ngers and hand rotation Sideways illumination causes signi cant deviations from classical optical ow constraint during rotation The generalized optical ow constraint with shading allows for accurate tracking First and third row Original data Second and fourth row The accuracy of the tracking is demonstrated by projecting the medial axes of each model nger white lines on the tracked data The full sequence can be seen in movie clip le http www cs sunysb edu samaras hand trk side mpg

    7