Welcome Guestlogin to KGsePGregister at KGsePG email | FAQs

Estimation of Multiple Directional Light Sources for Synthesis of Mixed Reality Images

download

    1 of 10

    Estimation of Multiple Directional Light Sources for Synthesis of Mixed Reality Images



    Estimation of Multiple Directional Light Sources for Synthesis of Mixed Reality Images - Transcript


    Estimation of Multiple Directional Light Sources for Synthesis of Mixed Reality
    Images
    Yang Wang, Dimitris Samaras
    Computer Science Department,
    State University of New York at Stony Brook, NY 11794, USA
    {yangwang, samaras}@cs.sunysb.edu
    Abstract
    We present a new method for the detection and estima-
    tion of multiple directional illuminants, using a single im-
    age of any object with known geometry and Lambertian re-
    flectance. We use the resulting highly accurate estimates
    to modify virtually the illumination and geometry of a real
    scene and produce correctly illuminated Mixed Reality im-
    ages. Our method obviates the need to modify the im-
    aged scene by inserting calibration objects of any partic-
    ular geometry, relying instead on partial knowledge of the
    geometry of the scene. Thus, the recovered multiple illumi-
    nants can be used both for image-based rendering and for
    shape reconstruction. Our method combines information
    both from the shading of the object and from shadows cast
    on the scene by the object. Initially we use a method based
    on shadows and a method based on shading independently.
    The shadow based method utilizes brightness variation in-
    side the shadows cast by the object, whereas the shading
    based method utilizes brightness variation on the directly
    illuminated portions of the object. We demonstrate how
    the two sources of information complement each other in
    a number of occasions. We then describe an approach that
    integrates the two methods, with results superior to those
    obtained if the two methods are used separately. The re-
    sulting illumination information can be used (i) to render
    synthetic objects in a real photograph with correct illumi-
    nation effects, and (ii) to virtually re-light the scene.
    1. Introduction
    In order to integrate seamlessly a virtual object in a real
    scene (i.e. synthesize a Mixed Reality image), we need
    to simulate accurately the interactions of the virtual object
    with the illumination of the scene. Furthermore, to manip-
    ulate realistically existing images, knowledge of illuminant
    directions is necessary both in image based computer graph-
    ics, and in computer vision for shape reconstruction. This
    problem is particularly hard for diffuse (Lambertian) sur-
    faces and directional light sources and cannot be solved us-
    ing local information only. Lambertian reflectance is the
    most common type of reflectance. In this work we con-
    centrate on directional light sources because they have the
    most pronounced effects in the appearance of a scene and
    any errors in their estimation will cause noticeable errors
    and inconsistencies in the resulting Mixed Reality images.
    Previous methods that estimate multiple light sources re-
    quire images of a calibration object of given shape (typ-
    ically spheres) which needs to be removed from the scene
    and might cause artifacts. Instead, our method relies on par-
    tial knowledge of the geometry of the scene and can be used
    on objects of arbitrary shape. This allows us to possibly use
    any diffuse object of the scene for illumination calibration.
    We present a new method that integrates information from
    shadows and shading in the presence of strong directional
    sources of illumination. The shadow based method utilizes
    brightness variations inside the shadows cast by the object,
    whereas the shading based method utilizes brightness varia-
    tions on the directly illuminated portions of the object. The
    proposed integrated method is both more accurate and more
    general in its applicability, than the two methods applied
    separately.
    In the last few years, there has been an increased interest
    in estimating the reflectance properties and the illumination
    conditions of a scene based on images of the scene. The
    interest in computing illuminant directions first arose from
    shape from shading applications, and focused on recovering
    a single light source [6, 11, 32, 21]. However, illumination
    in most real scenes is more complex and it is very likely
    to have a number of co-existing light sources in a scene.
    An early attempt to recover a more general illumination de-
    scription [7], modeled multiple light sources as a polyno-
    mial distribution. A discussion of the various types of light
    sources can be found in [10]. With the advent of Image
    Based Modeling and Rendering (IMBR) methods in Com-
    puter Graphics, it quickly became apparent that accuracy,
    photorealism and generality of many IMBR applications de-
    pends on the knowledge of such properties and parameters.
    As a result a number of methods were proposed, which re-
    covered illumination parameters or reflectance properties of
    the scene in the form of BRDFs (Bidirectional Reflectance
    Functions) [3, 30, 29, 24, 13, 20, 14, 9, 26, 4, 15, 22]. Most
    of these methods are geared towards the production of high
    quality images, requiring extensive data collection with the
    use of specialized equipment [3, 4, 30, 29] and off-line pro-
    cessing [9, 15], or have particularly restrictive assumptions,
    e.g. a single light source [14, 22]. Such methods would
    not be of use if only one or a few images of the scene are
    available. More relevant work is the system proposed by
    [13], aimed at interactive relighting of indoor scenes, re-
    quiring knowledge of complete scene geometry and using
    fast radiosity methods. Radiosity computations in a com-
    plex outdoor scene, even if accurate geometry was known,
    would be prohibitively slow.
    In particular, estimation of illumination parameters from
    images is necessary, in order to compensate for illumina-
    tion artifacts, and also to allow super-imposition of syn-
    thetic images of new objects into real scenes. Most such
    methods need to use a calibration object of fixed shape,
    typically a sphere. In [17] a calibration object that com-
    prises of diffuse and specular parts is proposed. In [3] a
    specular sphere is used as a light probe to measure the in-
    cident illumination at the location where synthetic objects
    will be placed in the scene. Such a sphere though might
    have strong inter-reflections with other objects of the scene,
    especially if they are close to it. Using the Lambertian shad-
    ing model, Yang and Yuille [28] observed that multiple light
    sources can be deduced from boundary conditions, i.e., the
    image intensity along the occluding boundaries and at sin-
    gular points. Based on this idea, Zhang and Yang [31] show
    that the illuminant directions have a close relationship to
    critical points on a Lambertian sphere and that, by identify-
    ing most of those critical points, illuminant directions may
    be recovered if certain conditions are satisfied. Conceptu-
    ally, a critical point is a point on the surface such that all
    its neighbors are not illuminated by the same light sources.
    However, because the detection of critical points is sensi-
    tive to noise, the direction of extracted real lights is not very
    robust to noisy data. Recently, an illuminant direction de-
    tection method that minimizes global error was proposed by
    Wang and Samaras [27]. In general, each point of a surface
    is illuminated by a subset of all the directional light sources
    in the scene. The method segments the surface robustly into
    regions (“virtual light patches”), with each region illumi-
    nated by a different set of sources. Then, real lights can be
    extracted, based on the segmented “virtual light patches” in-
    stead of critical points that are relatively sensitive to noise.
    Since there are more points in a region than on the bound-
    ary, the method’s accuracy does not depend on the exact
    extraction of the boundary and can tolerate noisy and miss-
    ing data better. When the observed shape is not spherical its
    normals are mapped to a sphere (for an example see Fig.6),
    although a lot of normals will be missing. However the
    method works well even for incomplete spheres, as long
    as there are enough points inside each light patch for the
    least-squares method to work correctly.
    Inserting calibration objects in the scene complicates the
    acquisition process, as they either need to be physically re-
    moved before re-capturing the image, which is not always
    possible, or they need to be electronically removed as a post
    processing step, which might introduce artifacts in the im-
    age. Our proposed method can be applied to objects of
    known arbitrary geometry, as long as that shape contains a
    fairly complete set of normals for a least-squares evaluation
    of the light sources. Thus, it would be possible to estimate
    the illuminants from the image of a scene, using geome-
    try that is part of the scene. The idea of using arbitrary
    known shape, can also be found in the approach of Sato et
    al. [24], which exploits information of a radiance distribu-
    tion inside shadows cast by an object of known shape in the
    scene. Recently, under a signal processing approach [20, 1]
    a comprehensive mathematical framework for evaluation of
    illumination parameters through convolution is described.
    Unfortunately, this framework does not provide a method
    to estimate high-frequency illumination such as directional
    light sources when the BRDF is smooth as in the Lamber-
    tian case. Convolution is a local operation and the problem
    is ill-posed when only local information is considered [2].
    Our method uses global information to overcome this prob-
    lem, and in this sense, it is complementary to the methods
    of [20, 1, 14].
    In this paper, we propose a new method for multiple di-
    rectional source estimation, that integrates illumination es-
    timation from shading [27] and shadows [24]. Both meth-
    ods rely on knowledge of the illuminated geometry but do
    not require a specific calibration object. However they have
    different strengths and weaknesses. Often the shadow of
    a light source that shades a large visible area of an ob-
    ject is occluded and vice-versa. The Hough transform
    can introduce spurious lights in the shading-based method
    and the extended area source approximation of directional
    sources in the shadow-based method can introduce signifi-
    cant errors. We demonstrate how the two sources of infor-
    mation complement each other in a number of occasions.
    Even when both methods are applicable at the same time,
    combining them reduces error and speeds up computation.
    Hence, we arrive at an approach that integrates the two
    methods, with results superior to those obtained if the two
    methods are used separately. The resulting illumination in-
    formation can be used (i) to render synthetic objects in a
    real photograph with correct illumination effects, and (ii) to
    virtually re-light the scene.
    The rest of this paper is structured as follows: Section 2
    describes the notion of critical points and their properties as
    2
    they pertain to our problem. Section 3 describes the basic
    shading-based algorithm and extensions that make it robust
    to noise and missing data. These properties of our algorithm
    allow its application to objects of arbitrary shape in Section
    3.4. The shadow based algorithm is in Section 4. The two
    methods are compared and integrated in Section 5. We ap-
    ply our method to the synthesis of Mixed Reality images in
    Section 6 and conclude with future work in Section 7.
    2. Critical Points
    Definition 1 Given an image, let Li, i = 1, 2, . . ., be the
    light sources of the image. A point in the image is called
    a critical point if the surface normal at the corresponding
    point on the surface of the object is perpendicular to some
    light source Li.
    We assume that images are formed by perspective or
    orthographic projection and the object in the image has
    a Lambertian surface with constant albedo, that is BRDF
    f(?i, ?i; ?e, ?e) is known to be a constant and each surface
    point appears equally bright from all viewing directions:
    E = ?LiLˆ · nˆ = ?Licos?i (1)
    where E is the scene radiance of an ideal Lambertian sur-
    face, ? is the albedo, Lˆ represents the direction and Li the
    amount of incident light, and nˆ is the unit normal to the
    surface.
    Initially, the algorithm is developed using a sphere model
    and subsequently extended [27] to objects of arbitrary
    shape.
    • We assume the observed object is a sphere with Lam-
    bertian reflectance properties whose physical size is al-
    ready known.
    • For light sources whose direction is co-linear with the
    lens axis of the camera1, the best possible result is their
    equivalent frontal light source Lfrontal.
    It has been proven in [31] that it is not possible to recover
    the exact value of the intensity of any individual light source
    among four (or more) pairs of antipodal light sources (i.e.
    opposite direction light sources). However, this kind of sit-
    uation, i.e. an object illuminated by antipodal light sources,
    happens rarely, so for simplicity in the rest of this paper, we
    will make an additional assumption that there are no antipo-
    dal light sources.
    2.1. Sphere cross section with a plane P
    Let P be an arbitrary plane such that S, the center of
    the sphere, lies on it (Fig.1), Li, i = 1, 2, . . ., be the light
    sources of the image and (Li)P their projections on P. A
    point on the arc ? can be specified by its corresponding an-
    gle parameter in [?, ?] using the following proposition [31]:
    1We assume that they are co-linear when they form an angle less than
    a threshold ? depending on the resolution.
    SP
    q
    ? *? LP
    L
    N
    ?
    ? ?
    1
    Figure 1. L and its projection LP onto plane P.
    Proposition 1 Consider an angle interval [?, ?] of a sphere
    cross section (Fig.1). We can always find a partition ?0 =
    ? < ?1 < . . . < ?n = ? of the interval [?, ?] such that in
    each [?i?1, ?i] we have E(?) = bi sin ? + ci cos ? for some
    constants bi and ci, 1 ? i ? n (Fig.2), where E(?) is the
    intensity function along the arc ? .
    -
    6
    X
    Y
    -
    6R
    ?
    1
    Figure 2. the xy-space and the ?R-space for the case with
    two directional light sources.
    Intuitively, (bi?1, ci?1) represents the virtual light
    source of the [?i?2, ?i?1] part, and (bi, ci) of the neigh-
    boring [?i?1, ?i] part. These two virtual light sources will
    be different, if each of these two parts is lit by a differ-
    ent illuminant configuration. More formally, Proposition
    2 (from [31]) describes the difference between (bi?1, ci?1)
    and (bi, ci):
    Proposition 2 In the configuration of Proposition 1, for
    any 1 ? i ? n, we define ?i as the index set of real light
    sources contributed to the [?i?1, ?i] part of the arc ? . Then
    the Euclidean distance between two (bi, ci) pairs is
    ?
    (bi ? bi?1)2 + (ci ? ci?1)2 =
    ?
    j???????
    ?(Lj)P ? (2)
    where ?(Lj)P ? is the Euclidean norm, ?? = ?i?1 \?i (the
    index set of elements in ?i?1 but not in ?i), ??? = ?i\?i?1,
    and
    ?
    j??i
    Lj is the virtual light source corresponding to
    [?i?1, ?i].
    Propositions 1 and 2 show that the difference between
    (bi?1, ci?1) and (bi, ci) will be maximized at a critical point
    for these two virtual light sources. As we can see from
    Eqn.(2), possible critical points can be detected by thresh-
    olding ?(Lj)P ?.
    2.2. Properties of critical points
    Let ? be the set of all critical points and ? be the space
    of the sphere image. Then intuitively ? will cut ? into a
    decomposition, i.e.
    ? = (
    ?
    i?I
    ui)
    ?
    ? (3)
    where each ui ? ? is a subset of R2 which does not contain
    any critical points and I is an index set.
    3
    Proposition 3 Given a decomposition of the image as de-
    scribed by (3), for any image region ui, which corresponds
    to a 3D surface region si, there exists a light source L such
    that when si is illuminated by L, the resulting image is ex-
    actly the same as ui.
    Proposition 2 already provides us with a criterion to
    detect critical points on the sphere based on the dis-
    tance between (bi, ci) pairs. Unfortunately, this criterion
    greatly depends on the intensities of virtual light sources,
    ?
    j??????? ?(Lj)P ?, which are projected on the plane with
    respect to each different cross section. To locate the critical
    points more accurately, we provide another way to detect
    critical points on each cross section. Instead of using the
    distance between (bi, ci) pairs, we can use the tangent an-
    gles defined on the intensity curve (Fig.3(a)) [27].
    Proposition 4 Along a sine curve, at a critical point ?c,
    the inner angle ? between two tangent lines of each side
    (T1,T2) will be larger than 180 degrees.
    -
    6R
    ?
    }
    L1 L2
    ?
    ?c
    1
    -
    6R
    ?
    }
    L1 L2
    ?
    ?c
    ~
    ??
    L1’
    1
    (a) (b)
    Figure 3. (a) inner angle ?. (b) Angles between two tan-
    gent lines.
    3. Shading-Based Illuminant Detection
    3.1. Critical Point Detection
    From Proposition 1 we know that, for every cross sec-
    tion of the sphere with a plane P such that S, the center
    of the sphere, lies on P (illustrated in figures 2, 4), there
    is a partition ?0 = a < ?1 < . . . < ?n = ? of the
    angle interval [?, ?] such that in each [?i?1, ?i], we have
    E(?) = bi sin ? + ci cos ? for some constants bi and ci,
    1 ? i ? n. By applying a standard recursive least-squares
    algorithm [5], we can use the following two consecutive
    windows to detect the local maximum points of inner an-
    gles ? and distance defined by Eqn.(2). Starting from an
    initial point A, any point B on the same arc can be deter-
    mined uniquely by the angle ? between SA and SB. Then
    we cover this part AB by two consecutive windows AW and
    WB (Fig.4).
    SP ?
    A
    B
    ?
    W
    C
    1
    Figure 4. a part of arc ? , AB, is covered by two consecu-
    tive windows AW and WB.
    With points B and W moving from the beginning point
    A of the visible part of the circle to its ending point C along
    the arc ? , we could estimate bi and ci from the data in the
    two consecutive windows AW and WB respectively. Once
    a local maximum point of Eqn.(2) is detected, it signifies
    that we have included at least a ‘critical point’ in the second
    window WB. Because the inner angle ? defined in Proposi-
    tion 4 is very sensitive to noise, we use two different criteria
    simultaneously to detect critical points. First we examine
    the distance defined in Proposition 2, then if the distance
    is above a threshold Tdistance, we try to locate the critical
    point by searching for the maximum inner angle ? along
    the curve. In practice, for the distance criterion threshold
    Tdistance, we use a ratio Tratio instead of the direct Eu-
    clidean norm to normalize for the varying light intensities.
    Tratio is calculated from Proposition 2:
    Tratio =
    ?
    (bi ? bi?1)2 + (ci ? ci?1)2
    max{
    ?
    b2i?1 + c
    2
    i?1,
    ?
    b2i + c
    2
    i }
    (4)
    where (bi?1, ci?1) and (bi, ci) are calculated from the two
    consecutive windows AW and WB. Therefore, we can keep
    growing the first window AW to find critical point pc using
    the recursive least-squares algorithm again. Then we fix the
    initial point A at pc and keep searching for the next ‘critical
    point’ until we exhaust the whole arc ? .
    3.2. Segmenting the Surface
    Definition 2 All critical points corresponding to one real
    light will be grouped into a cut-off curve which is called a
    critical boundary.
    Intuitively, each critical boundary of the sphere in our
    model is on a cross section plane through the center of the
    sphere. Therefore, critical points can be grouped into criti-
    cal boundaries using the Hough transform in a (?, ?) angle-
    pair parameter Hough space, i.e. we apply the cross-section
    plane equation in the following form:
    {
    x · nx + y · ny + z · nz = 0
    nx = r cos ?, ny = r sin ? cos ?, nz = r sin ? sin ?
    (5)
    where (x, y, z) is the position of each critical point,
    (nx, ny, nz) is the normal of the cross-section plane, r is
    the radius of the sphere and ?, ? ? [0, 180]. Typically, we
    use one-third of the highest vote count in the Hough trans-
    form as the threshold above which we detect a (?, ?) angle
    pair as a possible critical boundary.
    Although critical points provide information to deter-
    mine the light source directions [31], they are relatively sen-
    sitive to noisy data. Since most real images are not noise
    free, if we only use the Hough transform to extract criti-
    cal boundaries, we will very likely find more boundaries
    than the real critical boundaries. Noise can either introduce
    many spurious critical points or move the detected critical
    points away from their true positions. However, non-critical
    point areas are less sensitive to noise and provide important
    information to determine the light source directions.
    4
    Definition 3 Critical boundaries will segment the whole
    sphere image into several regions, and intuitively, each seg-
    mented region is corresponding to one virtual light. Each
    region is called a virtual light patch.
    Once we get the patches corresponding to each virtual
    light, the directions of virtual light sources can be calcu-
    lated.
    Let A, B, C and D be four points in a patch correspond-
    ing to one virtual light source and nA, nB , nC and nD be
    their normals respectively. From the Lambertian Eqn.(1),
    augmented by an ambient light term, we have
    ?
    ?
    ?
    ?
    nAx nAy nAz 1
    nBx nBy nBz 1
    nCx nCy nCz 1
    nDx nDy nDz 1
    ?
    ?
    ?
    ? ·
    ?
    ?
    ?
    ?
    Lx
    Ly
    Lz
    ?
    ?
    ?
    ?
    ? =
    ?
    ?
    ?
    ?
    IA
    IB
    IC
    ID
    ?
    ?
    ?
    ? (6)
    where IA, IB , IC and ID are brightness of each pixel in the
    source image corresponding to four points A, B ,C and D
    respectively.
    If nA, nB , nC and nD are non-coplanar, we can ob-
    tain the direction of the corresponding virtual light source
    L, [Lx, Ly, Lz]T , and the ambient light intensity ? by solv-
    ing the system of equations in (6). Ideally, we would solve
    for the directions of virtual light sources by using four non-
    coplanar points from corresponding patches. Due to com-
    putation and rounding errors, four non-coplanar points are
    not always enough for us to get a numerically robust es-
    timate of the direction of a virtual light source. Further-
    more, it is not necessary that we can always find several
    non-coplanar points in an interval of an arc in some plane
    P as described above. These problems are avoided by
    scanning the image both horizontally and vertically instead
    of one direction only and recovering the two dimensional
    patches that are separated by critical boundaries. Then from
    each two dimensional patch, we use the internal non-critical
    points of each virtual light patch to solve for the direction
    of the virtual light source2.
    3.3. Recovering the True Lights
    Proposition 5 If a critical boundary separates a region
    into two virtual light patches with one virtual light each,
    e.g. L1, L2, then the difference vector between L1 and L2,
    Lpre = L1 ? L2, is called the real light pre-direction with
    respect to this critical boundary. Since we have already
    assumed that there are no antipodal light sources (i.e. op-
    posite direction light sources), the real light direction will
    be either the pre-direction L1?L2, or its opposite L2?L1
    (Fig.5).
    To find out the true directions, we pick a number of
    points on the surface, e.g. P1, P2, ..., Pk and their normals,
    2We only use points that are at least 2 pixels away from the critical
    boundary for increased robustness to noise.

    L1
    L2
    7
    L1
    L2
    Lr = L1 ? L2
    Lr = L2 ? L1
    1
    Figure 5. Illustration of real light pre-direction. Lr is the
    real light direction.
    e.g. N1,N2, ...,Nk, then the true directions will be the so-
    lution of:
    E(Pj) =
    ?
    i??
    max(eiLi ·Nj , 0)+Lv ·Nj , 1 ? j ? k. (7)
    where Lv is the virtual light source of a possible frontal
    illuminant whose critical boundaries could not be detected
    and will be checked as a special case.
    Selecting points in the area inside the critical boundaries
    is a robust way to detect real lights. This can be done using
    standard least-squares methods [5, 18].
    After we find all the potential critical boundaries, Propo-
    sition 5 provides a way to extract real lights by calculating
    the light difference vector of two virtual light patches of
    two sides along the critical boundary. However, one real
    light might be calculated many times by different virtual
    light patch pairs, and since our data will not be perfect, they
    will not be necessary exactly the same vector. We introduce
    an angle threshold to cluster the resulting light difference
    vectors into real light groups, that can be approximated by
    one vector.
    By minimizing the least-squares errors of virtual light
    patches, we are able to merge the spurious critical bound-
    aries detected by the Hough transform, by the following
    steps (for an example see Fig.10):
    1. Find initial critical boundaries by Hough transform
    based on all detected critical points.
    2. Adjust critical boundaries. We adjust every critical
    boundary by moving it by a small step, and a reduc-
    tion in the least-squares error indicates a better solu-
    tion. We keep updating boundaries using a “greedy”
    algorithm in order to minimize the total error.
    3. Merge spurious critical boundaries. If two crit-
    ical boundaries are closer than a threshold angle
    Tmergeangle (e.g. 5 degrees), they can be replaced by
    their average, resulting into one critical boundary in-
    stead of two.
    4. Remove spurious critical boundaries. We test every
    critical boundary, by removing it temporarily and if the
    least-squares error does not increase, we can consider
    it a spurious boundary and remove it completely. We
    test boundaries in increasing order of Hough transform
    votes (intuitively we test first boundaries that are not as
    trustworthy).
    5
    5. Calculate the real lights along a boundary by subtract-
    ing neighboring virtual lights as described in Proposi-
    tion 5.
    3.4. Arbitrary Shape
    In this section we extend our method to work with
    any object of known shape. Obviously, there should ex-
    ist enough non-coplanar points on the object illuminated by
    each light to allow for a robust least-squares solution. We
    assume no inter-reflections. We map the image intensity
    of each point Pi of the arbitrary shape to a point Si of a
    sphere, so that the normal at Pi is the same as the normal at
    Si. We detect all potential critical points based on the points
    mapped on the sphere. As expected, not every point on the
    surface of the sphere will be corresponding to a normal on
    the surface of the arbitrary shape, so there will be many
    holes on the mapped sphere, e.g. the black area in Fig.6.
    Thus, many critical points’ locations will be erroneously
    calculated even for noise-free data. Consequently, the criti-
    cal boundaries calculated by the Hough transform based on
    these critical points might not be correct or even far away
    from their correct positions in some cases. Since we can
    not recover these missing data from the original image, it is
    impossible to adjust the critical boundaries detected by the
    Hough transform itself. On the other hand, as long as the
    critical boundaries are not too far from the truth, the major-
    ity of the points in a virtual patch will still correspond to the
    correct virtual light (especially after the adjustments steps
    described in Sec. 3.3. Thus it is still possible, using sparse
    points on the sphere, to calculate the true light for each vir-
    tual light patch based on Proposition 5. If two points have
    the same normal but different intensities, we use the brighter
    one (assuming that the other is in shadow).
    Figure 6. Vase and its sphere mapping. Both image sizes
    are 400 by 400. Black points on the sphere represent nor-
    mals that do not exist on the vase’s surface.
    4. Shadow-Based Illuminant Detection
    Besides the shading information we explored above, a
    picture of a real scene is very likely to contain some shadow
    information. Hence the illumination distribution of the
    scene might also be recovered from a radiance distribution
    inside shadows cast by an object of known shape onto an-
    other object surface of known shape and reflectance. In
    [24], the illumination distribution of a scene is approxi-
    mated by discrete sampling of an extended light source and
    the whole distribution is represented as a set of point sources
    Figure 7. Illumination distribution of a scene is approx-
    imated by discrete sampling over the entire surface of the
    extended light source.
    Figure 8. Each shadow pixel provides a linear equation
    for estimating illumination distribution by shadows.
    equally distributed in the scene as shown in Fig.7. The total
    irradiance E at the shadow surface received from the entire
    illumination distribution is computed by
    E =
    n?
    i=1
    LiSi cos ?i (8)
    where Li(i = 1, 2, . . . , n) is the illumination radiance per
    solid angle ? = 2pi/n coming from the direction (?i, ?i),
    and Si are occlusion coefficients. Si = 0 if Li is occluded
    by objects, and Si = 1 otherwise. Then this approximation
    leads each image pixel inside shadows to provide a linear
    equation with unknown radiance of those sources as shown
    in Fig.8 [25].
    Finally, a set of linear equations (Eqn. 9) is derived from
    the brightness changes observed in the shadow image and
    solved for unknown Li’s.
    ?
    ?
    ?
    ?
    ?
    ?
    a11 a12 a13 . . . a1n
    a21 a22 a23 . . . a2n
    a31 a32 a33 . . . a3n
    . . . . . .
    am1 am2 am3 . . . amn
    ?
    ?
    ?
    ?
    ?
    ?
    ·
    ?
    ?
    ?
    ?
    ?
    ?
    ?
    L1
    L2
    L3
    .
    .
    .
    Ln
    ?
    ?
    ?
    ?
    ?
    ?
    ?
    =
    ?
    ?
    ?
    ?
    ?
    ?
    ?
    P1
    P2
    P3
    .
    .
    .
    Pm
    ?
    ?
    ?
    ?
    ?
    ?
    ?
    (9)
    Under the assumption of the Lambertian model, the
    BRDF f(?i, ?i; ?e, ?e) for a Lambertian surface is known
    to be a constant. Then, in Eqn.9 the coefficients ai(i =
    1, 2, . . . , n) represent Kd cos ?iSi where Kd is a diffuse re-
    flection parameter of the surface. Therefore, by selecting a
    sufficiently large number of image pixels, it is possible to
    solve for a solution set of unknown Li’s.
    To estimate the illumination distribution of a real scene,
    we need to assume that the number of image pixels in shad-
    ows is far larger than the number of illumination radiance
    values to be calculated.
    5. Integration of Shading and Shadows
    In this section, we are going to propose a framework that
    combines the respective advantages of shading and shadow
    information, allowing us to obtain improved results com-
    pared to using each of them independently.
    6
    5.1. Advantages of Shadows over Shading
    As can be seen from Fig.6, arbitrary shapes do not al-
    ways provide enough normals on the surface to make a com-
    plete sphere mapping, so many data points on the sphere
    will be missing non-uniformly. Consequently, there is a
    possibility that some critical boundaries will be lost and the
    corresponding real lights will not be estimated. Fig.12 is
    an example of a synthetic vase image whose top part is lit
    by two directional light sources. From the sphere mapping
    (Fig.12(b)), it is clear that very few of the object’s normals
    map to the top part of the sphere and so not enough critical
    points can be detected. However, shadow information can
    be used to estimate the intensity and direction of each light
    source.
    5.2. Advantages of Shading over Shadows
    While recovering the illumination distribution of the
    scene from a radiance distribution inside shadows, complete
    shadows cast by an object of known shape onto another ob-
    ject surface of known shape and reflectance are required.
    However, this might not be possible in situations where the
    light direction is nearly parallel to the surface. Obviously in
    this case shadows can not provide enough information to es-
    timate the real illuminants. In particular the azimuth of the
    light source can still be estimated reliably but not the eleva-
    tion. An experiment showed that in this situation, big errors
    will be introduced to the illumination distribution estimated
    by shadow information only (Fig.12(d-e)). Furthermore, in
    the method proposed in [24], a large number of samples is
    needed to capture the rapid change of radiance distribution
    around a direct light source. Radiance distribution inside a
    direct light source has to be sampled densely and the estima-
    tion becomes more stable if we observe the difference be-
    tween radiance of two shadow regions for each light source:
    one illuminated and the other not illuminated. Therefore,
    due to the discrete sampling of the geodesic dome, it is very
    likely that one directional light will be represented by sev-
    eral adjacent sampling solid angles so the precision of es-
    timation will also be limited. In the following sections, a
    region on the geodesic dome in [24] composed by adjacent
    sampling solid angles, whose estimated illuminant intensity
    is not close to zero, will be referred to as an illumination re-
    gion. In Fig.9, we can see that the illumination distribution
    estimated by shading information provides higher accuracy
    than the one estimated by shadows.
    5.3. Shading and Shadows
    Definition 4 A shadow is called a complete shadow when
    all the parts of the scene the shadow falls on are visible. The
    outmost edge of a complete shadow corresponding to a di-
    rectional light source is generated by the occluding bound-
    ary of the object surface.
    Fig.11 shows that the occluding boundary of a smooth
    surface will be a critical boundary in the context of shading.
    Consequently, when there is information both from shad-
    ing and from shadows, we can use the shadow information
    to give us an initial estimate of the directions of the light
    sources, and then we can use the shading information to re-
    fine it to compute the directions and intensities of the real
    light sources.
    In order to incorporate shadow information, we augment
    the algorithm of Sec. 3.3 by steps 2, 3 and 4:
    1. Find initial critical boundaries by Hough transform
    based on all detected critical points.
    2. Calculate an initial illumination distribution using the
    estimation from shadows [24]. Mark directions on
    the geodesic dome, for which possible shadows are
    not complete or observable due to occlusions, as ‘ex-
    cluded’.
    3. For each critical boundary, if its pre-direction is in a
    ‘non-excluded’ solid angle whose illuminant intensity
    is close to zero, consider it a spurious critical boundary
    and reject it. Otherwise mark the illumination region
    on the geodesic dome, containing this solid angle, as
    ‘registered’.
    4. For each ‘non-registered’ illumination region, add a
    critical boundary whose pre-direction is close to the
    direction determined by the peak center of this region
    as an initial critical boundary.
    5. Adjust critical boundaries. We adjust every critical
    boundary by moving it by a small step, and a reduc-
    tion in the least-squares error indicates a better solu-
    tion. We keep updating boundaries using a “greedy”
    algorithm in order to minimize the total error.
    6. Merge spurious critical boundaries. If two crit-
    ical boundaries are closer than a threshold angle
    Tmergeangle (e.g. 5 degrees), they can be replaced by
    their average, resulting into one critical boundary in-
    stead of two.
    7. Remove spurious critical boundaries. We test every
    critical boundary, by removing it temporarily and if the
    least-squares error does not increase, we can consider
    it a spurious boundary and remove it completely. We
    test boundaries in increasing order of Hough transform
    votes (intuitively we test first boundaries that are not as
    trustworthy).
    8. Calculate the real lights along a boundary by subtract-
    ing neighboring virtual lights as described in Proposi-
    tion 5.
    7
    (a) (b) (c) (d) (e)
    Figure 9. (a) A synthetic vase illuminated by three directional light sources. (b) Estimated illumination distribution using the
    shadow information only. (c) Error image generated by illumination distribution estimated in (b). (d) Detected critical boundaries
    using the shading information only. (e) Error image generated by illumination distribution estimated in (d).
    (a)original image (b)generated image (c)error image (d)initial (e)resulting critical boundaries
    Figure 10. Real sphere image: an almost Lambertian rubber ball with five light sources. Image size: 456x456. (a) the original
    image, (b) the generated image of a Lambertian ball with the five light sources extracted from (a), (c) the error image: darker color
    means higher error, (d) the initial eight boundaries and virtual light patches extracted by the Hough transform, and (e) the resulting
    critical boundaries and virtual light patches calculated by our algorithm, three out of the initial eight boundaries were automatically
    merged and the locations of the other five boundaries were automatically adjusted.
    
    Directional Light Source
    R
    Shadow
    
    Critical Boundary
    1
    Figure 11. Outline of estimating illumination distribution
    by shadows.
    Step 3 reduces significantly the spurious critical bound-
    aries to be processed in step 6 and 7, which are the most
    time consuming steps of the method. In our experiments
    this amounts to a 30% speed up (or more in the case of
    noisy data with a lot of spurious boundaries).
    6. Mixed Reality Image Synthesis
    The combination of shading and shadow information
    can provide better estimation of illumination distribution.
    These estimates can be used to synthesize Mixed Reality
    images, i.e. real images with superimposed virtual ob-
    jects correctly lit. Furthermore, under the assumptions
    of Lambertian BRDF and known geometry, we can re-
    render the real images to generate new images by modify-
    ing the estimated illumination configuration. These abilities
    are demonstrated by the following real image experiments:
    Based on a scene containing two rubber toys illuminated by
    three light sources, we generated a new image where one
    light has been switched off in Fig.13(b), which can be com-
    pared with a real image of the scene with the same light
    truly switched off. In the generated image we superimpose
    a synthetic object with correct shading and cast shadows
    in Fig.13(e). The original image and 3D geometry were
    captured by the range scanner system described in [8]. In
    Fig.13(a) we can see that there are some inaccuracies and
    noise on the recovered 3D shape. The original image is
    1534 x 1024 pixels with the two toys at the center of the im-
    age. To demonstrate the ability of our algorithm to use only
    partial scene information for accurate estimation, only the
    duck toy was used to estimate the illuminant directions. The
    second toy is used for visual evaluation of the results. Based
    on the size of the duck, the diameter of the mapping sphere
    is 400. The following parameter values were chosen for the
    algorithm: sliding window width w = 30 pixels (approx-
    imately 13.5 degrees), distance ratio Tratio = 0.5 and an-
    gle threshold for boundary merging (described in Sec. 3.3)
    Tmergeangle = 5.0 degrees.
    Notice that the error of the generated image is mostly
    located along the edges of surfaces and shadows and this
    is because (1) the range scanned 3D shape in Fig.13(a)
    has higher levels of estimation noise near the edges and
    the inter-reflections between the object and the table were
    not modeled, and (2) the simple rendering program used,
    does not simulate perfectly the shadowing effects of the
    real lights. The remaining noise in the generated image in
    Fig.13(b) is due to inaccuracies in shape estimation and vi-
    olations of the Lambertian assumption. Nonetheless illumi-
    nant estimation is still possible. In Fig.13(h-k) we display
    the results of the various steps of the algorithm in Sec. 5.3.
    7. Conclusions and Future Work
    In this paper we presented a method for the estimation
    of multiple illuminant directions from a single image, in-
    corporating shadow and shading information. We demon-
    strate how information from each source enhances the in-
    formation from the other source. We do not require the im-
    8
    aged scene to be of any particular geometry (e.g. a sphere).
    This allows our method to be used with the existing scene
    geometry, without the need for special light probes when
    the illumination of the scene consists of directional light
    sources. Experiments on synthetic and real data show that
    the method is robust to noise, even when the surface is not
    completely Lambertian. We apply the results of our method
    to generate Mixed Reality images, by successfully modi-
    fying scene illumination and seamlessly re-rendering, in-
    cluding superimposed synthetic objects. Future work in-
    cludes study of the properties of arbitrary surfaces (so that
    we can avoid the intermediate sphere mapping), speeding
    up of the least-squares method and extending the method to
    non-Lambertian diffuse reflectance for rough surfaces.
    References
    [1] R. Basri and D. Jacobs. Lambertian reflectance and linear
    subspaces. ICCV01, pages 383–390.
    [2] W. Chojnacki, M. J. Brooks, and D. Gibbins. Can the sun’s
    direction be estimated prior to the determination of shapes?
    Australian Jnt. Conf. on A.I. 94, pages 530–535.
    [3] P. Debevec. Rendering synthetic objects into real scenes.
    SIGGRAPH98, pages 189–198.
    [4] P. Debevec, T. Hawkins, C. Tchou, H.P. Duiker, W. Sarokin,
    and M. Sagar. Acquiring the reflectance field of a human
    face. In SIGGRAPH00, pages 145–156.
    [5] S. Haykin. Adaptive Filter Theory. Prentice Hall, Englewood
    Cliffs, NJ, 1986.
    [6] B.K.P. Horn and M.J. Brooks. Shape and source from shad-
    ing. In IJCAI85, pages 932–936.
    [7] D.R. Hougen and N. Ahuja. Estimation of the light source
    distribution and its use in integrated shape recovery from
    stereo and shading. ICCV93, pages 29–34.
    [8] Q.Y. Hu. 3-D Shape Measurement Based on Digital Fringe
    Projection and Phase-Shifting Techniques. PhD thesis, M.E.
    Dept., SUNY at Stony Brook, 2001.
    [9] T. Kim, Y.D. Seo, and K.S. Hong. Improving ar using shad-
    ows arising from natural illumination distribution in video
    sequences. In ICCV01, pages II: 329–334.
    [10] M.J. Langer and S.W. Zucker. What is a light source? In
    CVPR97, pages 172–178.
    [11] C.H. Lee and A. Rosenfeld. Improved methods of estimating
    shape from shading using the light source coordinate system.
    AI85, 26(2):125–143.
    [12] H.Y. Lin and M. Subbarao. A vision system for fast 3d model
    reconstruction. In CVPR01, pages 663–668.
    [13] C. Loscos, M.C. Frasson, G. Drettakis, B. Walter, X. Grainer,
    and P. Poulin. Interactive virtual relighting and remodeling
    of real scenes. In 10th Eurographics Workshop on Render-
    ing, 1999.
    [14] S.R. Marschner and D.P. Greenberg. Inverse lighting for pho-
    tography. In Fifth Color Imaging Conference 97, pages 262–
    265.
    [15] S.R. Marschner, S.H. Westin, E.P.F. Lafortune, and K.E. Tor-
    rance. Image-based brdf measurement. In Applied Optics00,
    page 39(16):2592 600.
    [16] A.P. Pentland. Finding the illuminant direction. JOSA82,
    72:448–455.
    [17] M.W. Powell, S. Sarkar, and D. Goldgof. A simple strat-
    egy for calibrating the geometry of light sources. PAMI01,
    23(9):1022–1027.
    [18] W.H. Press, S.A. Teukolsky, W.T. Vettering, and B.P. Flan-
    nery. Numerical Recipes in C. Cambridge University Press,
    1992.
    [19] R. Ramamoorthi and P. Hanrahan. An efficient representa-
    tion for irradiance environment maps. SIGGRAPH01, pages
    497–500.
    [20] R. Ramamoorthi and P. Hanrahan. A signal-processing
    framework for inverse rendering. SIGGRAPH01, pages 117–
    128.
    [21] D. Samaras and D. Metaxas. Coupled lighting direction and
    shape estimation from single images. In ICCV99, pages 868–
    874.
    [22] D. Samaras, D. Metaxas, P. Fua, and Y. Leclerc. Variable
    albedo surface reconstruction from stereo and shape from
    shading. In CVPR00, pages I:480–487.
    [23] D. Samaras, D. Metaxas, P. Fua, and Y. Leclerc. Variable
    albedo surface reconstruction from stereo and shape from
    shading. In CVPR00, pages I:480–487.
    [24] I. Sato, Y. Sato, and K. Ikeuchi. Illumination distribution
    from brightness in shadows. In ICCV99, pages 875–883.
    [25] I. Sato, Y. Sato, and K. Ikeuchi. Stability issues in recover-
    ing illumination distribution from brightness in shadows. In
    CVPR01, pages 400–407.
    [26] Y. Sato, M.D. Wheeler, and K. Ikeuchi. Object shape and
    reflectance modeling from observation. Computer Graphics
    97, 31:379–388.
    [27] Y. Wang and D. Samaras. Estimation of multiple illumi-
    nants from a single image of arbitrary known geometry. In
    ECCV02, page III: 272 ff., 2002.
    [28] Y. Yang and A. Yuille. Sources from shading. CVPR91,
    pages 534–439.
    [29] Y. Yu, P. Debevec, J. Malik, and T. Hawkins. Inverse global
    illumination: Recovering reflectance models of real scenes
    from photographs from. In SIGGRAPH99, pages 215–224.
    [30] Y. Yu and J. Malik. Recovering photometric properties of
    architectural scenes from photographs. In SIGGRAPH98,
    pages 207–217.
    [31] Y. Zhang and Y.H. Yang. Illuminant direction determination
    for multiple light sources. CVPR00, pages 269–276 vol.1.
    [32] Q. Zheng and R. Chellappa. Estimation of illuminant direc-
    tion, albedo, and shape from shading. PAMI91, 13(7):680–
    702.
    9
    (a) (b) (c) (d) (e)
    Figure 12. (a) A top lit synthetic vase. (b) One critical boundary (in red color) was missing when using shading information only.
    (c) Yellow areas are illuminant estimates using shadow information. Red points represent the true light directions, green points the
    estimates of the integrated method. (d) A synthetic vase with partial shadows. (e) Yellow areas are illuminant estimates from (d)
    using shadow information with 4 degrees average angle error and high error of illuminant intensity. Red points represent the true
    light directions, green points the estimates of the integrated method.
    (a)original image (b)generated image (c)real image (d)error image
    (h)Estimation by shadows (i)Initial boundaries
    (e)superimposed synthetic object (g)3D shape (j)after adding shadow information (k)resulting critical boundaries
    Figure 13. Real arbitrary shape image experiment: a scene illuminated by three light sources. Image Size: 1534x1024(scene),
    400x400(mapping sphere). (a) the original image, (b) the generated image of a scene with the two light sources extracted from (a),
    (c) the real image of the scene illuminated by the two real lights, (e) the error image: darker color means higher error. The noise
    in the generated image is mainly due to the inaccuracies in the estimation of shape and the edges of each shadow. Nonetheless
    illuminant estimation is still possible, (e) a synthetic object is superimposed into the generated image (b), (g) the 3D shape of the
    two objects’ frontal surfaces, R,G,B color values represent the x,y,z components of the normal, (h) the distribution of illuminants
    estimated by the shadow information. Notice that for each direction light source there are more than one non-zero intensity solid
    angles corresponding to it, (i) the initial eight boundaries extracted by the Hough transform, (j) the remaining five boundaries after
    adding the shadow information, and (k) the resulting critical boundaries calculated by our algorithm, two of the five boundaries in
    (k) were automatically removed and the locations of the other three boundaries were automatically adjusted.
    10