블로그 이미지
Leeway is... the freedom that someone has to take the action they want to or to change their plans.
maetel

Notice

Recent Post

Recent Comment

Recent Trackback

Archive

calendar

1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30
  • total
  • today
  • yesterday

Category

2009. 3. 26. 19:56 Computer Vision

Inverse Depth Parametrization for Monocular SLAM
Civera, J.   Davison, A.J.   Montiel, J. 


This paper appears in: Robotics, IEEE Transactions on
Publication Date: Oct. 2008
Volume: 24,  Issue: 5
On page(s): 932-945
ISSN: 1552-3098
INSPEC Accession Number: 10301459
Digital Object Identifier: 10.1109/TRO.2008.2003276
First Published: 2008-10-03
Current Version Published: 2008-10-31

Javier Civera, Departamento de Informática e Ingeniería de Sistemas, Universidad de Zaragoza

Andrew J. Davison, Reader in Robot Vision at the Department of Computing, Imperial College London

Jose Maria Martinez Montiel, Robotics and Real Time Group, Universidad de Zaragoza




monocular simultaneous localization and mapping  (SLAM)

representation of uncertainty

the standard extended Kalman filter (EKF)

direct parametrization of the inverse depth of features

feature initialization

camera motion estimates

6-D state vector --> converted to the Euclidean XYZ form

linearity index => automatic detection and conversion to maintain maximum efficiency



I. Introduction


monocular camera
: projective sensor measuring the beairng of image features

monocular (adj) 단안(單眼)(용)의, 외눈의

A stereo camera is a type of camera with two or more lenses. This allows the camera to simulate human binocular vision.

structure from motion = SFM
1) feature matching
2) global camera location & scene feature position estimates

sliding window processing

Sliding Window Protocol is a bi-directional data transmission protocol used in the data link layer (OSI model) as well as in TCP (transport layer of the OSI model). It is used to keep a record of the frame sequences sent and their respective acknowledgements received by both the users.

In robotics and computer vision, visual odometry is the process of determining the position and orientation of a robot by analyzing the associated camera images.

Odometry is the use of data from the movement of actuators to estimate change in position over time. Odometry is used by some robots, whether they be legged or wheeled, to estimate (not determine) their position relative to a starting location.

visual SLAM

probabilistic filtering approach

initializing uncertain depth estimates for distance features

Gaussian distributions implicit in the EKF

a new feature parametrization that is able to smoothly cope with initialization of features at all depth - even up to "infinity" - within the standard EKF framework: direct parametrization of inverse depth relative to the camera position from which a feature was first observed


A. Delayed and Undelayed Initialization

main map; main probabilistic state; main state vector

test for inclusion

delayed initialization
> treating newly detected features separately from the main map to reduce depth uncertainty before insertion into the full filter (with a standard XYZ representation)
- Features that retain low parallax over many frames (those very far from the camera or close to the motion epipole) are usually rejected completely because they never pass the test for inclusion
> (in 2-D and simulation) Initialization is delayed until the measurement equation is approximately Gaussian and the point can be safely triangulated.
> 3-D monocular vision with inertial sensing + auxiliary particle filter (in high frame rate sequence)

undelayed initialization
> While features with highly uncertain depths provide little information on camera translation, they are extremely useful as bearing references for orientation estimation.
: a multiple hypothesis scheme, initializing features at various depths and pruning those not reobserved in subsequent images
> Gaussian sum filter approximated by a federated information sharing method to keep the computational overhead low
-> to spread the Gaussian depth hypotheses along the ray according to inverse depth

Davision's particle method --> (Sola et al.) Gaussian sum filter --> (Civera at al.) new inverse depth scheme

 

A Gaussian sum is more efficient representation than particles (efficient enough that the separate Gaussians can call be put into the main state vector), but not as efficient as the single Gaussian representation that the inverse depth parametrization aalows.



B. Points at Infinity

efficient undelayed initialization + features at all depths (in outdoor scenes)


Point at infinity: a feature that exhibits no parallax during camera motion due to its extreme depth
-> not used for estimating camera translationm but for estimating rotation

The homogeneous coordinate systems of visual projective geometry used normally in SFM allow explicit representation of points at infinity(, and they have proven to play an important role during offline structure and motion estimation).

sequential SLAM system

Montiel and Davison: In special case where all features are known to be infinite -- in very-large-scale outdoor scenes or when the camera rotates on a tripod -- SLAM in pure angular coordinates turns the camera into a real-time visual compass.


Our probabilistic SLAM algorithm must be able to represent the uncertainty in depth of seemingly infinite features. Observing no parallax for a feature after 10 units of camera translation does tell us something about its depth -- it gives a reliable lower bound, which depends on the amount of motion made by the camera (if the feature had been closer than this, we would have observed parallax).

The explicit consideration of uncertainty in the locations of points has not been previously required in offline computer vision algorithms, but is very important in a more difficult online case.



C. Inverse Depth Representation

There is a unified and straightforward parametrization for feature locations that can handle both initialization and standard tracking of both close and very distant features within the standard EKF framework.


standard tracking

An explicit parametrization of the inverse depth of a feature along a semiinfinite ray from the position from which it was first viewed allows a Gaussian distribution to cover uncertainty in depth that spans a depth range from nearby to infinity, and permits seamless crossing over to finite depth estimates of features that have been apparently infinite for long periods of time.

linearity index + inverse depth parametrization

The projective nature of a camera means that the image measurement process is nearly linear in this inverse depth coordinate.


Inverse depth appears in the relation between image disparity and point depth in a stereo vision; it is interpreted as the parallax with respect to the plane at infinity. (Hartley and Zisserman)

Inverse depth is used to relate the motion field induced by scene points with the camera velocity in optical flow analysis. 

modified polar coordinates

target motion analysis = TMA

EKF-based sequential depth estimation from camera-known motion

multibaseline stereo

matching robustness for scene symmetries

sequential EKF process using inverse depth
( ref. Stochastic Approximation and Rate-Distortion Analysis for Robust Structure and Motion Estimation )

undelayed initialization for 2-D monocular SLAM 
( ref. A unified framework for nearby and distant landmarks in bearing-only SLAM )

FastSLAM-based system for monocular SLAM
( ref. Ethan Eade &  Tom Drummond,  Scalable Monocular SLAM )

special epipolar update step

FastSLAM

( ref. Civera, J.   Davison, A.J.   Montiel, J.M.M., Inverse Depth to Depth Conversion for Monocular SLAM 
J. Montiel and A. J. Davison “A visual compass based on SLAM,” )

loop-closing



II. State Vector Definition


handheld camera motion
> constant angular and linear velocity model

quaternion








posted by maetel