기존 SLAM에서 쓰이는 레이저 레인지 스캐너 등 range and bearing 센서 대신 공간에 대한 풍부한 정보를 주는 카메라를 쓰면, 1차원 (인식된 물체까지의 거리 정보, depth)을 잃게 되어 bearing-only SLAM이 된다.
EKF requires Gaussian representations for all the involved random variables that form the map (the robot pose and all landmark's positions). Moreover, their variances need to be small to be able to approximate all the non linear functions with their linearized forms.
두 입력 이미지 프레임 사이에 baseline을 구할 수 있을 만큼 충분한 시점 차가 존재해야 랜드마크의 위치를 결정할 수 있으므로, 이를 확보하기 위한 시간이 필요하게 된다.
Randall C. Smith and Peter Cheeseman. 1986. On the representation and estimation of spatial uncertainly. Int. J. Rob. Res. 5, 4 (December 1986), 56-68.
DOI=10.1177/027836498600500404 http://dx.doi.org/10.1177/027836498600500404
[1] J. Civera, A.J. Davison, and J.M.M Montiel. Inverse depth
parametrization for monocular SLAM. IEEE Trans. on Robotics, 24(5),
2008.
[2] J. Civera, Andrew Davison, and J. Montiel. Inverse Depth to Depth
Conversion for Monocular SLAM. In IEEE Int. Conf. on Robotics and
Automation, pages 2778 –2783, April 2007.
[3] A. J. Davison. Real-time simultaneous localisation and mapping with
a single camera. In Int. Conf. on Computer Vision, volume 2, pages
1403–1410, Nice, October 2003.
[4] Andrew J. Davison. Active search for real-time vision. Int. Conf. on Computer Vision, 1:66–73, 2005.
[5] Andrew J. Davison, Ian D. Reid, Nicholas D. Molton, and Olivier
Stasse. MonoSLAM: Real-time single camera SLAM. Trans. on Pattern
Analysis and Machine Intelligence, 29(6):1052–1067, June 2007.
[6] Ethan Eade and Tom Drummond. Scalable monocular SLAM. IEEE Int.
Conf. on Computer Vision and Pattern Recognition, 1:469–476, 2006.
[7] Thomas Lemaire and Simon Lacroix. Monocular-vision based SLAM using
line segments. In IEEE Int. Conf. on Robotics and Automation, pages
2791–2796, Rome, Italy, 2007.
[8] Nicholas Molton, Andrew J. Davison, and Ian Reid. Locally planar
patch features for real-time structure from motion. In British Machine
Vision Conference, 2004.
[9] J. Montiel, J. Civera, and A. J. Davison. Unified inverse depth
parametrization for monocular SLAM. In Robotics: Science and Systems,
Philadelphia, USA, August 2006.
[10] L. M. Paz, P. Pini´es, J. Tard´os, and J. Neira. Large scale 6DOF
SLAM with stereo-in-hand. IEEE Trans. on Robotics, 24(5), 2008.
[11] J. Sol`a, Andr´e Monin, Michel Devy, and T. Vidal-Calleja. Fusing
monocular information in multi-camera SLAM. IEEE Trans. on Robotics,
24(5):958–968, 2008.
[12] Joan Sol`a. Towards Visual Localization, Mapping and Moving
Objects Tracking by a Mobile Robot: a Geometric and Probabilistic
Approach. PhD thesis, Institut National Polytechnique de Toulouse, 2007.
[13] Joan Sol`a, Andr´e Monin, and Michel Devy. BiCamSLAM: Two times
mono is more than stereo. In IEEE Int. Conf. on Robotics and
Automation, pages 4795–4800, Rome, Italy, April 2007.
[14] Joan Sol`a, Andr´e Monin, Michel Devy, and Thomas Lemaire.
Undelayed initialization in bearing only SLAM. In IEEE/RSJ Int. Conf.
on Intelligent Robots and Systems, pages 2499–2504, Edmonton, Canada,
2005.
[15] Joan Sol`a, Teresa Vidal-Calleja, and Michel Devy. Undelayed
initialization of line segments in monocular SLAM. In IEEE Int. Conf.
on Intelligent Robots and Systems, Saint Louis, USA, 2009. To appear.
slamtb.m
% SLAMTB An EKF-SLAM algorithm with simulator and graphics.
%
% This script performs multi-robot, multi-sensor, multi-landmark 6DOF
% EKF-SLAM with simulation and graphics capabilities.
%
% Please read slamToolbox.pdf in the root directory thoroughly before
% using this toolbox.
%
% - Beginners should not modify this file, just edit USERDATA.M and enter
% and/or modify the data you wish to simulate.
%
% - More advanced users should be able to create new landmark models, new
% initialization methods, and possibly extensions to multi-map SLAM. Good
% luck!
%
% - Expert users may want to add code for real-data experiments.
%
% See also USERDATA, USERDATAPNT, USERDATALIN.
%
% Also consult slamToolbox.pdf in the root directory.
% Created and maintained by
% Copyright 2008-2009 Joan Sola @ LAAS-CNRS.
% Programmers (for the whole toolbox):
% Copyright David Marquez and Jean-Marie Codol @ LAAS-CNRS
% Copyright Teresa Vidal-Calleja @ ACFR.
% See COPYING.TXT for wull copyright license.
%% OK we start here
% clear workspace and declare globals
clear
global Map %#ok<NUSED>
%% I. Specify user-defined options - EDIT USER DATA FILE userData.m
%% II. Initialize all data structures from user-defined data in userData.m
% SLAM data
[Rob,Sen,Raw,Lmk,Obs,Tim] = createSlamStructures(...
Robot,...
Sensor,... % all user data
Time,...
Opt);
% Simulation data
[SimRob,SimSen,SimLmk,SimOpt] = createSimStructures(...
Robot,...
Sensor,... % all user data
World,...
SimOpt);
% Graphics handles
[MapFig,SenFig] = createGraphicsStructures(...
Rob, Sen, Lmk, Obs,... % SLAM data
SimRob, SimSen, SimLmk,... % Simulator data
FigOpt); % User-defined graphic options
%% III. Init data logging
% TODO: Create source and/or destination files and paths for data input and
% logs.
% TODO: do something here to collect data for post-processing or
% plotting. Think about collecting data in files using fopen, fwrite,
% etc., instead of creating large Matlab variables for data logging.
% Clear user data - not needed anymore
clear Robot Sensor World Time % clear all user data
%% IV. Main loop
for currentFrame = Tim.firstFrame : Tim.lastFrame
% Robot motion
% FIXME: see how to include noise in a clever way.
Rob(rob).con.u = SimRob(rob).con.u + Rob(rob).con.uStd.*randn(6,1);
Rob(rob) = motion(Rob(rob),Tim);
% Process sensor observations
for sen = Rob(rob).sensors
%TODO: see how to pass only used Lmks and Obs.
% Observe knowm landmarks
[Rob(rob),Sen(sen),Lmk,Obs(sen,:)] = correctKnownLmks( ...
Rob(rob), ...
Sen(sen), ...
Raw(sen), ...
Lmk, ...
Obs(sen,:), ...
Opt) ;
% Figures for all sensors
for sen = [Sen.sen]
SenFig(sen) = drawSenFig(SenFig(sen), ...
Sen(sen), Raw(sen), Obs(sen,:), ...
FigOpt);
makeVideoFrame(SenFig(sen), ...
sprintf('sen%02d-%04d.png', sen, currentFrame),...
FigOpt, ExpOpt);
end
% Do draw all objects
drawnow;
end
% 4. DATA LOGGING
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% TODO: do something here to collect data for post-processing or
% plotting. Think about collecting data in files using fopen, fwrite,
% etc., instead of creating large Matlab variables for data logging.
end
%% V. Post-processing
% ========== End of function - Start GPL license ==========
% # START GPL LICENSE
%---------------------------------------------------------------------
%
% This file is part of SLAMTB, a SLAM toolbox for Matlab.
%
% SLAMTB is free software: you can redistribute it and/or modify
% it under the terms of the GNU General Public License as published by
% the Free Software Foundation, either version 3 of the License, or
% (at your option) any later version.
%
% SLAMTB is distributed in the hope that it will be useful,
% but WITHOUT ANY WARRANTY; without even the implied warranty of
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
% GNU General Public License for more details.
%
% You should have received a copy of the GNU General Public License
% along with SLAMTB. If not, see <http://www.gnu.org/licenses/>.
%
%---------------------------------------------------------------------
% SLAMTB is Copyright 2007,2008,2009
% by Joan Sola, David Marquez and Jean Marie Codol @ LAAS-CNRS.
% See on top of this file for its particular copyright.
Techniques for Generating Consistent Maps
• Scan matching
• EKF SLAM
• Fast-SLAM
• Probabilistic mapping with a single map and a posterior about poses Mapping + Localization
• Graph-SLAM, SEIFs
Approximations for SLAM
• Local submaps
[Leonard et al.99, Bosse et al. 02, Newman et al. 03]
• Sparse links (correlations)
[Lu & Milios 97, Guivant & Nebot 01]
• Sparse extended information filters
[Frese et al. 01, Thrun et al. 02]
• Thin junction tree filters
[Paskin 03]
• Rao-Blackwellisation (FastSLAM)
[Murphy 99, Montemerlo et al. 02, Eliazar et al. 03, Haehnel et al. 03]
EKF-SLAM Summary
•Quadratic in the number of landmarks: O(n2)
• Convergence results for the linear case.
• Can diverge if nonlinearities are large!
• Have been applied successfully in large-scale environments.
• Approximations reduce the computational complexity.
> definition
mapping: 환경을 인식가능한 정보로 변환하고
localization: 이로부터 자기 위치를 추정하는 것
> issues
- uncertainty <= sensor
- data association (데이터 조합): 차원이 높은 센서 정보로부터 2-3차원 정도의 정보를 추려내어 이를 지속적으로 - 대응시키는 것
- 관찰된 특징점 자료들을 효율적으로 관리하는 방법
> localization (위치인식)
: 그 위치가 미리 알려진 랜드마크를 관찰한 정보를 토대로 자신의 위치를 추정하는 것
: 초기치 x0와 k-1시점까지의 제어 입력, 관측벡터와 사전에 위치가 알려진 랜드마크를 통하여 매 k시점마다 로봇의 위치를 추정하는 것
- 로봇의 위치추정의 불확실성은 센서의 오차로부터 기인함.
> mapping (지도작성)
: 기준점과 상대좌표로 관찰된 결과를 누적하여 로봇이 위치한 환경을 모델링하는 것
: 위치와 관측정보 그리고 제어입력으로부터 랜드마크 집합을 추정하는 것
- 지도의 부정확성은 센서의 오차로부터 기인함.
> Simultaneous Localization and Mapping (SLAM, 동시간 위치인식 및 지도작성)
: 위치한 환경 내에서 로봇의 위치를 추정하는 것
: 랜드마크 관측벡터와 초기값 그리고 적용된 모든 제어입력이 주어진 상태에서 랜드마크의 위치와 k시점에서의 로봇 상태벡터 xk의 결합확률
- 재귀적 방법 + Bayes 정리
- observation model (관측 모델) + motion model (상태 공간 모델, 로봇의 움직임 모델)
- motion model은 상태 천이가 Markov 과정임을 의미함. (현재 상태는 오직 이전 상태와 입력 벡터로서 기술되고, 랜드마크 집합과 관측에 독립임.)
- prediction (time-update) + correction (measurement-update)
- 불확실성은 로봇 주행거리계와 센서 오차로부터 유발됨.
This paper appears in: Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on
Publication Date: 17-22 June 2006
Volume: 1, On page(s): 469- 476
ISSN: 1063-6919
ISBN: 0-7695-2597-0
Digital Object Identifier: 10.1109/CVPR.2006.263
Current Version Published: 2006-07-05
monocular SLAM
particle filter + top-down search => real-time, large number of landmarks
the first to apply this FastSLAM-type particle filter to single-camera SLAM
1. Introduction
SLAM = Simultaneous Localization and Mapping
: process of causally estimating both egomotion and structure in an online system
SLAM using visual data in computer vision
SFM (= structure from motion): reconstructing scene geometry
+ causal or recursive estimation techniques
perspective-projection cameras
filtering methods to allow indirect observation models
Kalman filtering framework
Extended Kalman filter = EKF (-> to linearize the observation and dynamics models of the system)
causal estimation with recursive algorithms (cp. estimation depending only on observations up to the current time)
=> online operation (cp. SFM on global nonlinear optimization)
Davision's SLAM with a single camera
> EKF estimation framework
> top-down Bayesian estimation approach searching for landmarks in image regions constrained by estimate > uncertainty (instead of performing extensive bottom-up image processing and feature matching)
> Bayesian partial-initialization scheme for incorporating new landmarks
- cannot scale to large environment
EKF = the Extended Kalman filter
- N*N covariace matrix for N landmarks
- updated with N*N computation cost
> SLAM system using a single camera as the only sensor
> frame-rate operation with many landmarks
> FastSLAM-style particle filter (the first use of such an approach in a monocular SLAM setting)
> top-down active search
> an efficient algorithm for discovering the depth of new landmarks that avoids linearization errors
> a novel method for using partially initialized landmarks to help constrain camera pose
FastSLAM
: based on the Rao-Blackwellized Particle Filter
2. Background
2.1 Scalable SLAM
> submap
bounded complexity -> bounded computation and space requirements
Montemerlo & Thrun
If the entire camera motion is known then the estimates of the positions of different landmarks become independent of each other.
This paper appears in: Robotics, IEEE Transactions on
Publication Date: Oct. 2008
Volume: 24, Issue: 5
On page(s): 932-945
ISSN: 1552-3098
INSPEC Accession Number: 10301459
Digital Object Identifier: 10.1109/TRO.2008.2003276
First Published: 2008-10-03
Current Version Published: 2008-10-31
Javier Civera, Departamento de Informática e Ingeniería de Sistemas, Universidad de Zaragoza
Andrew J. Davison, Reader in Robot Vision at the Department of Computing, Imperial College London
monocular simultaneous localization and mapping (SLAM)
representation of uncertainty
the standard extended Kalman filter (EKF)
direct parametrization of the inverse depth of features
feature initialization
camera motion estimates
6-D state vector --> converted to the Euclidean XYZ form
linearity index => automatic detection and conversion to maintain maximum efficiency
I. Introduction
monocular camera
: projective sensor measuring the beairng of image features
monocular (adj) 단안(單眼)(용)의, 외눈의
A stereo camera is a type of camera with two or more lenses. This allows the camera to simulate human binocular vision.
structure from motion = SFM
1) feature matching
2) global camera location & scene feature position estimates
sliding window processing
Sliding Window Protocol is a bi-directional data transmission protocol used in the data link layer (OSI model) as well as in TCP (transport layer of the OSI model). It is used to keep a record of the frame sequences sent and their respective acknowledgements received by both the users.
In robotics and computer vision, visual odometry is the process of determining the position and orientation of a robot by analyzing the associated camera images.
Odometry is the use of data from the movement of actuators to estimate change in position over time. Odometry is used by some robots, whether they be legged or wheeled, to estimate (not determine) their position relative to a starting location.
visual SLAM
probabilistic filtering approach
initializing uncertain depth estimates for distance features
Gaussian distributions implicit in the EKF
a new feature parametrization that is able to smoothly cope with initialization of features at all depth - even up to "infinity" - within the standard EKF framework: direct parametrization of inverse depth relative to the camera position from which a feature was first observed
A. Delayed and Undelayed Initialization
main map; main probabilistic state; main state vector
test for inclusion
delayed initialization
> treating newly detected features separately from the main map to reduce depth uncertainty before insertion into the full filter (with a standard XYZ representation)
- Features that retain low parallax over many frames (those very far from the camera or close to the motion epipole) are usually rejected completely because they never pass the test for inclusion
> (in 2-D and simulation) Initialization is delayed until the measurement equation is approximately Gaussian and the point can be safely triangulated.
> 3-D monocular vision with inertial sensing + auxiliary particle filter (in high frame rate sequence)
undelayed initialization
> While features with highly uncertain depths provide little information on camera translation, they are extremely useful as bearing references for orientation estimation.
: a multiple hypothesis scheme, initializing features at various depths and pruning those not reobserved in subsequent images
> Gaussian sum filter approximated by a federated information sharing method to keep the computational overhead low
-> to spread the Gaussian depth hypotheses along the ray according to inverse depth
Davision's particle method --> (Sola et al.) Gaussian sum filter --> (Civera at al.) new inverse depth scheme
A Gaussian sum is more efficient representation than particles (efficient enough that the separate Gaussians can call be put into the main state vector), but not as efficient as the single Gaussian representation that the inverse depth parametrization aalows.
B. Points at Infinity
efficient undelayed initialization + features at all depths (in outdoor scenes)
Point at infinity: a feature that exhibits no parallax during camera motion due to its extreme depth
-> not used for estimating camera translationm but for estimating rotation
The homogeneous coordinate systems of visual projective geometry used normally in SFM allow explicit representation of points at infinity(, and they have proven to play an important role during offline structure and motion estimation).
sequential SLAM system
Montiel and Davison: In special case where all features are known to be infinite -- in very-large-scale outdoor scenes or when the camera rotates on a tripod -- SLAM in pure angular coordinates turns the camera into a real-time visual compass.
Our probabilistic SLAM algorithm must be able to represent the uncertainty in depth of seemingly infinite features. Observing no parallax for a feature after 10 units of camera translation does tell us something about its depth -- it gives a reliable lower bound, which depends on the amount of motion made by the camera (if the feature had been closer than this, we would have observed parallax).
The explicit consideration of uncertainty in the locations of points has not been previously required in offline computer vision algorithms, but is very important in a more difficult online case.
C. Inverse Depth Representation
There is a unified and straightforward parametrization for feature locations that can handle both initialization and standard tracking of both close and very distant features within the standard EKF framework.
standard tracking
An explicit parametrization of the inverse depth of a feature along a semiinfinite ray from the position from which it was first viewed allows a Gaussian distribution to cover uncertainty in depth that spans a depth range from nearby to infinity, and permits seamless crossing over to finite depth estimates of features that have been apparently infinite for long periods of time.
linearity index + inverse depth parametrization
The projective nature of a camera means that the image measurement process is nearly linear in this inverse depth coordinate.
Inverse depth appears in the relation between image disparity and point depth in a stereo vision; it is interpreted as the parallax with respect to the plane at infinity. (Hartley and Zisserman)
Inverse depth is used to relate the motion field induced by scene points with the camera velocity in optical flow analysis.
modified polar coordinates
target motion analysis = TMA
EKF-based sequential depth estimation from camera-known motion