http://en.wikipedia.org/wiki/Chroma_key
Green is currently used as a backdrop more than any other color because
image sensors in digital video cameras are most sensitive to green, due
to the Bayer pattern allocating more pixels to the green channel, this mimicks the human increased sensitivity to green light.
Coelho, C., Heller, A., Mundy, J. L., Forsyth, D. A., and Zisserman, A.1992. An experimental evaluation of projective invariants. In Geometric invariance in Computer Vision, J. L. Mundy and A. Zisserman, Eds. Mit Press Series Of Artificial Intelligence Series. MIT Press, Cambridge, MA, 87-104.
camera self-calibration 카메라 자동 보정
: 어떤 물체의 삼차원 VRML 모델을 여러 장의 영상에서 얻고자 하는 경우 그 영상들을 얻는 데 사용된 카메라에 대한 위치, 방향, 초점거리 등의 정보를 구하는 과정
projective geometric method 투영기하방법
1. 삼차원 모델 복원을 통한 증강 현실
SFM = structure from motion
: 카메라 파라미터와 영상열 (image sequence) 각 프레임의 카메라간의 상대적인 위치를 계산하고, 이러한 정보를 이용하여 영상열에 보이는 물체의 대략적인 3차원 구조를 계산
trilinearity
: 임의의 3차원 구조를 보고 있는 세 개의 투시뷰 (perspective view) 사이의 대수학적 연결 관계
trifocal tensor
: trilinearity를 수학적으로 모델링한 것
(영상에서 특징점과 직선을 정합하고, 투영 카메라를 계산하며, 투영 구조를 복원하는 데 이용됨)
(기존 epipolar geometry를 이용한 방법보다 더 정확한 결과를 얻을 수 있는 것으로 알려짐)
SFM 시스템의 핵심 기술을 영상열에서 정확하게 카메라를 계산하는 것이다.
1) projective reconstruction 투영 기하 복원
영상열에서 추출되는 특징점과 특징선들을 정확하게 연결하여 영상열에서 관찰되는 2차원 특징들과 우리가 복원하여 모델을 만들고자 하는 3차원 구조와의 초기 관계를 (실제 영상을 획득한 카메라의 파라미터들과 카메라간의 상대적인 파악함으로써) 계산
Trifocal
tensor의 계산은 아주 정밀한 값을 요구하므로 잘못된 특징점이나 특징선의 연결이 들어가서는 안 된다. 이러한 잘못된 연결
(Outlier)를 제거하는 방법으로 LMedS (Least Median Square)나 RANSAC (Random
Sampling Consensus) 기법이 사용된다.
세 개의 뷰 단위로
연속된 영상열에서 계속적으로 계산된 trifocal tensor들과 특징점과 특징선들은 임의의 기준 좌표계를 중심으로 정렬하는
다중 뷰 정렬 (Multi-view Registration) 처리를 통하여 통합된다. 이렇게 통합된 값들은 투영 통합 최적화
(Projective Bundle Adjustment)를 거쳐 투영 기하 아래에서 발생한 에러를 최소화한다.
2) camera auto-calibration 카메라 자동 보정
투영 기하 정보를 유클리드 기하 정보로 변환하기 위해서 필요한 것으로, 2차원 영상 정보의 기하학적인 특징을 이용하여 투영 기하에서 유클리드 기하로 3차원 구조를 변환하는 동시에 실세계에서의 카메라 변수(초점 거리, 카메라 중심, 축 비율, 비틀림 상수)와 카메라 간의 상대적인 위치를 정확하게 계산
카메라 자동 보정의 방법론에서는 카메라 보정을 위한 패턴을 따로 디자인하여 사용하지 않는다. 이 경우 실시간 계산의 기능은 없어지게 된다.
3) (유클리드) 구조 복원
모델의 3차원 기하 정보를 구함
: 자동 보정 단계를 통하여 복원된 3차원 정보를 그래픽 모델로 만들기 위해서 먼저 3차원 데이터를 데이터 삼각법 (Triangulation)을 이용하여 다각형 메쉬 모델 (Polygonal mesh model)로 만든 후에, 텍스처 삽입을 통해서 모델의 현실성을 증가시킴
2. 카메라 보정에 의한 증강 현실
1) 보정 패턴에 고정된 좌표계(W)와 카메라 좌표계(C) 그리고 그래픽을 위한 좌표계(G) 사이의 관계를 미리 설정해 둠
2) 카메라와 보정 패턴 사이의 상대적인 좌표변환 관계는 영상처리를 통하여 매 프레임마다 계산을 통해 얻음
3) (그래픽 좌표계와 보정 패턴 좌표계는 미리 결정되어 있어서 컴퓨터 그래픽에 의해 합성될 가상물체들의 상대적인 위치가 카메라 보정 단계 이전에 이미 알려져 있는 것이므로,) 카메라로부터 얻은 영상을 분석하여 보정 패턴을 인식하고 고정 패턴의 삼차원 좌표와 그 좌표의 영상 위치를 알아낸 후 그 둘 사이의 관계를 이용하여 카메라 보정값을 얻음
cross ratio 비조화비
4) trifocal tensor를 계산하여 카메라의 초기 정보를 계산하여 초기 영상복원을 구함
5) 영상열의 각 영상에 대해서 이전 시점의 영상 사이의 정합점들을 영상 정합 (image based matching: normalized cross correlation (NCC)를 이용하는) 방법을 통해 구함
6) RANSAC 알고리듬을 기초로 새로운 영상에 대한 카메라 행렬을 구하고, 새롭게 나타난 정합점들에 대해서 삼차원 좌표를 구함 (잘못 얻어진 정합점들을 제거하는 과정을 포함)
7) Euclidean Reconstruction: 투영 기하 공간에서 정의된 값들을 유클리드 공간에서 정의되는 값으로 변환
카메라 자동 보정 및 투영기하공간에서의 계산식들은 모두 비선형 방정식으로 되어 있기 때문에 최소자승 오차법 (Least square error method)으로 구해지는 값들이 원래 방정식을 제대로 따르지 못하는 경우가 많이 생긴다. 따라서 비선형 최적화 과정이 항상 필요하며 유클리드 공간으로의 변환과정의 최종 단계에서 그리고 투영기하공간에서 복원값들 구할 때 최적화 과정을 적절히 배치할 필요가 있다.
Z. Zhang. Flexible Camera Calibration By Viewing a Plane From Unknown Orientations. International Conference on Computer Vision (ICCV'99), Corfu, Greece, pages 666-673, September 1999.
Sawhney, H. S. and Kumar, R. 1999. True Multi-Image Alignment and Its Application to Mosaicing and Lens Distortion Correction. IEEE Trans. Pattern Anal. Mach. Intell. 21, 3 (Mar. 1999), 235-243. DOI= http://dx.doi.org/10.1109/34.754589
Seong-Woo Park, Yongduek Seo, Ki-Sang Hong: Real-Time Camera Calibration for Virtual Studio. Real-Time Imaging 6(6): 433-448 (2000) doi:10.1006/rtim.1999.0199
Dept. of E.E. POSTECH, San 31, Hyojadong, Namku, Pohang, Kyungbuk, 790-784, Korea
Abstract
In this paper, we present an overall algorithm for real-time camera parameter extraction, which is one of the key elements in implementing virtual studio, and we also present a new method for calculating the lens distortion parameter in real time. In a virtual studio, the motion of a virtual camera generating a graphic studio must follow the motion of the real camera in order to generate a realistic video product. This requires the calculation of camera parameters in real-time by analyzing the positions of feature points in the input video. Towards this goal, we first design a special calibration pattern utilizing the concept of cross-ratio, which makes it easy to extract and identify feature points, so that we can calculate the camera parameters from the visible portion of the pattern in real-time. It is important to consider the lens distortion when zoom lenses are used because it causes nonnegligible errors in the computation of the camera parameters. However, the Tsai algorithm, adopted for camera calibration, calculates the lens distortion through nonlinear optimization in triple parameter space, which is inappropriate for our real-time system. Thus, we propose a new linear method by calculating the lens distortion parameter independently, which can be computed fast enough for our real-time application. We implement the whole algorithm using a Pentium PC and Matrox Genesis boards with five processing nodes in order to obtain the processing rate of 30 frames per second, which is the minimum requirement for TV broadcasting. Experimental results show this system can be used practically for realizing a virtual studio.
전자공학회논문지 제36권 S편 제7호, 1999. 7
가상스튜디오 구현을 위한 실시간 카메라 추적 ( Real-Time Camera Tracking for Virtual Studio )
박성우 · 서용덕 · 홍기상 저 pp. 90~103 (14 pages) http://uci.or.kr/G300-j12265837.v36n07p90
서지링크 한국과학기술정보연구원
가상스튜디오의 구현을 위해서 카메라의 움직임을 실시간으로 알아내는 것이 필수적이다. 기존의 가상스튜디어 구현에 사용되는 기계적인 방법을 이용한 카메라의 움직임 추적하는 방법에서 나타나는 단점들을 해결하기 위해 본 논문에서는 카메라로부터 얻어진 영상을 이용해 컴퓨터비전 기술을 응용하여 실시간으로 카메라변수들을 알아내기 위한 전체적인 알고리듬을 제안하고 실제 구현을 위한 시스템의 구성 방법에 대해 다룬다. 본 연구에서는 실시간 카메라변수 추출을 위해 영상에서 특징점을 자동으로 추출하고 인식하기 위한 방법과, 카메라 캘리브레이션 과정에서 렌즈의 왜곡특성 계산에 따른 계산량 문제를 해결하기 위한 방법을 제안한다.
camera tracking system : electromechanical / optical
pattern recognition
2D-3D pattern matches
planar pattern
feature extraction -> image-model matching & identification -> camera calibration
: to design the pattern by applying the concept of cross-ratio and to identify the pattern automatically
영상에서 찾아진 특징점을 자동으로 인식하기 위해서는 공간 상의 점들과 영상에 나타난 그것들의 대응점에 대해서 같은 값을 갖는 성질이 필요한데 이것을 기하적 불변량 (Geometric Invariant)이라고 한다. 본 연구에서는 여러 불변량 가운데 cross-ratio를 이용하여 패턴을 제작하고, 영상에서 불변량의 성질을 이용하여 패턴을 자동으로 찾고 인식할 수 있게 하는 방법을 제안한다.
radial alignment constraint
"If we presume that the lens has only radial distortion, the direction of a distorted point is the same as the direction of an undistorted point."
카메라의 움직임을 알아내기 위해서는 공간상에 인식이 가능한 물체가 있어야 한다. 즉, 어느 위치에서 보더라도 영상에 나타난 특징점을 찾을 수 있고, 공간상의 어느 점에 대응되는 점인지를 알 수 있어야 한다.
패턴이 인식 가능하기 위해서는 카메라가 어느 위치, 어느 자세로 보던지 항상 같은 값을 갖는 기하적 불변량 (Geometric Invariant)이 필요하다.
Coelho, C., Heller, A., Mundy, J. L., Forsyth, D. A., and Zisserman, A.1992. An experimental evaluation of projective invariants. In Geometric invariance in Computer Vision, J. L. Mundy and A. Zisserman, Eds. Mit Press Series Of Artificial Intelligence Series. MIT Press, Cambridge, MA, 87-104.
> initial identification process
extracting the pattern in an image: chromakeying -> gradient filtering: a first-order derivative of Gaussian (DoG) -> line fitting: deriving a distorted line (that is actually a curve) equation -> feature point tracking (using intersection filter)
이상적인 렌즈의 optical axis가 영상면에 수직이고 변하지 않는다고 할 때, 영상 중심은 카메라의 줌 동작 동안 고정된 값으로 계산된다. (그러나 실제 렌즈의 불완전한 특성 때문에 카메라의 줌 동작 동안 영상 중심 역시 변하게 되는데, 이 변화량은 적용 범위 이내에서 2픽셀 이하이다. 따라서 본 연구에서는 이러한 변화를 무시하고 이상적인 렌즈를 가정하여 줌동작에 의한 영상 중심을 구하게 된다.)
For zoom lenses, the image centers vary as the camera zooms because the
zooming operation is executed by a composite combination of several
lenses. However, when we examined the location of the image centers,
its standard deviation was about 2 pixels; thus we ignored the effect
of the image center change.
calculating lens distortion coefficient
Zoom lenses are zoomed by a complicated combination of several lenses so that the effective focal length and distortion coefficient vary during zooming operations.
When using the coplanar pattern with small depth variation, it turns out that focal length and z-translation cannot be separated exactly and reliably even with small noise.
카메라 변수 추출에 있어서 공간상의 특징점들이 모두 하나의 평면상에 존재할 때는 초점거리와 z 방향으로의 이동이 상호 연관 (coupling)되어 계산값의 안정성이 결여되기 쉽다.
collinearity
Collinearity represents a property when the line in the world coordinate is also shown as a line in the image. This property is not preserved when the lens has a distortion.
Once the lens distortion is calculated, we can execute camera calibration using linear methods.
filtering
가상 스튜디오 구현에 있어서는 시간 지연이 항상 같은 값을 가지게 하는 것이 필수적이므로, 실제 적용에서는 예측 (prediction)이 들어가는 필터링 방법(예를 들면, Kalman filter)은 사용할 수가 없었다.
The issue discusses methods to extract 3-dimensional (3D) models from plain images.
In particular, the 3D information is obtained from images for which the camera parameters are unknown.
The principles underlying such uncalibrated structure-from-motion methods are outlined. First, a short
review of 3D acquisition technologies puts such methods in a wider context, and highlights their important
advantages. Then, the actual theory behind this line of research is given. The authors have tried to keep
the text maximally self-contained, therefore also avoiding to rely on an extensive knowledge of the projective
concepts that usually appear in texts about self-calibration 3D methods. Rather, mathematical explanations
that are more amenable to intuition are given. The explanation of the theory includes the stratification
of reconstructions obtained from image pairs as well as metric reconstruction on the basis of more than 2
images combined with some additional knowledge about the cameras used. Readers who want to obtain more
practical information about how to implement such uncalibrated structure-from-motion pipelines may be
interested in two more Foundations and Trends issues written by the same authors. Together with this
issue they can be read as a single tutorial on the subject.
기존 SLAM에서 쓰이는 레이저 레인지 스캐너 등 range and bearing 센서 대신 공간에 대한 풍부한 정보를 주는 카메라를 쓰면, 1차원 (인식된 물체까지의 거리 정보, depth)을 잃게 되어 bearing-only SLAM이 된다.
EKF requires Gaussian representations for all the involved random variables that form the map (the robot pose and all landmark's positions). Moreover, their variances need to be small to be able to approximate all the non linear functions with their linearized forms.
두 입력 이미지 프레임 사이에 baseline을 구할 수 있을 만큼 충분한 시점 차가 존재해야 랜드마크의 위치를 결정할 수 있으므로, 이를 확보하기 위한 시간이 필요하게 된다.
#define num_landmarks 4
#define dof_rob 3 // DOF of a robot, (Rx, Ry, Ra)
#define dim_map 2 // 2 dimensional map
#define dim_sys ( dim_map * (num_landmarks + dof_rob) ) // defining the size of a state vector
#define dim_dom ( dim_map * (1 + dof_rob) ) // dimension of the domain of the observation model function
// number of variables to define the observation of one landmark
#define dim_obs 1 // 1 dimensional measurement from input images
#define mapWidth 1200 // width of the map on SLAM
#define mapHeight 600
#define imgWidth 400 // width of input image
#define imgHeight 200
#define focal 200 // focal length of the robot's camera
#define Step 100
// calculate the Jacobian of the observation model
CvMat* compute_Jacobian (CvMat* state)
{ /* Since the observation model is a non-linear function of elements of the state vector,
H matrix is defined by its Jacobian matrix. */
CvMat* H = cvCreateMat(num_landmarks, dim_sys, CV_64FC1);
cvZero(H); // initialize H matrix
// calculate the Jacobian of the observation model function, h
// let h = -focal*h1/h2, when px = h + mapWidth/2
double cosRa = cos(Ra);
double sinRa = sin(Ra);
for (int n = 0; n < num_landmarks; n++)
{
Lx = cvmGet(state,6+2*n,0); // n-th landmark's x-position
Lz = cvmGet(state,7+2*n,0); // n-th landmark's z-position
// partial derivatives of h
double dh[dim_dom]; // "dim_dom = 8"
/* dh[0] = focal/h2 * ( cosRa - sinRa*h1/h2 ); // der(h) to Rx; partial derivative of h with respect to Rx
dh[1] = focal/h2 * ( cosRa*h1/h2 - sinRa ); // der(h) to Rz
dh[2] = focal * ( 1 + (h1*h1)/(h2*h2) ); // der(h) to Ra
dh[3] = dh[4] = dh[5] = 0.0; // der(h) to Vx, Vz, and Va are zero ; h is independent on them
dh[6] = focal/h2 * ( sinRa*h1/h2 - cosRa); // der(h) to Lx
dh[7] = focal/h2 * ( sinRa + cosRa*h1/h2 ); // der(h) to Lz
*/
dh[0] = -focal / h2 * (sinRa - h1 * cosRa / h2); // der(h) to Rz
dh[1] = -focal / h2 * (cosRa - h1 * sinRa / h2); // der(h) to Rx \
dh[2] = (CV_PI/180.0)*(focal/h2) * ( sinRa * (Rx-Lx) + cosRa * (Lz-Rz) + (h1/h2)*(cosRa * (Rx-Lx) - sinRa * (Lz-Rz)) ); // der(h) to Ra
dh[3] = dh[4] = dh[5] = 0.0; // der(h) to Vx, Vz, and Va are zero ; h is independent on them
dh[6] = focal / h2 * (sinRa + cosRa * h1 / h2 ); // der(h) to Lz
dh[7] = focal / h2 * (cosRa - sinRa * h1 / h2); // der(h) to Lx
// set H matrix
for (int col = 0; col < 5; col++)
{
cvmSet(H, n, col, dh[col]); // H(0,0) = dh[0] = dh/dRx
}
cvmSet(H, n, 6+2*n, dh[6]); // dh/dLx; der(h) to "row"-th landmark's x-position unit
cvmSet(H, n, 7+2*n, dh[7]); // dh/dLz; der(h) to "row"-th landmark's z-position unit
}
displayMatrix(H, "H matrix");
return H;
}
// set the non-linear observation model function, h
// let h = -focal*h1/h2, when px = h + mapWidth/2
for( int n = 0; n < num_landmarks; n++ )
{
double Lz = cvmGet(input,6+2*n,0);
double Lx = mapHeight - cvmGet(input,7+2*n,0);
double h1 = (Lx - Rx)*cosRa - (Lz - Rz)*sinRa;
double h2 = (Lz - Rz)*cosRa + (Lx - Rx)*sinRa;
double h = -focal * h1 / h2;
cvmSet(output, n, 0, h + cvmGet(measurement_noise, n, 0));
}
displayMatrix(output, "z vector");
// return z;
}
// set the transition matrix to define the motion model
CvMat* define_motion_model( int state_dimension )
{
CvMat* G = cvCreateMat(state_dimension, state_dimension, CV_64FC1);
cvZero(G); // initialize G matrix
// set the covariance matrix of the process noise to define the uncertainty of control
CvMat* define_process_noise( int state_dimension, int Rx_cov, int Rz_cov, int Ra_cov, int Vx_cov, int Vz_cov, int Va_cov )
{
// R matrix ; covariance of process noise to G; uncertainty of control
CvMat* R = cvCreateMat(dim_sys, dim_sys, CV_64FC1);
cvZero(R); // initialize R matrix
// set the covariance matrix of measurement noise to define the uncertainty of observation
CvMat* define_measurement_noise( int measurement_dimension, double obs_cov )
{
CvMat* Q = cvCreateMat(measurement_dimension, measurement_dimension, CV_64FC1);
cvZero(Q); // initialize Q matrix
// set Q matrix
for (int row = 0; row < measurement_dimension; row++)
{
cvmSet(Q, row, row, obs_cov);
}
displayMatrix(Q, "Q matrix");
return Q;
}
// predict the covariance matrix of the estimated state
CvMat* predict_covariance_of_state( CvMat* G, CvMat* sigma, CvMat* R )
{
CvMat* sigma_p = cvCreateMat(dim_sys, dim_sys, CV_64FC1);
// create the temporary matrices to be needed in calculation below
CvMat* Gt = cvCreateMat(G->cols, G->rows, CV_64FC1);
CvMat* sigmaGt = cvCreateMat(sigma->rows, G->rows, CV_64FC1);
CvMat* GsigmaGt = cvCreateMat(G->rows, G->rows, CV_64FC1);
// predict the covariance of the state (ref. L3, EKF algorithm, 59p)
cvTranspose(G, Gt); // transpose(G) -> Gt
cvMatMul(sigma, Gt, sigmaGt); // sigma * Gt -> sigmaGt
cvMatMul(G, sigmaGt, GsigmaGt); // G * sigmaGt -> GsigmaGt
cvAdd(GsigmaGt, R, sigma_p); // GsigmaGt + R -> sigma_p
// release the temporary matrices in calculation above
cvReleaseMat(&Gt);
cvReleaseMat(&sigmaGt);
cvReleaseMat(&GsigmaGt);
return sigma_p;
}
// compute Kalman Gain in EKF algorithm (ref. L4, EKF algorithm, 59p)
CvMat* compute_KalmanGain( CvMat* H, CvMat* sigma_p, CvMat* Q )
{
CvMat* K = cvCreateMat(dim_sys, num_landmarks, CV_64FC1); // K matrix // Kalman gain
// release temporary matrices to be used in calculation above
cvReleaseMat(&Ht);
cvReleaseMat(&sigmaHt);
cvReleaseMat(&HsigmaHt);
cvReleaseMat(&HsigmaHtplusQ);
cvReleaseMat(&invGain);
cvReleaseMat(&sigmapHt);
displayMatrix(K, "K matrix");
return K;
}
// update the state with Kalman gain (ref. L5, EKF algorithm, 59p)
CvMat* update_state_mean( CvMat* mu, CvMat* mu_p, CvMat* K, CvMat* z, CvMat* H )
{
// create temporary matrices to be used in calculation below
CvMat* Hmu = cvCreateMat(H->rows, mu->cols, CV_64FC1);
CvMat* miss = cvCreateMat(Hmu->rows, 1, CV_64FC1);
CvMat* adjust = cvCreateMat(mu->rows, 1, CV_64FC1);
// update the state with Kalman gain (ref. L5, EKF algorithm, 59p)
cvMatMul(H, mu, Hmu); // H * mu -> Hmu
cvSub(z, Hmu, miss); // z - Hmu -> miss
cvMatMul(K, miss, adjust); // K * miss -> adjust
cvAdd(mu_p, adjust, mu); // mu_p + adjust -> mu
// release temporary matrices to be used in calculation above
cvReleaseMat(&Hmu);
cvReleaseMat(&miss);
cvReleaseMat(&adjust);
displayMatrix(mu, "mu vector");
return mu;
}
// update the covariance of the state with Kalman gain (ref. L6, EKF algorithm, 59p)
CvMat* update_state_covariance( CvMat* sigma, CvMat* K, CvMat* H, CvMat* sigma_p )
{
// create temporary matrices
CvMat* KH = cvCreateMat(K->rows, H->cols, CV_64FC1);
CvMat* I = cvCreateMat(KH->rows, KH->cols, CV_64FC1);
CvMat* change = cvCreateMat(I->rows, I->cols, CV_64FC1);
cvMatMul(K, H, KH); // K * H -> KH
cvSetIdentity(I);
cvSub(I, KH, change); // I - KH -> change
cvMatMul(change, sigma_p, sigma); // change * sigma_p -> sigma
// set the ground truth of the robot's state
CvMat* set_groundtruth_of_robot( double x, double z, double a, double vx, double vz, double va )
{
// CvMat* rob_ground = cvCreateMat(dim_sys, 1, CV_64FC1);
CvMat* rob_ground = cvCreateMat(6, 1, CV_64FC1);
if( a < -20.0 || a > 20.0 )
{
va = (-1) * va ;
}
x += vx;
// z += vzGroundTruth; // steady
a += va;
// temporary variable to display the robot's position in "window1"
// robot_ground[t] = cvPoint(cvRound(xGroundTruth),cvRound(zGroundTruth));
// set the initial state
double Rxi = mapWidth * 0.01; // robot's initial x-position
double Rzi = mapHeight * 0.5; // robot's initial z-position to be steady
double Rai = 0.0; // robot's initial angle rotated around y-axis
double Vxi = 2.0; // = mapWidth * 0.6 / (Step+2); // robot's initial x-velocity
double Vzi = 0.0; // robot's initial z-velocity that is zero
double Vai = 0.2; // = 20.0 * uniform_random() - 10.0; // robot's initial angular velocity around y-axis
// set the initial covariance of the state uncertainty
double Rx_cov = 0.001; // covariance of noise to robot's x-position
double Rz_cov = 0.001; // covariance of noise to robot's z-position
double Ra_cov = 0.001; // covariance of noise to robot's angle around y-axis
double Vx_cov = 0.001; // covariance of noise to robot's x-velocity
double Vz_cov = 0.001; // covariance of noise to robot's z-velocity
double Va_cov = 0.001; // covariance of noise to robot's angular velocity
// set the initial covariance of the measurement noise
double obs_x_cov = 900.0; // covariance of noise to x-measurement of landmarks
double obs_z_cov = 900.0; // covariance of noise to z-measurement of landmarks
// ground truths of the state
double xGroundTruth = Rxi;
double zGroundTruth = Rzi;
double aGroundTruth = Rai;
double vxGroundTruth = Vxi;
double vzGroundTruth = Vzi;
double vaGroundTruth = Vai;
// ground truths of "num_landmarks" landmarks in the world coordinate
double landmark[num_landmarks*dim_map]; // "dim_map" = 2; dimension of the map
// ground truths of "num_landmarks" landmarks in the world coordinate
set_groundtruths_of_landmarks( landmark ); // "groundtruth.h"
// G matrix; transition matrix to define the motion model
CvMat* G = define_motion_model(dim_sys);
// R matrix; covariance matrix of the process noise added to G, due to the uncertainty of control
CvMat* R = define_process_noise(dim_sys, Rx_cov, Rz_cov, Ra_cov, Vx_cov, Vz_cov, Va_cov);
// H matrix; measurement matrix to define the observation model
// CvMat* H = cvCreateMat(num_landmarks, dim_sys, CV_64FC1);
// Q matrix; covariance matrix of the measurement noise to H, due to the uncertainty of observation
CvMat* Q = define_measurement_noise(num_landmarks, obs_x_cov);
// define the system
CvMat* mu = cvCreateMat(dim_sys, 1, CV_64FC1); // state vector to be estimated
CvMat* mu_p = cvCreateMat(dim_sys, 1, CV_64FC1); // state vector to be predicted
CvMat* sigma = cvCreateMat(dim_sys, dim_sys, CV_64FC1); // covariance to be updated
for (int n = 0; n < num_landmarks*dim_map; n++)
{
cvmSet(mu, n+6, 0, landmark[n]); // set mu(6,0) to "landmark[0]", ...
}
displayMatrix(mu, "mu vector");
// initialize "sigma" matrix <-- This is the most critical factor in tuning
cvZero(sigma);
cvmSet(sigma, 0, 0, Rx_cov);
// cvmSet(sigma, 1, 1, Rz_cov);
cvmSet(sigma, 2, 2, Ra_cov);
cvmSet(sigma, 3, 3, Vx_cov);
// cvmSet(sigma, 4, 4, Vz_cov);
cvmSet(sigma, 5, 5, Va_cov);
for( int r = 6; r < sigma->rows; r += 2 )
{
cvmSet(sigma, r, r, obs_x_cov * 10);
cvmSet(sigma, r+1, r+1, obs_z_cov * 10);
}
// define IPL image windows
char window1[256], window2[256], captured[256];
sprintf(window1, "2D vSLAM with EKF");
cvNamedWindow(window1);
cvMoveWindow(window1, 10 , imgHeight + 200);
IplImage *iplImg = cvCreateImage(cvSize(mapWidth, mapHeight) , 8, 3); // process of vSLAM algorithm
cvZero(iplImg);
sprintf(window2, "observation of 1D input image");
cvNamedWindow(window2);
IplImage *inputImg[Step]; // measured input images
for( int t = 0; t < Step; t++ )
{
inputImg[t] = cvCreateImage(cvSize(imgWidth, imgHeight) , 8, 3);
cvZero(inputImg[t]);
}
// set to display the groundtruth of each landmark's position
char txt[200];
CvPoint pLM;
CvScalar colorLandmark[num_landmarks];
for(int iL = 0; iL < num_landmarks; iL++)
{
colorLandmark[iL] = generateRandomColor(50, 50, 50);
}
sprintf(captured, "vSLAM2D_test.bmp"); // captured one from frames in "iplImg"
int frame = int(0.9*Step); // frame Nunmber of a image to be saved as "captured"
cout << endl << "process starting" << endl;
for (int t = 0; t < Step; t++)
{
cout << endl << "step " << t << endl;
cvZero(iplImg);
//* prediction step
// predict the state (ref. L2, KF algorithm, 59p)
cvMatMul(G, mu, mu_p); // G * mu -> mu_p
// predict the covariance of the state (ref. L3, EKF algorithm, 59p)
CvMat* sigma_p = predict_covariance_of_state( G, sigma, R);
//* measurement step
// ground truth of state
// CvMat* rob_ground = set_groundtruth_of_robot( xGroundTruth, zGroundTruth, aGroundTruth, vxGroundTruth, vzGroundTruth, vaGroundTruth );
/////###
// set ground truths
for (int dim = 0; dim < num_landmarks*dim_map; dim++)
{
cvmSet(rob_ground, dim+6, 0, landmark[dim]);
}
displayMatrix(rob_ground, "rob_ground");
////###
for(int n = 0; n < num_landmarks; n++)
{
double rn = sqrt(obs_x_cov) * gaussian_random();
cvmSet(delta, n, 0, rn);
}
update_measurement(z, rob_ground, delta);
// remember px = z + mapWidth/2
//* correction step
// update H matrix H; measurement matrix to define the observation model
// returned by CvMat* compute_Jacobian (CvMat* state)
CvMat* H = compute_Jacobian( mu );
// update Kalman gain (ref. L4, EKF algorithm, 59p)
CvMat* K = compute_KalmanGain( H, sigma_p, Q );
// update the state with Kalman gain (ref. L5, EKF algorithm, 59p)
mu = update_state_mean( mu, mu_p, K, z, H );
// result in image
// ground truth of robot position, red
cvCircle(iplImg, cvPoint(cvRound(cvmGet(rob_ground,0,0)), cvRound(cvmGet(rob_ground,1,0))), 2, cvScalar(10, 0, 255), 1);
// draw the camera's position and pose in the global map
drawCamState(iplImg, cvPoint2D32f(cvmGet(mu,0,0), cvmGet(mu,1,0)), cvmGet(mu,2,0), focal, imgWidth / 2.0, cvScalar(255, 0, 100), cvScalar(60, 255, 255), cvPoint(-100, 30));
Danica Kragic1 and Markus Vincze2 1 Centre for Autonomous Systems,
Computational Vision and Active Perception Lab, School of Computer
Science and Communication, KTH, Stockholm, 10044, Sweden, dani@kth.se 2 Vision for Robotics Lab, Automation and Control Institute, Technische Universitat Wien, Vienna, Austria, vincze@acin.tuwien.ac.at
SUGGESTED CITATION: Danica Kragic and Markus Vincze (2010) “Vision for Robotics”,
Foundations and Trends® in Robotics: Vol. 1: No. 1, pp 1–78.
http:/dx.doi.org/10.1561/2300000001
Abstract
Robot vision refers to the capability of a robot to
visually perceive the environment and use this information for
execution of various tasks. Visual feedback has been used extensively
for robot navigation and obstacle avoidance. In the recent years, there
are also examples that include interaction with people and manipulation
of objects. In this paper, we review some of the work that goes beyond
of using artificial landmarks and fiducial markers for the purpose of
implementing visionbased control in robots. We discuss different
application areas, both from the systems perspective and individual
problems such as object tracking and recognition.
1 Introduction 2
1.1 Scope and Outline 4
2 Historical Perspective 7
2.1 Early Start and Industrial Applications 7
2.2 Biological Influences and Affordances 9
2.3 Vision Systems 12
3 What Works 17
3.1 Object Tracking and Pose Estimation 18
3.2 Visual Servoing–Arms and Platforms 27
3.3 Reconstruction, Localization, Navigation, and Visual SLAM 32
3.4 Object Recognition 35
3.5 Action Recognition, Detecting, and Tracking Humans 42
3.6 Search and Attention 44
4 Open Challenges 48
4.1 Shape and Structure for Object Detection 49
4.2 Object Categorization 52
4.3 Semantics and Symbol Grounding: From Robot Task to Grasping and HRI 54
4.4 Competitions and Benchmarking 56
D-SLAM: A Decoupled Solution to Simultaneous Localization and Mapping
Z. Wang, S. Huang and G. Dissanayake
ARC Centre of Excellence for Autonomous Systems (CAS), Faculty of Engineering, University of Technology, Sydney, Australia International Journal of Robotics Research Volume 26 Issue 2 - Publication Date: 1 February 2007 (Special Issue on the Fifth International Conference on Field and Service Robotics, 2005) http://dx.doi.org/10.1177/0278364906075173
On the Structure and Solution of the Simultaneous Localisation and Map Building Problem. Paul Michael Newman.
1999. Ph. D. thesis, Australian Centre for Field Robotics - The University of Sydney
Randall C. Smith and Peter Cheeseman. 1986. On the representation and estimation of spatial uncertainly. Int. J. Rob. Res. 5, 4 (December 1986), 56-68.
DOI=10.1177/027836498600500404 http://dx.doi.org/10.1177/027836498600500404
// ground truth of num_landmarks landmarks in the world coordinate
double landmark[num_landmarks];// = { 200, 600, 400 }; // x-position
for( int n = 0; n < num_landmarks; n++ )
{
landmark[n] = width * uniform_random();
}
// set the initial state
double rob_pos = 25.0; // initial robot position
double rob_vel = 10.0; // initial robot velocity
// set the initial covariance of the state
double rob_pos_cov = 0.01; // covariance of noise to robot position
double rob_vel_cov = 0.01; // covariance of noise to robot velocity
double obs_cov = 900; // covarriance of noise to measurement of landmarks
double xGroundTruth, vGroundTruth;
xGroundTruth = rob_pos;
vGroundTruth = rob_vel;
// H matrix
int Hrow = num_landmarks;
int Hcol = num_landmarks + 2;
CvMat* H = cvCreateMat(Hrow, Hcol, CV_64FC1);
cvZero(H); // initialize H matrix
// set H matrix
for (int row = 0; row < Hrow; row++)
{
cvmSet(H, row, 0, -1.0);
cvmSet(H, row, row+2, 1.0);
}
displayMatrix(H, "H matrix");
// Q matrix ; covariance of noise to H ; uncertainty of control
CvMat* Q = cvCreateMat(Hrow, Hrow, CV_64FC1);
cvZero(Q); // initialize Q matrix
// set Q matrix
for (int row = 0; row < Q->rows; row++)
{
cvmSet(Q, row, row, obs_cov);
}
displayMatrix(Q, "Q matrix");
// G matrix // transition
int Grow = num_landmarks + 2;
int Gcol = Grow;
CvMat* G = cvCreateMat(Grow, Gcol, CV_64FC1);
cvZero(G); // initialize G matrix
// set G matrix
cvmSet(G, 0, 0, 1.0); // previous position
cvmSet(G, 0, 1, 1.0); // velocity
for (int row = 1; row < Grow; row++)
{
cvmSet(G, row, row, 1.0); // constance of velocity
}
displayMatrix(G, "G matrix");
// R matrix ; covariance of noise to H ; uncertainty of observation
CvMat* R = cvCreateMat(Grow, Grow, CV_64FC1); // 5x5
cvZero(R); // initialize R matrix
// set R matrix
cvmSet(R, 0, 0, rob_pos_cov);
cvmSet(R, 1, 1, rob_vel_cov);
displayMatrix(R, "R matrix");
CvMat* mu = cvCreateMat(num_dim, 1, CV_64FC1); // state vector to be estimated
CvMat* rob_ground = cvCreateMat(num_dim, 1, CV_64FC1); // state vector to be estimated
CvMat* mu_p = cvCreateMat(num_dim, 1, CV_64FC1); // state to be predicted
CvMat* u = cvCreateMat(1, 1, CV_64FC1); // control vector
cvmSet(u, 0, 0, 1.0); // set u(0,0) to 1.0, the constant velocity here
CvMat* sigma = cvCreateMat(num_dim, num_dim, CV_64FC1); // covariance to be updated
CvMat* sigma_p = cvCreateMat(num_dim, num_dim, CV_64FC1); // covariance to be updated
CvMat* z = cvCreateMat(num_landmarks, 1, CV_64FC1); // measurement vector
CvMat* K = cvCreateMat(num_dim, num_landmarks, CV_64FC1); // K matrix // Kalman gain
// initialize "mu" vector
cvmSet(mu, 0, 0, rob_pos + sqrt(rob_pos_cov)*gaussian_random()); // set mu(0,0) to "rob_pos"
cvmSet(mu, 1, 0, rob_vel + sqrt(rob_vel_cov)*gaussian_random()); // set mu(0,0) to "rob_vel"
for(int n = 0; n < num_landmarks; n++)
{
// cvmSet(mu, n+2, 0, landmark[n] + sqrt(obs_cov)*gaussian_random());
cvmSet(mu, n+2, 0, landmark[n]);
}
displayMatrix(mu, "mu vector");
// initialize "sigma" matrix <-- This is the most critical point in tuning
cvSetIdentity(sigma, cvRealScalar(obs_cov));
displayMatrix(sigma, "sigma matrix");
// set the initial covariance of the state uncertainty
double rob_px_cov = 0.01; // covariance of noise to robot's x-position
double rob_py_cov = 0.01; // covariance of noise to robot's y-position
double rob_vx_cov = 0.01; // covariance of noise to robot's x-velocity
double rob_vy_cov = 0.01; // covariance of noise to robot's y-velocity
// set the initial covariance of the measurement noise
double obs_x_cov = 900; // covarriance of noise to x-measurement of landmarks
double obs_y_cov = 900; // covarriance of noise to y-measurement of landmarks
// ground truth of state
double xGroundTruth = rob_x;
double yGroundTruth = rob_y;
double vxGroundTruth = rob_vx;
double vyGroundTruth = rob_vy;
// ground truth of num_landmarks landmarks in the world coordinate
double landmark[2*num_landmarks];
for( int n = 0; n < num_landmarks; n++ )
{
landmark[2*n] = width * uniform_random();
landmark[2*n+1] = height * uniform_random();
}
// H matrix, measurement matrix
CvMat* H = cvCreateMat(2*num_landmarks, num_dim, CV_64FC1);
cvZero(H); // initialize H matrix
// set H matrix
for (int r = 0; r < num_landmarks; r++)
{
cvmSet(H, 2*r, 0, -1.0); // robot's x-position
cvmSet(H, 2*r, 2*r+4, 1.0); // landmark's x-position
cvmSet(H, 2*r+1, 1, -1.0); // robot's y-position
cvmSet(H, 2*r+1, 2*r+5, 1.0); // landmarks's y-position
}
displayMatrix(H, "H matrix");
// Q matrix ; covariance of noise to H; uncertainty of control
CvMat* Q = cvCreateMat(2*num_landmarks, 2*num_landmarks, CV_64FC1);
cvZero(Q); // initialize Q matrix
// set Q matrix
for (int row = 0; row < Q->rows; row++)
{
cvmSet(Q, row, row, obs_x_cov);
}
displayMatrix(Q, "Q matrix");
// G matrix // transition
CvMat* G = cvCreateMat(num_dim, num_dim, CV_64FC1);
cvZero(G); // initialize G matrix
// set G matrix
cvmSet(G, 0, 0, 1.0); // previous x-position
cvmSet(G, 0, 2, 1.0); // x-velocity
cvmSet(G, 1, 1, 1.0); // previous y-position
cvmSet(G, 1, 3, 1.0); // y-velocity
for (int row = 2; row < G->rows; row++)
{
cvmSet(G, row, row, 1.0); // constance of velocity
}
displayMatrix(G, "G matrix");
// R matrix ; covariance of noise to G; uncertainty of observation
CvMat* R = cvCreateMat(num_dim, num_dim, CV_64FC1);
cvZero(R); // initialize R matrix
// set R matrix
cvmSet(R, 0, 0, rob_px_cov);
cvmSet(R, 1, 1, rob_py_cov);
cvmSet(R, 2, 2, rob_vx_cov);
cvmSet(R, 3, 3, rob_vy_cov);
displayMatrix(R, "R matrix");
CvMat* rob_ground = cvCreateMat(num_dim, 1, CV_64FC1); // ground truth of state
CvMat* mu = cvCreateMat(num_dim, 1, CV_64FC1); // state vector to be estimated
CvMat* mu_p = cvCreateMat(num_dim, 1, CV_64FC1); // state vector to be predicted
CvMat* sigma = cvCreateMat(num_dim, num_dim, CV_64FC1); // covariance to be updated
CvMat* sigma_p = cvCreateMat(num_dim, num_dim, CV_64FC1); // covariance to be updated
CvMat* z = cvCreateMat(2*num_landmarks, 1, CV_64FC1); // measurement vector
CvMat* K = cvCreateMat(num_dim, 2*num_landmarks, CV_64FC1); // K matrix // Kalman gain
// initialize "mu" vector
cvmSet(mu, 0, 0, rob_x); // set mu(0,0) to "rob_x"
cvmSet(mu, 1, 0, rob_y); // set mu(1,0) to "rob_y"
cvmSet(mu, 2, 0, rob_vx); // set mu(2,0) to "rob_vx"
cvmSet(mu, 3, 0, rob_vy); // set mu(3,0) to "rob_vy"
for (int n = 0; n < 2*num_landmarks; n++)
{
cvmSet(mu, n+4, 0, landmark[n]); // set mu(4,0) to "landmark[0]", ...
}
displayMatrix(mu, "mu vector");
/* // initialize "sigma" matrix
cvmSet(sigma, 0, 0, rob_px_cov);
cvmSet(sigma, 1, 1, rob_py_cov);
cvmSet(sigma, 2, 2, rob_vx_cov);
cvmSet(sigma, 3, 3, rob_vy_cov);
for (int r = 4; r < sigma->rows; r=r*2)
{
cvmSet(sigma, r, r, obs_x_cov);
cvmSet(sigma, r+1, r+1, obs_y_cov);
}
*/ // initialize "sigma" matrix <-- This is the most critical point in tuning
cvSetIdentity(sigma, cvRealScalar(obs_x_cov));
displayMatrix(sigma, "sigma matrix");
// matrices to be used in calculation
CvMat* Hx = cvCreateMat(H->rows, mu->cols, CV_64FC1);
CvMat* Gt = cvCreateMat(G->cols, G->rows, CV_64FC1);
cvTranspose(G, Gt); // transpose(G) -> Gt
CvMat* sigmaGt = cvCreateMat(sigma->rows, G->rows, CV_64FC1);
CvMat* GsigmaGt = cvCreateMat(G->rows, G->rows, CV_64FC1); // 10x10
// observation relative to robot's position
for( int n = 0; n < 2*num_landmarks; n++ )
{
cvmSet(obs, n, 0, cvmGet(mu,0,0) + cvmGet(z,n,0));
}
// update the state with Kalman gain (ref. L5, EKF algorithm, 42p)
cvMatMul(H, mu, Hmu); // H * mu -> Hmu
cvSub(z, Hmu, miss); // z - Hmu -> miss
cvMatMul(K, miss, adjust); // K * miss -> adjust
cvAdd(mu_p, adjust, mu); // mu_p + adjust -> mu
displayMatrix(mu, "mu vector");
[1] J. Civera, A.J. Davison, and J.M.M Montiel. Inverse depth
parametrization for monocular SLAM. IEEE Trans. on Robotics, 24(5),
2008.
[2] J. Civera, Andrew Davison, and J. Montiel. Inverse Depth to Depth
Conversion for Monocular SLAM. In IEEE Int. Conf. on Robotics and
Automation, pages 2778 –2783, April 2007.
[3] A. J. Davison. Real-time simultaneous localisation and mapping with
a single camera. In Int. Conf. on Computer Vision, volume 2, pages
1403–1410, Nice, October 2003.
[4] Andrew J. Davison. Active search for real-time vision. Int. Conf. on Computer Vision, 1:66–73, 2005.
[5] Andrew J. Davison, Ian D. Reid, Nicholas D. Molton, and Olivier
Stasse. MonoSLAM: Real-time single camera SLAM. Trans. on Pattern
Analysis and Machine Intelligence, 29(6):1052–1067, June 2007.
[6] Ethan Eade and Tom Drummond. Scalable monocular SLAM. IEEE Int.
Conf. on Computer Vision and Pattern Recognition, 1:469–476, 2006.
[7] Thomas Lemaire and Simon Lacroix. Monocular-vision based SLAM using
line segments. In IEEE Int. Conf. on Robotics and Automation, pages
2791–2796, Rome, Italy, 2007.
[8] Nicholas Molton, Andrew J. Davison, and Ian Reid. Locally planar
patch features for real-time structure from motion. In British Machine
Vision Conference, 2004.
[9] J. Montiel, J. Civera, and A. J. Davison. Unified inverse depth
parametrization for monocular SLAM. In Robotics: Science and Systems,
Philadelphia, USA, August 2006.
[10] L. M. Paz, P. Pini´es, J. Tard´os, and J. Neira. Large scale 6DOF
SLAM with stereo-in-hand. IEEE Trans. on Robotics, 24(5), 2008.
[11] J. Sol`a, Andr´e Monin, Michel Devy, and T. Vidal-Calleja. Fusing
monocular information in multi-camera SLAM. IEEE Trans. on Robotics,
24(5):958–968, 2008.
[12] Joan Sol`a. Towards Visual Localization, Mapping and Moving
Objects Tracking by a Mobile Robot: a Geometric and Probabilistic
Approach. PhD thesis, Institut National Polytechnique de Toulouse, 2007.
[13] Joan Sol`a, Andr´e Monin, and Michel Devy. BiCamSLAM: Two times
mono is more than stereo. In IEEE Int. Conf. on Robotics and
Automation, pages 4795–4800, Rome, Italy, April 2007.
[14] Joan Sol`a, Andr´e Monin, Michel Devy, and Thomas Lemaire.
Undelayed initialization in bearing only SLAM. In IEEE/RSJ Int. Conf.
on Intelligent Robots and Systems, pages 2499–2504, Edmonton, Canada,
2005.
[15] Joan Sol`a, Teresa Vidal-Calleja, and Michel Devy. Undelayed
initialization of line segments in monocular SLAM. In IEEE Int. Conf.
on Intelligent Robots and Systems, Saint Louis, USA, 2009. To appear.
slamtb.m
% SLAMTB An EKF-SLAM algorithm with simulator and graphics.
%
% This script performs multi-robot, multi-sensor, multi-landmark 6DOF
% EKF-SLAM with simulation and graphics capabilities.
%
% Please read slamToolbox.pdf in the root directory thoroughly before
% using this toolbox.
%
% - Beginners should not modify this file, just edit USERDATA.M and enter
% and/or modify the data you wish to simulate.
%
% - More advanced users should be able to create new landmark models, new
% initialization methods, and possibly extensions to multi-map SLAM. Good
% luck!
%
% - Expert users may want to add code for real-data experiments.
%
% See also USERDATA, USERDATAPNT, USERDATALIN.
%
% Also consult slamToolbox.pdf in the root directory.
% Created and maintained by
% Copyright 2008-2009 Joan Sola @ LAAS-CNRS.
% Programmers (for the whole toolbox):
% Copyright David Marquez and Jean-Marie Codol @ LAAS-CNRS
% Copyright Teresa Vidal-Calleja @ ACFR.
% See COPYING.TXT for wull copyright license.
%% OK we start here
% clear workspace and declare globals
clear
global Map %#ok<NUSED>
%% I. Specify user-defined options - EDIT USER DATA FILE userData.m
%% II. Initialize all data structures from user-defined data in userData.m
% SLAM data
[Rob,Sen,Raw,Lmk,Obs,Tim] = createSlamStructures(...
Robot,...
Sensor,... % all user data
Time,...
Opt);
% Simulation data
[SimRob,SimSen,SimLmk,SimOpt] = createSimStructures(...
Robot,...
Sensor,... % all user data
World,...
SimOpt);
% Graphics handles
[MapFig,SenFig] = createGraphicsStructures(...
Rob, Sen, Lmk, Obs,... % SLAM data
SimRob, SimSen, SimLmk,... % Simulator data
FigOpt); % User-defined graphic options
%% III. Init data logging
% TODO: Create source and/or destination files and paths for data input and
% logs.
% TODO: do something here to collect data for post-processing or
% plotting. Think about collecting data in files using fopen, fwrite,
% etc., instead of creating large Matlab variables for data logging.
% Clear user data - not needed anymore
clear Robot Sensor World Time % clear all user data
%% IV. Main loop
for currentFrame = Tim.firstFrame : Tim.lastFrame
% Robot motion
% FIXME: see how to include noise in a clever way.
Rob(rob).con.u = SimRob(rob).con.u + Rob(rob).con.uStd.*randn(6,1);
Rob(rob) = motion(Rob(rob),Tim);
% Process sensor observations
for sen = Rob(rob).sensors
%TODO: see how to pass only used Lmks and Obs.
% Observe knowm landmarks
[Rob(rob),Sen(sen),Lmk,Obs(sen,:)] = correctKnownLmks( ...
Rob(rob), ...
Sen(sen), ...
Raw(sen), ...
Lmk, ...
Obs(sen,:), ...
Opt) ;
% Figures for all sensors
for sen = [Sen.sen]
SenFig(sen) = drawSenFig(SenFig(sen), ...
Sen(sen), Raw(sen), Obs(sen,:), ...
FigOpt);
makeVideoFrame(SenFig(sen), ...
sprintf('sen%02d-%04d.png', sen, currentFrame),...
FigOpt, ExpOpt);
end
% Do draw all objects
drawnow;
end
% 4. DATA LOGGING
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% TODO: do something here to collect data for post-processing or
% plotting. Think about collecting data in files using fopen, fwrite,
% etc., instead of creating large Matlab variables for data logging.
end
%% V. Post-processing
% ========== End of function - Start GPL license ==========
% # START GPL LICENSE
%---------------------------------------------------------------------
%
% This file is part of SLAMTB, a SLAM toolbox for Matlab.
%
% SLAMTB is free software: you can redistribute it and/or modify
% it under the terms of the GNU General Public License as published by
% the Free Software Foundation, either version 3 of the License, or
% (at your option) any later version.
%
% SLAMTB is distributed in the hope that it will be useful,
% but WITHOUT ANY WARRANTY; without even the implied warranty of
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
% GNU General Public License for more details.
%
% You should have received a copy of the GNU General Public License
% along with SLAMTB. If not, see <http://www.gnu.org/licenses/>.
%
%---------------------------------------------------------------------
% SLAMTB is Copyright 2007,2008,2009
% by Joan Sola, David Marquez and Jean Marie Codol @ LAAS-CNRS.
% See on top of this file for its particular copyright.
The function cvRetrieveFrame() returns the pointer to the image grabbed with the GrabFrame function. The returned image should not be released or modified by the user. In the event of an error, the return value may be NULL.
Canny edge detection
Canny edge detection in OpenCV image processing functions
Implements the Canny algorithm for edge detection.
Parameters:
image – Single-channel input image
edges – Single-channel image to store the edges found by the function
threshold1 – The first threshold
threshold2 – The second threshold
aperture_size – Aperture parameter for the Sobel operator (see cvSobel())
The function cvCanny() finds the edges on the input image image and marks them in the output image edges using the Canny algorithm. The smallest value between threshold1 and threshold2 is used for edge linking, the largest value is used to find the initial segments of strong edges.
IplImage *iplOriginalColor; // image to be captured
IplImage *iplOriginalGrey; // grey-scale image of "iplOriginalColor"
IplImage *iplEdge; // image detected by Canny edge algorithm
IplImage *iplImg; // resulting image to show tracking process
IplImage *iplEdgeClone;
// set the process noise
// covariance of Gaussian noise to control
CvMat* transition_noise = cvCreateMat(D, D, CV_64FC1);
// assume the transition noise
for (int row = 0; row < D; row++)
{
for (int col = 0; col < D; col++)
{
cvmSet(transition_noise, row, col, 0.0);
}
}
cvmSet(transition_noise, 0, 0, 3.0);
cvmSet(transition_noise, 1, 1, 3.0);
cvmSet(transition_noise, 2, 2, 0.3);
// set the measurement noise
/*
// covariance of Gaussian noise to measurement
CvMat* measurement_noise = cvCreateMat(D, D, CV_64FC1);
// initialize the measurement noise
for (int row = 0; row < D; row++)
{
for (int col = 0; col < D; col++)
{
cvmSet(measurement_noise, row, col, 0.0);
}
}
cvmSet(measurement_noise, 0, 0, 5.0);
cvmSet(measurement_noise, 1, 1, 5.0);
cvmSet(measurement_noise, 2, 2, 5.0);
*/
double measurement_noise = 2.0; // standrad deviation of Gaussian noise to measurement
CvMat* state = cvCreateMat(D, 1, CV_64FC1); // state of the system to be estimated
// CvMat* measurement = cvCreateMat(2, 1, CV_64FC1); // measurement of states
// declare particles
CvMat* pb [N]; // estimated particles
CvMat* pp [N]; // predicted particles
CvMat* pu [N]; // temporary variables to update a particle
CvMat* v[N]; // estimated velocity of each particle
CvMat* vu[N]; // temporary varialbe to update the velocity
double w[N]; // weight of each particle
for (int n = 0; n < N; n++)
{
pb[n] = cvCreateMat(D, 1, CV_64FC1);
pp[n] = cvCreateMat(D, 1, CV_64FC1);
pu[n] = cvCreateMat(D, 1, CV_64FC1);
v[n] = cvCreateMat(D, 1, CV_64FC1);
vu[n] = cvCreateMat(D, 1, CV_64FC1);
}
// initialize the state and particles
for (int n = 0; n < N; n++)
{
cvmSet(state, 0, 0, 258.0); // center-x
cvmSet(state, 1, 0, 406.0); // center-y
cvmSet(state, 2, 0, 38.0); // radius
// initialize the image window
cvZero(iplImg);
cvNamedWindow(titleResult);
cout << "start filtering... " << endl << endl;
float aperture = 3, thresLow = 50, thresHigh = 110;
// float aperture = 3, thresLow = 80, thresHigh = 110;
// for each frame
int frameNo = 0;
while(frameNo < frame_count && cvGrabFrame(capture)) {
// retrieve color frame from the movie "capture"
iplOriginalColor = cvRetrieveFrame(capture);
// convert color pixel values of "iplOriginalColor" to grey scales of "iplOriginalGrey"
cvCvtColor(iplOriginalColor, iplOriginalGrey, CV_RGB2GRAY);
// extract edges with Canny detector from "iplOriginalGrey" to save the results in the image "iplEdge"
cvCanny(iplOriginalGrey, iplEdge, thresLow, thresHigh, aperture);
int count = 0; // number of measurements
double dist_sum = 0;
for (int i = 0; i < 8; i++) // for 8 directions
{
double dist = scope[i] * 1.5;
for ( int range = 0; range < scope[i]; range++ )
{
int flag = 0;
for (int turn = -1; turn <= 1; turn += 2) // reverse the searching direction
{
search[i].x = direction[i].x + turn * range * d[i].x;
search[i].y = direction[i].y + turn * range * d[i].y;
// detect measurements
// CvScalar s = cvGet2D(iplEdge, cvRound(search[i].y), cvRound(search[i].x));
unsigned char s = CV_IMAGE_ELEM(iplEdge, unsigned char, cvRound(search[i].y), cvRound(search[i].x));
// if ( s.val[0] > 200 && s.val[1] > 200 && s.val[2] > 200 ) // bgr color
if (s > 250) // bgr color
{ // when the pixel value is white, that means a measurement is detected
flag = 1;
count++;
// cvCircle(iplEdgeClone, cvPoint(cvRound(search[i].x), cvRound(search[i].y)), 3, CV_RGB(200, 0, 255));
// cvShowImage("3D Particle filter", iplEdgeClone);
// cvWaitKey(1);
/* // get measurement
cvmSet(measurement, 0, 0, search[i].x);
cvmSet(measurement, 1, 0, search[i].y);
double dist = distance(measurement, pp[n]);
*/ // evaluate the difference between predictions of the particle and measurements
dist = distanceEuclidean(search[i], direction[i]);
break; // break for "turn"
} // end if
} // for turn
if ( flag == 1 )
{ break; } // break for "range"
} // for range
dist_sum += dist; // for all searching directions of one particle
} // for i direction
double dist_avg; // average distance of measurements from the one particle "n"
// cout << "count = " << count << endl;
dist_avg = dist_sum / 8;
// cout << "dist_avg = " << dist_avg << endl;
// estimate likelihood with "dist_avg"
like[n] = likelihood(dist_avg, measurement_noise);
// cout << "likelihood of particle-#" << n << " = " << like[n] << endl;
like_sum += like[n];
} // for n particle
// cout << "sum of likelihoods of N particles = " << like_sum << endl;
// estimate states
double state_x = 0.0;
double state_y = 0.0;
double state_r = 0.0;
// estimate the state with predicted particles
for (int n = 0; n < N; n++) // for "N" particles
{
w[n] = like[n] / like_sum; // update normalized weights of particles
// cout << "w" << n << "= " << w[n] << " ";
state_x += w[n] * cvmGet(pp[n], 0, 0); // center-x of the object
state_y += w[n] * cvmGet(pp[n], 1, 0); // center-y of the object
state_r += w[n] * cvmGet(pp[n], 2, 0); // radius of the object
}
if (state_r < 0) { state_r = 0; }
cvmSet(state, 0, 0, state_x);
cvmSet(state, 1, 0, state_y);
cvmSet(state, 2, 0, state_r);
// define integrated portions of each particles; 0 < a[0] < a[1] < a[2] = 1
a[0] = w[0];
for (int n = 1; n < N; n++)
{
a[n] = a[n - 1] + w[n];
// cout << "a" << n << "= " << a[n] << " ";
}
// cout << "a" << N << "= " << a[N] << " " << endl;
for (int n = 0; n < N; n++)
{
// select a particle from the distribution
int pselected;
for (int k = 0; k < N; k++)
{
if ( uniform_random() < a[k] )
{
pselected = k;
break;
}
}
// cout << "p " << n << " => " << pselected << " ";
// retain the selection
for (int row = 0; row < D; row++)
{
cvmSet(pu[n], row, 0, cvmGet(pp[pselected],row,0));
cvSub(pp[pselected], pb[pselected], vu[n]); // pp - pb -> vu
}
}
// updated each particle and its velocity
for (int n = 0; n < N; n++)
{
for (int row = 0; row < D; row++)
{
cvmSet(pb[n], row, 0, cvmGet(pu[n],row,0));
cvmSet(v[n], row , 0, cvmGet(vu[n],row,0));
}
}
cout << endl << endl ;
Kalman filter
: estimates system states that can only be observed indirectly or inaccurately by the system itself.
: estimates the variables of a wide range of processes.
: estimates the states of a linear system.
: minimizes the variance of the estimation error
Linear system
x: state of the system
u: known input to the system
y: measured output
w: process noise
z: measurement noise
http://wiki.answers.com/Q/What_is_a_feedback_system
A feedback system, in general engineering terms, is a system whose
output if fed back to the input, and depending on the output, your
input is adjusted so as to reach a steady-state. In colloquial
language, you adjust your input based on the output of your system so
as to achieve a certain end, like minimizing disturbance, cancelling
echo (in a speech system) and so on.
Criteria of an Estimator
1) The expected value of the estimate should be equal to the expected value of the state.
2) The estimator should be with the smallest possible error variance.
Requirement of Kalman filter
1) The average value of w is zero and average value of z is zero.
2) No correlation exists between w and z. w_k and z_k are independent random variables.
Kalman filter equations
K matrix: Kalman gain
P matrix: estimation error covariance
http://en.wikipedia.org/wiki/Three_sigma_rule
In statistics, the 68-95-99.7 rule, or three-sigma rule, or empirical rule, states that for a normal distribution, nearly all values lie within 3 standard deviations of the mean.
"steady state Kalman filter"
- K matrix & P matrix are constant
"extended Kalman filter"
: an extension of linear Kalman filter theory to nonlinear systems
"Kalman smoother"
: to estimate the state as a function of time so to reconstruct the trajectory after the fact
H infinity filter
=> correlated noise problem
=> unknown noise covariances problem
spacecraft navigation for the Apollo space program
> applications
all forms of navigation (aerospace, land, and marine)
nuclear power plant instrumentation
demographic modeling
manufacturing
the detection of underground radioactivity
fuzzy logic and neural network training
Gelb, A. Applied Optimal Estimation. Cambridge, MA: MIT Press, 1974.
Anderson, B. and J. Moore. Optimal Filtering. Englewood Cliffs, NJ: Prentice-Hall, 1979.
Grewal, M. and A. Andrews. Kalman Filtering Theory and Practice. Englewood Cliffs, NJ: Prentice-Hall, 1993.
Sorenson, H. Kalman Filtering: Theory and Application. Los Alamitos, CA: IEEE Press, 1985.
Peter Joseph’s Web site @http://ourworld.compuserve.com/homepages/PDJoseph/
// count the number of detections in measurement process
int count_detections (void)
{
// set cases of measurement results
double mtype[4];
mtype [0] = 0.0;
mtype [1] = 0.5;
mtype [2] = mtype[1] + 0.3;
mtype [3] = mtype[2] + 0.2;
// cout << "check measurement type3 = " << mtype[3] << endl; // just to check
// select a measurement case
double mrn = uniform_random();
int type = 1;
for ( int k = 0; k < 3; k++ )
{
if ( mrn < mtype[k] )
{
type = k;
break;
}
}
return type;
}
// distance between measurement and prediction
double distance(CvMat* measurement, CvMat* prediction)
{
double distance2 = 0;
double differance = 0;
for (int u = 0; u < 2; u++)
{
differance = cvmGet(measurement,u,0) - cvmGet(prediction,u,0);
distance2 += differance * differance;
}
return sqrt(distance2);
}
// likelihood based on multivariative normal distribution (ref. 15p eqn(2.4))
double likelihood(CvMat *mean, CvMat *covariance, CvMat *sample) {
/*
struct particle
{
double weight; // weight of a particle
CvMat* loc_p = cvCreateMat(2, 1, CV_64FC1); // previously estimated position of a particle
CvMat* loc = cvCreateMat(2, 1, CV_64FC1); // currently estimated position of a particle
CvMat* velocity = cvCreateMat(2, 1, CV_64FC1); // estimated velocity of a particle
cvSub(loc, loc_p, velocity); // loc - loc_p -> velocity
};
*/
int main (int argc, char * const argv[]) {
srand(time(NULL));
int width = 400; // width of image window
int height = 400; // height of image window
CvMat* state = cvCreateMat(2, 1, CV_64FC1);
// declare particles
/* particle pb[N]; // N estimated particles
particle pp[N]; // N predicted particles
particle pu[N]; // temporary variables for updating particles
*/
CvMat* pb [N]; // estimated particles
CvMat* pp [N]; // predicted particles
CvMat* pu [N]; // temporary variables to update a particle
CvMat* v[N]; // estimated velocity of each particle
CvMat* vu[N]; // temporary varialbe to update the velocity
double w[N]; // weight of each particle
for (int n = 0; n < N; n++)
{
pb[n] = cvCreateMat(2, 1, CV_64FC1);
pp[n] = cvCreateMat(2, 1, CV_64FC1);
pu[n] = cvCreateMat(2, 1, CV_64FC1);
v[n] = cvCreateMat(2, 1, CV_64FC1);
vu[n] = cvCreateMat(2, 1, CV_64FC1);
}
// initialize particles and the state
for (int n = 0; n < N; n++)
{
w[n] = (double) 1 / (double) N; // equally weighted
for (int row=0; row < 2; row++)
{
cvmSet(pb[n], row, 0, 0.0);
cvmSet(v[n], row, 0, 15 * uniform_random());
cvmSet(state, row, 0, 0.0);
}
}
// set the system
CvMat* groundtruth = cvCreateMat(2, 1, CV_64FC1); // groundtruth of states
CvMat* measurement [3]; // measurement of states
for (int k = 0; k < 3; k++) // 3 types of measurement
{
measurement[k] = cvCreateMat(2, 1, CV_64FC1);
}
// evaluate measurements
double range = (double) width; // range to search measurements for each particle
// cout << "range of distances = " << range << endl;
int mselected;
for (int index = 0; index < count; index++)
{
double d = distance(measurement[index], pp[n]);
if ( d < range )
{
range = d;
mselected = index; // selected measurement
}
}
/// cout << "selected measurement # = " << mselected << endl;
like[n] = likelihood(measurement[mselected], measurement_noise, pp[n]);
/// cout << "likelihood of #" << n << " = " << like[n] << endl;
Lessons:
1. transition noise와 measurement noise, (그것들의 covariances) 그리고 각 입자의 초기 위치와 상태를 알맞게 설정하는 것이 관건임을 배웠다. 그것을 Tuning이라 부른다는 것도.
1-1. 코드에서 가정한 system에서는 특히 입자들의 초기 속도를 어떻게 주느냐에 따라 tracking의 성공 여부가 좌우된다.
2. 실제 상태는 등가속도 운동을 하는 비선형 시스템이나, 여기에서는 프레임 간의 운동을 등속으로 가정하여 선형 시스템으로 근사한 모델을 적용한 것이다.
2-1. 그러므로 여기에 Kalman filter를 적용하여 결과를 비교해 볼 수 있겠다. 이 경우, 3차원 Kalman gain을 계산해야 한다.
2-2. 분홍색 부분을 고쳐 비선형 모델로 만든 후 Particle filtering을 하면 결과가 더 좋지 않을까? Tuning도 더 쉬어지지 않을까?
3. 코드 좀 정리하자. 너무 지저분하다. ㅡㅡ;
4. 아니, 근데 영 헤매다가 갑자기 따라잡는 건 뭐지??? (아래 결과)
// count the number of detections in measurement process
int count_detections (void)
{
// set cases of measurement results
double mtype[4];
mtype [0] = 0.0;
mtype [1] = 0.5;
mtype [2] = mtype[1] + 0.3;
mtype [3] = mtype[2] + 0.2;
// cout << "check measurement type3 = " << mtype[3] << endl; // just to check
// select a measurement case
double mrn = uniform_random();
int type = 1;
for ( int k = 0; k < 3; k++ )
{
if ( mrn < mtype[k] )
{
type = k;
break;
}
}
return type;
}
// get measurements
int get_measurement (CvMat* measurement[], CvMat* measurement_noise, double x, double y)
{
// set measurement types
double c1 = 1.0, c2 = 4.0;
// measured point 1
cvmSet(measurement[0], 0, 0, x + (c1 * cvmGet(measurement_noise,0,0) * gaussian_random())); // x-value
cvmSet(measurement[0], 1, 0, y + (c1 * cvmGet(measurement_noise,1,1) * gaussian_random())); // y-value
// measured point 2
cvmSet(measurement[1], 0, 0, x + (c2 * cvmGet(measurement_noise,0,0) * gaussian_random())); // x-value
cvmSet(measurement[1], 1, 0, y + (c2 * cvmGet(measurement_noise,1,1) * gaussian_random())); // y-value
// measured point 3 // clutter noise
cvmSet(measurement[2], 0, 0, width*uniform_random()); // x-value
cvmSet(measurement[2], 1, 0, height*uniform_random()); // y-value
// count the number of measured points
int number = count_detections(); // number of detections
cout << "# of measured points = " << number << endl;
// get measurement
for (int index = 0; index < number; index++)
{
cout << "measurement #" << index << " : "
<< cvmGet(measurement[index],0,0) << " " << cvmGet(measurement[index],1,0) << endl;
// set the system
CvMat* state = cvCreateMat(2, 1, CV_64FC1); // state of the system to be estimated
CvMat* groundtruth = cvCreateMat(2, 1, CV_64FC1); // groundtruth of states
CvMat* measurement [3]; // measurement of states
for (int k = 0; k < 3; k++) // 3 types of measurement
{
measurement[k] = cvCreateMat(2, 1, CV_64FC1);
}
// declare particles
CvMat* pb [N]; // estimated particles
CvMat* pp [N]; // predicted particles
CvMat* pu [N]; // temporary variables to update a particle
CvMat* v[N]; // estimated velocity of each particle
CvMat* vu[N]; // temporary varialbe to update the velocity
double w[N]; // weight of each particle
for (int n = 0; n < N; n++)
{
pb[n] = cvCreateMat(2, 1, CV_64FC1);
pp[n] = cvCreateMat(2, 1, CV_64FC1);
pu[n] = cvCreateMat(2, 1, CV_64FC1);
v[n] = cvCreateMat(2, 1, CV_64FC1);
vu[n] = cvCreateMat(2, 1, CV_64FC1);
}
// initialize the state and particles
for (int n = 0; n < N; n++)
{
w[n] = (double) 1 / (double) N; // equally weighted
for (int row=0; row < 2; row++)
{
cvmSet(state, row, 0, 0.0);
cvmSet(pb[n], row, 0, 0.0);
cvmSet(v[n], row, 0, 15 * uniform_random());
}
}
// set the process noise
// covariance of Gaussian noise to control
CvMat* transition_noise = cvCreateMat(2, 2, CV_64FC1);
cvmSet(transition_noise, 0, 0, 3); //set transition_noise(0,0) to 3.0
cvmSet(transition_noise, 0, 1, 0.0);
cvmSet(transition_noise, 1, 0, 0.0);
cvmSet(transition_noise, 1, 1, 6);
// set the measurement noise
// covariance of Gaussian noise to measurement
CvMat* measurement_noise = cvCreateMat(2, 2, CV_64FC1);
cvmSet(measurement_noise, 0, 0, 2); //set measurement_noise(0,0) to 2.0
cvmSet(measurement_noise, 0, 1, 0.0);
cvmSet(measurement_noise, 1, 0, 0.0);
cvmSet(measurement_noise, 1, 1, 2);
// initialize the image window
cvZero(iplImg);
cvNamedWindow("ParticleFilter-3d", 0);
for (int t = 0; t < step; t++) // for "step" steps
{
// cvZero(iplImg);
cout << "step " << t << endl;
// get the groundtruth
double gx = 10 * t;
double gy = (-1.0 / width ) * (gx - width) * (gx - width) + height;
get_groundtruth(groundtruth, gx, gy);
// get measurements
int count = get_measurement(measurement, measurement_noise, gx, gy);
double like[N]; // likelihood between measurement and prediction
double like_sum = 0; // sum of likelihoods
for (int n = 0; n < N; n++) // for "N" particles
{
// predict
double prediction;
for (int row = 0; row < 2; row++)
{
prediction = cvmGet(pb[n],row,0) + cvmGet(v[n],row,0) + cvmGet(transition_noise,row,row) * gaussian_random();
cvmSet(pp[n], row, 0, prediction);
}
// cvLine(iplImg, cvPoint(cvRound(cvmGet(pp[n],0,0)), cvRound(cvmGet(pp[n],1,0))), cvPoint(cvRound(cvmGet(pb[n],0,0)), cvRound(cvmGet(pb[n],1,0))), CV_RGB(100,100,0), 1);
// cvCircle(iplImg, cvPoint(cvRound(cvmGet(pp[n],0,0)), cvRound(cvmGet(pp[n],1,0))), 1, CV_RGB(255, 255, 0));
// evaluate measurements
double range = (double) width; // range to search measurements for each particle
// cout << "range of distances = " << range << endl;
int mselected;
for (int index = 0; index < count; index++)
{
double d = distance(measurement[index], pp[n]);
if ( d < range )
{
range = d;
mselected = index; // selected measurement
}
}
// cout << "selected measurement # = " << mselected << endl;
like[n] = likelihood(measurement[mselected], measurement_noise, pp[n]);
// cout << "likelihood of #" << n << " = " << like[n] << endl;
like_sum += like[n];
}
// cout << "sum of likelihoods = " << like_sum << endl;
// estimate states
double state_x = 0.0;
double state_y = 0.0;
// estimate the state with predicted particles
for (int n = 0; n < N; n++) // for "N" particles
{
w[n] = like[n] / like_sum; // update normalized weights of particles
// cout << "w" << n << "= " << w[n] << " ";
state_x += w[n] * cvmGet(pp[n], 0, 0); // x-value
state_y += w[n] * cvmGet(pp[n], 1, 0); // y-value
}
cvmSet(state, 0, 0, state_x);
cvmSet(state, 1, 0, state_y);
// define integrated portions of each particles; 0 < a[0] < a[1] < a[2] = 1
a[0] = w[0];
for (int n = 1; n < N; n++)
{
a[n] = a[n - 1] + w[n];
// cout << "a" << n << "= " << a[n] << " ";
}
// cout << "a" << N << "= " << a[N] << " " << endl;
for (int n = 0; n < N; n++)
{
// select a particle from the distribution
int pselected;
for (int k = 0; k < N; k++)
{
if ( uniform_random() < a[k] )
{
pselected = k;
break;
}
}
// cout << "p " << n << " => " << pselected << " ";
Branislav Kisačanin & Vladimir Pavlović & Thomas S. Huang
Real-Time Vision for Human-Computer Interaction
(RTV4HCI) Springer, 2005
(google book's overview)
Computer vision and pattern recognition continue to play a dominant
role in the HCI realm. However, computer vision methods often fail to
become pervasive in the field due to the lack of real-time, robust
algorithms, and novel and convincing applications.
Keywords:
head and face modeling
map building
pervasive computing
real-time detection
Contents:
RTV4HCI: A Historical Overview.
- Real-Time Algorithms: From Signal
Processing to Computer Vision.
- Recognition of Isolated Fingerspelling
Gestures Using Depth Edges.
- Appearance-Based Real-Time Understanding
of Gestures Using Projected Euler Angles.
- Flocks of Features for
Tracking Articulated Objects.
- Static Hand Posture Recognition Based on
Okapi-Chamfer Matching.
- Visual Modeling of Dynamic Gestures Using 3D
Appearance and Motion Features.
- Head and Facial Animation Tracking
Using Appearance-Adaptive Models and Particle Filters.
- A Real-Time
Vision Interface Based on Gaze Detection -- EyeKeys.
- Map Building from
Human-Computer Interactions.
- Real-Time Inference of Complex Mental
States from Facial Expressions and Head Gestures.
- Epipolar Constrained
User Pushbutton Selection in Projected Interfaces.
- Vision-Based HCI
Applications.
- The Office of the Past.
- MPEG-4 Face and Body Animation
Coding Applied to HCI.
- Multimodal Human-Computer Interaction.
- Smart
Camera Systems Technology Roadmap.
- Index.
The goal of research in real-time vision for human-computer interaction
is to develop algorithms and systems that sense and perceive humans and
human activity, in order to enable more natural, powerful, and
effective computer interfaces.
- Presence and location (Face and body detection, head and body tracking)
- Identity (Face recognition, gait recognition)
- Expression (Facial feature tracking, expression modeling and analysis)
- Focus of attention (Head/face tracking, eye gaze tracking)
- Body posture and movement (Body modeling and tracking)
- Gesture (Gesture recognition, hand tracking)
- Activity (Analysis of body movement)
eg.
VIDEOPLACE (M W Krueger, Artificial Reality II, Addison-Wesley, 1991)
Magic Morphin Mirror / Mass Hallucinations (T Darrell et al., SIGGRAPH Visual Proc, 1997)
Principal Component Analysis (PCA)
Linear Discriminant Analysis (LDA)
Gabor Wavelet Networks (GWNs)
Active Appearance Models (AAMs)
Hidden Markov Models (HMMs)
Identix Inc.
Viisage Technology Inc.
Cognitec Systems
- MIT Medial Lab
ALIVE system (P Maes et al., The ALIVE system: wireless, full-body
interaction with autonomous agents, ACM Multimedia Systems, 1996)
PFinder system (C R Wren et al., Pfinder: Real-time tracking of the human body, IEEE Trans PAMI, pp 780-785, 1997)
KidsRoom project (A Bobick et al., The KidsRoom: A perceptually-based interactive and immersive story environment, PRESENCE: Teleoperators and Virtual Environments, pp 367-391, 1999)
Flocks of Features for
Tracking Articulated Objects
Mathias Kolsch (kolsch@nps.edu
Computer Science Department, Naval Postgraduate School, Monterey
Matthew Turk (mturk@cs.ucsb.edu)
Computer Science Department, University of California, Santa Barbara
Jiwon Kim (jwkim@cs.washington.edu), Steven M. Seitz (seitz@cs.washington.edu)
University of Washington
Maneesh Agrawala (maneesh@microsoft.com)
Microsoft Research
Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 10 - Volume 10 Page: 157 Year of Publication: 2004 http://desktop.google.com http://grail.cs.washington.edu/projects/office/ http://www.realvnc.com/
Smart
Camera Systems Technology Roadmap
Bruce Flinchbaugh (b-flinchbaugh@ti.com)
Texas Instruments
CvMat* state = cvCreateMat(2, 1, CV_64FC1); //states to be estimated
CvMat* state_p = cvCreateMat(2, 1, CV_64FC1); //states to be predicted
CvMat* velocity = cvCreateMat(2, 1, CV_64FC1); //motion controls to change states
CvMat* measurement = cvCreateMat(2, 1, CV_64FC1); //measurement of states
CvMat* cov = cvCreateMat(2, 2, CV_64FC1); //covariance to be updated
CvMat* cov_p = cvCreateMat(2, 2, CV_64FC1); //covariance to be predicted
CvMat* gain = cvCreateMat(2, 2, CV_64FC1); //Kalman gain to be updated
// temporary matrices to be used for estimation
CvMat* Kalman = cvCreateMat(2, 2, CV_64FC1); //
CvMat* invKalman = cvCreateMat(2, 2, CV_64FC1); //
CvMat* I = cvCreateMat(2,2,CV_64FC1);
cvSetIdentity(I); // does not seem to be working properly
// cvSetIdentity (I, cvRealScalar (1));
// check matrix
for(int i=0; i<2; i++)
{
for(int j=0; j<2; j++)
{
cout << cvmGet(I, i, j) << "\t";
}
cout << endl;
}
// set the initial state
cvmSet(state, 0, 0, 0.0); //x-value //set state(0,0) to 0.0
cvmSet(state, 1, 0, 0.0); //y-value //set state(1,0) to 0.0
// set the initital covariance of state
cvmSet(cov, 0, 0, 0.5); //set cov(0,0) to 0.5
cvmSet(cov, 0, 1, 0.0); //set cov(0,1) to 0.0
cvmSet(cov, 1, 0, 0.0); //set cov(1,0) to 0.0
cvmSet(cov, 1, 0, 0.4); //set cov(1,1) to 0.4
// set the initial control
cvmSet(velocity, 0, 0, 10.0); //x-direction //set velocity(0,0) to 1.0
cvmSet(velocity, 1, 0, 10.0); //y-direction //set velocity(0,0) to 1.0
for (int t=0; t<step; t++)
{
// retain the current state
CvMat* state_out = cvCreateMat(2, 1, CV_64FC1); // temporary vector
cvmSet(state_out, 0, 0, cvmGet(state,0,0));
cvmSet(state_out, 1, 0, cvmGet(state,1,0));
// predict
cvAdd(state, velocity, state_p); // state + velocity -> state_p
cvAdd(cov, transition_noise, cov_p); // cov + transition_noise -> cov_p
// measure
cvmSet(measurement, 0, 0, measurement_set[t]); //x-value
cvmSet(measurement, 1, 0, measurement_set[step+t]); //y-value
// estimate Kalman gain
cvAdd(cov_p, measurement_noise, Kalman); // cov_p + measure_noise -> Kalman
cvInvert(Kalman, invKalman); // inv(Kalman) -> invKalman
cvMatMul(cov_p, invKalman, gain); // cov_p * invKalman -> gain
// update the state
CvMat* err = cvCreateMat(2, 1, CV_64FC1); // temporary vector
cvSub(measurement, state_p, err); // measurement - state_p -> err
CvMat* adjust = cvCreateMat(2, 1, CV_64FC1); // temporary vector
cvMatMul(gain, err, adjust); // gain*err -> adjust
cvAdd(state_p, adjust, state); // state_p + adjust -> state
// update the covariance of states
CvMat* cov_up = cvCreateMat(2, 2, CV_64FC1); // temporary matrix
cvSub(I, gain, cov_up); // I - gain -> cov_up
cvMatMul(cov_up, cov_p, cov); // cov_up *cov_p -> cov
// update the control
cvSub(state, state_out, velocity); // state - state_p -> velocity