블로그 이미지
Leeway is... the freedom that someone has to take the action they want to or to change their plans.
maetel

Notice

Recent Post

Recent Comment

Recent Trackback

Archive

calendar

1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31
  • total
  • today
  • yesterday

Category

'Computer Vision'에 해당되는 글 207건

  1. 2009.04.03 Rosten & Drummond <Machine learning for high-speed corner detection>
  2. 2009.03.31 Montemerlo & Thrun <Simultaneous localization and mapping with unknown data association using FastSLAM>
  3. 2009.03.31 A. J. Davison <Real-time simultaneous localisation and mapping with a single camera>
  4. 2009.03.31 <A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking>
  5. 2009.03.31 Frank C. Park and Bryan J. Martin <Robot Sensor Calibrattion: Solving AX = XB on the Euclidean Group>
  6. 2009.03.31 Rao-Blackwellized Particle Filter
  7. 2009.03.27 Ethan Eade & Tom Drummond <Scalable Monocular SLAM>
  8. 2009.03.27 people in SLAM
  9. 2009.03.26 Civera, Davison & Montiel <Inverse Depth Parametrization for Monocular SLAM>
  10. 2009.02.16 camera calibration 09-02-16
  11. 2009.02.14 Special Issue on Visual SLAM (IEEE Transactions on Robotics, Vol. 24, No. 5)
  12. 2009.02.06 IMU sensor calibration
  13. 2009.02.06 SLAM, AR
  14. 2009.02.02 [유찬기] OpenCV & CV #4: Camera Projection Model
  15. 2009.01.29 [유찬기] OpenCV & CV #3: Image Filtering
  16. 2009.01.22 [Gonzalez & Woods] Digital Image Processing: 3. Image Enhancement in the Spatial Domain
  17. 2009.01.22 [유찬기] OpenCV & CV #2: OpenCV 시작
  18. 2009.01.21 OpenCVX
  19. 2009.01.20 [유찬기] OpenCV & CV #1: Computer Vision 소개
  20. 2009.01.12 [유찬기] OpenCV & CV #0 OpenCV 설치
  21. 2009.01.09 [유찬기] OpenCV & CV study guide
  22. 2009.01.05 .ai 파일 - 7cm*7cm grid
  23. 2008.12.29 [G. Strang] Linear Algebra
  24. 2008.12.07 Aleš Leonardis 강연 <Learning Hierarchical Representations of Object Categories>
  25. 2008.11.28 convex optimization
2009. 4. 3. 17:52 Computer Vision
arim read:

Machine learning for high-speed corner detection
Edward Rosten and Tom Drummond
Department of Engineering, Cambridge University,


Edward Rosten
http://mi.eng.cam.ac.uk/~er258/



FAST = feature from accelerated segment test
http://en.wikipedia.org/wiki/Corner_detection


ID3
http://en.wikipedia.org/wiki/ID3_algorithm
ID3 (Iterative Dichotomiser 3) is an algorithm used to generate a decision tree invented by Ross Quinlan.

posted by maetel
2009. 3. 31. 22:25 Computer Vision

Simultaneous localization and mapping with unknown data association using FastSLAM
Montemerlo, M.   Thrun, S.  
Robotics Inst., Carnegie Mellon Univ., Pittsburgh, PA, USA


This paper appears in: Robotics and Automation, 2003. Proceedings. ICRA '03. IEEE International Conference on
Publication Date: 14-19 Sept. 2003
Volume: 2
On page(s): 1985 - 1991 vol.2
Number of Pages: 3 vol.lii+4450
ISSN: 1050-4729
ISBN: 0-7803-7736-2
INSPEC Accession Number:7877180
Digital Object Identifier: 10.1109/ROBOT.2003.1241885
Current Version Published: 2003-11-10


Michael Montemerlo @ Field Robotics Center, Carnegie Mellon University 
http://en.wikipedia.org/wiki/Michael_Montemerlo

Sebastian Thrun @ Stanford Artificial Intelligence Laboratory, Stanford University
http://en.wikipedia.org/wiki/Sebastian_Thrun

http://www.probabilistic-robotics.org/


FastSLAM
http://robots.stanford.edu/probabilistic-robotics/ppt/fastslam.ppt

Rao-Blackwellized Particle Filter
http://en.wikipedia.org/wiki/Particle_filter


I. Introduction



SLAM

mobile robotics

the problem of building a map of an unknown environment from a sequence of noisy landmark measurements obtained from a moving robot + a robot localization problem => SLAM

autonomous robots operating in environments where precise maps and positioning are not available


 

Extended Kalman Filter (EKF)
: used for incrementally estimating the joint posterior distribution over robot pose and landmark positions

limitations of EKF
1) Quadratic complexity (: sensor updates require time quadratic in the total number of landmarks in the map)
=> limiting the number of landmarks to only a few hundred (whereas natural environment models frequently contain millions of features
2) Data association / correspondence (: mapping of observations to landmarks)
=> associating a small numver of observations with incorrect landmarks to cause the filter to diverge



 

FastSLAM decomposes the SLAM problem into a robot localization problem, and a collection of landmark estimation problems that are conditioned on the robot pose estimate.

ref. Montemerlo & Thrun & Koller & Wegbreit <FastSLAM: A factored solution to the simultaneous localization and mapping problem>   In Proceedings of the AAAI National Conference on Artificial Intelligence, Edmonton, Canada, 2002. AAAI.

 

FastSLAM factors the SLAM posterior into a localization problem and K independent landmark estimation problems conditioned on the robot pose estimate.

> a modified particle filter to estimate the posterior over robot paths
> each particle possessing K independent Kalman filters that estimate the landmark locations conditioned on the particle's path
=> an instance of the Rao-Blackwellized particle filter

Representing particles as binary trees of Kalman Filters
-> Incorporating observations into FastSLAM in O(MlogK) time  (M, # of particles; K, # of landmarks)

Each particle represents a different robot pose hypothesis.
=> Data association can be considered separately for every particle.
=> 1) The noise of robot motion does not affect the accuracy of data association.
2) Incorrectly associated particls will receive lower probabilities and will be removed in future resampling steps.


 

On real-world data with ambiguous data association
Adding extra odometric noise
( Odometry is the use of data from the movement of actuators to estimate change in position over time. )
Estimating an accurate map without any odometry in the environment in which the Kalman Filter inevitably diverges.
How to incorporate negative information resulting in a measurable increase in the accuracy of the resulting map



 

II. SLAM Problem Definition


probabilistic Markov chain
http://en.wikipedia.org/wiki/Markov_chain

robot's position & heading orientation, s

K landmarks' locations, θ

i) The robot's current pose is a probabilistic function of the robot control and the previous pose at time t.

ii) The sensor measurement, range and bearing to landmarks, is a probabilistic function of the robot's current pose and the landmakr being at time t.

=> SLAM is the problem of determining the locations of all landmarks and robot poses from measurements and controls.


III. Data Association


uncertainty in the SLAM posterior, mapping between observations and landmarks
=> ambiguity in data association

i) measurement noise: uncertain landmark positions
<= 1) measurement ambiquity
2) confusion between nearby landmarks

ii) motion noise: robot pose uncertainty after incorporating a control
=> 1) adding a large amount of error to the robot's pose
2) causing a filter to diverge


IV. FastSLAM with Known Data Association


dynamic Bayes network
http://en.wikipedia.org/wiki/Dynamic_Bayesian_network

conditional independece
: The problem of determining the landmark locations could be decoupled into K independent estimation problems, one for each landmark.


FastSLAM estimates the factored SLAM posterior using a modified particle filter, with K independent Kalman Filters for each particle to estimate the landmark positions conditioned on the hypothesized robot paths. The resulting algorithm is an instance of the Rao-Blackwellized particle filter.



A. Particle Filter Path Estimation

Monte Carlo Localization (MCL) algorithm
http://en.wikipedia.org/wiki/Monte_Carlo_localization
 
particle set, representing the posterior ("guess") of a robot path

proposal distribution of particle filtering








posted by maetel
2009. 3. 31. 21:10 Computer Vision

Real-time simultaneous localisation and mapping with a single camera

Davison, A.J.  
Dept. of Eng. Sci., Oxford Univ., UK;

This paper appears in: Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on
Publication Date: 13-16 Oct. 2003
On page(s): 1403-1410 vol.2
ISBN: 0-7695-1950-4
INSPEC Accession Number: 7971070
Digital Object Identifier: 10.1109/ICCV.2003.1238654
Current Version Published: 2008-04-03


 

posted by maetel
2009. 3. 31. 20:22 Computer Vision

A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking

Arulampalam, M.S.   Maskell, S.   Gordon, N.   Clapp, T.  
Defence Sci. & Technol. Organ., Adelaide, SA , Australia;


This paper appears in: Signal Processing, IEEE Transactions on [see also Acoustics, Speech, and Signal Processing, IEEE Transactions on]
Publication Date: Feb. 2002
Volume: 50 , Issue: 2
On page(s): 174 - 188
ISSN: 1053-587X
CODEN: ITPRED
INSPEC Accession Number:7173038
Digital Object Identifier: 10.1109/78.978374
Current Version Published: 2002-08-07

posted by maetel
2009. 3. 31. 15:22 Computer Vision

VIP paper-reading
2009-03-31 @GA606


by adrift  



Robot Sensor Calibration: Solving AX = XB on the Euclidean Group
Frank C. Park and Bryan J. Martin

This paper appears in: Robotics and Automation, IEEE Transactions on
Publication Date: Oct 1994
Volume: 10,  Issue: 5
On page(s): 717-721
ISSN: 1042-296X
References Cited: 8
CODEN: IRAUEZ
INSPEC Accession Number: 4803588
Digital Object Identifier: 10.1109/70.326576
Current Version Published: 2002-08-06

Abstract
The equation AX=XB on the Euclidean group arises in the problem of calibrating wrist-mounted robotic sensors. In this article the authors derive, using methods of Lie theory, a closed-form exact solution that can be visualized geometrically, and a closed-form least squares solution when A and B are measured in the presence of noise 




http://en.wikipedia.org/wiki/Lie_group

http://en.wikipedia.org/wiki/Lie_algebra

http://en.wikipedia.org/wiki/SO(3)

http://en.wikipedia.org/wiki/SO(4)



posted by maetel
2009. 3. 31. 14:00 Computer Vision



> papers

Vision-based SLAM using the Rao-Blackwellised Particle Filter
Robert Sim, Pantelis Elinas, Matt Griffin, and James J. Little


Doucet, A., Freitas, N. d., Murphy, K. P., and Russell, S. J. 2000. Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks. In Proceedings of the 16th Conference on Uncertainty in Artificial intelligence (June 30 - July 03, 2000). C. Boutilier and M. Goldszmidt, Eds. Morgan Kaufmann Publishers, San Francisco, CA, 176-183.




> tutorials

http://en.wikipedia.org/wiki/Particle_filter

Rudolph van der Merwe, Nando de Freitas, Arnaud Doucet, Eric Wan
The Unscented Particle Filter
In Advances in Neural Information Processing Systems 13 (Nov 2001)
->
Rudolph van der Merwe (OGI)
Nando de Freitas (UC Berkeley)
Arnaud Doucet (Cambridge University)
Eric Wan (OGI)



Kyuhyoung Choi (Sogang Univ., Korea) Particle Filter for Tracking



> related courses

Intelligent Embedded Systems (Fall 2002) @AI, MIT

posted by maetel
2009. 3. 27. 21:33 Computer Vision

Scalable Monocular SLAM
Eade, E.   Drummond, T.  
Cambridge University;

This paper appears in: Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on
Publication Date: 17-22 June 2006
Volume: 1,  On page(s): 469- 476
ISSN: 1063-6919
ISBN: 0-7695-2597-0
Digital Object Identifier: 10.1109/CVPR.2006.263
Current Version Published: 2006-07-05

 
Ethan Eade & Tom Drummond
Machine Intelligence Laboratory
the Division of Information Engineering at Cambridge University Engineering Department




monocular SLAM
particle filter + top-down search => real-time, large number  of landmarks

the first to apply this FastSLAM-type particle filter to single-camera SLAM


1. Introduction


SLAM = Simultaneous Localization and Mapping
: process of causally estimating both egomotion and structure in an online system

 SLAM using visual data in computer vision

SFM (= structure from motion): reconstructing scene geometry
+ causal or recursive estimation techniques

perspective-projection cameras

filtering methods to allow indirect observation models

Kalman filtering framework

Extended Kalman filter = EKF (-> to linearize the observation and dynamics models of the system)

causal estimation with recursive algorithms (cp. estimation depending only on observations up to the current time)
=> online operation (cp. SFM on global nonlinear optimization)


Davision's SLAM with a single camera
> EKF estimation framework
> top-down Bayesian estimation approach searching for landmarks in image regions constrained by estimate > uncertainty (instead of performing extensive bottom-up image processing and feature matching)
> Bayesian partial-initialization scheme for incorporating new landmarks
- cannot scale to large environment


EKF = the Extended Kalman filter
-  N*N covariace matrix for N landmarks
- updated with N*N computation cost

> SLAM system using a single camera as the only sensor
> frame-rate operation with many landmarks
> FastSLAM-style particle filter (the first use of such an approach in a monocular SLAM setting)
> top-down active search
> an efficient algorithm for discovering the depth of new landmarks that avoids linearization errors
> a novel method for using partially initialized landmarks to help constrain camera pose


FastSLAM
: based on the Rao-Blackwellized Particle Filter

2. Background

2.1 Scalable SLAM

> submap
bounded complexity -> bounded computation and space requirements

Montemerlo & Thrun
If the entire camera motion is known then the estimates of the positions of different landmarks become independent of each other.







Rao-Blackwellized Particle Filter



ZNCC = the Zero mean Normalized Cross-Correlation function epipolar constraint


epipolar constraint

http://en.wikipedia.org/wiki/Epipolar_geometry


posted by maetel
2009. 3. 27. 21:05 Computer Vision

Hugh F. Durrant-Whyte, Australian Centre for Field Robotics
http://en.wikipedia.org/wiki/Hugh_F._Durrant-Whyte

John J. Leonard, Center for Ocean Engineering, MIT

Sebastian Thrun, Stanford Artificial Intelligence Laboratory, Stanford University
http://en.wikipedia.org/wiki/Sebastian_Thrun

David Nistér, Center for Visualization and Virtual Environments, University of Kentucky

Ethan Eade, Machine Intelligence lab, Engineering Department, Cambridge University

Tom Drummond, Machine Intelligence Laboratory, Engineering Department, Cambridge University

Javier Civera, Departamento de Informática e Ingeniería de Sistemas, Universidad de Zaragoza

Andrew J. Davison, Reader in Robot Vision at the Department of Computing, Imperial College London

Jose Maria Martinez Montiel, Robotics and Real Time Group, Universidad de Zaragoza

Robert Castle, Active Vision Laboratory, Robotics Research Group, Oxford University

임현Embedded control system 연구실, 전기공학부, 인하대학교

김정호, Robotics and Computer Vision 연구실 (권인소), 한국과학기술원

labs
 
Active Vision Goup, Robotics Research Group, Engineering Department, Oxford University

Computer Vision & Robotics Group, Machine Intelligence Laboratory, Department of Engineering, University of Cambridge

Image Information Processing Lab 영상정보처리연구실 (홍기상), 포항공대

Intelligent Control and Systems Lab 지능제어 및 시스템 연구실 (김상우), 포항공대


posted by maetel
2009. 3. 26. 19:56 Computer Vision

Inverse Depth Parametrization for Monocular SLAM
Civera, J.   Davison, A.J.   Montiel, J. 


This paper appears in: Robotics, IEEE Transactions on
Publication Date: Oct. 2008
Volume: 24,  Issue: 5
On page(s): 932-945
ISSN: 1552-3098
INSPEC Accession Number: 10301459
Digital Object Identifier: 10.1109/TRO.2008.2003276
First Published: 2008-10-03
Current Version Published: 2008-10-31

Javier Civera, Departamento de Informática e Ingeniería de Sistemas, Universidad de Zaragoza

Andrew J. Davison, Reader in Robot Vision at the Department of Computing, Imperial College London

Jose Maria Martinez Montiel, Robotics and Real Time Group, Universidad de Zaragoza




monocular simultaneous localization and mapping  (SLAM)

representation of uncertainty

the standard extended Kalman filter (EKF)

direct parametrization of the inverse depth of features

feature initialization

camera motion estimates

6-D state vector --> converted to the Euclidean XYZ form

linearity index => automatic detection and conversion to maintain maximum efficiency



I. Introduction


monocular camera
: projective sensor measuring the beairng of image features

monocular (adj) 단안(單眼)(용)의, 외눈의

A stereo camera is a type of camera with two or more lenses. This allows the camera to simulate human binocular vision.

structure from motion = SFM
1) feature matching
2) global camera location & scene feature position estimates

sliding window processing

Sliding Window Protocol is a bi-directional data transmission protocol used in the data link layer (OSI model) as well as in TCP (transport layer of the OSI model). It is used to keep a record of the frame sequences sent and their respective acknowledgements received by both the users.

In robotics and computer vision, visual odometry is the process of determining the position and orientation of a robot by analyzing the associated camera images.

Odometry is the use of data from the movement of actuators to estimate change in position over time. Odometry is used by some robots, whether they be legged or wheeled, to estimate (not determine) their position relative to a starting location.

visual SLAM

probabilistic filtering approach

initializing uncertain depth estimates for distance features

Gaussian distributions implicit in the EKF

a new feature parametrization that is able to smoothly cope with initialization of features at all depth - even up to "infinity" - within the standard EKF framework: direct parametrization of inverse depth relative to the camera position from which a feature was first observed


A. Delayed and Undelayed Initialization

main map; main probabilistic state; main state vector

test for inclusion

delayed initialization
> treating newly detected features separately from the main map to reduce depth uncertainty before insertion into the full filter (with a standard XYZ representation)
- Features that retain low parallax over many frames (those very far from the camera or close to the motion epipole) are usually rejected completely because they never pass the test for inclusion
> (in 2-D and simulation) Initialization is delayed until the measurement equation is approximately Gaussian and the point can be safely triangulated.
> 3-D monocular vision with inertial sensing + auxiliary particle filter (in high frame rate sequence)

undelayed initialization
> While features with highly uncertain depths provide little information on camera translation, they are extremely useful as bearing references for orientation estimation.
: a multiple hypothesis scheme, initializing features at various depths and pruning those not reobserved in subsequent images
> Gaussian sum filter approximated by a federated information sharing method to keep the computational overhead low
-> to spread the Gaussian depth hypotheses along the ray according to inverse depth

Davision's particle method --> (Sola et al.) Gaussian sum filter --> (Civera at al.) new inverse depth scheme

 

A Gaussian sum is more efficient representation than particles (efficient enough that the separate Gaussians can call be put into the main state vector), but not as efficient as the single Gaussian representation that the inverse depth parametrization aalows.



B. Points at Infinity

efficient undelayed initialization + features at all depths (in outdoor scenes)


Point at infinity: a feature that exhibits no parallax during camera motion due to its extreme depth
-> not used for estimating camera translationm but for estimating rotation

The homogeneous coordinate systems of visual projective geometry used normally in SFM allow explicit representation of points at infinity(, and they have proven to play an important role during offline structure and motion estimation).

sequential SLAM system

Montiel and Davison: In special case where all features are known to be infinite -- in very-large-scale outdoor scenes or when the camera rotates on a tripod -- SLAM in pure angular coordinates turns the camera into a real-time visual compass.


Our probabilistic SLAM algorithm must be able to represent the uncertainty in depth of seemingly infinite features. Observing no parallax for a feature after 10 units of camera translation does tell us something about its depth -- it gives a reliable lower bound, which depends on the amount of motion made by the camera (if the feature had been closer than this, we would have observed parallax).

The explicit consideration of uncertainty in the locations of points has not been previously required in offline computer vision algorithms, but is very important in a more difficult online case.



C. Inverse Depth Representation

There is a unified and straightforward parametrization for feature locations that can handle both initialization and standard tracking of both close and very distant features within the standard EKF framework.


standard tracking

An explicit parametrization of the inverse depth of a feature along a semiinfinite ray from the position from which it was first viewed allows a Gaussian distribution to cover uncertainty in depth that spans a depth range from nearby to infinity, and permits seamless crossing over to finite depth estimates of features that have been apparently infinite for long periods of time.

linearity index + inverse depth parametrization

The projective nature of a camera means that the image measurement process is nearly linear in this inverse depth coordinate.


Inverse depth appears in the relation between image disparity and point depth in a stereo vision; it is interpreted as the parallax with respect to the plane at infinity. (Hartley and Zisserman)

Inverse depth is used to relate the motion field induced by scene points with the camera velocity in optical flow analysis. 

modified polar coordinates

target motion analysis = TMA

EKF-based sequential depth estimation from camera-known motion

multibaseline stereo

matching robustness for scene symmetries

sequential EKF process using inverse depth
( ref. Stochastic Approximation and Rate-Distortion Analysis for Robust Structure and Motion Estimation )

undelayed initialization for 2-D monocular SLAM 
( ref. A unified framework for nearby and distant landmarks in bearing-only SLAM )

FastSLAM-based system for monocular SLAM
( ref. Ethan Eade &  Tom Drummond,  Scalable Monocular SLAM )

special epipolar update step

FastSLAM

( ref. Civera, J.   Davison, A.J.   Montiel, J.M.M., Inverse Depth to Depth Conversion for Monocular SLAM 
J. Montiel and A. J. Davison “A visual compass based on SLAM,” )

loop-closing



II. State Vector Definition


handheld camera motion
> constant angular and linear velocity model

quaternion








posted by maetel
2009. 2. 16. 18:26 Computer Vision
Camera Calibration Toolbox for Matlab

read first this

1. Image names  이미지 불러오기 (같은 폴더 안에 개별 파일)
2. Extract grid corners  탐색 범위 설정
3. Calibration
4. Recomp.corners
5. Calibration (error  감소) =>  data 저장
6.  Export calib data
  -> 1 (=Zhang 선택)
  -> 3D (임의 지정), 2D ( "2DPoint"  the variable name in bin's code) => image  저장
7. Undistort image
  -> "enter" =>  생성 이미지 저장



posted by maetel
2009. 2. 14. 17:27 Computer Vision
IEEE Transactions on Robotics, Volume 24, Number 5, October 2008
: Visual SLAM Special Issue

Guest Editorial: Special Issue on Visual SLAM


simultaneous localization mapping (SLAM)
in autonomous mobile robotics
using laser range-finder sensors
to build 2-D maps of planar environments

SLAM with standard cameras:
feature detection
data association
large-scale state estimation

SICK laser scanner


Kalman filter
Particle filter
submapping

http://en.wikipedia.org/wiki/Particle_filter
particle filter = sequential Monte Carlo methods (SMC)


http://en.wikipedia.org/wiki/Image_registration
the process of transforming the different sets of data into one coordinate system


http://en.wikipedia.org/wiki/Simultaneous_localization_and_mapping
the process of creating geometrically accurate maps of the environment
to build up a map within an unknown environment while at the same time keeping track of their current position

R.C. Smith and P. Cheeseman (1986)

Hugh F. Durrant-Whyte (early 1990s)

Sebastian Thrun

mobile robotics
autonomous vehicle

한국로봇산업협  http://www.korearobot.or.kr/




posted by maetel
2009. 2. 6. 21:20 Computer Vision
posted by maetel
2009. 2. 6. 17:15 Computer Vision
SLAM


SLAM Special Issue: IEEE Transactions on Robotics, 2008, V. 24 Issue 5
http://ieeexplore.ieee.org/xpl/tocresult.jsp?isYear=2008&isnumber=4663225


J. Sola Towards visual localization, mapping and moving objects tracking by a mobile robot: A geometric and probabilistic approach, Toulouse, France, 2007.
http://ethesis.inp-toulouse.fr/archive/00000528/01/sola.pdf




AR

ISMAR2008:
http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=4637297&isYear=2008

posted by maetel
2009. 2. 2. 22:14

보호되어 있는 글입니다.
내용을 보시려면 비밀번호를 입력하세요.

2009. 1. 29. 17:16

보호되어 있는 글입니다.
내용을 보시려면 비밀번호를 입력하세요.

2009. 1. 22. 20:55 Computer Vision
Digital Image Processing (2nd ed.)
3. Image Enhancement in the Spatial Domain



> Image enhancement
1) spatial domain methods
2) frequency domain methods - Fourier transform

Visual evaluation of image quality is a hightly subjective process.


3.1 Background

> point processing
gray-level (/ intensity / mapping) transformation function
contrast stretching
thresholding function

> mask processing / filtering
mask / filter / kernel / template / window
image sharpening


3.2 Some Basic Gray Level Transformations

one-dimensional array
table lookup



3.2.1 Image Negatives
- to enhance white or gray detail embedded in dark regions of an image

3.2.2 Log Transformations
- to expand the values of dark pixels in am image while compressing the higher-level values
- to compress the dynamic range of images with large variations in pixel values
eg. Fourier spectrum
cf. net effect -> a significant degree of detail will be lost in the display of a typical Fourier spectrum
 
3.2.3 Power-Law Transformations
- gamma correction (for CRT) - with the device-dependent value of gamma
- contrast enghancement - expansion / compression of gray levels

3.2.4 Piecewise-Linear Transformation Functions

Contrast stretching
- to increase the dynamic range of the gray levels in the image being processed

Gray-level slicing
- to highlight a specific range of gray levels in an image

Bit-plane slicing
- to analyze the relative importance played by each bit of the image -> image compression


3.3 Histogram Processing

histogram => image statistics -> real-time image processing

eg.
narrow histogram -> low contrast
uniform distribution -> high contrast -> detail, more -> high dynamic range

3.3.1 Histogram Equalization

http://en.wikipedia.org/wiki/Monotonic_function

cf. http://en.wikipedia.org/wiki/Multi-valued_function

http://en.wikipedia.org/wiki/Probability_density_function
http://mathworld.wolfram.com/ProbabilityDensityFunction.html

http://en.wikipedia.org/wiki/Cumulative_distribution_function (누적 분포 함수)


The pdf of the transformed histogram is always uniform and independent of the pdf of the original histogram.


histogram = plot of pdf of gray levels versus gray levels

histogram equalization (/ histogram linearization)
= mapping each pixel in the input image into a corresponding pixel in the output image
=> to spread the histogram of the input image
-> the levels of the histogram-equalized image will span a fuller range of the gray scale


3.3.2 Histogram Matching (Specification)



posted by maetel
2009. 1. 22. 19:11 Computer Vision
2009-01-22 나무 1시 @G705

ref. 세종대 디지털 컨텐츠학과 박상일 교수 - 멀티미디어 프로그래밍 2강: 강의자료


image file - header 정보 + data

IPlImage
format 확인하여 처리

char * imageData
data [width*height]
실제 pixel intensities를 저장하는 memory를 지시

int nChannels
RGB (컬러) -> 3, Gray (흑백) -> 1

int alphaChannel
openCV에서는 지원하지 않음

int depth
8 bit -> 2^8 = 256
1 bit -> 2^1 = 2

image 좌표계 원점 - windows에서는 좌상, Linux와 OpenGL에서는 좌하

CVScalar
returning double

callback 함수
do{ }while( ) 부분

GUI - event (행위) 중심 -> procedural programming


posted by maetel
2009. 1. 21. 20:39 Computer Vision
OpenCVX is a project to provide tweaked functions or extensional functions for OpenCV which is a well known open source Computer Vision C library.

posted by maetel
2009. 1. 20. 17:15 Computer Vision

2009-01-20 불 @G703

http://en.wikipedia.org/wiki/Computer_vision

AI + 시각 =>
recognition
graphics
HCI
etc


image based lighting


EECS 432 Advanced Computer Vision  Winter 2008
by Ying Wu,  Electrical Engineering & Computer Science, Northwestern University


http://en.wikipedia.org/wiki/Machine_learning

http://en.wikipedia.org/wiki/Pattern_recognition


http://en.wikipedia.org/wiki/Intelligent_environment

http://en.wikipedia.org/wiki/Smart_environment


image processing - content analysis -> image & video understanding, motion understanding
machine learning -> visual recognition -> vision perception
 
Computer vision is to infer different factors such as camera model, lighting, color, texture, shape and motion that affect images and videos, from visual inputs.


radiometry

SfM = Structure from Motion

IBR = Image-based modeling and rendering

http://en.wikipedia.org/wiki/Imagery_analysis


dynamic range

CCD = Charged-coupled device = 전하 결합 소자 (子)

CMOS = Complementary metal–oxide–semiconductor = 상보성 금속 산화막 반도체 (體)


> Low-level Image Processing - edge detection, corner detection, filtering, morphology, etc.
> Low-level Vision - image matching, optical flow computation, motion analysis
> Middle-level Vision - (1) multi-view geometry, stereo, Structure from motion -> geometric modeling -> 3D reconstruction, image-based rendering (2) image segmentation -> visual tracking, visual motion capturing
> High-level Vision - (model-based / learning-based) object recognition -> image understanding, video understanding + intelligent human-computer interaction, intelligent robots, smart environment, content-based multimedia

'Computer Vision' 카테고리의 다른 글

[유찬기] OpenCV & CV #2: OpenCV 시작  (0) 2009.01.22
OpenCVX  (0) 2009.01.21
[유찬기] OpenCV & CV #0 OpenCV 설치  (0) 2009.01.12
[유찬기] OpenCV & CV study guide  (0) 2009.01.09
.ai 파일 - 7cm*7cm grid  (0) 2009.01.05
posted by maetel
2009. 1. 12. 18:45 Computer Vision

http://dasan.sejong.ac.kr/~sipark/class2008/mm/MP02.ppt

http://en.wikipedia.org/wiki/OpenCV

www.intel.com/technology/computing/opencv/

http://opencvlibrary.sourceforge.net/


C:\Program Files\OpenCV\docs\index.htm
OpenCV = Open Source Computer Vision Library
: a collection of C functions and a few C++ classes that implement many popular Image Processing and Computer Vision algorithms.

(공)저: Gary Bradski, Adrian Kaehler
Edition: revised
출판사: O'Reilly, 2008
ISBN 0596516134, 9780596516130
525페이지
e-book 보기


OpenCV Korea 강좌 - 1편 설치하기


.exe 실행 파일이 있는 폴더 안에 .dll 파일을 함께 넣어 주어야 함.





이칠우:
영상처리는 주어진 영상으로부터 필요한 정보를 얻거나 인간이 필요로 하는 영상으로 가공하는 기술입니다. 
이 기술은 인간의 시각기능 자동화와 매우 밀접한 관계를 지니고 있을 뿐만 아니라 최근에 들어서는 영상 데이터를 전송하고 표현하는 데 가장 핵심적인 기술입니다.

IPL (Image Processing Library)
a platform independent image manipulating C/C++ library 

디지털 영상 데이터 (저장 방식) - vector 영상 데이터 / raster 영상 데이터

bitmap
The term bitmap comes from the computer programming terminology, meaning just a map of bits, a spatially mapped array of bits. (Now, along with pixmap, it commonly refers to the similar concept of a spatially mapped array of pixels.)

디지털 영상 처리 과정>
 영상 입력 (AD 변환) -> 영상 처리 (개선, 복원, 분석, 변환, 압축) - > 영상 저장/출력

영상 처리 - 영상 인식>
image acquisition 영상획득 (표본화, 양자화) -> preprocessing 전처리 (개선, 복원) -> segmentation 분할 -> representation & description 표현 및 서술 -> recognition & interpretation 인식 및 해석

영상 처리 algorithm>
점 처리 알고리즘 - histogram, look-up table => 개선, 합성 
영역 처리 알고리즘 - 회선 알고리즘 sharpening, blurring, median filter, mask => 잡음 제거, 분할
전역 처리 알고리즘 => 기하학적 변형, (Fourier, wavelet) 변환, 압축


영상 (image) = 광선의 굴절이나 반사에 따라 비추어지는 물체의 모습

색조 (tint) = 색상 + 채도(:백색과 섞이지 않은 정도)
명도 = 빛이 물체에 비쳐져 반사되는 느낌의 정도 <- intensity, 흡수
대비 (contrast) = 하나의 영상에서 명도가 가장 높은 부분과 낮은 부분의 차이

영상 해상도 = 영상을 구성하는 픽셀의 수 => 정밀도
픽셀 해상도 = 픽셀의 비트(bit) 수 => AD 양자화 수준
화면 해상도 = 화면에 표시할 수 있는 최대 픽셀 수 <- graphic card



MMX (Multimedia Extension)

IPL = Intel Image Processing Library (영상 및 신호처리 라이브러리) -->

IPP = Integrated Performance Primitives Library
http://intel.com/software/products/ipp
 
    * Video Decode/Encode
    * Audio Decode/Encode
    * JPEG/JPEG2000
    * Data Compression
    * Cryptography – CAVP Validated!
    * Speech Coding
    * Speech Recognition
    * Image Processing
    * Image Color Conversion
    * Computer Vision
    * Signal Processing
    * Vector/Matrix Mathematics
    * String Processing
    * Data Integrity(new!)
    * Ray Tracing/Rendering

http://sourceforge.net/projects/opencvlibrary/
500 algorithms, documentation and sample code for real time computer vision

http://tech.groups.yahoo.com/group/OpenCV/


color model >
3 channel - RGB, HSV, CMY, YCC
4 channel - CMYK, RGBA

pixel depth
Color depth or bit depth, is a computer graphics term describing the number of bits used to represent the color of a single pixel in a bitmapped image or video frame buffer. This concept is also known as bits per pixel (bpp), particularly when specified along with the number of bits used.

A framebuffer is a video output device that drives a video display from a memory buffer containing a complete frame of data. The information in the buffer typically consists of color values for every pixel (point that can be displayed) on the screen.

channel sequence - pixel oriented / plane oriented (data ordering)

ROI = Region of Interest

COI = Channel of Interest

Scan Line

http://en.wikipedia.org/wiki/Integer_(computer_science)


'Computer Vision' 카테고리의 다른 글

OpenCVX  (0) 2009.01.21
[유찬기] OpenCV & CV #1: Computer Vision 소개  (0) 2009.01.20
[유찬기] OpenCV & CV study guide  (0) 2009.01.09
.ai 파일 - 7cm*7cm grid  (0) 2009.01.05
[G. Strang] Linear Algebra  (0) 2008.12.29
posted by maetel
2009. 1. 9. 20:19

보호되어 있는 글입니다.
내용을 보시려면 비밀번호를 입력하세요.

2009. 1. 5. 18:36 Computer Vision
width: 300 cm
height: 150cm

posted by maetel
2008. 12. 29. 16:39 Computer Vision
MIT 18.06 Linear Algebra
lectured by Gilbert Strang

text: Introduction to Linear Algebra, 4th ed, Wellesley-Cambridge Press, Wellesley-Cambridge Press
official: http://math.mit.edu/linearalgebra/
google: http://books.google.co.kr/books?id=Gv4pCVyoUVYC

lecture http://web.mit.edu/18.06/www/
MIT 18.06 Linear Algebra, Spring 2005 @YouTube  videos @MIT
MIT 18.06 Linear Algebra, Spring 2010



Lecture 01: The Geometry of Linear Equations

n linear equations, n unknowns
1) Row picture -> 2 lines in x-y plane, 3 planes in x-y-z space, etc.
2) Column picture -> Linear combination of columns
3) Matrix form

Q1. Can I solve Ax=b for every b?
Q2. Do the linear combinations of the columns fill 3-D space?

"Ax is a combination of the columns of A."



Lecture 02: Elimination with Matrices

Elimination - success/failure
- augmented matrix
- pivot
Back-substitution
Elimination matrices
Matrix multiplication
- permutation matrix
- inverses




Lecture 03: Multiplication and Inverse Matrices

Matrix multiplication
Inverse of A  AB  A^t
Gauss-Jordan / Find A^-1



Lecture 04: Factorization into A=LU

Inverse of AB  A^T
Product of elimination matrices
A=LU (no row exchanges)



Lecture 05: Transpose, Permutations, Spaces R^n

section 2.7: PA = LU
section 3.1: Vector spaces and subspaces

"All the linear combinations of the columns of a matrix form a subspace, called column space C(A)."



Lecture 06: Column Space and Nullspace

Vector  spaces and subspaces
Column space of A: Solving Ax=b
Nullspace of A



Lecture 07: Solving Ax=0 : Pivot Variables, Special Solutions

Computing the nullspace (Ax=0)
Pivot Variables - free variables
Special Solutions - rref(A)=R


Lecture 08: Solving Ax = b: Row Reduced Form R

Complete solution of Ax = b
: x = x_p + x_n
Rank r
r = m : Solution exists
r = n : Soultion is unique


Lecture 09: Independence, Basis, and Dimension

Linear independence
Spanning a space
BASIS and Dimension


Lecture 10: The Four Fundamental Subspaces

(Correct an error in Lecture 9!)
Four Fundamental Subspaces (for matrix A)



Lecture 11: Matrix Spaces; Rank 1; Small World Graphs

Basis of new vector spaces
Rank one matrices
Small world graphs

http://en.wikipedia.org/wiki/Euclidean_subspace

http://en.wikipedia.org/wiki/Dimension_(vector_space)

http://en.wikipedia.org/wiki/Basis_(linear_algebra)

http://en.wikipedia.org/wiki/Rank_(linear_algebra)


Lecture 12: Graphs, Networks, Incidence Matrices

Graphs & Networks
Incidence matrices
Kirchhoff's Laws

http://en.wikipedia.org/wiki/Incidence_matrix
a matrix that shows the relationship between two classes of objects

입사 행렬(incidence matrix): 그래프 내의 모서리(edge)에 대한 정보를 담고 있는 2차원 배열. 그래프 내의 정점이 n개이면 nxn의 배열로 구성되며 배열의 각 원소는 해당 행과 열의 정점이 모서리로 연결되어 있는지를 0 또는 1로 나타낸다.  (출처 : [인터넷] 돌도끼 컴퓨터용어사전)


"Real matrices from genuine problems have structures."

"The null space tells us what combination of the columns to get zero."

"All the dependencies come from the loops."

"TREE is the name for a graph with no loop."


http://en.wikipedia.org/wiki/Planar_graph
Euler's formula states that if a finite, connected, planar graph is drawn in the plane without any edge intersections, and v is the number of vertices, e is the number of edges and f is the number of faces (regions bounded by edges, including the outer, infinitely-large region), then      v − e + f = 2.


Lecture 13: Quiz 1 Review

Emphasizes Chapter 3



null space ㅗ row space


"Null space contains all vectors ㅗ row space."







'Computer Vision' 카테고리의 다른 글

[유찬기] OpenCV & CV study guide  (0) 2009.01.09
.ai 파일 - 7cm*7cm grid  (0) 2009.01.05
Aleš Leonardis 강연 <Learning Hierarchical Representations of Object Categories>  (0) 2008.12.07
convex optimization  (0) 2008.11.28
papers  (0) 2008.11.14
posted by maetel
2008. 12. 7. 17:08 Computer Vision

2008-12-04 나무 늦은 두 시 @가브리엘관 706

Aleš Leonardis & Sanja Fidler
University of Ljubljana
Faculty of computer and information science
Visual Cognitive Systems Laboratory

"Learning Hierarchical Representations of Object Categories"


SIFT
= Scale-invariant feature transform
http://en.wikipedia.org/wiki/Scale-invariant_feature_transform

'Computer Vision' 카테고리의 다른 글

.ai 파일 - 7cm*7cm grid  (0) 2009.01.05
[G. Strang] Linear Algebra  (0) 2008.12.29
convex optimization  (0) 2008.11.28
papers  (0) 2008.11.14
2008 fall courses  (0) 2008.09.15
posted by maetel
2008. 11. 28. 23:56 Computer Vision

http://en.wikipedia.org/wiki/Convex_optimization


Haitham Hindi, A Tutorial on Convex Optimization

This has an slight engineering focus but is written informally and at a very accessible level.


Ph.D. School in Optimization in Computer Vision
Copenhagen, May 19th to 23rd 2008

'Computer Vision' 카테고리의 다른 글

[G. Strang] Linear Algebra  (0) 2008.12.29
Aleš Leonardis 강연 <Learning Hierarchical Representations of Object Categories>  (0) 2008.12.07
papers  (0) 2008.11.14
2008 fall courses  (0) 2008.09.15
[박용문] Digital Image Processing #5  (0) 2008.07.21
posted by maetel