블로그 이미지
Leeway is... the freedom that someone has to take the action they want to or to change their plans.
maetel

Notice

Recent Post

Recent Comment

Recent Trackback

Archive

calendar

1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31
  • total
  • today
  • yesterday

Category

2009. 11. 8. 16:31 Computer Vision
Branislav Kisačanin & Vladimir Pavlović & Thomas S. Huang
Real-Time Vision for Human-Computer Interaction
(RTV4HCI)
Springer, 2005
(google book's overview)

2004 IEEE CVPR Workshop on RTV4HCI - Papers
http://rtv4hci.rutgers.edu/04/


Computer vision and pattern recognition continue to play a dominant role in the HCI realm. However, computer vision methods often fail to become pervasive in the field due to the lack of real-time, robust algorithms, and novel and convincing applications.

Keywords:
head and face modeling
map building
pervasive computing
real-time detection

Contents:
RTV4HCI: A Historical Overview.
- Real-Time Algorithms: From Signal Processing to Computer Vision.
- Recognition of Isolated Fingerspelling Gestures Using Depth Edges.
- Appearance-Based Real-Time Understanding of Gestures Using Projected Euler Angles.
- Flocks of Features for Tracking Articulated Objects.
- Static Hand Posture Recognition Based on Okapi-Chamfer Matching.
- Visual Modeling of Dynamic Gestures Using 3D Appearance and Motion Features.
- Head and Facial Animation Tracking Using Appearance-Adaptive Models and Particle Filters.
- A Real-Time Vision Interface Based on Gaze Detection -- EyeKeys.
- Map Building from Human-Computer Interactions.
- Real-Time Inference of Complex Mental States from Facial Expressions and Head Gestures.
- Epipolar Constrained User Pushbutton Selection in Projected Interfaces.
- Vision-Based HCI Applications.
- The Office of the Past.
- MPEG-4 Face and Body Animation Coding Applied to HCI.
- Multimodal Human-Computer Interaction.
- Smart Camera Systems Technology Roadmap.
- Index.




RTV4HCI: A Historical Overview
Matthew Turk (mturk@cs.ucsb.edu)
University of California, Santa Barbara
http://www.stanford.edu/~mturk/
http://www.cs.ucsb.edu/~mturk/

The goal of research in real-time vision for human-computer interaction is to develop algorithms and systems that sense and perceive humans and human activity, in order to enable more natural, powerful, and effective computer interfaces.

Computers in the Human Interaction Loop (CHIL)

perceptual interfaces
multimodal interfaces
post-WIMP(windows, icons, menus, pointer) interfaces

implicit user awareness or explicit user control

The user interface
- the software and devices that implement a particular model (or set of models) of HCI

Computer vision technologies must ultimately deliver a better "user experience".

B Shneiderman, Designing the User Interface: Strategies for Effective Human-Computer Interaction, Third Edition, Addison-Wesley, 1998.
: 1) time to learn 2) speed of performance 3) user error rates 4) retention over time 5) subjective satisfaction

- Presence and location (Face and body detection, head and body tracking)
- Identity (Face recognition, gait recognition)
- Expression (Facial feature tracking, expression modeling and analysis)
- Focus of attention (Head/face tracking, eye gaze tracking)
- Body posture and movement (Body modeling and tracking)
- Gesture (Gesture recognition, hand tracking)
- Activity (Analysis of body movement)

eg.
VIDEOPLACE (M W Krueger, Artificial Reality II, Addison-Wesley, 1991)
Magic Morphin Mirror / Mass Hallucinations (T Darrell et al., SIGGRAPH Visual Proc, 1997)

Principal Component Analysis (PCA)
Linear Discriminant Analysis (LDA)
Gabor Wavelet Networks (GWNs)
Active Appearance Models (AAMs)
Hidden Markov Models (HMMs)

Identix Inc.
Viisage Technology Inc.
Cognitec Systems


- MIT Medial Lab
ALIVE system (P Maes et al., The ALIVE system: wireless, full-body interaction with autonomous agents, ACM Multimedia Systems, 1996)
PFinder system (C R Wren et al., Pfinder: Real-time tracking of the human body, IEEE Trans PAMI, pp 780-785, 1997)
KidsRoom project (A Bobick et al., The KidsRoom: A perceptually-based interactive and immersive story environment, PRESENCE: Teleoperators and Virtual Environments, pp 367-391, 1999)




Flocks of Features for Tracking Articulated Objects
Mathias Kolsch (kolsch@nps.edu
Computer Science Department, Naval Postgraduate School, Monterey
Matthew Turk (mturk@cs.ucsb.edu)
Computer Science Department, University of California, Santa Barbara




Visual Modeling of Dynamic Gestures Using 3D Appearance and Motion Features
Guangqi Ye (grant@cs.jhu.edu), Jason J. Corso, Gregory D. Hager
Computational Interaction and Robotics Laboratory
The Johns Hopkins University



Map Building from Human-Computer Interactions
http://groups.csail.mit.edu/lbr/mars/pubs/pubs.html#publications
Artur M. Arsenio (arsenio@csail.mit.edu)
Computer Science and Artificial Intelligence Laboratory
Massachusetts Institute of Technology



Vision-Based HCI Applications
Eric Petajan (eric@f2f-inc.com)
face2face animation, inc.
eric@f2f-inc.com



The Office of the Past
Jiwon Kim (jwkim@cs.washington.edu), Steven M. Seitz (seitz@cs.washington.edu)
University of Washington
Maneesh Agrawala (maneesh@microsoft.com)
Microsoft Research
Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 10 - Volume 10  Page: 157   Year of Publication: 2004
http://desktop.google.com
http://grail.cs.washington.edu/projects/office/
http://www.realvnc.com/



Smart Camera Systems Technology Roadmap
Bruce Flinchbaugh (b-flinchbaugh@ti.com)
Texas Instruments

posted by maetel
2009. 3. 27. 21:33 Computer Vision

Scalable Monocular SLAM
Eade, E.   Drummond, T.  
Cambridge University;

This paper appears in: Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on
Publication Date: 17-22 June 2006
Volume: 1,  On page(s): 469- 476
ISSN: 1063-6919
ISBN: 0-7695-2597-0
Digital Object Identifier: 10.1109/CVPR.2006.263
Current Version Published: 2006-07-05

 
Ethan Eade & Tom Drummond
Machine Intelligence Laboratory
the Division of Information Engineering at Cambridge University Engineering Department




monocular SLAM
particle filter + top-down search => real-time, large number  of landmarks

the first to apply this FastSLAM-type particle filter to single-camera SLAM


1. Introduction


SLAM = Simultaneous Localization and Mapping
: process of causally estimating both egomotion and structure in an online system

 SLAM using visual data in computer vision

SFM (= structure from motion): reconstructing scene geometry
+ causal or recursive estimation techniques

perspective-projection cameras

filtering methods to allow indirect observation models

Kalman filtering framework

Extended Kalman filter = EKF (-> to linearize the observation and dynamics models of the system)

causal estimation with recursive algorithms (cp. estimation depending only on observations up to the current time)
=> online operation (cp. SFM on global nonlinear optimization)


Davision's SLAM with a single camera
> EKF estimation framework
> top-down Bayesian estimation approach searching for landmarks in image regions constrained by estimate > uncertainty (instead of performing extensive bottom-up image processing and feature matching)
> Bayesian partial-initialization scheme for incorporating new landmarks
- cannot scale to large environment


EKF = the Extended Kalman filter
-  N*N covariace matrix for N landmarks
- updated with N*N computation cost

> SLAM system using a single camera as the only sensor
> frame-rate operation with many landmarks
> FastSLAM-style particle filter (the first use of such an approach in a monocular SLAM setting)
> top-down active search
> an efficient algorithm for discovering the depth of new landmarks that avoids linearization errors
> a novel method for using partially initialized landmarks to help constrain camera pose


FastSLAM
: based on the Rao-Blackwellized Particle Filter

2. Background

2.1 Scalable SLAM

> submap
bounded complexity -> bounded computation and space requirements

Montemerlo & Thrun
If the entire camera motion is known then the estimates of the positions of different landmarks become independent of each other.







Rao-Blackwellized Particle Filter



ZNCC = the Zero mean Normalized Cross-Correlation function epipolar constraint


epipolar constraint

http://en.wikipedia.org/wiki/Epipolar_geometry


posted by maetel