SlideDeck.io – A repository of great HTML presentations
Surveillance Event Detection – 1. Background – 2. Basics of pattern recognition
View Github Repository
Open presentation in a new window
imaus10
See all presentation from imaus10
Surveillance Event Detection – 1. Background – 2. Basics of pattern recognition
0
0
sed-pres
Surveillance event detection presentation
On Github
imaus10 / sed-pres
Surveillance Event Detection
Austin Blanton
Boston marathon bombings
Goals
Prevention Expedite apprehension process ("human in the loop")
Outline
Background Basics of pattern recognition A simple solution Future
1. Background
Automatic detection of observable events of interest in surveillance video
TRECVid
Large dataset (Gatwick airport)
Event annotations
Event detection is hard
Gatwick dataset: 240 GB
Dimensionality reduction
Most pixels not useful
Feature selection
Events are rare
Good features and/or learning algorithm
Subtle semantics of human interaction
???
KTH Human Motion Dataset
Slightly over 1 GB
One event per video
All frames part of event
Human activity recognition is less hard
...but still hard
2. Basics of pattern recognition
A tiny bit of history
Chess is easy for computers to understand
Deep Blue beat Garry Kasparov, world champion, in 1997
Evaluated 200 million positions per second
Classify spam emails, music beat detection, face recognition less easy
Instead, learn from data
What is pattern recognition?
Use statistics and probability to detect patterns
Methods often invariant to domain
...but not really (informative features are a prereq)
Approximating the hidden mathematical model of complex domains
Example: text classification
Amazon wants to guess star ratings (1-5) based on review text
Simplest method: Bag of Words (BoW)
Create dictionary of words in all documents
Count number of occurrences of each word in each document
Loses grammatical structure
document cats are cool War and Peace 5 400 10 Moby Dick 0 621 3
Ready to learn!
Bunch of labelled and unlabelled data
Use labelled data to train model
Naive Bayes popular for text classification
Predict labels
Evaluation and Overfitting
Evaluate
Split labelled data into training and test sets
Overfitting
Model too specific to training data
Does not handle unseen data - generlization error
Cross validation can help
Example 2: text clustering
Amazon wants to automatically choose novel genre
Unsupervised (no labels)
Divide into a number of clusters based on BoW input
3. A simple solution
SIFT (Scale Invariant Feature Transform)
Interest points
Keypoint descriptors
MoSIFT (Motion SIFT)
SIFT points must have optical flow
Append motion descriptor to SIFT descriptor
Optical flow
MoSIFT points
MoSIFT accuracy
Method Accuracy MoSIFT + SVM 95.83%
Foreground/background segmentation
Gaussian Mixture Model
Bag of Visual Words (BoVW)
Cluster interest points
Frequency count of each cluster in video
Loses spatial and temporal information
4. Future
How do humans do it?
Johansson (1973)
LEDs on key points of human body
Humans recognize human actions from motion ofLEDs
Biologically inspired approach
Detect human Pose estimation Track joints Compute action descriptor (over lifetime of action)
Or maybe not
Recent approaches
Good accuracy
Hard datasets
Not human specific
Speaking of hard datasets...
Turns out KTH isn't very hard
Other new challenging datasets
More action classes
Challenging backgrounds
Varying viewpoints
BIGGER
Thanks!