Published on

Computer Vision Introduction

Authors

Recently, I have started following CS231n MIT Course and the first week class is about introduction to computer vision and here I am taking my notes.

543 Million Years ago, earth is filled with water with animals floating on top of water with no vision but around 540 million years ago within a very short period of time (10 million years), the number of animal species exploded from few to hundreds of thousands.

Evolutionary Biologists call this s Evolution's Big Bang and it is theoretically it's because of first animal developing eyes. Now vision developed into biggest sensory system. 50% of neurons in our cortex involved in visual processing.

Humans & Mechanical Vision

  • First Camera - "Camera Obscura" developed by Gemma Frisius in 1545
  • Encyclopedie, 18th Century
  • Leonardo da vinci, 16th Century

Hubel & Wiesel, 1959

Cat Brain Analysis

  • Simple Cells - Response to light orientation
  • Complex Cells - Response to light orientation and movement
  • Hyper Complex Cells - Response to movement with an end point

Basically visualization in brain starts with the edges a.k.a simple structures and builds on complex information.

MIT Summer Vision Project, 1966

Aimed at developing significant part of our visual system

David Marr - Vision, 1970's

Input Image -> Primal Sketch -> 2-1/2D Sketch -> 3D Model Sketch

Primal Sketch - Zero Crossings, blobs, edges, bars, ends, virtual lines, groups, curves, boundaries.

2-1/2D Sketch - Local surface orientation and discontinuities in depth and in surface orientation.

3D Model - Hierarchically organized in terms of surface and volumetric primitives.

Generalized Cylinder - Brooks & Binford, 1979

Wanted to move from block world to real world in recognizing things

Generalized Cylinder

Pictorial Structure - Fischler & Elschlager, 1973

Tried reducing complex information into simple structures

Pictorial Structure

All these works are insignificant compared to the task at hand. If object recognition is too hard, object segmentation could be worked.

Normalized Cut - Shi & Malik, 1997

Graph theory algorithm for segmentation.

Face Detection - Viola & Jones, 2001

1999 - 2000 Machine Learning Techniques

  • Statistic Machine Learning
  • Support Vector Machines
  • Boosting
  • Graphical Models

After 5 years, in 2006, Fuji Film released first camera that can detect face.

"SIFT" & Object detection - David Lowe, 1999

Feature based detection

Spatial Pyramid Matching - Lazebnik, Schimd & Ponce - 2006

To recognize landscape holistically

Histogram of Gradients - Dalal & Triggs, 2005

Deformable Part Model - Felzenswalb, Mc Allester, Ramanan, 2009

Pascal Visual Object Challenge (2006 - 2012)

Mean Precision improved from around 25% to 500% between 2006 & 2012

Imagenet Large Scale Visual Recognition Challenge

Machine learning models tend to overfit if we do not give ample data as input to train the models and hence image-net and word-net formed, a crowd sourcing approach to gather data.

In 2012, Convolutional Neural Network Model (Deep Learning).