Image and video analysis. Part 1

Lecture 1. Introduction to computer vision

  • Goals of computer vision – metric and semantic vision.
  • Complexity of computer vision and visual clues.
  • Examples of modern computer vision systems and products.
  • Human optical system and digital camera.
  • Color models.

Lecture 2. Image processing basics

  • Main image processing tasks.
  • Global color correction. Histograms. Linear and non-linear brightness correction.
  • Image noise. Image convolution. Gaussian filter, image sharpening, median filter.
  • Edge detection and Canny edge detector.
  • Special effects via image filtering.

Lecture 3. Image processing (cont.)

  • Frequency-based image processing, DCT, JPEG image correction.
  • Basic image segmentation methods – image binarization, connected components extraction, mathematical morphology.
  • Texture.
  • Heuristic methods of image recognition based on image segmentation.

Lecture 4. Local image features

  • Image matching problem.
  • Local image features – Harris detector, LoG, DOG, Harris-Laplacian.
  • Image matching via point features – SIFT and affine adaptation.

Lecture 5. Model fitting

  • Parametrical models – lines, curves, geometric transformation.
  • Digital Linear Transform (DLT) for lines and geometric transformation.
  • Robust methods – M-estimators, randomized methods (RANSAC), voting schemes (Hough Transform).
  • Application for object detection and panoramic image stitching.

Lecture 6. Category-level image classification

  • Notion of object category.
  • Image recognition in human brain.
  • Basic pipeline of image classification. Features, coding and aggregation methods.
  • Visual words and “bag-of-words” model.

Lecture 7. Category-level object detection

  • Object detection via sliding window classification.
  • “Bag-of-words” model for object detection.
  • Histogram of gradients (HOG) plus SVM. Jittering and bootstrapping.
  • Object detection with weak classifiers, boosting and cascade of classifiers. Viola-Jones (VJ) detector. Hoar features. Integral images.
  • Hierarchical architectures for object detection.

Lecture 8. Content-based image retrieval (CBIR)

  • History of content-based image retrieval methods, QBIC.
  • Near-duplicates search - GIST descriptor, approximate nearest-neighbors methods, inverted index, hamming embedding.
  • “Bag-of-words” model for image retrieval.
  • Search result re-ranking via geometric information and query expansion methods.

Lecture 9. Internet vision

  • Creation of large ground truth image datasets, Amazon Turk.
  • Image completion, image recognition and photo collage via large datasets.
  • Object filters for image recognition and retrieval.

Lecture 10. Semantic image segmentation

Lecture 11. Object tracking and action recognition

  • Object tracking problem, evaluation of tracking performance.
  • Object tracking methods – pattern matching, edge-based, MeanShift, flock of features, combination of different methods.
  • Action recognition, test datasets, automatic video annotation.
  • Nearest-neighbor based action recognition.
  • “Bag-of-words” model for action recognition.

Lecture 12. Real-time computer vision

  • Augmented reality as example of real-time computer vision.
  • Random forest and its application for real-time computer vision.
  • Image matching and alignment in real-time.
  • Microsoft Kinect and human pose estimation.
All lectures are accompanied by practical tasks for seminars and homework. Also students are expected to participate in one course project – image recognition competition.


R.Szeliski “Computer vision: algorithms and applications”.
D.A. Forsyth, J. Ponce “Computer Vision: A Modern Approach (2nd Edition)”
R.Gonsalez, R. Woods, “Digital image processing”