Today, images and video are everywhere. Online photo sharing sites and social networks have them in the billions. Search engines will produce images of just about any conceivable query. Practically all phones and computers come with built in cameras. It is not uncommon for people to have many gigabytes of photos and videos on their devices.

Programming a computer and designing algorithms for understanding what is in
these images is the field of computer vision. Computer vision powers applications like image search, robot navigation, medical image analysis, photo management and many more.

The idea behind this book is to give an easily accessible entry point to hands-on
computer vision with enough understanding of the underlying theory and algorithms
to be a foundation for students, researchers and enthusiasts. The Python programming language, the language choice of this book, comes with many freely available powerful modules for handling images, mathematical computing and data mining.

When writing this book I have had the following principles as a guideline. The book

  • be written in an exploratory style. Encourage readers to follow the examples on
    their computers as they are reading the text.
  • promote and use free and open software with a low learning threshold. Python
    was the obvious choice.
  • be complete and self-contained. Not complete as in covering all of computer vision (this book is far from that!) but rather complete in that all code is presented and explained. The reader should be able to reproduce the examples and build upon them directly.
  • be broad rather than detailed, inspiring and motivational rather than theoretical.

In short: act as a source of inspiration for those interested in programming computer
vision applications.

What you need to know

  • Basic programming experience. You need to know how to use an editor and run
    scripts, how to structure code as well as basic data types. Familiarity with Python or other scripting style languages like Ruby or Matlab will help.
  • Basic mathematics. To make full use of the examples it helps if you know about
    matrices, vectors, matrix multiplication, the standard mathematical functions
    and concepts like derivatives and gradients. Some of the more advanced mathematical examples can be easily skipped.

Chapter 1 Introduces the basic tools for working with images and the central Python modules used in the book. This chapter also covers many fundamental examples needed for the remaining chapters.

Chapter 2 Explains methods for detecting interest points in images and how to use them to find corresponding points and regions between images.

Chapter 3 Describes basic transformations between images and methods for computing them. Examples range from image warping to creating panoramas.

Chapter 4 Introduces how to model cameras, generate image projections from 3D space to image features and estimate the camera viewpoint.

Chapter 5 Explains how to work with several images of the same scene, the fundamentals of multiple-view geometry and how to compute 3D reconstructions from images.

Chapter 6 Introduces a number of clustering methods and shows how to use them for grouping and organizing images based on similarity or content.

Chapter 7 Shows how to build efficient image retrieval techniques that can store
image representations and search for images based on their visual content.

Chapter 8 Describes algorithms for classifying image content and how to use them recognizing objects in images.

Chapter 9 Introduces different techniques for dividing an image into meaningful
regions using clustering, user interactions or image models.

Chapter 10 Shows how to use the Python interface for the commonly used OpenCV computer vision library and how to work with video and camera input.

