Convolutional neural networks (CNN) are becoming mainstream in computer vision. In particular, CNNs are widely used for high-level vision tasks, like image classification (AlexNet*, for example). This article (and associated tutorial) describes an example of a CNN for image super-resolution (SR), which is a low-level vision task, and its implementation using the Intel® Distribution for Caffe* framework and Intel® Distribution for Python*. This CNN is based on the work described by Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang1,2, proposing a new approach to performing single-image SR using CNNs.
Some modern camera sensors, present in everyday electronic devices like digital cameras, phones, and tablets, are able to produce reasonably high-resolution (HR) images and videos. The resolution in the images and videos produced by these devices is in many cases acceptable for general use.
However, there are situations where the image or video is considered low resolution (LR). Examples include the following situations:
Super-resolution is a technique to obtain an HR image from one or several LR images. SR can be based on a single image or on several frames in a video sequence.
Single-image (or single-frame) SR uses pairs of LR and HR images to learn the mapping between them. For this purpose, image databases containing LR and HR pairs are created3 and used as a training set. The learned mapping can be used to predict HR details in a new image.
On the other hand, multiple-frame SR is based on several images taken from the same scene, but from slightly different conditions (such as angle, illumination, and position). This technique uses the non-redundant information…