Dr. Tiziana Ligorio x CSCI 493.77 - Deep Learning, Hunter College of the City University of New York

CNN’s main components include convolutional layers, pooling layers and dense layers.

Screenshot 2024-03-20 at 5.00.15 PM.png

Image credits: Gèron, Hands On ML3

Filters / Kernels

The terms filter and kernel are used interchangeably.

You can think of the filter as a 2D matrix that slides across the input of the convolution. Each neuron in the next layer is equivalent to summing the results of the element-wise multiplication of the patch and a same-size filter.

2D_kernel.png

Color image & Multiple Filters

The output of applying a single filter is called a Feature Map. The output of applying multiple filters is a stack of feature maps and has the same spatial dimensions as the input. The number of feature maps produced (the depth of the output) is the same as the number of filters applied.

But, of course, if this network of convolutional layers is to predict non-linear functions, we need to add nonlinearity here too, just as we would in a network of dense layers.