Rozpoznawanie obrazów

w sieciach CNN, w oparciu o deeplearning4j.
Marek Będkowski

Marek Będkowski


marek@e-kursy.it
Pewnie średnio Was to obchodzi, ale i tak Wam powiem ;)

Plan

  1. Intro
  2. Origin of CNN - biology inspired
  3. What is CNN Training?
  4. Live demo
  5. Summary

Artificial Intelligence

Origin of CNN

Brain

source

Mammal brain...

source

Cat vision

A light poiting of stable cat's eye, influences certain part of its visual (striate) cortex

pl: bruzda ostrogowa source

Shape detection

Shape detection

+

Shape detection (found)

$$a \cdot b = \sum_{i=1}^{n} a_{i}b_{i} = (50 * 30) + (50 * 30) + (50 * 30) + (20 * 30) + (50 * 30) + (50 * 30) = \underline{6000} $$ W tym przypadku prostym wymnożeniem wartości z obu sygnałów wejściowych, a następnie ich zsumowaniem

Shape detection (not found)

$$a \cdot b = \sum_{i=1}^{n} a_{i}b_{i} = \underline{0} $$

Multiple shape detection

+

Machine Learning

Possible issues


Machine Learning

Convolution
Splot

Digital Signal Processing

source

Convolution example

source

Convolution example

How does it relate to image processing?

Convolution = Filter

Can you think of filter example?

Sobel edge detection


Can you spot the pattern?
source

Sobel edge detection

$$\mid G\mid = \sqrt{Gx^2 + Gy^2}$$
Sobel edge detection source

Sobel derivate


source (OpenCV)

Possible issues

Convolution layer!

Convolution layer

new ConvolutionLayer.Builder(/*kernelSize*/ 5, 5) // 5 px kernel
  .nIn(1) // black white image as input
  .stride(1, 1) // move by 1 pixel
  .nOut(20); // output depth (?)

Definition of multiple filters

Multiple filters

Convolution layer output?

Shape

Filters visualisation

source

Convolution output size

$$\frac{W - K + 2P}{S} + 1$$

Convolution layer

new ConvolutionLayer.Builder(/*kernelSize*/ 5, 5) // 5 px kernel
  .nIn(1) // black white image as input
  .stride(1, 1) // move by 1 pixel
  .nOut(20); // output depth

$$\frac{W - K + 2P}{S} + 1 = \frac{28 - 5 + 0}{1} + 1 = 24$$

OUTPUT: 24x24x20

Convolution

source

Convolution

CNN forward pass

What's missing?

Loss Function

Measure of inconsistency between predicted value \(\hat{y}\) and actual label \(y\)

Loss Function - examples

Function whose derivative exists at each point in its domain.

source

Code example

import org.nd4j.linalg.lossfunctions.LossFunctions.LossFunction;
  // (...)
  // last layer in our network
  .layer( 5, new OutputLayer.Builder( LossFunction.NEGATIVELOGLIKELIHOOD )
  .nOut( 10 ) // 10 classes = 10 digits
  .activation( Activation.SOFTMAX )
  .build() )

What is CNN training?

Optimization problem

Training CNNs is normally done using a gradient-based optimization method

source

Gradient descent

Gradient descent

Optimization algorithms

sources: 1 2

CNN backward pass

So we start doing magic!

First derivative

(Thus,) the first stage, namely the propagation of errors backwards through the network (i.e. backpropogation) in order to evaluate derivatives.

$$\frac{\partial L_{i}}{\partial f_{k}} = p_{k} - \mathbb{1}(y_{i} = k)$$

Allows computing the output derivatives in a memory-efficient manner. source

Parameter update

Now these derivatives are used to make adjustments to the weights, the simplest such technique is gradient descent.

source

Code example

import org.nd4j.linalg.learning.config.Nesterovs;

MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
  .updater( new Nesterovs( 0.01, 0.9 ) ) // nesterovs with parameters

What's being updated?

What is weight in convolution layer?

Filters!

Possible issues

Deeplearning performance

Demos

Summary

The End!