A Simplified explanation of CNN component layers in deep learning
Introduction
In this article, we will discuss the work on explaining and interpreting the layers of the neural network, with an explanation of each layer, its role, and what it affects in the network.
Define most of the layers of CNN and Argument Weight and bias initializers:
In this reading, we will define most of the layers and investigate different ways to initialize weights and biases in the layers of neural networks.
Import The most effective Library:
import tensorflow as tf
import pandas as pd
What is the TensorFlow?
It is one of the most important and famous open-source platforms for working on deep learning and artificial intelligence applications that are used to build deep learning models. Building neural networks.
Here we will learn about default weights and biases
In the models we’ve worked with so far, we haven’t specified initial values for the weights and biases in each layer of our neural networks. The default values for weights and biases in TensorFlow depend on the type of layers we are using. For example, in a “Dense” layer, the biases are set to zero (“zeros”) by default, while the weights are set according to “glorot_uniform”, Glorot’s uniform initializer.
Here we will look at defining our own weights and biases for illustration and experimentation:
This is often where we want to configure our own weights and biases, and TensorFlow makes this process quite straightforward. When we build a model in TensorFlow, each layer has optional arguments ‘kernel_initialiser’ and ‘bias_initialiser’, which are used to set weights and biases respectively. If the layer has no weights or biases (eg it is a max pooling layer), trying to set ‘kernel_initialiser’ or ‘bias_initialiser’ will throw an error.
Let’s illustrate an example that uses some of the different configurations available in Keras.
What is the Sequential?
Sequential is a well-known way to build a model in Keras. It allows you to build a model layer by layer. Each layer has weights corresponding to the next layer. We use the “add()” function to add layers to our model. We will add two layers and an output layer.
What is the difference between sequential and functional forms?
sequential and Function are two ways to build Keras models. The sequential model is the simplest type of model, which is a linear stack of layers. If we need to generate arbitrary layer graphs, Keras functional API can do it for us. Keras provides some datasets, which can be loaded using Keras directly.
What is the use of sequential API?
Sequential API allows you to create models layer-by-layer by stacking them. It is limited in that it does not allow you to create models that share layers or have multiple inputs or outputs.
from tensorflow.keras.models import Sequential
model=Sequential()
What are the Flatten layers?
Flattening layers is a simple process that requires you to create a few layers with background elements before merging them into a single layer.
from tensorflow.keras.layers import Flatten
model.add(Flatten(data_format=None, **kwargs))
Arguments
- data_format: A string, one of
channels_last
(default) orchannels_first
. The ordering of the dimensions in the inputs.channels_last
corresponds to inputs with shapes(batch, ..., channels)
whilechannels_first
corresponds to inputs with shape(batch, channels, ...)
. It defaults to theimage_data_format
value found in your Keras config file at~/.keras/keras.json
. If you never set it, then it will be "channels_last".
What is the Dense layer?
The dense layer is also called the fully connected layer, which is widely used in deep learning models.
When we look at the structure of a neural network, we will find that the dense layer is a layer closely related to its previous layer, which means that the neurons of the layer are connected to every neuron in the previous layer. This layer is the most widely used layer in artificial neural networks. Which works greatly in obtaining reliable results.
from tensorflow.keras.layers import Dense
# This is the all prameter of Dense
model.add(Dense( units,
activation=None,
use_bias=True,
kernel_initializer="glorot_uniform",
bias_initializer="zeros",
kernel_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
bias_constraint=None,))
Arguments
- units: Positive integer, the dimensionality of the output space.
- activation: Activation function to use. If you don’t specify anything, no activation is applied (ie. “linear” activation:
a(x) = x
). - use_bias: Boolean, whether the layer uses a bias vector.
- kernel_initializer: Initializer for the
kernel
weights matrix. - bias_initializer: Initializer for the bias vector.
- kernel_regularizer: Regularizer function applied to the
kernel
weights matrix. - bias_regularizer: Regularizer function applied to the bias vector.
- activity_regularizer: Regularizer function applied to the output of the layer (its “activation”).
- kernel_constraint: Constraint function applied to the
kernel
weights matrix. - bias_constraint: Constraint function applied to the bias vector.
Input shape
N-D tensor with shape: (batch_size, ..., input_dim)
. The most common situation would be a 2D input with a shape (batch_size, input_dim)
.
Output shape
N-D tensor with shape: (batch_size, ..., units)
. For instance, for a 2D input with shape, (batch_size, input_dim)
, the output would have a shape (batch_size, units)
.
What is the Conv1D?
Let’s take a look at the 1D convolution layer (like temporal convolution). This layer creates a convolution kernel that is convoluted with the input of the layer on one spatial (or time) dimension to produce a tensor of the output. If use_bias is true, the bias vector is generated and added to the output.
from tensorflow.keras.layers import Conv1D
# This is all prameter of Conv1D
model.add(Conv1D(filters,
kernel_size,
strides=1,
padding="valid",
data_format="channels_last",
dilation_rate=1,
groups=1,
activation=None,
use_bias=True,
kernel_initializer="glorot_uniform",
bias_initializer="zeros",
kernel_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
bias_constraint=None))
Arguments
- filters: Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution).
- kernel_size: An integer or tuple/list of a single integer, specifying the length of the 1D convolution window.
- strides: An integer or tuple/list of a single integer, specifying the stride length of the convolution. Specifying any stride value != 1 is incompatible with specifying any
dilation_rate
value != 1. - padding: One of
"valid"
,"same"
or"causal"
(case-insensitive)."valid"
means no padding."same"
results in padding with zeros evenly to the left/right or up/down of the input such that the output has the same height/width dimension as the input."causal"
results in causal (dilated) convolutions, e.g.output[t]
does not depend oninput[t+1:]
. Useful when modeling temporal data where the model should not violate the temporal order. See WaveNet: A Generative Model for Raw Audio, section 2.1. - data_format: A string, one of
channels_last
(default) orchannels_first
. The ordering of the dimensions in the inputs.channels_last
corresponds to inputs with shapes(batch_size, width, channels)
whilechannels_first
corresponds to inputs with shape(batch_size, channels, width)
. Note that thechannels_first
format is currently not supported by TensorFlow on the CPU. - dilation_rate: an integer or tuple/list of a single integer, specifying the dilation rate to use for dilated convolution. Currently, specifying any
dilation_rate
value != 1 is incompatible with specifying anystrides
value != 1. - groups: A positive integer specifying the number of groups in which the input is split along the channel axis. Each group is convolved separately with
filters / groups
filters. The output is the concatenation of all thegroups
results along the channel axis. Input channelsfilters
must both be divisible bygroups
. - activation: Activation function to use. If you don’t specify anything, no activation is applied (see
keras.activations
). - use_bias: Boolean, whether the layer uses a bias vector.
- kernel_initializer: Initializer for the
kernel
weights matrix (seekeras.initializers
). Defaults to 'glorot_uniform'. - bias_initializer: Initializer for the bias vector (see
keras.initializers
). Defaults to 'zeros'. - kernel_regularizer: Regularizer function applied to the
kernel
weights matrix (seekeras.regularizers
). - bias_regularizer: Regularizer function applied to the bias vector (see
keras.regularizers
). - activity_regularizer: Regularizer function applied to the output of the layer (its “activation”) (see
keras.regularizers
). - kernel_constraint: Constraint function applied to the kernel matrix (see
keras.constraints
). - bias_constraint: Constraint function applied to the bias vector (see
keras.constraints
).
Input shape
3+D tensor with shape: batch_shape + (steps, input_dim)
Output shape
3+D tensor with shape: batch_shape + (new_steps, filters)
steps
value might have changed due to padding or strides.
What is the MaxPooling1D?
Max pooling operation for 1D temporal data. Downsamples the input representation by taking the maximum value over a spatial window of size pool_size. The window is shifted by strides.
from tensorflow.keras.layers import MaxPooling1D
#All Prammeter for MaxPolling1D
model.add(MaxPooling1D(pool_size=2, strides=None,
padding="valid", data_format="channels_last",))
Arguments
- pool_size: Integer, size of the max pooling window.
- strides: Integer, or None. Specifies how much the pooling window moves for each pooling step. If None, it will default to
pool_size
. - padding: One
"valid"
or"same"
(case-insensitive)."valid"
means no padding."same"
results in padding evenly to the left/right or up/down of the input such that the output has the same height/width dimension as the input. - data_format: A string, one of
channels_last
(default) orchannels_first
. The ordering of the dimensions in the inputs.channels_last
corresponds to inputs with shapes(batch, steps, features)
whilechannels_first
corresponds to inputs with shape(batch, features, steps)
.
Input shape
- If
data_format='channels_last'
: 3D tensor with shape(batch_size, steps, features)
. - If
data_format='channels_first'
: 3D tensor with shape(batch_size, features, steps)
.
Output shape
- If
data_format='channels_last'
: 3D tensor with shape(batch_size, downsampled_steps, features)
. - If
data_format='channels_first'
: 3D tensor with shape(batch_size, features, downsampled_steps)
.
Here we can add all these layers together and build a small CNN:
# Construct a model
model = Sequential([
# This is convluation layer
Conv1D(filters=16, kernel_size=3, input_shape=(128, 64), kernel_initializer='random_uniform', bias_initializer="zeros", activation='relu'),
#This is the Maxpooling1D
MaxPooling1D(pool_size=4),
# This is Flatten layers
Flatten(),
# This is Dense Layer
Dense(64, kernel_initializer='he_uniform', bias_initializer='ones', activation='relu'),
])
As the following example illustrates, we can also instantiate initializers in a slightly different manner, allowing us to set optional arguments of the initialization method.
Here we can add some layers to our model:
model.add(Dense(64,
kernel_initializer=tf.keras.initializers.RandomNormal(mean=0.0, stddev=0.05),
bias_initializer=tf.keras.initializers.Constant(value=0.4),
activation='relu'),)
model.add(Dense(8,
kernel_initializer=tf.keras.initializers.Orthogonal(gain=1.0, seed=None),
bias_initializer=tf.keras.initializers.Constant(value=0.4),
activation='relu'))
Here we can specify the assigned weight and initial bias rates
Here it is also possible to determine your weight and bias indicators.
Prefixes must take two arguments, the ‘shape’ of the tensor to initialize and its, `dtype`.
Here we will explain an example of this example is able to explain further.
import tensorflow.keras.backend as K
# Define a custom initializer
def my_init(shape, dtype=None):
return K.random_normal(shape, dtype=dtype)
model.add(Dense(64, kernel_initializer=my_init))
Here we can see the summary.
# Print the model summary
model.summary()
Visualizing the initialized weights and biases.
Here, we can see the effect of our starter on the weights and biases by plotting the graphs of the resulting values. Here we can compare these plots with the configuration specified for each layer above.
import matplotlib.pyplot as plt
# Plot histograms of weight and bias values
# Using Subplots to show visualization
fig, axes = plt.subplots(5, 2, figsize=(12,16))
fig.subplots_adjust(hspace=0.5, wspace=0.5)
# Filter out the pooling and flatten layers, that don't have any weights
weight_layers = [layer for layer in model.layers if len(layer.weights) > 0]
for i, layer in enumerate(weight_layers):
for j in [0, 1]:
axes[i, j].hist(layer.weights[j].numpy().flatten(), align='left')
axes[i, j].set_title(layer.weights[j].name)
Reference: