Implementing Convolutional Neural Networks using Tensorflow¶

Convolutional Neural Networks (CNNs) are a class of deep learning models that excel at working with image data. Instead of processing each pixel independently (like in a fully connected neural network), CNNs use filters (or kernels) to scan across the image, capturing spatial hierarchies and local patterns like edges, textures, and shapes.

AI Summary

This document walks through the implementation of a basic Convolutional Neural Network (CNN) using TensorFlow, designed to classify images from the CIFAR-10 dataset. It introduces core CNN components such as convolutional, pooling, and dense layers, and explains their roles in feature extraction and classification. The model is built with TensorFlow's Keras API, trained for 10 epochs, and achieves around 70% test accuracy. Readers will also learn how to visualize training progress and evaluate model performance. This is a great starting point for beginners to understand how CNNs work and how to implement them from scratch.

CNNs consist of layers such as:

Convolutional layers, which apply learnable filters to detect features.
ReLU activations, which introduce non-linearity.
Pooling layers, which reduce spatial size and help generalize.
Fully connected layers, which perform final classification based on extracted features.

This architecture allows CNNs to automatically learn useful features from raw pixel data with minimal pre-processing. They're widely used in image recognition, medical imaging, self-driving cars, and more.

Implementation of CNN using Tensorflow¶

1. Import Libraries¶

import tensorflow as tf
from tensorflow.keras import layers, models
import matplotlib.pyplot as plt
import numpy as np

2. Load and Preprocess the Dataset¶

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()

x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

y_train = tf.keras.utils.to_categorical(y_train, 10)
y_test = tf.keras.utils.to_categorical(y_test, 10)

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
170498071/170498071 ━━━━━━━━━━━━━━━━━━━━ 3s 0us/step

The CIFAR-10 dataset contains 60,000 32x32 color images in 10 categories. We normalize the image data and one-hot encode the labels for training.

3. Build the CNN Model¶

model = models.Sequential([
    layers.Input((32, 32, 3)),
    layers.Conv2D(32, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),

    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),

    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')  # 10 classes
], name="simple-cnn")

model.summary()

Model: "simple-cnn"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D)                      │ (None, 30, 30, 32)          │             896 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d (MaxPooling2D)         │ (None, 15, 15, 32)          │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_1 (Conv2D)                    │ (None, 13, 13, 64)          │          18,496 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ max_pooling2d_1 (MaxPooling2D)       │ (None, 6, 6, 64)            │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ flatten (Flatten)                    │ (None, 2304)                │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense (Dense)                        │ (None, 64)                  │         147,520 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense)                      │ (None, 10)                  │             650 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘

 Total params: 167,562 (654.54 KB)

 Trainable params: 167,562 (654.54 KB)

 Non-trainable params: 0 (0.00 B)

This simple CNN has two convolutional layers followed by max-pooling layers. After flattening the features, it passes through two dense layers to produce a probability distribution over the 10 classes.

This is a basic convolutional neural network (CNN) that you’ll often see in introductory deep learning tutorials. This model is inspired by LeNet.

4. Compile and Training the Model¶

model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy'],
)

history = model.fit(x_train, y_train, epochs=10, batch_size=64, validation_data=(x_test, y_test), )

Epoch 1/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 10s 7ms/step - accuracy: 0.3820 - loss: 1.7106 - val_accuracy: 0.5632 - val_loss: 1.2533
Epoch 2/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 4s 5ms/step - accuracy: 0.5817 - loss: 1.1892 - val_accuracy: 0.6254 - val_loss: 1.0929
Epoch 3/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - accuracy: 0.6421 - loss: 1.0342 - val_accuracy: 0.6310 - val_loss: 1.0580
Epoch 4/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 5s 4ms/step - accuracy: 0.6757 - loss: 0.9403 - val_accuracy: 0.6653 - val_loss: 0.9575
Epoch 5/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 5s 4ms/step - accuracy: 0.7016 - loss: 0.8715 - val_accuracy: 0.6787 - val_loss: 0.9492
Epoch 6/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - accuracy: 0.7199 - loss: 0.8115 - val_accuracy: 0.6936 - val_loss: 0.8958
Epoch 7/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - accuracy: 0.7385 - loss: 0.7540 - val_accuracy: 0.6931 - val_loss: 0.9070
Epoch 8/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 5s 4ms/step - accuracy: 0.7550 - loss: 0.7088 - val_accuracy: 0.6888 - val_loss: 0.9137
Epoch 9/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - accuracy: 0.7671 - loss: 0.6708 - val_accuracy: 0.7036 - val_loss: 0.8936
Epoch 10/10
782/782 ━━━━━━━━━━━━━━━━━━━━ 6s 4ms/step - accuracy: 0.7799 - loss: 0.6332 - val_accuracy: 0.7029 - val_loss: 0.9034

plt.plot(history.history['accuracy'], label='train acc')
plt.plot(history.history['val_accuracy'], label='val acc')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.title('Training and Validation Accuracy')
plt.show()

CNN Training History

5. Evaluate the Model¶

test_loss, test_acc = model.evaluate(x_test, y_test)
print(f'Test accuracy: {test_acc:.4f}')

313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7010 - loss: 0.9037
Test accuracy: 0.7029

6. Sample Prediction¶

class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
               'dog', 'frog', 'horse', 'ship', 'truck']

index = np.random.randint(len(x_test))
sample_image = x_test[index]
true_label = np.argmax(y_test[index])

prediction = model.predict(np.expand_dims(sample_image, axis=0))
predicted_label = np.argmax(prediction)

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 29ms/step

plt.imshow(sample_image)
plt.title(f"Actual: {class_names[true_label]} | Predicted: {class_names[predicted_label]}")
plt.axis('off')
plt.show()

CNN Sample Prediction

Conclusion¶

We explored what Convolutional Neural Networks (CNNs) are and how to implement a simple one using TensorFlow and Keras. While this model is simple, it's a strong foundation for understanding how more advanced architectures work. CNNs are powerful tools for working with image data because they can automatically learn meaningful patterns and hierarchies from raw pixels.