How to Load Dataset in TensorFlow

This example demonstrates the process of loading, normalizing, and batching the MNIST dataset using TensorFlow. The code showcases best practices for preparing data for training and validation in deep learning models.

Key Features:

  • MNIST Dataset: Loads the famous MNIST dataset, which contains images of handwritten digits.
  • Data Normalization: Scales the pixel values of the images to the [0, 1] range for better model performance.
  • Shuffling and Batching: The training data is shuffled and divided into batches to optimize model training.
  • Data Verification: Prints out the shapes of the batched images and labels to confirm successful data preparation.

Code:

# Step 1: Import necessary libraries
import tensorflow as tf

# Step 2: Load a built-in dataset (e.g., MNIST)
mnist = tf.keras.datasets.mnist

# Step 3: Split dataset into training and testing sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Step 4: Normalize the data
x_train, x_test = x_train / 255.0, x_test / 255.0

# Step 5: Create a TensorFlow Dataset object from the NumPy arrays
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
test_dataset = tf.data.Dataset.from_tensor_slices((x_test, y_test))

# Step 6: Shuffle and batch the training dataset
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(32)
test_dataset = test_dataset.batch(32)

# Step 7: Print a sample batch for verification
for images, labels in train_dataset.take(1):
    print("Sample batch shape:", images.shape, labels.shape)

# Step 8: Iterate through the dataset to confirm loading
for images, labels in train_dataset.take(5):
    print("Batch images shape:", images.shape)
    print("Batch labels shape:", labels.shape)
OUTPUT:
Batch images shape: (32, 28, 28)
Batch labels shape: (32,)
Batch images shape: (32, 28, 28)
Batch labels shape: (32,)
Batch images shape: (32, 28, 28)
Batch labels shape: (32,)
Batch images shape: (32, 28, 28)
Batch labels shape: (32,)
Batch images shape: (32, 28, 28)
Batch labels shape: (32,)
EXPLANATION:

This program loads the MNIST dataset using TensorFlow, normalizes the image data, and creates TensorFlow Dataset objects for training and testing. It shuffles and batches the data for efficient model training. The code prints the shape of a sample batch of images and labels. This is a common preprocessing step for machine learning tasks using TensorFlow.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top