How to encode an image dataset to reduce its dimensionality and visualize it in the 2D space.
When manipulating semantic segmentation datasets, I found myself having to downsize segmentation masks without adding extra colors. If the image is cleanly encoded as a PNG, only the colors representing each of the classes contained in the label map will be present, and no antialias intermediate colors will exist in the image.
When resizing, though, antialias might add artifacts to your images to soften the edges, adding new colors that don't belong to any class in the label map. We can overcome this problem loading (or decoding) input images with TensorFlow as PNG and resizing our images with TensorFlow's NEAREST_NEIGHBOR
resizing method.
(You can find a list of all TensorFlow's resize methods here, and an explanation of what each of them does here.)
import tensorflow as tf
# Read image file
img = tf.io.read_file('/path/to/input/image.png')
# Decode as PNG
img = tf.io.decode_png(
img,
channels=3,
dtype=tf.uint8
)
# Resize using nearest neighbor to avoid adding new colors
# For that purpose, antialias is ignored with this resize method
img = tf.image.resize(
img,
(128, 128), # (width, height)
antialias=False, # Ignored when using NEAREST_NEIGHBOR
method=tf.image.ResizeMethod.NEAREST_NEIGHBOR
)
# Save the resize image back to PNG
tf.keras.preprocessing.image.save_img(
'/path/to/output/image.png',
img
)
try:
%tensorflow_version 2.x
except Exception:
pass
import tensorflow as tf
Note: that %tensorflow_version
is only available in Colab and not in regular Python.
Note: This was useful when TensorFlow 1 was the default. But now TensorFlow 2 is available in Colab by default.