Skip to content

Commit 65339fa

Browse files
No public description
PiperOrigin-RevId: 714957630
1 parent 41cc801 commit 65339fa

File tree

2 files changed

+28
-8
lines changed

2 files changed

+28
-8
lines changed

official/vision/dataloaders/segmentation_input.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -157,9 +157,14 @@ def _prepare_image_and_label(self, data):
157157
dtype=tf.uint8,
158158
)
159159
image = tf.reshape(image, (height, width, self._image_feature.num_channels))
160-
# Normalizes the image feature with mean and std values, which are divided
161-
# by 255 because an uint8 image are re-scaled automatically. Images other
162-
# than uint8 type will be wrongly normalized.
160+
# Normalizes the image feature.
161+
# The mean and stddev values are divided by 255 to ensure correct
162+
# normalization, as the input `uint8` image is automatically converted to
163+
# `float32` and rescaled to values in the range [0, 1] before the
164+
# normalization happens (as a pre-processing step). So, we re-scale the
165+
# mean and stddev values to the range [0, 1] beforehand.
166+
# See `preprocess_ops.normalize_image` for details on the expected ranges
167+
# for the image mean (`offset`) and stddev (`scale`).
163168
image = preprocess_ops.normalize_image(
164169
image,
165170
[mean / 255.0 for mean in self._image_feature.mean],

official/vision/ops/preprocess_ops.py

Lines changed: 20 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -82,13 +82,28 @@ def normalize_image(
8282
) -> tf.Tensor:
8383
"""Normalizes the image to zero mean and unit variance.
8484
85-
If the input image dtype is float, it is expected to either have values in
86-
[0, 1) and offset is MEAN_NORM, or have values in [0, 255] and offset is
87-
MEAN_RGB.
85+
This function normalizes the input image by subtracting the `offset`
86+
and dividing by the `scale`.
87+
88+
**Important Note about Input Types and Normalization:**
89+
90+
* **Integer Images:** If the input `image` is an integer type (e.g., `uint8`),
91+
the provided `offset` and `scale` values should be already **normalized**
92+
to the range [0, 1]. This is because the function converts integer images to
93+
float32 with values in the range [0, 1] before the normalization happens.
94+
95+
* **Float Images:** If the input `image` is a float type (e.g., `float32`),
96+
the `offset` and `scale` values should be in the **same range** as the
97+
image data.
98+
- If the image has values in [0, 1], the `offset` and `scale` should
99+
also be in [0, 1].
100+
- If the image has values in [0, 255], the `offset` and `scale` should
101+
also be in [0, 255].
88102
89103
Args:
90-
image: A tf.Tensor in either (1) float dtype with values in range [0, 1) or
91-
[0, 255], or (2) int type with values in range [0, 255].
104+
image: A `tf.Tensor` in either:
105+
(1) float dtype with values in range [0, 1) or [0, 255], or
106+
(2) int type with values in range [0, 255].
92107
offset: A tuple of mean values to be subtracted from the image.
93108
scale: A tuple of normalization factors.
94109

0 commit comments

Comments
 (0)