AI技术研究院 2018-07-20
在计算机视觉中,数据增强是提高计算机视觉系统性能的一种技术。这使得我们的分类算法在训练数据集上对诸如阳光、光照等的变化更加健壮。Tensorflow API提供了广泛的数据增强方法,以提高DNN的分类性能。使用这些对数据集的增强步骤可以增加网络的泛化能力,因为生成了更多的训练数据,而这些数据与我们的原始数据是不同的。这种技术在处理少量数据集以训练DNN时非常方便。数据扩充也是另一种方法,我们可以减少对模型的过度拟合,我们在训练数据中只使用信息来增加训练数据量。
参数可以作为键值对提供。可以如下初始化扩充,您可以在其中指定自己想要的值,否则,preprocessor.proto将仅提供默认值:
data_augmentation_options {
random_image_scale {
min_scale_ratio: 0.3
max_scale_ratio: 1.5
}
}
其中,用户指定的值将覆盖默认值,
def random_image_scale(image,
masks=None,
min_scale_ratio=0.5,
max_scale_ratio=2.0,
seed=None,
preprocess_vars_cache=None):
"""Scales the image size.
Args:
image: rank 3 float32 tensor contains 1 image -> [height, width, channels].
masks: (optional) rank 3 float32 tensor containing masks with
size [height, width, num_masks]. The value is set to None if there are no
masks.
min_scale_ratio: minimum scaling ratio.
max_scale_ratio: maximum scaling ratio.
seed: random seed.
preprocess_vars_cache: PreprocessorCache object that records previously performed augmentations. Updated in-place. If this function is called multiple times with the same non-null cache, it will perform deterministically.
Returns:
image: image which is the same rank as input image.
masks: If masks is not none, resized masks which are the same rank as input masks will be returned.
with tf.name_scope('RandomImageScale', values=[image]):
result = []
image_shape = tf.shape(image)
image_height = image_shape[0]
image_width = image_shape[1]
generator_func = functools.partial(
tf.random_uniform, [],
minval=min_scale_ratio, maxval=max_scale_ratio,
dtype=tf.float32, seed=seed)
size_coef = _get_or_create_preprocess_rand_vars(
generator_func, preprocessor_cache.PreprocessorCache.IMAGE_SCALE,
preprocess_vars_cache)
image_newysize = tf.to_int32(
tf.multiply(tf.to_float(image_height), size_coef))
image_newxsize = tf.to_int32(
tf.multiply(tf.to_float(image_width), size_coef))
image = tf.image.resize_images(
image, [image_newysize, image_newxsize], align_corners=True)
result.append(image)
if masks:
masks = tf.image.resize_nearest_neighbor(
masks, [image_newysize, image_newxsize], align_corners=True)
result.append(masks)
return tuple(result)
对于随机调整饱和度,
data_augmentation_options {
random_adjust_saturation {
}
}
其中,除非指定,否则将初始化默认值,
def random_adjust_saturation(image,
min_delta=0.8,
max_delta=1.25,
seed=None,
preprocess_vars_cache=None):
"""Randomly adjusts saturation.
Makes sure the output image is still between 0 and 255.
Args:
image: rank 3 float32 tensor contains 1 image -> [height, width, channels] with pixel values varying between [0, 255].
min_delta: see max_delta.
max_delta: how much to change the saturation. Saturation will change with a value between min_delta and max_delta. This value will be multiplied to the current saturation of the image.
seed: random seed.
preprocess_vars_cache: PreprocessorCache object that records previously performed augmentations. Updated in-place. If this function is called multiple times with the same non-null cache, it will perform deterministically.
Returns:
image: image which is the same shape as input image.
"""
with tf.name_scope('RandomAdjustSaturation', values=[image]):
generator_func = functools.partial(tf.random_uniform, [],
min_delta, max_delta, seed=seed)
saturation_factor = _get_or_create_preprocess_rand_vars(
generator_func,
preprocessor_cache.PreprocessorCache.ADJUST_SATURATION,
preprocess_vars_cache)
image = tf.image.adjust_saturation(image/255, saturation_factor)*255
image=tf.clip_by_value(image,clip_value_min=0.0,clip_value_max=255.0)
return image
以下是Preprocessor.proto中提供的选项列表。从适合您的选项列表中选择扩充技术
NormalizeImage normalize_image = 1;
RandomHorizontalFlip random_horizontal_flip = 2; RandomVerticalFlip random_vertical_flip = 3
RandomPixelValueScale random_pixel_value_scale = 4;
RandomImageScale random_image_scale = 5;
RandomRGBtoGray random_rgb_to_gray = 6;
RandomAdjustBrightness random_adjust_brightness = 7;
RandomAdjustContrast random_adjust_contrast = 8;
RandomAdjustHue random_adjust_hue = 9;
RandomAdjustSaturation random_adjust_saturation = 10;
RandomDistortColor random_distort_color = 11;
RandomJitterBoxes random_jitter_boxes = 12;
RandomCropImage random_crop_image = 13;
RandomPadImage random_pad_image = 14;
RandomCropPadImage random_crop_pad_image = 15;
RandomCropToAspectRatio random_crop_to_aspect_ratio = 16;
RandomBlackPatches random_black_patches = 17;
RandomResizeMethod random_resize_method = 18;
ScaleBoxesToPixelCoordinates scale_boxes_to_pixel_coordinates = 19;
ResizeImage resize_image = 20;
SubtractChannelMean subtract_channel_mean = 21;
SSDRandomCrop ssd_random_crop = 22;
SSDRandomCropPad ssd_random_crop_pad = 23;
SSDRandomCropFixedAspectRatio ssd_random_crop_fixed_aspect_ratio = 24;