Tensorflow中图像分类的数据增强技术

AI技术研究院 2018-07-20

Tensorflow中图像分类的数据增强技术

在计算机视觉中,数据增强是提高计算机视觉系统性能的一种技术。这使得我们的分类算法在训练数据集上对诸如阳光、光照等的变化更加健壮。Tensorflow API提供了广泛的数据增强方法,以提高DNN的分类性能。使用这些对数据集的增强步骤可以增加网络的泛化能力,因为生成了更多的训练数据,而这些数据与我们的原始数据是不同的。这种技术在处理少量数据集以训练DNN时非常方便。数据扩充也是另一种方法,我们可以减少对模型的过度拟合,我们在训练数据中只使用信息来增加训练数据量。

参数可以作为键值对提供。可以如下初始化扩充,您可以在其中指定自己想要的值,否则,preprocessor.proto将仅提供默认值:

data_augmentation_options {

random_image_scale {

min_scale_ratio: 0.3

max_scale_ratio: 1.5

}

}

其中,用户指定的值将覆盖默认值,

def random_image_scale(image,

masks=None,

min_scale_ratio=0.5,

max_scale_ratio=2.0,

seed=None,

preprocess_vars_cache=None):

"""Scales the image size.

Args:

image: rank 3 float32 tensor contains 1 image -> [height, width, channels].

masks: (optional) rank 3 float32 tensor containing masks with

size [height, width, num_masks]. The value is set to None if there are no

masks.

min_scale_ratio: minimum scaling ratio.

max_scale_ratio: maximum scaling ratio.

seed: random seed.

preprocess_vars_cache: PreprocessorCache object that records previously performed augmentations. Updated in-place. If this function is called multiple times with the same non-null cache, it will perform deterministically.

Returns:

image: image which is the same rank as input image.

masks: If masks is not none, resized masks which are the same rank as input masks will be returned.

with tf.name_scope('RandomImageScale', values=[image]):

result = []

image_shape = tf.shape(image)

image_height = image_shape[0]

image_width = image_shape[1]

generator_func = functools.partial(

tf.random_uniform, [],

minval=min_scale_ratio, maxval=max_scale_ratio,

dtype=tf.float32, seed=seed)

size_coef = _get_or_create_preprocess_rand_vars(

generator_func, preprocessor_cache.PreprocessorCache.IMAGE_SCALE,

preprocess_vars_cache)

image_newysize = tf.to_int32(

tf.multiply(tf.to_float(image_height), size_coef))

image_newxsize = tf.to_int32(

tf.multiply(tf.to_float(image_width), size_coef))

image = tf.image.resize_images(

image, [image_newysize, image_newxsize], align_corners=True)

result.append(image)

if masks:

masks = tf.image.resize_nearest_neighbor(

masks, [image_newysize, image_newxsize], align_corners=True)

result.append(masks)

return tuple(result)

对于随机调整饱和度,

data_augmentation_options {

random_adjust_saturation {

}

}

其中,除非指定,否则将初始化默认值,

def random_adjust_saturation(image,

min_delta=0.8,

max_delta=1.25,

seed=None,

preprocess_vars_cache=None):

"""Randomly adjusts saturation.

Makes sure the output image is still between 0 and 255.

Args:

image: rank 3 float32 tensor contains 1 image -> [height, width, channels] with pixel values varying between [0, 255].

min_delta: see max_delta.

max_delta: how much to change the saturation. Saturation will change with a value between min_delta and max_delta. This value will be multiplied to the current saturation of the image.

seed: random seed.

preprocess_vars_cache: PreprocessorCache object that records previously performed augmentations. Updated in-place. If this function is called multiple times with the same non-null cache, it will perform deterministically.

Returns:

image: image which is the same shape as input image.

"""

with tf.name_scope('RandomAdjustSaturation', values=[image]):

generator_func = functools.partial(tf.random_uniform, [],

min_delta, max_delta, seed=seed)

saturation_factor = _get_or_create_preprocess_rand_vars(

generator_func,

preprocessor_cache.PreprocessorCache.ADJUST_SATURATION,

preprocess_vars_cache)

image = tf.image.adjust_saturation(image/255, saturation_factor)*255

image=tf.clip_by_value(image,clip_value_min=0.0,clip_value_max=255.0)

return image

以下是Preprocessor.proto中提供的选项列表。从适合您的选项列表中选择扩充技术

NormalizeImage normalize_image = 1;

RandomHorizontalFlip random_horizontal_flip = 2; RandomVerticalFlip random_vertical_flip = 3

RandomPixelValueScale random_pixel_value_scale = 4;

RandomImageScale random_image_scale = 5;

RandomRGBtoGray random_rgb_to_gray = 6;

RandomAdjustBrightness random_adjust_brightness = 7;

RandomAdjustContrast random_adjust_contrast = 8;

RandomAdjustHue random_adjust_hue = 9;

RandomAdjustSaturation random_adjust_saturation = 10;

RandomDistortColor random_distort_color = 11;

RandomJitterBoxes random_jitter_boxes = 12;

RandomCropImage random_crop_image = 13;

RandomPadImage random_pad_image = 14;

RandomCropPadImage random_crop_pad_image = 15;

RandomCropToAspectRatio random_crop_to_aspect_ratio = 16;

RandomBlackPatches random_black_patches = 17;

RandomResizeMethod random_resize_method = 18;

ScaleBoxesToPixelCoordinates scale_boxes_to_pixel_coordinates = 19;

ResizeImage resize_image = 20;

SubtractChannelMean subtract_channel_mean = 21;

SSDRandomCrop ssd_random_crop = 22;

SSDRandomCropPad ssd_random_crop_pad = 23;

SSDRandomCropFixedAspectRatio ssd_random_crop_fixed_aspect_ratio = 24;

相关推荐