Spatial Pyramid Pooling in Deep Convolutional --- Spp_net

微软亚研院2015的一篇文章,优点是能够满足任意大小图像的输入。

主要思想:

(1)Spatial Pyramid Pooling Layer. 正是因为该层,才让Spp_net能够实现任意图片的输入,并且得到固定长度的特征向量:

这里写图片描述

stride和window的计算:

这里写图片描述

(2)Mapping a Window to Feature Maps. 将原图输入Spp_net后,通过下面图片中介绍的方法,能够将原图中的点映射到feature map上,为object detection打下基础:

这里写图片描述

主要代码实现(基于theano/keras):

(1)spp_layer:

def __init__(self,bins,feature_map_size=0):
    super(SppLayer,self).__init__()
    self.strides = []
    self.windows = []
    self.a = feature_map_size#feature_map size
    self.bins = bins
    self.num_bins = len(bins)

def get_output(self,train):
    self.input = self.get_input(train)
    for i in range(self.num_bins):
        self.strides.append(int(math.floor(self.a/self.bins[i])))
        self.windows.append(int(math.ceil(self.a/self.bins[i])))

    self.pooled_out = []
    for j in range(self.num_bins):
        self.pooled_out.append(downsample.max_pool_2d(input=self.input,
                                                          ds=(self.windows[j],self.windows[j]),
                                                          st=(self.strides[j],self.strides[j]),
                                                          ignore_border=False))

    for k in range(self.num_bins):
        self.pooled_out[k] = self.pooled_out[k].flatten(2)
        """
        print self.windows[k]
        print self.strides[k]
        print 'K: '+str(k)
        """
    # batch_size * image_size
    self.output = T.concatenate([self.pooled_out[0],self.pooled_out[1],self.pooled_out[2]],axis=1)

    return self.output

(2)Mapping a Window to Feature Maps:

1
2
3
4
5
6
7
8
def window_t(window_point_x1,window_point_y1,window_point_x2,window_point_y2,
window_size_x,window_size_y,map_size_x,map_size_y):

map_point_x1 = window_point_x1*math.ceil(map_size_x/window_size_x)-1
map_point_y1 = window_point_y1*math.ceil(map_size_y/window_size_y)-1
map_point_x2 = window_point_x2*math.ceil(map_size_x/window_size_x)-1
map_point_y2 = window_point_y2*math.ceil(map_size_y/window_size_y)-1

return map_point_x1,map_point_y1,map_point_x2,map_point_y2

本文系作者原创,转载请先联系作者: 18254275587@163.com