Small object recognition techniques based on structured template matching for high-resolution satellite images

We are developing infrastructure tools of wide-area monitoring used for such as disaster damaged areas or traffic conditions, using Earth observation satellite images. Especially, we are focusing on developing a small object recognition tool for satellite images, which enables extract automobile patterns in high-resolution satellite images such as QuickBird panchromatic images, for example. Although, resolution of optical sensors installed in the current earth observation satellites has been highly advanced, their pixel resolution is not enough for identifying each small object such as an automobile by the currently available pattern matching techniques. Whereas, the pattern matching calculation load of high-resolution images becomes bigger, it will take tremendous time for searching whole objects included in a slice of satellite images. In order to overcome these problems, we propose a structured template matching technique for recognizing small objects in satellite images, which consists of a micro-template matching, clustered micro-template matching and macro-template matching. In this paper, we describe an abstract of our proposed method and present its experimental results.


Introduction
We are developing infrastructure tools of wide-area monitoring for disaster damaged areas or traffic conditions, using earth observation satellite images. In these days, resolution of optical sensors installed in earth observation satellites has been highly improved. In case of the panchromatic image captured by the QuickBird (DigitalGlobe, 2008), the ground-level resolution is about 0.6 [m], which makes possible to recognize each automobile on roads or parking lots. The previous satellite image analysis works have been mainly focused on area segmentation and classification problems (Giovanni Poggi et al., 2005), and object recognition targets have been limited to large objects such as traffic roads or buildings (Qi-Ming Qin et al., 2005), using high-resolution panchromatic satellite images. Whereas, there have been a lot of works on recognizing automobiles in aerial images including the paper (Tao Zhao et al., 2001), however, there have been almost any works trying to recognize such small objects as automobiles in satellite images excluding (So Hee Jeon et al., 2005). This previous work (So Hee Jeon et al., 2005) has been applying template matching methods to recognizing small objects but its recognition rate has been very poor because of insufficient pixel information for pattern matching. In the previous paper (Modegi T., 2008), we proposed an interactive high-precision template matching tool for satellite images, but it took amount of calculation times and object searching area was limited and far from practical uses. In order to overcome this problem, we apply a structured identification approach similar to the work (Qu Jishuang et al., 2003). In this paper, we propose a three-layered structured template matching method, which enables recognizing small objects such as automobiles in satellite images at very little calculation load. The first layer is focusing on extracting candidate areas, which have metallic-reflection optical characteristics, where any types of transportation objects are included. The second layer is identification and removal of excessively extracted candidate areas such as roads and buildings, which have the similar optical characteristics but do not include our targets. The third layer is identifying each automobile object within the focusing area using our proposing conceptual templates (Modegi T., 2008), which are learned patterns based on user's operations, based on our improved high-speed template-matching algorithm. The following sections describe specific algorithms of each layer.   Figure 1 shows our proposing structured recognition model for automobiles on the ground (Modegi T., 2009), which resembles 7-layered communication protocols called as OSI (Open System Interconnection) designed by the ISO (International Standard Organization). The highest recognition level has been already operated by the Japanese police, known as "N-System", by installing a lot of monitoring cameras along highways. The current available high-resolution earth observation satellites cover up to the fifth level in Fig.1. Figure 2 shows our proposing structured template matching method for recognizing small objects in satellite images. The first matching is called as a micro-template matching, and it extracts candidate areas thoroughly including target small objects by non-linear heuristic filtering with micro-block templates. This candidate means pixel blocks indicating metallic reflection characteristics, which any types of transport objects including automobiles have in common. This process can be made at very little calculation load and also decrease the following third matching calculation load. The second matching is called as a clustered micro-template matching, and it removes excessively matched relatively large candidate areas, which do not include target small objects with multiple 8-neighbored connected micro-block templates called as a clustered micro-template. This process also can be made at very little calculation load and decrease more the following third matching calculation load.

Proposing structured template matching method
(1) Given image Segmentation of candidate regions including targets by abstract matching with micro templates (Fast Processing) (2) Segmented areas [micro-templates] (3) Filtered Segmented areas (4)  The third matching is called as a macro-template matching whose size is almost the same as the target object, and it identifies each object in the segmented candidate areas by the pixelvalue correlation based pattern matching shown in the paper (Modegi T., 2008). This process needs a lot of calculation times but its calculation areas will be shrunken by the first and second matching processes.

Micro-template matching
The Figure 3 shows a concept of micro-template matching, which defines binary mask value M(x,y) for given 256 gray-scale image I(x,y) (0≤x≤N x -1, 0≤y≤N y -1). This process determines whether optical reflection values of each pixel block have metallic characteristics or not, by a heuristic filtering process. In other words, we determine each tiny area as shown in Fig.3 would be a part of transportation materials which we are searching. As this determination logic, we can use our following described heuristic rules defined between some pixels in the micro-template. In case of 4 × 4 pixel micro-template shown in Fig. 3, we separate 2 × 2 inside pixels V ik (k=1,4) from the other 12 outside pixels V ok (k=1,12). Using these pixel values, we calculate the following 7 statistical parameters: the minimum value of outside pixels V omin , the maximum value of outside pixels V omax , the minimum value of inside pixels V imin , the maximum value of inside pixels V imax , the average value of outside pixels V oave , the average value of inside pixels V iave , the standard deviation value of outside pixels V dir .
We apply the above template to its nearest 4 × 4 pixel block values of each pixel of given satellite panchromatic image I(x,y), and determine it would be included in candidate areas or not by the following rules. We have to consider both two kinds of candidate patterns, one is the inside part is brighter than the outside and the other is its negative pattern. In order to determine some pixel (x,y) included in candidate M(x,y)=1, the following 5 conditions should be satisfied using the 7 predetermined slice levels: S aoi , S doi , S omin , S omax , S imin , S imax , and S dir . These 7 kinds of slice levels can be defined interactively by indicating areas where target objects are definitely included on the image displayed screen. For each pixel in our indicated areas, we calculate the 7 statistical parameters based on the equation (1), and using either the minimum or maximum statistical parameters, we can define each of the slice levels as the following.
In case of the pixel value of given monochrome image is between 0 and 255, we give typically these slice levels as S aoi =15, S doi =80, S omin =100, S omax =160, S imin =35, S imax =245, and S dir =10 .

Clustered micro-template matching
In the first micro-template matching, all of metallic reflection characteristic areas are selected as candidates, but these characteristics are not limited to transportation materials. In general, edge parts of buildings and roads are the same characteristics and selected as candidates. Ironically these non-transport areas are larger than our target transport objects, and increase searching load of the next object identification processes. Therefore, in this section we provide an identifying and removing process of excessively selected candidate areas. In general, incorrectly selected candidate areas such as edge parts of buildings and roads are long slender shape, which can be detected as multiple 8-neighbor connected micro-template blocks called as a clustered micro-template. However, some large areas such as parking lots may include target objects. Therefore, we have to distinguish these patterns from correctly selected large areas where multiple target objects are located closely, with their pixel value characteristics in the detected clustered area. Figure 4 shows an algorithm of recognizing these incorrectly selected areas to be removed. Fig.4-(1) shows 8 × 8 pixel parts of selected candidate areas, where painted pixels are determined as candidates based on the equation (2). On this image, we will search long candidate pixel clusters, whose height or width is larger than S N -pixel length. In order to find these clusters, we will track 8-neighbored connected candidate pixels from the top-left pixel shown in Fig.4 (1) Source 256-level Image (2) Two-classified Binaty Mask Fig. 5. Example of both micro-template matching and clustered micro-template matching processes.
In the either horizontal or vertical direction, if we can find successfully a S N -pixel length cluster, we will calculate the minimum, maximum and average pixel value of this cluster as C min , C max and C ave . Defining two slice levels as S cave and S cmax , if the following conditions are satisfied, we will extend the cluster by finding another 8-neighbored connected candidate pixels around previously found clustered pixels. Typically, we give to these slice levels as S N =13, S cave =50 and S cmax =50, in case of the pixel value of given monochrome image is between 0 and 255.
C ave >S cave and C max −C min >S cmax . (4) Then we will reset all of tracked pixels in the extended cluster to non-candidate pixels as M(x,y)=0, whereas each of left candidate pixels [x,y] will be extended to 4 × 4 candidate pixel block as M(x+i,y+j)=1 for 0≤i≤3 and 0≤j≤3. Figure 4- (3) shows in the vertical direction we have found a 8-pixel length cluster (S N =8), then we will reset all of tracked pixels in the cluster to non-candidate pixels as shown in Fig.4-(4). Furthermore, we extend removing areas around the removed cluster. Fig.4- (5) shows its second extended removed cluster, and Fig.4- (6) shows its third extended removed cluster. Figure 5 shows a series of both micro-template matching and clustered micro-template matching processes. In Fig.5- (2), we found a small 3-pixel area and a large 15-pixel area with a micro-template. Then the larger area has been removed as M(x,y)=0 with a clustered micro-template as shown in Fig.5-(3). Finally, we extend each of left 3 candidate pixels to 4 × 4 candidate pixel block. Therefore, the size of final candidate area M(x,y)=1 becomes 27-pixel size.

Macro-template matching
As a final process, we identify each target object within the selected candidate areas using macro-template whose size is almost the same as that of searching objects. In order to execute this process efficiently, we propose using conceptual templates. Figure 6 shows a concept of our proposing macro-template matching. The upper two objects have different shape, size and color each other. If these two objects are captured by a camera with some light sources, we can obtain tremendous kinds of images including 4 images shown in the middle part of Fig.6. In order to identify these objects, we have to prepare amounts of templates, at least two kinds of templates in this example. Our proposing macrotemplate matching makes possible identify the objects on these various kinds of captured images with small amounts of templates called as conceptual templates. Our proposing macro-template matching process consists of two kinds of matching processes, the angle-independent and angle-dependent processes, based on the previous work (Modegi T., 2008). The first angle-independent process is mainly comparing a graylevel histogram of each determining block with that of a template. If the first matching process is successful, the second angle dependent process will be made. This is mainly comparing a normalized correlation coefficient of each determining block with those of several angle rotated image blocks of a template. If one of rotated image is fitted, the determining block will be identified as that template pattern. Figure 7 shows how to define templates in our proposing macro-template matching algorithm. A template image I t (a,x,y) is originally an extracted block of pixels in some sample image, which is not necessarily this working image for search I(x,y). Then this image We have to prepare amounts of templates meeting with these variations.
Robust template matching enables identifying with a little kinds of templates. Fig. 6. Concept of our proposing template matching using conceptual templates.
is rotated to 8 kinds of angle for angle-dependent matching, and 8 kinds of template images are defined. In each defined N × N pixel template I t (a,x,y), two kinds of quadangle-shape outlines are defined for making mask image data M t (a,x,y). The inner outlines indicate the actual outline pattern of a target and the nearest patterns outside these inner outlines will be considered for another closely located object identification. The common area of the inner areas in all of angles shown the bottom of Fig.7 is used for the angle-independent template matching as the following algorithm (b) and (c), whereas the outer outlined area is used for the angle-dependent template matching as the following algorithm (d) and (e).
The following describes a specific algorithm of our proposing macro-template matching. a. Check the inside pixels of inner mask are candidate. If D can ≥N 2 /2, proceed to the next.
c. Calculate a dispersion value difference D dis between the angle-0 template and its corresponding working image, where M t (0,x,y)≥3.
(9) Figure 8 shows a construction of total macro-template matching processes for identifying each small object included in candidate areas. The first and second processes are based on our proposing template matching processes, which calculate 4 kinds of matching parameters D his , D dis , D sub (a max ) and D cor (a max ) with the fitted angle parameter a max for some position. The first process (1)  (

2) Detailed Search of Fitting Position
Finding the most matching position by macro-template matching process with moving the matching position slightly for (ΔX, ΔY) width around (X c , Y c ) . (

3) Save and Output of Matching Results
Save and display the corrected matching position, matched angle, template-ID and 4 matching determination parameters.
(4) Erase the pixels inside the matched area.
Reset all of the pixel values to zero inside the matched area corresponding with the inner template area. Repeat these operations for whole pixels in the working image.
Matching position (X c , Y c ) Corrected matching position (X m , Y m ) Fig. 8. Construction of our proposing total macro-template matching processes.
www.intechopen.com most fitted position (X m ,Y m ) where D his , D dis and D sub (a max ) to be the minimum, and D cor (a max ) to be the maximum. The third process (3) saves and displays this matched position (X m ,Y m ), matching parameters and the matched angle parameter. The fourth process clears values of the pixel block to zero, corresponding with this matched area where M t (x,y)≥2, in the working image I(x,y). This process prevents picking up the already matched area in duplicate. Figure 9 shows interactive definition processes of macro-template matching conditions: 4 kinds of slice levels as S his , S dis , S sub and S cor , and conceptual updated templates. The first process (1) defines a position (X c ,Y c ) to be matched interactively by a user where the target area should be identified as an object. The second process (2) corrects the defined position to the most fitted position (X m ,Y m ), where D his , D dis and D sub (a max ) will be the minimum and D cor (a max ) will be the maximum. The third process calculates 4 kinds of slice levels as: S his =D his •1.1, S dis =D dis •1.1, S sub =D sub •1.1 and S cor =D cor •0.9 . The fourth process updates the template image I t (x,y) by mixing it with the matched pixel block in the working image I(x,y) in order half of pixels to become those of the working image as shown in Fig.10. This process makes the template match with more types of patterns and become a more conceptual pattern.
(1) Define Target Position Specify a center position (X c , Y c ) of a pattern which should be included in searching targets on the screen.

(3) Definition or Updating Slice Levels
Calculate 4 slice levels based on the most matched conditions of the specified pattern to be selected out.
(4) Updating Template Images Updating the templates to be used next operations by mixing the corresponding working image block with the template image. Repeat these operations until you can get appropriate results.
(  Either pixel is randomly picked up

Experimental results
The Figure 11 shows an example of experimental results using a QuickBird (DigitalGlobe, 2008) panchromatic 11-bit monochrome 13032 x 13028 pixel satellite image Fig. 11-(a) shows 664 x 655 trimmed area after the first segmentation process has been made with a 4 x 4 pixel micro-template. Extracted candidate areas are indicated in red. Fig.11- (b) shows modified area after the clustered template-matching process has been made. The large candidate areas such as roads, which do not include automobiles, have been removed from the candidate. Fig.11-(c) shows the three-time zoomed area 591 x 468 (a parking lot image) of the center-left 197 x 156 after the third identification process has been made with 52 x 52 pixels, 4 kinds of angle macro-templates, whose initial definition and updated patterns are shown in Fig.12. Then the slice levels are defined as: S his =603, S dis =600, S sub =130 and S cor =412. Fig. 11. Example of automobile recognition experiments by our proposing template matching using a 0.6 [m]-resolution QuickBird panchromatic image (13032 x 13028 pixels). In this final stage, 61 automobile objects out of 85 automobiles in this picture could be correctly identified, whereas 19 objects out of 558 non-target patterns were excessively identified. This incorrect identification rate has been improved, compared with that of without a micro-template matching or a clustered micro-template matching as shown in Table 1, which shows identification precision rates in three kinds of experiments whether adding a micro-template matching or a clustered micro-template matching. However, correct identification rate has been slightly worse, which should be improved in the future.
(a) Candidate areas painted in red by micro-template matching (664 x 655 pixels).
(b) Large area patterns such as roads are filtered out from the candidate areas by a clustered micro-template matching.
(c) Identified examples of automobile objects by macrotemplate matching, outlined in green (197 x 156).

Texture analysis application of our proposed method
Our proposing micro-template matching method can be extended to indentify the other kinds of objects or areas other than transportation objects. For example, we are trying to extract rice field areas in several kinds of satellite images including low-resolution SAR images by designing micro templates for detecting some spatial frequency feature patterns.
In this section, we present an application of our proposing micro-template matching method to texture analysis of satellite images.

Proposing texture analysis method
Our proposing texture analysis method proposed is based on the micro-template matching method proposed in this paper. This makes binary determinations whether the target pixel I(x,y) is included in metallic texture areas or not, using micro-templates defined with multiple binary determination rules around nearest N×N pixel blocks. Applying the microtmeplates around the target pixel I(x,y) (=0-255), we can create a binary image B(x,y) (=0 or 1) which indicates metallic areas. In this paper, we propose extending this micro-template to the filter matrices M(x,y) (= −1 or +1) which have spatial frequency analysis functions as shown in Fig.13 and Fig.14.
For N×N block pixels (N=8) around the pixel I(x,y) in given source image data, we calculate the following 4 parameters, and determine B(x,y) (=0 or 1) whether the target pixel should have predefined texture characteristics or not, by all of the calculated parameter values are included in predefined ranges or not. More specifically, we define 8 kinds of slice values as (2) First Spatial Frequency Components: F 1 We defined 4 kinds of matrices from (1-1-1) to (1-1-4) shown in Fig.13 as M x11 (x,y) to M x14 (x,y), and 4 kinds of matrices from (1-2-1) to (1-2-4) as M y11 (x,y) to M y14 (x,y). Using these matrices, we calculate the following F x1i and F y1i (i=1,4), define the maximum values F x1 to F y1 among each of 4 values F x1i and F y1i (i=1,4), and define a square root average value of these two as F 1 .
(4) Fourth Spatial Frequency Components: F 4 We defined 2 matrices (3-1) and (3-2) shown in Fig.14 as M x4 (x,y) and M y4 (x,y). Using these matrices, we calculate the following F x4i and F y4 , and define a square root average value of these two as F 4 .
Then we describe how to define 8 kinds of slice values as L dis , H dis , L f1 , H f1 , L f2 , H f2 , L f4 , H f4 interactively. We indicate one of target texture areas to be extracted, area-O, and also indicate two reverse feature areas not to be extracted: area-A and area-B. We calculate average values of 4 parameters D dis , F 1 , F 2 , F 4 based on the equations described above for each of three selected areas. We define average values in the area-A as In case of A dis >B dis , we can apply the above by changing values of A dis and B dis .

Example of rice field area extraction
Applying the previously described texture analysis method to several feature areas in satellite images, we have obtained the specific average values of 4 parameters D dis , F 1 , F 2 , F 4 . As experiment images, we used two kinds of QuickBird (DigitalGlobe, 2008) panchromatic images (urban area:13032×13028 pixels, rural area:29195×34498 pixels), we have extracted 252×182 pixel area and 1676×1563 pixels area from each image and converted depth of brightness level from 11-bits to 8-bits. We selected three areas from the first image, which were a road area without cars existed, a parking area without cars existed and a parking area with multiple cars parked in parallel. And we also selected three areas form the second image, which were a town and housing area, a rice field area and a river area. For each of these selected 6 areas, we calculated average values of 4 parameters D dis , F 1 , F 2 , F 4 , as plotted in Fig.15 In Fig.15, We have divided the vertical direction to four parts, and plotted 4 average parameters for each of the 6 selected areas plotted from the top to bottom parts: pixel dispersion values, first spatial frequency components, second spatial frequency components, and fourth spatial frequency components. The horizontal dimension is 100-time integer values of calculated values by the equation (1) to (4). For example, Figure 16 shows a result of extraction experiment for the first urban area image, setting slice values as L dis =520, As shown in Fig.15, parameter values of the rice field area and the river area are nearer compared with the values of the town and housing area. This indicates, it is difficult to separate the two areas of the rice field area and the river area than separating them from the town and housing area. Therefore, we have made separation experiments of these 3 kinds of texture in the same image, as shown in Fig.17. Using the source image Fig.17- (1), the result of extraction is shown in Fig.17-(2), setting parameters as Then from the processed image Fig.17- (2), the result of extraction is shown in Fig.17- (3), setting parameters as We can almost separate the rice filed areas from the river areas except the several edge parts of river areas are incorrectly extracted.

Conclusions
In this paper, we have proposed a three-layered structured template matching method for decreasing calculation loads and improving identification precisions against the conventional template-matching algorithm. The first layer called as a micro-template matching is focusing on extracting candidate areas, which have metallic-reflection optical characteristics, where any types of transportation objects are included. The second layer called as a clustered micro-template matching is identification and removal of excessively extracted candidate areas such as roads and buildings, which have the similar optical characteristics but do not include our targets. The third layer called as macro-template matching is identifying each automobile object within the focusing area using our proposing conceptual templates, which are learned patterns based on user's operations, based on our improved high-speed template-matching algorithm. In our experiments using our proposed method, we could extract about 70% automobile objects correctly in a single scene of a QuickBird panchromatic image. Moreover, we have proposed giving a texture analysis function to the first micro-template matching process, by adding independent spatial frequency component as parameters. The previously proposed micro-template matching has been using only a mixed parameter of pixel dispersion value and first spatial frequency component in this proposing texture analysis. It has been difficult to distinguish similar texture characteristic areas shown in such as Fig.17-(2) and Fig.17-(3). We have found overcoming this problem by giving independent spatial frequency component parameters separating from pixel dispersion parameters and adding the second spatial frequency component parameter. As future works, we are going to evaluate functions of the other first and fourth spatial frequency components in several kinds of satellite images, and redesign matrices size or analysis parameters. We suppose our proposing technology creates new industrial applications and business opportunities of high-cost earth observation satellite images such as a wide-area traffic monitoring, which compensates for the blind areas of the conventional terrestrial traffic Vision-based object recognition tasks are very familiar in our everyday activities, such as driving our car in the correct lane. We do these tasks effortlessly in real-time. In the last decades, with the advancement of computer technology, researchers and application developers are trying to mimic the humanâ€™s capability of visually recognising. Such capability will allow machine to free human from boring or dangerous jobs.

How to reference
In order to correctly reference this scholarly work, feel free to copy and paste the following: