An Intelligent System for Container Image Recognition using ART2-based Self-Organizing Supervised Learning Algorithm

The purpose of this book is to provide an up-to-date and systematical introduction to the principles and algorithms of machine learning. The definition of learning is broad enough to include most tasks that we commonly call “learning” tasks, as we use the word in daily life. It is also broad enough to encompass computers that improve from experience in quite straightforward ways. The book will be of interest to industrial engineers and scientists as well as academics who wish to pursue machine learning. The book is intended for both graduate and postgraduate students in fields such as computer science, cybernetics, system sciences, engineering, statistics, and social sciences, and as a reference for software professionals and practitioners. The wide scope of the book provides a good introduction to many approaches of machine learning, and it is also the source of useful bibliographical information.


Introduction
Recently, the quantity of goods transported by sea has increased steadily since the cost of transportation by sea is lower than other transportation methods. Various automation methods are used for the speedy and accurate processing of transport containers in the harbor. The automation systems for transport container flow processing are classified into two types: the barcode processing system and the automatic recognition system of container identifiers based on image processing. However, these days the identifier recognition system based on images is more widely used in the harbors. Identifiers of shipping containers are given in accordance with the terms of ISO standard, which consist of 4 code groups such as shipping company codes, container serial codes, check digit codes and container type codes (ISO-6346, 1995;Kim, 2003). And, only the first 11 identifier characters are prescribed in the ISO standard and shipping containers are able to be discriminated by automatically recognizing the first 11 characters. But, other features such as the foreground and background colors, the font type and the size of container identifiers, etc., vary from one container to another since the ISO standard doesn't prescribes other features except code type (Kim, 2004;Nam et al., 2001). Since identifiers are printed on the surface of containers, shapes of identifiers are often impaired by the environmental factors during the transportation by sea. The damage to a container surface may lead to a distortion of shapes of identifier characters in a container image. So, the variations in the feature of container identifiers and noises make it quite difficult the extraction and recognition of identifiers using simple information like color values (Kim, 2004). Generally, container identifiers have another feature that the color of characters is black or white. Considering such a feature, in a container image, all areas excepting areas with black or white colors are regarded as noises, and areas of identifiers and noises are discriminated by using a fuzzy-based noise detection method. Noise areas are replaced with a mean pixel value of the whole image area, and areas of identifiers are extracted and binarized by applying the edge detection by Sobel masking operation and the vertical and horizontal block extraction to the conversed image one by one. In the extracted areas, the color of identifiers is converted to black and one of background to white, and individual identifiers are extracted by using a 8-directional contour tacking algorithm. An ART2-based selforganizing supervised learning algorithm for the identifier recognition is proposed in this chapter, which creates nodes of the hidden layer by applying ART2 between the input layer and the hidden one and improves performance of learning by applying generalized delta learning and the Delta-bar-Delta algorithm (Vogl et al., 1998). Experiments using many images of shipping containers show that the presented identifier extraction method and the ART2-based supervised learning algorithm is more improved compared with the methods proposed previously.

Extraction of container identifier areas
Due to the rugged surface shape of containers and noises vertically appeared by an external light, a failure may occur in the extraction of container identifier areas from a container image. To refine the failure problem, a novel method is proposed for extraction of identifier areas based on a fuzzy-based noise detection method. In the proposed method, edges of identifiers are detected by applying Sobel masking operation to a grayscale image of the original image and extracts areas of identifiers using information on edges. Sobel masking operation is sensitive to noises so that it detects noises by an external light as edges. To remove an effect of noises in the edge detection, first, noise pixels are detected by a fuzzy method and replaced by the pixels with a mean gray value. Next, Applying Sobel masking to the noise-removed image, areas of container identifiers are separated from background areas.

Fuzzy-based noise detection
To remove noises by an external light, an container image is converted to a grayscale one and apply the membership function like Fig. 1 to each pixel of the grayscale image, deciding whether the pixel is a noise or not. In Fig. 1, C and E are categories being likely to belong to an area of identifiers, and D is the category being likely to be a noise. Eq. (1) shows the expression for the membership function of Fig. 1. The criterion to distinguish pixels of noise and non-noise using the degree of membership in the proposed method is given in Table 1.  To observe the effectiveness of the fuzzy-based noise detection, results of edge detection by Sobel masking are compared between the original image and the noise-removed image by the proposed method. Fig. 2 is the original container image, and Fig. 3 is the output image generated by applying only Sobel masking to a grayscale image of Fig. 2. Fig. 4 is results of edge detection obtained by applying the fuzzy-based noise removal and Sobel masking to Fig.2. First, the fuzzy-based noise detection method is applied to a grayscale image of the original image and pixels detected as noises are replaced with a mean gray value. Next, edges of container identifiers are detected by applying Sobel masking to the noise-removed image. As shown in Fig. 3, noise removal by the proposed fuzzy method generates more efficient results in the extraction of areas of identifiers.

Binarization of container identifier areas
Currently, the iterative binarization algorithm is mainly used in the preprocessing of pattern recognition. The iterative binarization algorithm, first, roughly determines an initial threshold, divides an input image to two pixel groups using the threshold, calculates a mean value for each pixel group, and sets the arithmetic mean of two mean values to a new threshold. And, the algorithm repeats the above processing until there is no variation of threshold value and sets the last value to the threshold value for binarization operation. In the case of a noise-removed container image, since the difference of intensity between the background and the identifiers is great, the iterative algorithm is able to provide a good threshold value.

Extraction of individual identifiers
Individual identifiers are extracted by applying the 8-directional contour tracking method (Chen & Hsu, 1989) to binarized areas of container identifiers. In the extraction process, the extraction of individual identifiers is successful in the case that the background color is a general color except white one like Fig. 5, and on the other hand, the extraction is failed in the case with white background color as shown in Fig. 6. In the binarization process, background pixels of a bright intensity are converted to black and identifier pixels of a dark intensity are converted to white. Since the contour tracking method detects edges of an area with black color, it can not detect edges of identifiers from target areas with white background. So a result of binarization process is reversed for identifier areas with white background. That is, background pixels are converted to white and identifier pixels to black. Fig. 7 shows that the pixel reversal lead to a success of edge detection in an identifier area with white background presented in Fig. 6. Step 1. Initialize with Eq. (2) in order to apply the 8-neighborhood contour tracking algorithm to the identifier area, and find the pixel by applying tracking mask as shown in Fig. 8.
Step 2. When a black pixel is found after applying the tracking mask in the current pixel, calculate the value of r i P and c i P as shown in Eq. (3) Step 3 Step 4

Recognition of container identifiers using ART2-based self-organizing supervised leaning algorithm
The error backpropagation algorithm uses gradient descent as the supervised learning rule to minimize the cost function defined in terms of the error value between the output value and the target one for a given input. Hence, the algorithm has the drawback that the convergence speed of learning is slower and the possibility of falling into the local minima is induced by the insufficient number of nodes in the hidden layer and the unsuitable initial connection weights. During the learning process, the algorithm uses credit assignment for propagating error value of the output layer's nodes backward to the nodes in the hidden layer. As a result, paralysis can be induced in the hidden layer. Generally, the recognition algorithms using the error backpropagation are plagued by the falling-off of recognition rate caused by the empirical determination of the number of hidden layer nodes and the credit assignment procedure (Kim & Yun, 1999). Fuzzy C-Means-based RBF networks uses the fuzzy C-Means algorithm to generate the middle layer. It has a disadvantage of consuming too much time when applied to character recognition. In character recognition, a binary pattern is usually used as the input pattern. Thus, when the fuzzy C-Means algorithm is applied to the training pattern composed of 0 and 1, it is not only difficult to precisely classify input patterns but also takes a lot of training time compared to other clustering algorithms (Kim et al., 2005). The ART2 architecture was evolved to perform learning for binary input patterns and also accommodate continuous valued components in input patterns (Carpenter et al., 1991). In the ART2 algorithm, connection weights are modified according to the calculation of mean values of all input patterns. Then the cluster center is calculated by adapting it to the new pattern. However, the averaged mean value of the difference in input vector and connection weight is used for comparison with the vigilance factor, which leads to the possibility of an input pattern being classified to a similar cluster having different properties (Kim & Kim, 2004). This could happen particularly in cases where the pattern dimensionality is large and one www.intechopen.com An Intelligent System for Container Image Recognition using ART2-based Self-Organizing Supervised Learning Algorithm 169 feature drastically differs from the cluster center but its impact is minimized due to averaging all differences. When the traditional ART2 algorithm was applied to the recognition of container identifiers, it was observed that the recognition rate declined due to the classification of such different input patterns to the same cluster. Therefore, we propose a novel ART2-based hybrid network architecture where the middle layer neurons have RBF (Radial Basis Function) properties and the output layer neurons have a sigmoid function property. An ART2-based self-organizing supervised learning algorithm for the recognition of container identifiers, is proposed in this chapter. First, a new leaning structure is applied between the input and the middle layers, which applies ART2 algorithm between the two layers, select a node with maximum output value as a winner node, and transmits the selected node to the middle layer. Next, generalized Delta learning algorithm and Delta-bar-Delta algorithm are applied in the learning between the middle and the output layers, improving the performance of learning. The proposed learning algorithm is summarized as follows: 1. The connection structure between the input and the middle layers is like ART2 algorithm and the output layer of ART2 becomes the middle layer of the proposed learning algorithm. 2. Nodes of the middle layer mean individual classes. Therefore, while the proposed algorithm has a fully-connected structure on the whole, it takes the winner node method that compares target vectors and output vectors and back-propagates a representative class and the connection weight. 3. The proposed algorithm performs the supervised learning by applying generalized Delta learning as the learning structure between the middle and the output layers. 4. The proposed algorithm improves the performance of learning by applying Delta-bar-Delta algorithm to generalized Delta learning for the dynamical adjustment of a learning rate. When defining the case that the difference between the target vector and the output vector is less than 0.1 as an accuracy and the opposite case as an inaccuracy, Delta-bar-Delta algorithm is applied restrictively in the case that the number of accuracies is greater than or equal to inaccuracies with respect to total patterns. This prevents no progress or an oscillation of learning keeping almost constant level of error by early premature situation incurred by competition in the learning process. The detailed description of ART2-based self-organizing supervised learning algorithm is like Fig. 9.

Performance evaluation
The proposed algorithm is implemented by using Microsoft Visual C++ 6.0 on the IBMcompatible Pentium-IV PC for performance evaluation. 79 container images with size of 640x480 are used in the experiments for extraction and recognition of container identifiers. In the extraction of identifier areas, the previously proposed method fails to extract in images containing noises vertically appearing by an external light and the rugged surface shape of containers. On the other hand, the proposed extraction method detects and removes noises by using a fuzzy method, improving the success rate of extraction compared with the previously proposed. The comparison of the success rate of identifier area extraction between the proposed method in this chapter and the previously proposed method is like   performed with the FCM-based RBF network and the proposed ART2-based self-organizing supervised learning algorithm using extracted identifier characters and compared the recognition performance in Table 3.

Adjust weights and bias
In the experiment of identifier recognition, the learning rate and the momentum are set to 0.4 and 0.3 for the two recognition algorithms, respectively. And, for ART2 algorithm generating nodes of the middle layer in the proposed algorithm, vigilance variables of two character types are set to 0.4. When comparing the number of nodes of the middle layer between the two algorithms, the proposed algorithm creates more nodes than FCM-based RBF network, but via the comparison of the number of Epochs, it is known that the number of iteration of learning in the proposed algorithm is less than FCM-based RBF network. That is, the proposed algorithm improves the performance of learning. Also, comparing the success rate of recognition, it is able to be known that the proposed algorithm improves the performance of recognition compared with FCM-based RBF network. Failures of recognition in the proposed algorithm are incurred by the damage of shapes of individual identifiers in original images and the information loss of identifiers in the binarzation process.

Conclusion
This chapter proposes an automatic recognition system of shipping container identifiers using fuzzy-based noise removal method and ART2-based self-organizing supervised learning algorithm. In the proposed method, after detecting and removing noises from an original image by using a fuzzy method, areas of identifiers are extracted. In detail, the performance of identifier area extraction is improved by removing noises incurring errors using a fuzzy method based on the feature that the color of container identifiers is white or black on the whole. And, individual identifiers are extracted by applying the 8-directional contour tracking method to extracted areas of identifiers. Experiments using 79 container images show that 72 areas of identifiers and 784 individual identifiers are extracted successfully and 767 identifiers among the extracted are recognized by the proposed recognition algorithm. Failures of recognition in the proposed algorithm are incurred by the damage of shapes of individual identifiers in original images and the information loss of identifiers in the binarzation process.