Texture Image Segmentation Method Based on Wavelet Transform and Neural Networks Shunichiro Oe
Jing Zhang
Information Science & Intelligent Systems University of Tokushima, Japan 2-1. Minami-Josaniima. Tokushima. 770-8506 JaDan
[email protected] [email protected] ABSTRACT
1. INTRODUCTION
texture features can be divided into statistical and structural methods. Structural methods describe textures in terms of texture primitives and spatial relationships among them. Now many computational methods have been proposed and applied to extract texture feature. Commonly used texture features include Fourier domain energy, co-occurrence matrices, texture energy measures, Markov random field models, second-order gray level statistics, twodimensional autoregressive model and fractal dimension etc.[l]. But in most early work, the texture features are almost analyzed in a single resolution. More recently, attention has been focused on multiresolution and multichannel texture analysis, especially the application of the wavelet decomposition. Wavelet decomposition provide a mathematically precise definition to the concept of multiresolution. In this paper, we proposed an approach for extracting texture features by first decomposing an original image into three different frequency ranges images by using wavelet decomposition, and the zerocrossing detection is performed to all filtered images. Then the mean, variance and ceoccurrence matrices of all filtered images are calculated and take these features as the texture features of the filtered images.
In the field of computer vision, texture plays an important role in low level image analysis and understanding. Its potential applications range has been shown in various areas such as analysis of remote sensing images, industrial automation and biomedical image processing etc. For example, in a vision guided assembly system of the machine, the robot needs to pick up the appropriate components from the bin. For this, the image segmentation followed by image recognition is required. Image segmentation is a process of partitioning the image into some non-intersecting regions such that each region is homogeneous and the union of no two adjacent regions is homogeneous. In image segmentation, segmenting a random texture image into different texture regions is a relatively difficult problem. Generally, the texture image segmentation problem can be implemented by two steps: The first step is to extract the texture features of the random texture image. To design an effective algorithm for texture segmentation, it is essential to find out a set of texture features which have a good ability to discriminate different textural pattern classes. In mathematics, the description of
The second step is to segment an image into homogeneous texture areas according to the texture features obtained above. There are a large number of image segmentation techniques. Most techniques, however, can be categorized into three broad categories [2]: (1) region extraction method, (2) edge detection method, and (3) multi-resolution method. About region extraction methods, now many papers have been presented. For example, single (hybrid or centroid) linkage region growing schemes, split and merge schemes and spatial clustering schemes etc. And several attempts have also been made to develop image processing algorithms using neural networks [1]-[4]. But there are still some problems for these segmentation methods. For region growing and split-merge segmentation methods, the threshold technique is needed, especially for problem of segmenting a texture image with some kinds of different random texture, the multi-thresholding techniques must be considered. Also for some of these methods, particularly the histogram based methods are not at all suitable for noisy images. Neural network based segmentation approaches are attempts to solve these problems and to achieve robustness of
This paper presents an effective texture image segmentation algorithm by using wavelet decomposition and band-pass neural networks. This approach is applied to segment an random texture image into several homogeneous areas. The basic idea of proposed method is first decomposing an original image into several filtered images which contain information in different orientation and frequency ranges, and these filtered images are of the same size as the original images. Then the zero-crossing transformation is applied to all these filtered subimages. The texture features are extracted by calculating the energy, mean, variance and co-occurrence matrix of the small window in the filtered subimages. Then the feature vector pyramid are built of reduced-resolution versions of these arrays. By using band-pass neural networks in the pyramid linking process, the child can be linked to its most similar parent, at same time, the Robustness of the system and the ability of noise resistant are improved a lot. The validity of this method will be verified by several numerical examples.
0-7803-4778-1198 S 10.00 0 1998 IEEE
4595
the system with respect t o random noise and failure of processors. These methods do have heightened the segmentation result, but the robustness to hight noise images and some images which have wide texture feature values are not good. T h e multi-resolution image processing method has a good evaluation on criteria such as image data compression, edge detection, and fitting. T h e basic idea of this method is the use of a multi-resolution (“pyramid”) image representation and pyramid linking algorithm. In this paper, we use the pyramid linking algorithm as the segmentation method. For a texture image, its properties are not meaningful for single pixels, so we begin with a fixed partition of the image into a number of small, equally sized windows, and compute a texture vector for each window. In this paper, the mean, variance and cooccurrence matrices of the small windows are computed as the texture features. Then the pyramid structure is constructed based on the texture features computed above (Fig.1.b). After that, the pyramid linking is used to link each point to their most similar parent. In process of linking children to their parents, it is necessary t o define correctly the discriminant function which determines the similarities between parents and children, so that the child can be linked to its parent suitably. But sometimes it is very difficult to define such discriminant function, especially when many of the different texture features are used simultaneously in segmentation. In this paper, we propose a new segmentation method by using pyramid linking and band-pass filtering neural networks. T h e object of using bandpass neural networks is t o decide correctly to which parent the son should be linked according to the similarity of texture feature between sons and its parents. Due to the using of band-pass neural networks, the abilities of noise resistant in pyramid linking process are heightened, even when the texture features have a wide values. So the robustness of the system is improved a lot. We present implementation details along with several experiments demonstrating the effectiveness of the proposed segmentation scheme. T h e rest of the paper is organized as follows: in the next section, we introduced the wavelet transform and extraction method of the texture features. In section 3, we explain the construction of the neural networks. The implementation of the segmentation method is presented in section 4. In section 5, examples of results obtained using our technique on a variety of image data are presented. We conclude the paper in section 6.
Figure 1: T h e pyramid structure of feature vector of the image
changes with the frequency content of the image also. T h a t means if we look at an image with a large window size, we would notice gross features of the image. Similarly, if we look a t an image with a small window size, we would notice small features of the image. This makes wavelet interesting and useful. T h e techniques to decompose an original image into several filtered images, which contain information in different subbands, are the use of a family wavelet functions and its associated scaling functions. The wavelet decomposition of the image can be performed by a pyramid algorithm, in which the quadrature mirror filters H ( a low-pass filter) and G ( a high-pass filter) are utilized to calculate wavelet coefficients. T h e impulse response of H and G are represented by h(n) and g ( n ) , respectively, and their relation is: g(k)
k)
With the pyramid structure wavelet transform, the original image is first passed through the low-pass and high-pass filters to generate the low-low (LL), low-high (LH), high-low (HL) and high-high (HH) subimages. T h e decompositions are repeated o n the LL subimage t o obtain the four subimages. T h e 1scale and 2-scale wavelet decomposition scheme are represented in Fig.a(a) and (b) respectively, where the size of the wavelet representation is the same as that of the original image.
1-~ LLLL LLHL LLLH LLHH
2. TEXTURE FEATURE 2.1 Wavelet decomposition of texture images Wavelet are functions that satisfy certain mathematical requirements and are used to cut up data into different frequency components. This idea is not new. One of t,he earliest methods of signal decomposition is the Fourier transform which used sinusoid to represent other functions. However, unlike the sine and cosine functions of the Fourier transform, the wavelet can not only provides localization both in spatial and frequency domains, but can prcvides a further advantage in which the window size
= (-l)kh(l-
Figure 2: (a) I-scale decomposition (b) 2-scale decomposition Feature extraction is a crucial step in accomplishing reliable segmentation. A good texture feature should reflect some global view while keeping some discrimination capability a t the pixel level. For a two dimensional image, many mathematical results
4596
have shown that a. band pass signal can be reconstructed from its zero-crossing under certain circumst,ances. Although t.his theory is not enough to make any similar claims about texture features. it is clear that the zero-crossings are rich in information so it can provide a complete representation of signal under certain restrictions. An important property of the result of wavelet transformation is its balance between positive and negative values, So in this paper, we investigate the envelope-based feature extraction schemes, that means t o perform zero-crossing transformation to all filtered subimages. The method of zero-crossing is first t o fined the maximum value between two adjacent zero-crossings, then assign this maximum value to all points within the interval. Where we apply zero-crossing transformation along vertical direction on LH subimage. and along horizontal direction on HL and HH subimage. Then the following features of the image are calculated. 2.2 Feature extraction from wavelet coefficients
Many feature extraction methods have been proposed. among them. many successful applications have been reported based on the texture energy measure and co-occurrence matrix. In o u r scheme, the features used for segmentation are energy measure, mean, variance and gray-level co-occurrence matrix of the envelope values of wavelet coefficients (after zero-crossing transformation) of the high-frequency channels a t first scale. T h e energy measure is siinilar to the texture energy measure and is computed by following equation:
i= 1
where n l is the size of the small window. From the co-occurrence matrix 14 different text,ure features can be computed. These features emphasize different entries in the matrix. for instance. t,he clustering of its content around t,he main diagonal. Here we use the moment features of the COoccurrence matrix, it can be comput.ed by following equation: n
n
(3) i=l j = l
Where Ph( i. j ) represent the two-dimensional histogram measurements made for four radial separa, t,ion point,s and angles of 0 = O . r / - l . n / % nnd3irl.l radians under the assumption of angular symiiietry. 3. THE CALCULATION OF THE SEGMENTATION ERROR Suppose t.he t,ext,ure image is divided into ri different homogeneous arras, ,Y is the feat.ure vpctor of t,he imagp, a,< is the mean which belong to cluster rl ant1 is the nuttiher of t.he small windows which incluclrd i n cluster d . T h e rvaluatioll function is defirirtl by following formula:
that is. .:(TI) is the sum of variance of n different homogeneous areas. T h e value of U:(.) changes according to the number of different homogeneous areas n and the segmentation result which the value of a:(n) is smallest will be the best one.
4. NEURAL NETWORKS Recently, neural networks have been successfully applied t o pattern recognition. such as BackPropagation neural networks, Decision-Based neural networks, Kohonen's Self-organizing neural networks etc. When we use pyramid linking method to segment the texture images, it is very important to decide to which parents the child should be linked according to the similarity between parents and son. In this paper, we use band-pass neural networks [7] (we call it as BPN in short) to decide the similarity between the parents and son during the pyramid linking. T h e band-pass neural networks is a feedforward multi-layer supervised neural networks, it can pass patterns from only one category with no or little distortion and has a nonlinear filtering ability. Its output values represent the distances between the target pattern and the candidates of pattern categories. It has two advantages: The first is that it is possible to recognize the pattern by comparing the similarities between the input and the output. The second is that it is independent of other neural networks, because one such neural networks is trained only for one pattern. In this section. we will describe the configuration of the BPN networks and the method of training and recognition by the BPN networks. In this paper, we use a five-layer BPN networks. In this five-layer BPN networks. the upper and lower hidden layers have more units than those of the input and output layers and the middle hidden layer has fewer units than those of the input units. In general, two phases are included in a supervised learning networks: learning phase and retrieving phase. In the learning phase, the weights and offsets of the networks are trained so that the output vector of the BPN networks is same as the input one of the BPN networks. Thus each BPN networks is trained as the mapping of an identity pattern using patterns which belong to one pattern categories. Therefore the training will benefit from the assistance of the teacher. In the retrieving phase. t.he object is to determine to which class a pattern belongs based on the winner of the output values of the networks. The output values are function of input and network weights, called discriminant function. In fact, many kinds of structures in supervised-learning neural networks can be chosen according to practical application. Here the BPN networks uses likelihood between the input and output as the discriminant function. In learning phase, t.he back-propagation algorithm is used t,o t,rain the BPN networks, because the back-propagation algorithm offers an efficient, computational speed-up for training multi-layer networks. T h e goal of training process is to let the output patrern of BPN is as close as t,o input pattern. 'That means to train the weights PVji so as to minimize the error 6 k between the teacher and the act.ual response. As the training succeeds, each BPN
BPY neural networks so that the outputs of the BPN 2
Figure 3 : The overlapped small window
becomes a nonlinear band-pass filter. that is, each BPN passes patterns that belong t o only one category, with no or little distortion. 5 . IMAGE SEGMENTATION USING PYRAMID LINKING AND NEURAL NETWORKS
The pyramid structure is a multi-resolution image representation in which each level is a lower resolution version of its previous one. The process of constructing the pyramid of feature vector is as follows: First, an original image is decomposed into three high-frequency channels, then the zero-crossing transformation is applied to all these subimages. We divide these subimages into several number of small windows of the same size (16 x 16), then the texture features of each small windows are computed. In order to avoid the result of segmentation is too coarse a t the edges between different texture areas, because of large size of the small window, here we used the overlapped small window when compute the texture feature of the image. That means we move the small window to let the neighbor small window overlap each other, see Fig.3. After the feature vectors of all subimages was calculated. we constructed the pyramid structure of feature vectors by weighting the average of 4 x 4 neighborhoods in its preceding level. T h e neighborhoods were overlapped 50% in the x and the y directions separately. Here the parent-child relationship was defined between nodes in adjacent levels. Because the pyramid was 50% overlapping. two adjacent parents had 50%)of their possible children in common. So each node in the pyramid had four parents at the
are the same as the inputs. then the feature vector of the child is inputed into its four parents’ BPN neural networks and the similarities between the child and its four parents are computed separately, then we compare the values of these similarities and link the child to its parent which is most similar to itself. We perform this process for all points from level 0 to level 4 by 4 and repeated it until there is no changes on linking. Finally, we map the result of segmentation to the original image and calculate the error of segmentation result. The similarities between parents and sons are computed by following equation:
s
= (X!X()/(lI.Yl] ’ IlX’Il) (5) Where S is the feature vector of the parent, X’ is the out,put vector of BPN neural networks (the input of BPN is feature vector of the son). ( ; U , X ’ ) is the inner product of fi and X‘, and l l S l l is the norm of
x.
In this paper. the 5-layer BPY has a unit configuration of i-12-5-12-7. T h e number of hidden units was determined by experiments. The results of experiment shows that the segmentation performance is best in the const,ruction. 6. EXPERIMENTS In this section, the proposed Segmentation method was applied t.0 some synthesizing images which were constructed using the real images taken from Brodatz’s book [9], as shown in Fig.5, Fig.7, Fig.9, and Fig.11. Fig.6, Fig.8, Fig.10, and Fig.?? are the results by using proposed method. From the results, it can be seen that the proposed method could give better segmentation results. Traditional pyramid linking method sometimes would fail to link the child to its most similar parent when the value of texture features had a wide range, so the robustness of the system is not well, and the failure linking usually led to an incorrect segmentation. While by using prcposed segmentation method, however, because the BPN neural networks, which has a good band-pass filt.ering abilit.y, was used in pyramid linking process, the child could always be linked to its most s i m lar parent, so a good segmentation result of images were obtained. T h e error of the segmentation results defined by equation ( 4 ) are listed in table 1. Table 1: T h e error of segmentation result (window size is 16 x 16)
next, level, a n d 16 children at t h e previous levrl.
Next the neural networks for all parents are construct,ed and train t.hrm so t,hat the outputs of neural net,works is sarne as inputs. Then t,he pyramid linking algorithm was used to segment an image into several homogeneous areas. While linking parent and child between two adjacent levels. the BPN neural networks was used t,o determine to which parent, node the child should be linked. according to the similarit,ies hetwren child and its four parents. Fig.4 show t h e linking method of the pyramid. We first train the
7. CONCLUSION In this research, we have proposed a texture image segmentation met hod by using wavelet transformation and band-pass neural networks. From the
459%
-
The father which is the most similar to the son
I
1
Computationof similarity
I feature vector of the son
Figure 4: Pyramid linking according to the similarity between parents and son
Figure 5 : Original image
Figure 6: Segmentation result
results of experiments we can see that the proposed segmentation method can segment the texture image correctly, even though the feature values of the texture images have a wide ranges and the robustness of the system is well. And by using proposed segmentation met,hod, the segnient,ationerror defined by equation 4 are small. So a great improvement in segmentation accuracy can be obtained by using wavelet transformation and neural networks. But process of segmentat,ion. we only consider the relationship betwet,n paret1t.s and children during pyramid linking process, and think nothing about the relat,ionship between children. If the relationship between childrcn is considered while linking t.he children to its most, similar parents. the hettcr segmentation result may he got by using proposed segmentat,ion nief hod. So o u r next work will t,hink about how t,o find o u t bhe similarity b c t w ~ e nchildren during linking process. and we will also think about how t.0 extract, t,he texture feature of the image more exactly, so that t,he hest segmentation results could be obtained.
4599
8. REFERENCES
J.Zhang, S.Oe Texture Image Segmentation Method by Usiiig Pyramid Linking and SelfOrganizing Neural Networks Proceeding of fhe 1997 International Conference on .Veural Informatzon Processzng and Intelligent Informataon System. Volume 2 . pp.1191-1194 July (1992) NikHil P. Pal anti Sankar K . Pal, A review on image segmet i t at ion t ecliniques. Pa t tern R ecognitzon. Vol. 26. No. 9, pp. 127i - 1294, (1993) Jianchang Mao a n d A . I<. Jain. Texture classification and segmentation using multi-resolution simultaneous a u t oregressive model. Patfe rn Recognitzon, Volume 25 No. 2 173-183 (1992) Xlatti Pietikainen, Azriel Rosenfeld, Image Segmentation by Texture Using Pyramid Yode Linking, IEEE Transactions on system. man,
Figure 7 : Original image
Figure 10: Segmentation result
Figure 8: Segmentation result,
Figure 11: Original image
guage Proceedzng), Proceedings pp 669 - 672 (1990 11)
and cybemetzcs. Vol.Sb1C-11, No. 12, December 1981
[5] M . Bister and J . Cornelis,
[8]Jean-Michel Jolion and Azriel Rosenfeld,
[6] Andrew Laine. Jian Fan An Adaptive Approach for Texture Segmentation by Multichannel Wavelet Frames.
[9] PhilBrodatz, Textures: A photographic album for artists and designers, Dozier Publzcations, Inc New York (1966)
A critical view of pyramid segmentation algorithms, Pattern Recognztion Lefters, 11 605 - 617 (1990)
A Pyramid Framework for Early Vision. Kluwer Academic Publishers (1994)
[7] M. Nakamura and S . Tamura. Vowel recognition by phoneme filter neural networks, ICSLP90 ( International Conference on Spoken Lan-
Figure 12: Segmentation result
Figure 9: Original image
46QQ