当前位置:首页 >> >> Combination of Multiple Handwritten Text Line Recognition Systems with a Recursive Approach

Combination of Multiple Handwritten Text Line Recognition Systems with a Recursive Approach


Combination of Multiple Handwritten Text Line Recognition Systems with a Recursive Approach
Roman Bertolami, Beat Halter, Horst Bunke Institute of Computer Science and Applied Mathematics University of Bern, Neubr¨ ckstrasse 10, CH-3012 Bern, Switzerland u {bertolam, bhalter, bunke}@iam.unibe.ch

Abstract
In this paper we propose a novel method to combine the results of multiple text line recognition systems. The method uses a recursive approach and re-examines those parts in a text line which have been rejected based on the initial combination of the base recognisers’ results. By means of the new method, the search space can be reduced, and therefore more accurate recognition results can be expected. Experiments conducted on the IAM database show that the proposed method is able to improve the recognition rate compared to a standard combination scheme.

Keywords: Handwritten Text Line Recognition, Multiple Classi?er Systems, Hidden Markov Models

1.

Introduction

Handwriting recognition has been addressed by many researchers. In character, digit, and isolated word recognition high recognition rates have been achieved which enable successful applications in the ?elds of postal address reading [3] and cheque processing [7]. Most of the systems reported in literature consider constrained recognition problems, involving a small vocabulary or speci?c writing instruments. The recognition of general handwritten text is still a widely unexplored ?eld with many open problems. Everyone has their own writing style, different writing instruments can be used, and the number of word classes is usually huge. Furthermore, the dif?cult problem of segmentation occurs when moving from word to text recognition because the correct number of words in a text line is unknown in advance. Therefore, rather low recognition rates of only 50% to 80% have been reported in literature for general handwritten text recognition [9, 17, 19, 20]. Multiple classi?er systems have successfully been applied to improve the classi?cation accuracy in many different ?elds of pattern recognition [10, 14]. Voting and similar strategies have shown good potential to improve the classi?cation accuracy compared to a single classi?cation system. In the domain of handwriting recognition, classi?er combination has often been applied for isolated character and single word recognition. However, the

combination of handwritten text recognisers has been proposed only recently. This kind of combination requires some additional synchronisation mechanism because the number of words in the recognised word sequences might differ. Usually, a sequence alignment procedure is applied to synchronise the word sequences output by the individual base recognisers. Then, a standard voting method can be applied to extract the ?nal result from the synchronised sequences. The contribution of the present paper is a novel architecture for the combination of multiple recognisers, which involves recursive recognition. In the overall combined system, all parts of a text line are rejected where too many individual recognisers disagree. The rejected parts are subjected to an additional round of recognition. Rejection and re-recognition can be iterated several times. With an increasing number of rejection and re-recognition cycles, those parts of a text line that undergo recognition are becoming increasingly smaller. Hence the overall search space of the recogniser is gradually reduced. Because the search space is reduced, larger parts can be explored by the sub-optimal search strategy during the decoding step. In general, this leads to different recognition results. The probability that the correct word sequence is among the explored parts increases because a more exhaustive search is performed. Therefore, we can expect that the recognition becomes more accurate. The remaining part of this paper is organised as follows. Related work is summarised in the next section. In Sect. 3 the recursive recognition procedure is introduced. Recognition, combination, rejection, cutting, and pasting, which are the essential steps of the proposed method, are explained in detail. Experiments and results are provided in Sect. 4 and conclusions are drawn in the last section of the paper.

2.

Related Work

In of?ine handwriting recognition, improvements by means of multiple classi?er systems have been reported for handwritten character, numeral, word, and word sequence recognition. An automatic self-con?guration scheme to combine multiple character recognition systems has been proposed in [18]. For this scheme genetic algorithms are used.

Input text

(a) Recognition results of R1 , R2 , R3 :
Recognition

Combination

R1 : R2 : R3 :

leave is the autumn leave in that autumn leave is that autumn

(b) Alignment of the word sequences:
Rejection N Pasting Y Cutting the rejected parts

R1 : R2 : R3 :

leave leave leave

is in is

the that that

autumn autumn autumn

(c) Result of the re-recognition step: R1 : R2 : R3 : in in in the the that

Transcription

(d) Final result with the TakeRevoted pasting strategy:
Figure 1. Recursive recognition overview.

In numeral recognition, the application of statistical combination methods has been reported in [6]. Especially the behaviour knowledge space methods were able to successfully combine the classi?ers. In [22] a framework to combine numeral string recognisers was proposed. This framework uses a graph-based approach for combination. An evaluation of several decision combination strategies for handwritten word recognition has been reported in [4]. Borda count methods, fuzzy integrals, and multilayer perceptrons have been compared. In [5] various ensemble methods, including bagging, boosting, and feature subspace methods have been applied to handwritten word recognition. Hidden Markov Model based recognisers have been automatically generated by modi?cation of the training set. In handwritten text line recognition additional effort is required to synchronise the word sequences. In [11] a heuristic approach to align and combine multiple handwritten text line recognisers has been used. Positional information of the recognised words is exploited to reduce the search space of the alignment. A novel method to generate ensembles of text line recognition systems has been introduced in [2]. Based on speci?c integration of a statistical language model multiple recognisers have been built.

leave

in

the

autumn

(e) Final result with the TakeOriginal pasting strategy: leave in that autumn

Figure 2. Example of the recursive recognition step. recognisers. The results of these recognisers are then combined. Based on a con?dence measure we reject certain parts of the combined result. The rejected parts are cut from the original input image and resubmitted to the recognition process. This recursion can be applied multiple times. Finally, the recognised parts are pasted together to build the ?nal transcription.

3.2.

Recognition and Combination

3.

Methodology

This section describes the proposed recursive recognition schema. First, an overview is given. The individual parts of the process are then described in greater detail. Figure 2 provides an illustrating guiding example of the entire recursive recognition process.

3.1.

Overview

A system overview is shown in Fig. 1. First the handwritten input text is recognised by n independent base

In the ?rst processing step, the handwritten input text is recognised by n different base recognisers individually. The output of the recognisers are n word sequences each of which is a textual representations of the handwritten text line. Note that the number of words in these sequences may differ. In the example of Fig. 2 three base recognisers (R1 , R2 , R3 ) output a transcription for the handwritten input text leave in the autumn (Fig. 2a). Notice that none of the base recognisers correctly recognises the input. To combine the output word sequences of the base recognisers we ?rst use an alignment procedure to synchronise the sequences. Then, we apply a voting strategy to the individual segments of the alignment. A heuristic extension of the standard sequence alignment method [21] is used to align the word sequences. The heuristics enable us to reduce the search space by using positional information as proposed in [11]. The result of the alignment procedure is a sequence of segments which contain the recognised words. In our example the alignment results in four segments, as shown in Fig. 2b.

Table 1. Recognition rates of the three base recognisers.

0.4

Error Rate

Recogniser R1 R2 R3

Recognition Rate 63.06% 58.71% 55.33%

0.3

0.2

Table 2. Recognition rates of the different cutting (Plain/Average) and pasting strategies (TakeRevoted/TakeOriginal)

0.1

0

0

0.2

Baseline System Plain Average

TakeRevoted TakeOriginal TakeRevoted TakeOriginal

63.85% 63.78% 64.59% 63.19 % 64.69%

0.4 0.6 Rejection Rate

0.8

1

Figure 3. Error-rejection plot.

3.4.

Recursive Recognition

To each of the aligned segments we then apply a weighted voting strategy to extract the combination result. The weights are proportional to the recognition rates of the individual recognisers.

3.3.

Rejection and Cutting Methods

Depending on how many recognisers agree on their decision we de?ne a con?dence measure. Based on this con?dence measure we can then reject certain parts of the input. For example, if each recogniser outputs a different word, we reject this part. In the example of Fig. 2 we only accept if all recognisers agree on their decision. Therefore, the second and the third word are rejected (Fig. 2b). An additional round of recognition is then applied to the rejected parts. To be able to re-recognise the rejected parts we need a suitable cutting method. Because the starting and ending points of the recognised words may be different for the n recognised word sequences, it is not trivial to ?nd suitable starting and ending points for the re-recognition step. If we choose the starting point of a part too far to the left, we risk to have an additional, already correctly recognised, word in the cut part. On the other hand, if the start point is too much to the right, part of a word can be removed, thus making a correct recognition impossible. The same problems but in opposite order occur for the ending point. For these reasons we propose two different cutting methods: Plain The rejected parts are cut for each of the n recognisers individually. Thus, for each extracted segment we have n starting points (s1 , . . . , sn ) and n ending points (e1 , . . . , en ). Recogniser Ri then performs the re-recognition on the part between si and ei . Average A common part is cut for all the n recognisers. n 1 The average starting point s = n i=1 si and the n 1 average ending point e = n i=1 ei are used for this purpose.

Once we have cut the parts that need further examination, we can apply the re-recognition step. We resubmit the extracted parts to the same base recognisers. The motivation for this procedure is that we can dramatically reduce the search space when we don’t have to recognise a whole text line but only a part of it. The re-recognition step is identical to the initial recognition except for the reduced input sequences. The same recognition procedure is used and the same statistical language model supports the recognition process. Nevertheless, we can expect more accurate recognition, because the decoding step which performs the recognition implements a sub-optimal search strategy. The search space of the re-recognised parts is usually much smaller than the original one. Because the search space is smaller it can then be explored more deeply. Thus, we increase the probability that the correct words are considered during the decoding step and can therefore expect recognition becoming more accurate. The result of the re-recognition step in our example is shown in Fig. 2c. While the recognisers now agree on ?rst word in, they still disagree on the second word. The results of the re-recognition steps are then combined according to the same combination scheme as the original recognition results. If there are still parts to be rejected we can recursively invoke another round of rerecognition. To stop the recursion process we simply de?ne a maximal number of iterations.

3.5.

Pasting Methods

Once the recursive recognition has ?nished we have to include the results of the cut parts in the original result. It may occur that some parts still do not ful?l the acceptance criteria. To get the ?nal result for these parts as well we propose two different strategies: TakeRevoted The results of the last recognition step are combined and provide the results for the cut parts. TakeOriginal The results of the ?rst recognition step are combined and provide the results for the cut parts. In the example of Fig. 2, we decide to use the rerecognition step only once and therefore apply the past-

180 160 160 140 140 120 120 100 100 80 80 60

60

40

40

20

20

0

0

500

1000

1500

2000

2500

0

0

200

400

600

800

1000

1200

Figure 4. Distribution of the length of the input sequences in pixel. On the left, the lengths of the original text lines are shown whereas the lengths of the extracted parts which are submitted to re-recognition are shown at the right.

ing methods immediately after the result in Fig. 2c has been produced. The TakeRevoted method (d) provides the correct transcription whereas the TakeOriginal method (e) produces an error at the third word.

set. The statistical language model we used to support the decoding step is a bigram language model [16] which was extracted from the LOB corpus [8].

4.2.

Testset Results

4.

Experiments and Results

In the experiments we use three different base recognisers (R1 , R2 , R3 ) which are combined according to the proposed scheme.

4.1.

Experimental Setup

Each of the three recognisers (R1 , R2 , R3 ) is based on Hidden Markov Models (HMMs) [15] and is trained on the same dataset. Additionally, the same statistical language model supports the decoding step. However, the recognisers are different in terms of feature extraction and state modeling. Geometric features are used by recogniser R1 [12]. For each character, an individual number of states is determined with the quantile method [23]. Six Gaussians are used to model the output distribution of each state. The input of recogniser R2 is a pixel-based feature stream [19]. The Bakis method is used to determine the number of states per character [1]. The output distribution is again modelled with six Gaussians. The last recogniser (R3 ) we use is also based on the geometric features introduced in [12]. In contrast to R1 the number of states is determined with the Bakis method and only a single Gaussian is used to model the output distribution. All data used to train and test the system originate from the IAM1 database [13]. The HMM models are trained on 1530 text lines written by 35 writers. The test set consists of 572 text lines written by ?fteen writers. The considered task is writer independent which means that no writer who contributed to the test set is used to train the system. The underlying lexicon contains 4207 word instances and is the union of all words occurring in the training and test
1 The IAM database is publicly available for download at http://www.iam.unibe.ch/?fki/iamDB

The recognition rates of the three base recognisers (R1 , R2 , R3 ) are summarised in Tab. 1. The geometric feature based recogniser R1 , which uses a mixture of six Gaussians, clearly outperforms recognisers R2 and R3 . As a baseline system we use the combination of R1 , R2 , and R3 according to the combination methods described in Sect. 3 without rejections and re-recognition. The recognition rate of this baseline system is 63.85%. Thus, the combination of the three recognisers without recursion already outperforms the best single recogniser R1 . The goal is now to show that the recursive recognition has the potential to perform even better. The error-reject characteristic of the new system is shown in Fig. 3. We can either force the system to accept each combination result (error rate: 36.15%), accept if at least two of the recognisers produce the same result (error rate: 25.49%), or accept only if all recognisers agree (error rate: 14.66%). For the sake of simplicity we apply the recursive recognition only once. This means that the rejected parts are resubmitted to the recognition process only one time. We accept words that occur at least in the result of two recognisers. The other words are rejected and rerecognised. The results on the test set are summarised in Tab. 2. We can see that the TakeOriginal pasting method is able to improve the performance whereas the TakeRevoted method leads to some performance decrease. The best performing system uses the Average cutting method and the TakeOriginal pasting method which yields a recognition rate of 64.69%. One distinctive feature of the proposed method is a reduction of the length of the input image. This enables us to perform a more accurate recognition. The reduction of the length of the input sequences is illustrated in Fig. 4. In the recursive recognition step the input images are, on av-

erage, more than ?ve times shorter than the original input.

5.

Conclusions

[12]

We have proposed a novel method to combine multiple text line recognisers. The novelty we introduce is a recursive recognition step which enables us to resubmit the dif?cult parts of a text line to a second round of recognition. Because only parts of the text lines are re-recognised we can reduce the search space and therefore expect to obtain a more accurate recognition. In the proposed system, each of the base recognisers outputs a transcription of the handwritten input text ?rst. These word sequences are then combined. Based on how many recognisers agree in their decision, certain parts of the combination result are rejected. Next, we cut the rejected parts from the original image and resubmit them to the recognition process. All recognised parts are pasted together to build the ?nal word sequence. Experiments conducted on the IAM database show that the proposed method is able to improve the recognition rate compared to a standard combination scheme.

[13]

[14] [15] [16] [17] [18]

[19]

Acknowledgement
This research was supported by the Swiss National Science Foundation (Nr. 200020-19124/1). Additional funding was provided by the Swiss National Science Foundation NCCR program ”Interactive Multimodal Information Management (IM)2” in the Individual Project ”Visual/video processing”.
[20]

[21] [22] [23]

References
[1] R. Bakis. Continuous speech recognition via centisecond acoustic states. In Proc. of the 91. Meeting of the Acoustic Society of America, Washington, USA, 1976. [2] R. Bertolami and H. Bunke. Multiple handwritten text recognition systems derived from speci?c integration of a language model. In 8th International Conference on Document Analysis and Recognition, Seoul, Korea, volume 1, pages 521–524, 2005. [3] A. Brakensiek and G. Rigoll. Handwritten address recognition using hidden Markov models. In A. Dengel, M. Junker, and A. Weisbecker, editors, Reading and Learning, pages 103–122. Springer, 2004. [4] P. Gader, M. Mohamed, and J. Keller. Fusion of handwritten word classi?ers. Pattern Recognition Letters, 17:577– 584, 1996. [5] S. G¨ nter and H. Bunke. Ensembles of classi?ers for handu written word recognition. International Journal on Document Analysis and Recognition, 5(4):224 – 232, 2003. [6] T. Huang and C. Suen. A method of combining multiple experts for the recognition ofunconstrained handwritten numerals. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17:90–94, 1995. [7] S. Impedovo, P. Wang, and H. Bunke, editors. Automatic Bankcheck Processing. World Scienti?c, Singapore, 1997. [8] S. Johansson, E. Atwell, R. Garside, and G. Leech. The Tagged LOB Corpus, User’s Manual. Norwegian Computing Center for the Humanities, Bergen, Norway, 1986. [9] G. Kim, V. Govindaraju, and S. Srihari. Architecture for handwritten text recognition systems. In S.-W. Lee, editor, Advances in Handwriting Recognition, pages 163– 172. World Scienti?c Publ. Co., 1999. [10] L. I. Kuncheva. Combining Pattern Classi?ers: Methods and Algorithms. John Wiley & Sons Inc, 2004. [11] U.-V. Marti and H. Bunke. Use of positional information in sequence alignment for multiple classi?er combination.

In J. Kittler, F. Roli (eds.): Multiple Classi?er Systems, MCS 2001, LNCS 2096, Springer, pages 388 – 398, 2001. U.-V. Marti and H. Bunke. Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system. International Journal of Pattern Recognition and Arti?cial Intelligence, 15:65–90, 2001. U.-V. Marti and H. Bunke. The IAM-database: an English sentence database for of?ine handwriting recognition. International Journal on Document Analysis and Recognition, 5:39 – 46, 2002. N. Oza, R. Polikar, J. Kittler, and F. Roli, editors. Multiple Classi?er Systems, 6th International Workshop. Springer LNCS 3541, 2005. L. Rabiner. A tutorial on hidden Markov models and selected application in speech recognition. Proc. of the IEEE, 77(2):257–286, 1989. R. Rosenfeld. Two decades of statistical language modeling: Where do we go from here? Proc. of the IEEE, 88:1270–1278, 2000. A. Senior and A. Robinson. An off-line cursive handwriting recognition system. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(3):309–321, 1998. K. Sirlantzkis, M. Fairhurst, and M. Hoque. Genetic algorithm for multiple classi?er con?guration: A case study in character recognition. In J. Kittler and F. Roli, editors, 2nd International Workshop on Multiple Classi?er Systems (MCS), Cambridge, England, pages 99–108, 2001. A. Vinciarelli, S. Bengio, and H. Bunke. Of?ine recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(6):709–720, 2004. A. Vinciarelli and J. Luettin. Off-line cursive script recognition based on continuous density HMM. In 7th International Workshop on Frontiers in Handwriting Recognition, Amsterdam, The Netherlands, pages 493–498, 2000. R. Wagner and M. Fischer. The string-to-string correction problem. Journal of the ACM, 21(1):168–173, 1974. X. Ye, M. Cheriet, and C. Y. Suen. Strcombo: combination of string recognizers. Pattern Recognition Letters, 23:381– 394, 2002. M. Zimmermann and H. Bunke. Hidden Markov model length optimization for handwriting recognition systems. In 8th International Workshop on Frontiers in Handwriting Recognition, Niagara-on-the-Lake, Canada, pages 369–374, 2002.


更多相关文档:

Combination of Multiple Handwritten Text Line Recognition ....pdf

Combination of Multiple Handwritten Text Line Recognition Systems with a Recursive Approach_专业资料。In this paper we propose a novel method to combine the...

Offline Handwritten English Character Recognition.pdf

Offline Handwritten English Character Recognition_...combination of HMMs), or printed character ...Burges. Lerec: A nn/hmm hybrid for on-line ...

A Study of Hidden Markov Models for Off-line Recognition of ....pdf

line Recognition of Handwritten Characters_专业资料...be achieved through a combination of multiple ...In the case of handwritten character recogc STW,...

Optimizing Feature Selection for Recognizing Handwritten ....pdf

Several features of the handwritten Arabic ...An off-line recognition system based on the ...or combination of dots (one, two, or three)....

...Incremental On-line Chinese Handwriting Recognition System....pdf

line Chinese Handwriting Recognition System_专业资料...ideographic text would use on-line handwritten ...ers) is implemented as a combination of two ...

...Methods and Support Vector Machines for handwriting recgn....pdf

Machines for handwriting recgnization_IT/计算机_...combination of supporting patterns, (which are the...handwritten word recognition system is as depicted ...

...Self-Organizing Principles for Large Vocabulary On-Line H_....pdf

Principles for Large Vocabulary On-Line H_专业...handwritten text requires robust recognition systems,...combination of bi- and trigraphs, as well as ...

An Analytical Handwritten Word Recognition System with Word-....pdf

An Analytical Handwritten Word Recognition System ...part of a character or a combination of ... Γ Char-level Training Rec(1) pos 96.1 83....

Learning prototypes for on-line handwritten digits.pdf

Learning prototypes for on-line handwritten digits_...For the system described here, recognition is ...ned as a linear combination j of the respective...

A Recognition and Verification Strategy for Handwritten Word ....pdf

The combination of these primitives plus a ...Handwritten numeral recognition based on multiple ...Bunke. Text line segmentation and word recognition...

Chaincode contour processing for handwritten word recognition....pdf

An illegal combination is one where the following of the chaincode slopes ...On-line Handwritten Wo... 13页 免费 Neural Network for Rec... 40...

Handwritten Word Recognition based on Structural ....pdf

in combination with the newly introduced radial ...system for handwritten character recognition'', In... Lexicon-based Word Rec... 5页 免费 A ...

A New Segmentation Approach for Handwritten Digits.pdf

system deal with the combination of segmentation ...Handwritten numeral recognition based on multiple ...Neural Network for Rec... 40页 5下载券 A ...

Training set expansion in handwritten character recognition_....pdf

images is tested in combination with an ...er for several values of k. A training set ...Bunke. O?-line, handwritten numeral recognition ...

READING CHECKS WITH MULTILAYER GRAPH TRANSFORMER NETWORKS_....pdf

reading several million handwritten and machine ...with a combination of successive pieces (called a...Burges. Lerec: A nn/hmm hybrid for on-line ...

...the Past and Future of Character and Document Recognition_....pdf

Freely handwritten character recognition techniques ...character line and, later, a full page image. ...Combination in Multiple Classifier Systems,” IEEE ...

On-line adaptation in recognition of handwritten alphanumeric....pdf

On-line adaptation in recognition of handwritten ...with a multi-writer system and without any ...After choosing the best combination of the ...

READING CHECKS WITH MULTILAYER GRAPH TRANSFORMER NETWORKS_....pdf

with a combination of successive pieces (called a...Burges. Lerec: A nn/hmm hybrid for on-line ... applied to handwritten zipcode recognition. ...

Feature selection for ensembles A hierarchical multi-....pdf

combination, one should deal with problems such ...Rate (%) 99.13 98.17 97.04 Rec.Rate 0.5...genetic algorithms for handwritten digit recognition....

On-line handwritten formula recognition using statistical ....pdf

On-line handwritten formula recognition using ... the system contains several mathematical sym; ; ...combination of the results as proposed in [6],...

更多相关标签:
网站地图

文档资料共享网 nexoncn.com copyright ©right 2010-2020。
文档资料共享网内容来自网络,如有侵犯请联系客服。email:zhit325@126.com