توجه: محتویات این صفحه به صورت خودکار پردازش شده و مقاله‌های نویسندگانی با تشابه اسمی، همگی در بخش یکسان نمایش داده می‌شوند.
۱Machine Learning Approaches to Text Segmentation
نویسنده(ها):
اطلاعات انتشار: Scientia Iranica، سيزدهم،شماره۴، زمستان ، سال
تعداد صفحات: ۹
Two machine learning approaches are introduced for text segmentation. The first approach is based on inductive learning in the form of a decision tree and the second uses the Naive Bayes technique. A set of training data is generated from a wide category of compound text image documents for learning both the decision tree and the Naive Bayes Classifier (NBC). The compound documents used for generating the training data include both machine printed and handwritten texts with different fonts and sizes. The 18–Discrete Cosine Transform (DCT) coefficients are used as the main feature to distinguish texts from images. The trained decision tree and the Naive Bayes are tested with unseen documents and very promising results are obtained, although the later method is more accurate and computationally faster. Finally, the results obtained from the proposed approaches are compared and contrasted with one wavelet based approach and it is illustrated that both methods presented in this paper are more effective.
نمایش نتایج ۱ تا ۱ از میان ۱ نتیجه