Abstract
In recent years Machine learning and deep learning application-based researchers have achieved interest and the most significant one is handwritten recognition. Because it has the tremendous application such as Bangla OCR. EkushNet is the first research which can recognize Bangla handwritten basic characters, digits, modifiers, and compound characters. Handwritten recognition is one of the most interesting issues in the present time due to its variant applications and helps to make the old form and information digitization and reliable. In spite of, there is no single model which can classify all types of Bangla characters. One of the most common reasons conducting with handwritten scripts is big challenge because every person has a unique style to write and also has different shape and size. Therefore, EkushNet has proposed a model which help to recognize Bangla handwritten 50 basic characters, 10 digits, 10 modifiers, and 52 mostly used compound characters. The proposed model train and validate with Ekush dataset and cross-validated with CMATERdb dataset. The proposed method is shown satisfactory recognition accuracy 97.73% for Ekush dataset which is our proposed dataset, and 95.01% cross-validation accuracy on CMATERdb dataset, the best accuracy for Bangla character recognition.
The proposed dataset Ekush contains Bangla modifiers, vowels, consonants, compound letters and numerical digits that consists of 367,018 isolated handwritten characters written by 3086 unique writers which were collected within Bangladesh. This dataset is freely available for any kind of academic research work.
LIST OF TABLES
LIST OF FIGURES
FIGURES | PAGE NO |
FIGURE 1: A FLOW DIAGRAM OF “EKUSH” DATASET | 8 |
FIGURE 2: DATA COLLECTION FORM | 10 |
FIGURE 3: FORM WITH LABELED ON THE BOX | 11 |
FIGURE 4: SOME SCANNED FORM ON “SIMPLE SCAN” | 12 |
FIGURE 5 (A): SCANNED FORM | 13 |
FIGURE 5 (B): CROPPED FORM | 13 |
FIGURE 6: EXAMPLE OF ROW-WISE CROPPING | 14 |
FIGURE 7: EXAMPLE OF SEPARATED CHARACTERS | 15 |
FIGURE 8: SOME EXAMPLE OF SMOOTHING IMAGES | 15 |
FIGURE 9: EXAMPLE OF UNNECESSARY INFORMATION IMAGES | 16 |
FIGURE 10: EXAMPLE OF INVERTED IMAGE | 17 |
FIGURE 11: GUI FOR AN AUTOMATIC PROCESS | 17 |
FIGURE 12: DATA FROM DEFERENT AGES BASE ON GENDER | 20 |
FIGURE 13 (A): BANGLA MODIFIER ON EKUSH DATASET | 20 |
FIGURE 13 (B): BANGLA DIGIT ON EKUSH DATASET | 20 |
FIGURE 13 (C): BANGLA BASIC CHARACTERS ON EKUSH | 21 |
FIGURE 13 (D): BANGLA COMPOUND CHARACTERS ON EKUSH | 21 |
FIGURE 14: ARCHITECTURE OF EKUSHNET | 24 |
FIGURE 15 (A): TRAINING AND VALIDATION LOSS. | 29 |
FIGURE 15 (B) TRAINING AND VALIDATION ACCURACY | 29 |
FIGURE 16 (A): ERROR FOR VALIDATION SET (B) TEST SET | 30 |
FIGURE 16 (B): ERROR FOR TEST SET | 30 |
LIST OF PUBLICATION
PUBLICATION |
A UNIVERSAL WAY TO COLLECT AND PROCESS HANDWRITTEN DATA FOR ANY LANGUAGE. |
BORNONET: BANGLA HANDWRITTEN CHARACTERS RECOGNITION USING CONVOLUTIONAL NEURAL NETWORK. |
EKUSHNET: USING CONVOLUTIONAL NEURAL NETWORK FOR BANGLA HANDWRITTEN RECOGNITION. |
BANGLA HANDWRITTEN DIGIT RECOGNITION USING CONVOLUTIONAL NEURAL NETWORK. |
EKUSH: A MULTIPURPOSE AND MULTITYPE COMPREHENSIVE DATABASE FOR ONLINE OFF-LINE BANGLA HANDWRITTEN CHARACTERS. |
ONKOGAN: BANGLA HANDWRITTEN DIGIT GENERATION WITH DEEP CONVOLUTIONAL GENERATIVE ADVERSARIAL NETWORKS. |
SHONKHANET: A DYNAMIC ROUTING FOR BANGLA HANDWRITTEN DIGIT RECOGNITION USING CAPSULE NETWORK. |
CHAPTER 1
INTRODUCTION
Automatic handwritten recognition is one of the most important research fields in recent years for its different application such as OCR which helps to recognize the character from images. Over the last few years, information has transferred from handwritten hardcopy documents to digital file formats, which is more reliable. However, this system to handle new forms of document. But till now a large number of older documents are written by hand. The challenge is attempting to convert them if follow tradition method to copy the document by manual typing. Because it takes a long time and needs a huge amount of manpower. But OCR process can help to digitize old information with less time and less manpower. For that reason, a robust model of handwritten character recognition plays an important role. In spite of this Bangla handwritten character recognition has no strong model that helps to build robust Bangla OCR. Because of yet there is no model which can classify all kind of characters (basic character, numeral, modifiers, compound character). Many works have been done but those are concentrated for digit [1] or basic characters [2] or compound characters [3]. Dealing with handwritten character is complicated because of different shape and style. And the arrangement of Bangla character is complex due to its alignment and many of them are similar apart from compound characters that complement other basic characters.
One language fundamental is different from other languages like Latin scripts are differ from Bangla because Bangla comes from Sanskrit scripts. In Bangla language written scripts has 50 basic characters, 10 numerical digits, more than 200 compound characters, and 10 modifiers.
Bangla is 4th most popular language in the world. It is the first language of Bangladesh with a rich heritage. February 21st is announced as the International Mother Language Day by UNESCO to respect the language martyrs for the Bangla language in Bangladesh in 1952. It is the second most popular language in the Indian subcontinent. So, overall About 300 million people use Bangla language as their writing and speaking purpose. Consider all that situation Bangla handwritten character recognition plays an important role to help those people in different purpose as Bangla traffic number plate recognition, automatic postal code identification, extracting data from hardcopy forms, automatic ID card reading, automatic reading of bank cheques and digitalization of documents etc.
Forensic analysis from handwriting.
There are five chapters in this research paper. They are Introduction, Literature Review,
Proposed Research, Results and Discussion, Conclusion and Future.
Chapter one: Introduction; Objective, Motivation, Expected Outcome, Report layout.
Chapter two: Literature Review; Sentiment Analysis, Related works, challenges.
Chapter three: Proposed Research; Research Methodology, Data Collection, Data
Processing, Flow Model, Experimental layout.
Chapter four: Results and Discussion; Experimental Result, Discussion.
Chapter five: Conclusion and Future; Conclusion, Future Scope.
CHAPTER 2
LITERATURE REVIEW
In the present world, one of the most interesting topics is Handwritten Recognition due to its academic and commercial interest in different research fields. There are many types of research held on Bangla handwriting and other languages like English, Arabic, Hindi, Chinese etc. So, in this chapter giving an explanation of related work, summary of this research, the scope of the problem and lastly will show the challenges of this research.
In past studies there are many works for recognition of handwritten character in a different language as Latin [4], Chines [5], Japanese [6] achieve great success. There are a few works are available for Bangla handwritten basic character, digit and compound character recognition, some literature has been made on Bangla characters recognition in the past years as “A complete printed Bangla OCR system” [7], “On the development of an optical character recognition (OCR) system for printed Bangla script” [8]. there are also few types of research on handwritten Bangla numeral recognition that reaches to the desired recognition accuracy. Pal et al. have conducted some exploring works for recognizing handwritten Bangla characters those are “Automatic recognition of unconstrained offline Bangla hand-written numerals” [9], “A system towards Indian postal automation” [10]. And “Touching numeral segmentation using water reservoir concept” [11]. The proposed schemes are mainly based on extracted features from a concept called water reservoir. Apart from there also present several Bangla Handwritten Character Recognition and had achieved pretty good success. Halima Begum et al., “Recognition of Handwritten Bangla Characters using Gabor Filter and Artificial Neural Network” [12] works with own dataset that was collected from 95 volunteers and their proposed model achieved without feature extraction and with feature extraction around 68.9% and 79:4% of recognition rate respectively. “Recognition of Handwritten Bangla Basic Character and Digit Using Convex Hall Basic Feature” [13] achieve accuracy for Bangla characters 76.86% and Bangla numerals 99.45%. “Bangla Handwritten Character Recognition using Convolutional Neural Network” achieved 85.36% test accuracy using their own dataset. In “Handwritten Bangla Basic and Compound character recognition using MLP and SVM classifier” [14], the handwritten Bangla basic and compound character recognition using MLP and SVM classifier has been proposed and they achieved around 79.73% and 80.9% of recognition rate, respectively.
For recognizing handwritten characters dataset plays a vital rule and there are three open access datasets available for Bangla characters, these are the BanglaLekha-Isolated [15], the CMATERdb [16], and the ISI [17]. But every dataset has some drawback. BanglaLekha–Isolated dataset consists the total of 166,105 squared images (while preserving the aspect ratio of the characters), each containing a sample of one of 84 different Bangla characters which has 3 categories such as 10 numeral digits, 50 basic characters, and 24 compound characters. Two others datasets CMATERdb has also 3 different categories for basic characters, numerals and compound characters and ISI dataset has two different datasets for basic characters and numerals.
Our goal to make a model that can recognize Bangla Handwritten digits using Convolutional Neural Network which is training by Ekush and other Bangla character dataset. The Convolutional Neural Network (CNN) reveal new opportunities in the field of pattern recognition for classification, which is helping numerous researchers to implement their state-of-the-art system in solving. The CNN structure was first proposed by Fukushima et al. in 1980 [18] but it was not widely used because the algorithm was complex. In the 1990s, LeCun et al. applied a gradient-based learning algorithm to CNN and achieved successful results [19]. After that, many researchers work on it and improved CNN and made good results in pattern recognition. A few years ago, Cirean et al. [20] applied multi-column CNN to recognize digits, alpha-numerals, traffic signs, and the other object class.
The Ekush dataset consists of 367,018 images that contain 122 classes and it became the largest dataset for Bangla characters yet. A comparison between Ekush and with that three popular sources of Bangla handwriting related datasets (BanglaLekha-Isolated dataset, CMATERdb, and the ISI Handwriting dataset) are given in Table 1.
Table 1: Number of images in different datasets
Dataset Name | Modifiers | Basic Characters | Compound Characters |
Numeral | Total |
ISI | None | 30,966 | None | 23,299 | 34,256 |
CMATERdb | None | 15,103 | 42,248 | 6,000 | 63,351 |
BanglaLekha-Isolated | None | 98,950 | 47,407 | 19,748 | 166,105 |
Ekush | 30,667 | 154,824 | 150,840 | 30,687 | 367,018 |
The proposed method is shown satisfactory recognition accuracy 97.73% for Ekush dataset, and 95.01% cross-validation accuracy on CMATERdb dataset, which is so far, the best accuracy for Bangla character recognition.
CHAPTER 3
PROPOSED RESEARCH
Ekush is a dataset of Bangla handwritten characters which can be used as multipurpose way. Ekush dataset of isolated Bangla handwritten characters structured and organized data was collected from 3086 peoples covering university, school, college students, where approximately 50% 1510 male and 50% 1576 female. The handwritten characters were written in a form and then scanned it to get the data in JPEG format, we also focused on some of the issues and requirements when collected handwritten data, such as creating a form, data collection methodology, process, software, and relevant tools. Figure 1 showing a flowchart of creating the Ekush dataset.
Figure 1: A flow diagram of “Ekush” dataset
In recent years researches based on Machine learning and Deep learning have achieved much interest and one of its handwritten recognition. Handwritten recognition is very difficult due to its lack of dataset and also for collecting data from people. This research introduces a fast and comprehensive way to collect and process handwritten data to develop a way of Handwritten Recognition (HWR) algorithm for any languages. In this research handwritten characters wrote on a paper and then scanned to get the data into a JPEG format. We also focused on some of the other issues and requirements while collecting handwritten data, creating a form, data collection methodology, process, using software and relevant tools. We described these issues in the context of our own effort to create a handwritten database for the Bangla language. Our designed Graphical User Interface (GUI) is also able to process 100 scanned images per minute where each scanned image contains 120 characters. This proposed method, we developed a system which can help future researchers to collect and process handwritten data so they can directly use it in their Handwritten Recognition based researchers and also in many applications.
In our built system, we used some popular handy tools like python [21], OpenCV [22], PIL [23], PyQT [24] which will boost other researchers to modify the process as they want. Our proposed method form is able to collect at least 120 characters within a single form. As well as this form has scanned and pre-process all the 120 characters in different forms like jpeg, grayscale, invert and CSV for multipurpose uses. An example form is shown in Figure 2.
Figure 2: Data collection Form
At the very beginning, we create the form generally which is very tricky. We need to create all cell very carefully with equal size, which will later help to separate the characters automatically. Under or above every box there have to have some label which will help people to write the character in the specific boxes. If there are lots of characters than selecting the most frequently used character should be placed on the first page of the form. Casting the character under the box found to be more efficient from our own experiences. It is also mandatory to print the form with a good quality printer or photocopy machine. Otherwise, the printed label will be murky and volunteers will fill the form wrongly. Figure 3 showing 2 examples of labeling.
Figure 3: Form with (a) labeled under the box, (b) labeled above the box.
Next step is scanning those forms where the user can use any kinds of applications which will be efficient for them. Usually, any kind of windows scan application takes at least 1 minute to scan 1 form. Also, most of the windows scan application does not support user to save lots of images at the same time. For the sake of, we use “Simple Scan” application on “Ubuntu” with300dpi. Which takes 7-15 second to scan a single page and also have some options for saving and cropping the images to reduce the extra black part of the scanner. Figure 4 showing some example of the scanned form.
Figure 4: Some scanned form on “Simple Scan”
During scanning the form, it is not possible to scan all the paper in the same position. To overcome this issue, we create a big black boundary which will help to find the biggest contour by canny edge detection [25] to correct the skew of the paper and crop all the images in the same shape, size, and angle which is described in algorithm 1. Figure 5 is showing an example.
Algorithm 1: Find the Biggest Contour and Crop