AForge.NET

  :: AForge.NET Framework :: Articles :: Forums ::

Yet another EBP-FF network and OCR problem

The forum is to discuss topics from different artificial intelligence areas, like neural networks, genetic algorithms, machine learning, etc.

Yet another EBP-FF network and OCR problem

Postby matt.andrsn » Sat Jul 03, 2010 2:09 am

I've run into a hang up (maybe) with an OCR project I am working on. I cannot seem to get the network I am using to converge, or at least not within what I had thought a reasonable amount of time. Undoubtedly the error is BTKATC. I'm just unsure of what direction to take at the moment and thought it couldn't hurt to ask for advice from the community while I continue tinkering and research.

The network I am using is an activation network using the bipolar sigmoid function. I've set the function's alpha parameter to various values (2.00, 1.00, 0.50, 0.10, and 0.01). I am using back propagation learning, as well.

I am classifying 84 different characters (upper and lower case alphabet, 0 through 9, and other extended characters). I have 20 samples of each character, for a total of 1680 samples. The samples are saved as 15x20 bitmaps, and are mapped to vectors of size 300 to be used as inputs to the network. Background/foreground value representations of -1/1 and -0.5/0.5 have been used.

For the hidden layer, I've tried 8, 10, 15, and 20 neurons.

The output layer consists of 84 neurons (one neuron for each class), with -1/1 or -0.5/0.5 convention being used depending on what was used for the input. E.g., if -1/1 was used for background/foreground in the input, then class one would look like: {1, -1, -1, -1, ..., -1}.

Originally I was depending on the error value returned by network.RunEpoch(). The errors would range from 40 to 6000, even after several thousand epochs (100000+). I eventually made a very rudimentary validation framework to help circumvent overfitting. I removed x% (60:40 training to validation ratio mostly) of the samples per epoch for validation testing. Validation errors never got below 96%, again even after several thousand epochs of training.

To me, those results don't seem right and smell like either the architectures chosen were poor or the training samples weren't conditioned right (or perhaps were too few). Am I correct, or am I just not being patient enough and should allow the network to train longer?

Thanks in advanced for any help. If I discover anything new, or if I solve my problem I will be sure to post a reply.
matt.andrsn
 
Posts: 4
Joined: Sat Jul 03, 2010 12:47 am

Re: Yet another EBP-FF network and OCR problem

Postby cesarsouza » Sat Jul 03, 2010 3:58 am

Hi Matt,

Instead of training a single network with many output nodes, have you considered training many networks with a single output node, each one trying to classify a pattern against all others? This may not be the best approach, but perhaps could give you better results.

Anyways, perhaps you should add some form of preprocessing or feature extraction step instead of trying to feed the network with the characters directly. Have you already seen the Neural Network OCR article in CodeProject?

Regards,
César
cesarsouza
 
Posts: 63
Joined: Fri Apr 10, 2009 3:41 pm

Re: Yet another EBP-FF network and OCR problem

Postby matt.andrsn » Sat Jul 03, 2010 12:54 pm

Thanks for the reply, César!

No, I had not thought about using multiple networks. That is an intriguing idea that may be worth looking into. I suppose, as far as the classification is concerned, you would just feed an input to each network and whichever network returned the highest positive response would indicate the most likely label to apply to the input.

With regards to preprocessing, at one point I was doing stroke thinning (or skeletonization) using Rosenfelds algorithm, but found that was eliminating crucial strokes (such as the dots in 'i' and 'j'). I had thought about cropping images to the just the foreground, but was unsure how to handle feeding different sized images into a neural network (without resizing or padding to ensure a constant size, in which case I might just as well leave the images as they are).

I had read the article on Code Project (the first part of that article is what I based my original work off of). However, I thought I would try my hand at the traditional method first before moving on. I'm not opposed to trying something else. I just wanted to make sure I wasn't doing something silly before abandoning what I had done prior.

Another idea I was mulling over, though I don't know how well it will work, would be to create a histogram of the dark pixels in each column, normalize the histogram, and feed the histogram into the network.
matt.andrsn
 
Posts: 4
Joined: Sat Jul 03, 2010 12:47 am

Re: Yet another EBP-FF network and OCR problem

Postby matt.andrsn » Sat Jul 03, 2010 4:09 pm

Just wanted to quickly post that on a whim I decided to further reduce the input image size from 15x20 to 12x16. Validation errors are now steadily dropping (started at about 90% error and has dropped to about 65-70%).

I still intend to look at the other options that have been discussed, but I am guessing that the network was a bit too complex (which is what César was probably getting at in his response).
matt.andrsn
 
Posts: 4
Joined: Sat Jul 03, 2010 12:47 am

Re: Yet another EBP-FF network and OCR problem

Postby matt.andrsn » Sat Jul 03, 2010 5:19 pm

A found another problem, this time with my vectorization algorithm. I inadvertently assigning some of the foreground pixels the value used for the background pixels. I had forgotten that when I resized the original images, they were no longer monochrome (originally I had been checking just one color channel for 0, or black).
matt.andrsn
 
Posts: 4
Joined: Sat Jul 03, 2010 12:47 am




Return to Artificial Intelligence

cron