I've run into a hang up (maybe) with an OCR project I am working on. I cannot seem to get the network I am using to converge, or at least not within what I had thought a reasonable amount of time. Undoubtedly the error is BTKATC. I'm just unsure of what direction to take at the moment and thought it couldn't hurt to ask for advice from the community while I continue tinkering and research.
The network I am using is an activation network using the bipolar sigmoid function. I've set the function's alpha parameter to various values (2.00, 1.00, 0.50, 0.10, and 0.01). I am using back propagation learning, as well.
I am classifying 84 different characters (upper and lower case alphabet, 0 through 9, and other extended characters). I have 20 samples of each character, for a total of 1680 samples. The samples are saved as 15x20 bitmaps, and are mapped to vectors of size 300 to be used as inputs to the network. Background/foreground value representations of -1/1 and -0.5/0.5 have been used.
For the hidden layer, I've tried 8, 10, 15, and 20 neurons.
The output layer consists of 84 neurons (one neuron for each class), with -1/1 or -0.5/0.5 convention being used depending on what was used for the input. E.g., if -1/1 was used for background/foreground in the input, then class one would look like: {1, -1, -1, -1, ..., -1}.
Originally I was depending on the error value returned by network.RunEpoch(). The errors would range from 40 to 6000, even after several thousand epochs (100000+). I eventually made a very rudimentary validation framework to help circumvent overfitting. I removed x% (60:40 training to validation ratio mostly) of the samples per epoch for validation testing. Validation errors never got below 96%, again even after several thousand epochs of training.
To me, those results don't seem right and smell like either the architectures chosen were poor or the training samples weren't conditioned right (or perhaps were too few). Am I correct, or am I just not being patient enough and should allow the network to train longer?
Thanks in advanced for any help. If I discover anything new, or if I solve my problem I will be sure to post a reply.