Here it is. The Handwritten Character Recognizer program.
The mother of all Neural Nets! I do not know how well it works yet.
I am continuing to test it.

I have not made any changes to the harness. So I have not included the files
that came with the package.

Files in the directory:

README.TXT	- This file.
RECOG.C 	- The source code for the character recognizer.
RECOG.OBJ	- Borland C++ 3.0 compiled source with optimizations.
HARNESS.EXE	- Executable Program
AUTOEXEC.BAT	- My AUTOEXEC.BAT file.
CONFIG.SYS	- My CONFIG.SYS file
TURBO_C.MAK	- My non-optimizing MAKE file (I change the directories)
BORLANDC.MAK	- My optimizing MAKE file. (I found the switches that do not
		  crash!!!)
M.BAT		- Run non-optimizing MAKE.
MO.BAT		- Run optimizing MAKE
TURBO_C.LNK	- Link File used by MAKE (I changed the directories)
RAM.BAT 	- copy the files to the D:\ (RAM) drive.


This is a Neural Net using the delta rule, or a multi level perceptron.
The NN uses at the moment 148 Features, the first hidden level has
440 nodes, the second hidden level has 440 nodes, and the third or
output level has 65 nodes (one for each character).

The output is one if it is the character, or zero of it is not the
character. Since the harness takes only integers I multiply the value
by 1000 so the weights are between 1000 and 0. A weight over 900
is pretty likely the right choice for the highest score, however in my testing
I have seen correct scores as low as 300, but the confidence is not very good.

I discovered that I do not have enough hidden levels. I assumed
that 200 and 200 would be enough. None of my references give any indication
as to how many hidden nodes to have. When there are not enough hidden levels
the net will not converge.
If this is the case the number of hidden levels must be raised.
I will be testing in the next couple of days to find the recommended number of
hidden levels. Unfortunately this weekend I have to go out of town for
a wedding.

The GAIN term is probably O.K. In theory, a higher gain
will converge faster if the function is smooth. Since the data is so unknown
we can not assume that. The theory says that for the function to converge
the steps must be very very small. 0.25 or 0.3 is what most people use.

The learning is slooooooow. A 486 is a must! There is a lot of floating
point math and the built in coprocessor makes a big difference!
My 33 MHz 386 is a little over 100 times slower than my 33 MHz 486!
The other advantage of the 486 is the built-in cache. There are
many tight summing loops, which is what an internal cache was built for.
A 486 DX2 with an internal clock of 50 MHz will probably run very close to
twice as fast as a 25 MHz 486, because of these tight loops.

Because of the big weight arrays, they will not fit in memory.
I did not have enought time to screw around with Extended Memory.
I use block reads and writes and it slowed it down a lot. You need
to use a big disk cache like smartdrive.exe or a big
RAM disk.

I have included my AUTOEXEC.BAT and CONFIG.SYS that I have on my
Floppy which I boot from so I do not mess up my hard disk startup
when I have to do *real* work.

I have also included a batch file RAM.BAT that copied the relivant
files to the RAM disk for testing. I have been using a 4MB ram disk
and have not had any problems. And it is much faster.
I think using a normal disk would wear it out.

The *.mak and *.lnk files are how I compile and link. The optimizer
causes me to get floating point domain errors when I used -O2 optimize
for speed. I found the combination that stopped it from happening.
The optimizer makes a big difference in learning also.

The code is ugly. I have not had time to clean it up! Getting it to
work is more important!. The memory problems lost me this whole last week.

The program writes a file called C:\SAVENET.KIP to store the intermediate
results of the net. When it starts up it looks to see if it has a valid
copy. If so, then it loads it. If not then it starts from scratch.
I will try to create a SAVENET.KIP that may cause the
net to converge faster for different people. By learning on all of the data
I have so far. This may get the net in the neighborhood, so it will converge
faster on a new set. We will see. The books say it should converge
anywhere from 50 to 5637 times! Good luck.

The system is easily to extend. If you are so inclined. The
Calculate_Feeatures subroutine calculates all of the features.
If you add more features just up the number of MAX_FEATURES.
To be safe, a feature should be a value between -1 and 1. Other numbers
will work, but you have to be careful that there are no overflows.
A simple linear transformation often works well. f(x) = ax + b.

As I continue to figure out this thing I will keep you posted.

Regards,
Kipton Moravec
Exkerstr 31
W-8050 Freising
Germany
Telephone: 011 49 8161 67097  Home
	   011 49 8161 804753 Work
	   011 49 8161 804788 FAX
