Thinking as a Hobby

I spent most of yesterday working with Philip on his final project for the Artificial Neural Network class that he took for credit and that I audited.

In an earlier post, I described what we were trying to do...but I'll recap a little here, then talk about how we ultimately achieved optimum results.

All semester we learned about different types of neural networks, different ways to train them, different ways they represent data, etc.

We never talked about genetic algorithms in class, but I've been interested in them for a while, and we decided to try evolving neural nets instead of using traditional training methods.

Our problem domain was pattern recognition, specifically, we wanted to create a neural net that would recognize and correctly identify letters.

Basically, we would show it an input pattern like this:

*0000
*0000
*0000
*****
*000*
*000*
*****

and we wanted it to identify which letter of the alphabet it was (hint: this one's a lower-case "b"). :)

We initially wanted to create a single neural net that could identify and distinguish between the first five lower-case letters of the alphabet: a, b, c, d, and e.

We had input units for each bit of the input pattern, ten hidden units, and seven output units, which correspond with the bit representation of ASCII characters. For example, a lower-case "a" is represented by the base-two number 97, which is:

1 1 0 0 0 0 1

Philip decided to actually use -1 instead of zero in our output pattern, which was a good idea, since it gave us greater flexibility when calculating error and assessing the performance of the nets.

We began by randomizing the weight vectors between the input nodes and the hidden layers, and between the hidden layers and the output layers. The first weight vector remained fixed throughout. We then populated 100 neural nets, all with the same architecture.

We showed each net each letter, and each net came back with an output. For example, we showed the first net in the population the pattern:

00000
00000
****0
0000*
0****
*000*
0****

and it gave back an output, like:

1 1 1 -1 -1 1 1

Now, the actual output pattern for "a" is:

1 1 -1 -1 -1 -1 1

As you can see, the third and sixth bits are wrong, so we'd calculate the error between its output and what we wanted it to output.

Then we showed it the next letter, got its output and calculated the error.

We showed each net all five letters and calculated their error. The top 10%, that is the ones with the fewest errors, got to survive. The bottom 90% were killed off. Of the remaining 10%, 90 new clones were made, and the weights between the hidden units and the output units were mutated slightly. Then we showed each letter to every individual in this population, and repeated the process.

At first, our results sucked. Philip figured out the problem. We were selecting for nets that had the fewest errors for all five letters, so we were basically selecting for a network that was good at identifying whether a letter was a, b, c, d, or e, but was horrible at distinguishing between them.

Philip's idea was to evolve seven separate nets, that each had only one output for each binary bit in the ASCII representation.

So basically, we'd show each individual "a", and ask it, "Okay, so is the first bit 1 or -1?" Then we showed it "b" and asked the same thing.

In the end, we combined the seven nets into one, and voila! It worked damned near perfectly. We usually converged on a solution within 25 generations, but occasionally it took a little more to get no errors.

We tested the nets with fuzzy input, that is, we blurred up the input, and even with significant noise, the net performed incredibly well.

Cool, huh?




	Thinking as a Hobby Home Get Email Updates LINKS JournalScan Email Me Admin Password Remember Me 3476873 Curiosities served Share on Facebook				2002-12-09 11:42 AM How to Build a Brain (okay, a really small one) Previous Entry :: Next Entry Read/Post Comments (6) I spent most of yesterday working with Philip on his final project for the Artificial Neural Network class that he took for credit and that I audited. In an earlier post, I described what we were trying to do...but I'll recap a little here, then talk about how we ultimately achieved optimum results. All semester we learned about different types of neural networks, different ways to train them, different ways they represent data, etc. We never talked about genetic algorithms in class, but I've been interested in them for a while, and we decided to try evolving neural nets instead of using traditional training methods. Our problem domain was pattern recognition, specifically, we wanted to create a neural net that would recognize and correctly identify letters. Basically, we would show it an input pattern like this: 0000 0000 0000 **** 000 000 *** and we wanted it to identify which letter of the alphabet it was (hint: this one's a lower-case "b"). :) We initially wanted to create a single neural net that could identify and distinguish between the first five lower-case letters of the alphabet: a, b, c, d, and e. We had input units for each bit of the input pattern, ten hidden units, and seven output units, which correspond with the bit representation of ASCII characters. For example, a lower-case "a" is represented by the base-two number 97, which is: 1 1 0 0 0 0 1 Philip decided to actually use -1 instead of zero in our output pattern, which was a good idea, since it gave us greater flexibility when calculating error and assessing the performance of the nets. We began by randomizing the weight vectors between the input nodes and the hidden layers, and between the hidden layers and the output layers. The first weight vector remained fixed throughout. We then populated 100 neural nets, all with the same architecture. We showed each net each letter, and each net came back with an output. For example, we showed the first net in the population the pattern: 00000 00000 *0 0000 0**** 000 0** and it gave back an output, like: 1 1 1 -1 -1 1 1 Now, the actual output pattern for "a" is: 1 1 -1 -1 -1 -1 1 As you can see, the third and sixth bits are wrong, so we'd calculate the error between its output and what we wanted it to output. Then we showed it the next letter, got its output and calculated the error. We showed each net all five letters and calculated their error. The top 10%, that is the ones with the fewest errors, got to survive. The bottom 90% were killed off. Of the remaining 10%, 90 new clones were made, and the weights between the hidden units and the output units were mutated slightly. Then we showed each letter to every individual in this population, and repeated the process. At first, our results sucked. Philip figured out the problem. We were selecting for nets that had the fewest errors for all five letters*, so we were basically selecting for a network that was good at identifying whether a letter was a, b, c, d, or e, but was horrible at distinguishing between* them. Philip's idea was to evolve seven separate nets, that each had only one output for each binary bit in the ASCII representation. So basically, we'd show each individual "a", and ask it, "Okay, so is the first bit 1 or -1?" Then we showed it "b" and asked the same thing. In the end, we combined the seven nets into one, and voila! It worked damned near perfectly. We usually converged on a solution within 25 generations, but occasionally it took a little more to get no errors. We tested the nets with fuzzy input, that is, we blurred up the input, and even with significant noise, the net performed incredibly well. Cool, huh? Read/Post Comments (6) Previous Entry :: Next Entry Back to Top