This blog is to help provide an introduction to artificial neural networks (ANN) while I simultaneously perform a personal study on the same.
Disclaimer: This, in no way, is a final product. I intend to revise this as I learn more and more. If there are mistakes, please note them and help me find answers to questions which I missed along the way, including things which I have misinterpreted.
Most sites, when I've started trying to look up something, try to illustrate artificial neural networks by likening them to the brain; however, I do not presume to know anything about biology. What I would do is instead start from the machine learning standpoint of the perceptron.
The perceptron is a machine learning tool that can be used as a linear classifier, in its simplest explanation. The perceptron is designed by allowing a program to use input/output combinations to adjust weights to determine the classification of any input that may exist in a test set by separating the data set into two regions, positive and negative labels.
At this point, if you're reading this and getting discouraged that you're trying to learn a real valued function rather than a classification problem, don't be. I've only recently found a way to bypass the problem of extrapolation for estimating linear functions. (this may not sound like a big deal, but considering the application that I am working on, it's big.)
As the perceptron adjusts the weights which define the classifier, the true "line of separation" is defined. This is turn will cause the error to approach zero. The error occurs when testing the perceptron on test data such as when a positive label is predicted by is really negative.
A multi-layered perceptron can be considered to be the process of adding multiple lines of classification which increases the accuracy of the prediction as well as opening up a new domain of problems which are able to be classified.
The classic test of the perceptron is the XOR function. The XOR function is defined as a function in which, given two boolean input parameters, may either be one, or the other, but not both. A boolean parameter has two values, typically 1 or 0. Therefore, the XOR set may be seen as:
[0,0]=0
[0,1]=1
[1,0]=1
[1,1]=0
In this case, we see that this is not linearly separable. Linearly separable is defined to be when a set of labels cannot be split by a straight line. However, if we had two lines, we would be able to configure the lines to separate the data.
I find that this is a good description of a good portion of the goals of most experiments. There are many applications of artificial neural networks. However, there is a caveat. In many of these situations, even though it may appear that you are getting the results that you would expect, there is not really an accurate interpretation for the network as a whole.
For this blog, I will focus on coding in python. Though there are a lot of libraries already available, I've heard and learned so far that having your own can be especially helpful for not only understanding what is happening, but also allows you to make changes to build new kinds of networks. For instance, for every kind of library that I've found, including some built-in libraries for well known applications, I've not been able to redesign a neural network to fit and extrapolate a linear function. Having my own code allowed me to resolve the problem myself with the help of some professors at my University.
I'll try to make the code available, since I don't really care what happens to it. I'm a big advocate of open source tools, especially when it comes to research.
No comments:
Post a Comment