Before we dive into explaining the core concepts of neural networks, I want to take some time and appreciate the remarkable and complex thing we, humans, hold between our ears —the brain.
We can, without much thinking, almost intuitively, understand the content of these two grouped images of cats and dogs.
Think about it — when you imagine cats and dogs, what are the distinctive features that make the classification in your head?
For cats, maybe you think of pointy ears, short snout, sassy attitude (unfortunately pictures don’t portray that, or else it would be easy to classify cats v. dogs), …
For dogs, maybe you think of fur, fluffy ears, long tongue, playfulness (can’t be depicted in images), …
Notice that regardless of how many features we can come up with (probably you’ll have overlapping features too) we’re still able to classify these images almost flawlessly. Even though different species of dogs have different color, shape, fur length, snout size, and yet your brain without much effort concludes that these are representing the same class. Isn’t that amazing!?
Neural network are used for:
- Basically in everything (Replacing all of the previously mentioned algorithms)
- Image classification
- Object detection
- Voice recognition and synthesis
Popular architectures: Perceptron, Convolution Neural Networks (CNN), Recurrent Neural Networks (RNN), Hopfield Networks (HN), Boltzman Machine, Deep Belief Networks (DBN), Generative Adversarial Networks (GAN), and many more.
What are [artificial] neural networks?
The status quo for explaining neural networks or artificial neural network (this is what they are formally referred to in the scientific literature) is to start by showing how they resemble the neurons in the biological brain. Since I like that explanation, and I believe it’s inspiring I will stick to it.
To begin with, the biological brain is made out of millions and trillions of neurons. Everything we do in our daily lives, the way we see, the way we interact with the world, the way we feel, the way we move, fundamentally comes down to the way neurons in the brain wire and fire. You can think of neurons as the simplest [biological] processing unit. It is made out of dendrites (receivers of information), nucleus (the processing unit), axon, and axon terminals (senders of information).
Neurons are connected with each other through synapses, which are junctions where information is shared (electric signals) among neurons. In other words, the stronger the connection of synapses, the stronger the signal transmitted.
In 1956, Frank Rosenblatt came up with a really clever way of loosely representing these concepts with simple mathematical formulas. The intuition behind it is that there are a bunch of neurons and they communicate with each other with weights and biases, trying to map input A →output B (wow, what just happened?). These two are the fundamental components of neural networks.
A neuron, as depicted above has some inputs [10, 7, 3]. These inputs are connected to the neuron with weights [0.5, 1.0, 0.1] that represent the strength of the inputs. The function of the neuron is to multiply inputs with corresponding weights to determine if the result is greater than 10 (but can be any number); if it is, the neuron outputs 1, otherwise 0. Additionally, a neuron can output more than two values (0|1). That is done using activation function, which are used as functions to project the output of the network to a certain range.
Weights tell the neuron to which inputs to respond more strongly. This is what gets updated during training — and fundamentally this is how networks learn.
Neurons are organized in layers, where they are connected with the neurons of the previous and next layer, but not within the that layer. The information in networks goes strictly forward, from the input (layer 0), to output (layer L, where L is the number of the last layer). The layers in-between input layer and output layer are called hidden layers (they’re called hidden because, compared from input and output, they’re not directly observable — we will talk more on visualizing neural networks on another article).
If you put enough neurons in a hidden layers, hidden layers in a neural network, and data in the neural network — the network starts learning better than any other algorithm.
If you like to dive more into details of neural networks and how they learn, I highly suggest you check this out— here’s a great explanation for neural networks.
Why is it getting all the recognition?
The main reason of all the recognition they are getting is that this method of learning generally works better than other classical machine learning algorithms. One distinguishing characteristic is that for the classical methods they perform relatively well with a decent amount of data, and after that point their performance plateaus. On the other hand, neural-network-based learning have continually demonstrated to continue to improve with more data and more data.
Side note: This is why big companies like Google, Facebook, Apple, Amazon, Netflix prefer to use algorithms like these — because they have ‘infinite’ amounts of data, and this is why their products continue to improve.
So, let’s build a neural network!
Down below, I have added a link where yo can tinker with neural network and all the above concepts explained.
Let’s try it to do a binary (two classes) classification with data of different difficulty to get an intuition how neural nets work and tinker with different parameters to see their performance!
Important note: Learning rate is the variable that controls how quickly the network learns.
Things to tinker with:
- Data (Left-hand)
- Features (The things that get inputted to the network)
- The minimum number of neurons and hidden layers for a Test loss sm0.05
- Learning rate
Here’s a challenge for you: Can you get the Test Loss (presented just above the graph of data in the right) less than 0.05 — if you do, what parameters did you use?
For a more systematic exploration of the parameters check this out.
One thing that has helped me immensely in understanding the core concepts of neural networks and exploring their practicality is tinkering with them. Below you can find a list of the most useful links I have found (feel free to suggest more in the comment section down below):
Hope this article helped you get a clear general idea on neural networks and an intuition of how they perform with different parameters.
The vast usability of neural network in the market makes it very valuable to explore and tinker with in order to understand its fundamental components, how they interact with each other, and showcase all this with a model performing on our data— and we’re going to do just that in the next article.
I’d love to hear your ideas on what you’d like to read next — let me know down below in the comment section!
You can always connect with me via LinkedIn.