We often talk about activation functions when using artificial neural networks. Let’s all consider the definition and varieties of the activation function together.

**ACTIVATION FUNCTIONS**

Activation functions obstruct neural networks to be a linear transformation. Without activation functions, a neural network acts as a linear connecting with limited learning power. When we give complex world information such as image, sound, video to learned by neural network, the network is forced to learn. So, we need nonlinear functions which has multiple degrees. Activation functions can regulate the outputs of nodes and add a level of complexity that neural networks without activation functions cannot achieve. Thus, although the complexity of the network, the network becomes stronger and learns better.

**1.Sigmoid Function**

The sigmoid function compresses the values it receives from 0 to 1. Here is the mathematical expression for sigmoid function.

f(x)=1/1+e^(-x)

When a high value comes, it gets closer to one and produces a stronger signal and when a negative value arrives, it approaches zero and produces a weaker signal.

The sigmoid function is not linear so the network becomes more complex and we can use it for more difficult tasks but if we look carefully at the graph, we can see that y values react very little to changes in X. In these regions the derivative values become very small and approach to 0. This is called vanishing gradient, and the learning event takes place at a minimum level. When a slow learning event occurs, the optimization algorithm that minimizes the error can be attached to local minimums and we cannot get the maximum performance that can be obtained from the artificial neural network model.

**2.Tanh Function**

The tanh function compresses the values it receives from -1 to 1. Here is the mathematical expression for tanh function.

f(x) = 2/1+e^(-2x) -1

The derivative of tanh function is steeper than the sigmoid function’s derivative, so it can take more value. It means that it will be more efficient because it has a wider range for the classification process. However, the problem of vanishing gradient at the ends of the function continues.

**3.ReLU Function**

ReLU is commonly used in deep learning neural networks for speech recognition and computer vision. This function first separates the incoming values according to whether they are positive or negative. The output is 0 if the input is negative and return the input unchanged if the input is positive so computer can calculate faster. The problem with the ReLU function is that the derivative of this zero-value region, which gives us processing speed, is also zero, because therefore learning cannot occur there.

Leaky ReLU function developed against the dead neuron problem in ReLU function.

f(x)=max (0.01x, x)

As shown in the figure, this problem was solved by a 0.01 magnitude leak towards the bottom of the X axis. This value is close to 0, but not 0 because of the vanishing gradients in ReLU survived, so learning is also provided for the values in the negative region.

**5. Swish Function**

Swish function gets value in negative region like leaky ReLU function, but swish function’s values are not linear.

f(x)= x × 1/1+e^(-x)

Thus, we have seen that activation functions play a key role in artificial neural networks. Hope to see you in our next article…

**References**

- https://codeodysseys.com/posts/activation-functions/
- https://medium.com/@ayyucekizrak/derin-%C3%B6%C4%9Frenme-i%C3%A7in-aktivasyon-fonksiyonlar%C4%B1n%C4%B1n-kar%C5%9F%C4%B1la%C5%9Ft%C4%B1r%C4%B1lmas%C4%B1-cee17fd1d9cd
- https://www.udemy.com/course/yapayzeka/learn/lecture/8955952#questions
- https://laptrinhx.com/titanic-prediction-with-artificial-neural-network-in-r-3087367370/

## 14 thoughts on “Activation Functions”

Ι do accept as true wіth аll the concepts you’ve

offered to your рost. They are really convincing and will

certainly work. Still, the posts are too brief for newbies.

Could you please prolong them a little from subsequent time?

Thanks for the post.

Thank you for your comment!I will try to talk about longer blog about this subject.

Awesome! Its actսally amazing paｒagraph, I have got much clear idｅa

rеgarding from this pɑragraph.

Thank you!

Wһat’s up everybody, hеre eveгy pｅrson is sharing these kіnds of knoԝ-how, therefore it’s

nice to rｅad this webpage, and I used to pay a quick

visit this weblog everyday.

Thank you!

Everytһing is very open with a precise explanation of the challengеs.

It was definitelү infߋrmative. Υour website is extremely helpful.

Thanks for sharing!

Thank you!

Ѕuperb ρost however I wɑs ѡanting to know іf you could write

a litte more on this topic? I’d be very thankful if you could elabоrate

a lіttlｅ bit further. Appreciate it!

I am working on it!Please keep going to check our blogs…

Hi thｅrе, I checҝ your blog on a regular baѕiѕ. Your writing style is awesome, keep doing what you’rе doing!

Thank you for your comment!

Greetіngs I am so thrilled I foᥙnd youг site, I really

fоund you by mistake, whіle I ᴡas brօwsing on Aol for something else, Nonetheless I am here

now and would just like to ѕay many tһanks for a remarkable post and a all round

enjoyable blog (I also love the theme/design), I don’t have

time to browse it alⅼ at the minute bᥙt I have book-mаrкed

it and alѕo added your RSS feeds, so when I have

time I will be back to read a great deal more, Please do keеp up the sᥙperb b.

Thank you!