Theory Behind Single Layer perceptron

4 min readJan 23, 2021

Today we have to discuss the theory behind single neurons and the mathematical intuition .In next we discus some python code related to this problem statement.
To clear understanding the topic we have to use the data set in which we have a feature with personage and whether a person has insurance or not (binary class).

Using this we want to build a function that can tell us whether a person buys insurance or not? , which is called a binary classification problem. As in Machine learning algorithms, we discuss Linear regression. First, we have to implement the Linear regression to classify the above data then there will be a certain drawback and use neural networks to classify properly. First, we have to make it a scatter plot with age along the x-axis and have insurance (dependent variable)on the y-axis as shown in below .

Scatter plot on independent and target feature.

Using linear regression we can draw the best-fit line as below.

Now using this best fit line function if we want to predict whether an 80 year’s old grandmother wants to buy insurance or not. Here we set a threshold value of 0.5 if the precited value is greater than 0.5 it will be considered as 1 (means buy the insurance ) else less than 0.5 will be considered as 0 (Not interested in buying the insurance). But there is some misclassification during linear regression as shown below by the red circle because there are age criteria for insurance that age above 45 will not eligible for insurance but our regression model predicts as people who have an interest in buying insurance, this drawback of our linear model.

Showing some misclassified points by model.

What if we have a classification boundary function as shown below.In which we can clearly observe that each point is correctly classified except a few points (outliers).

This function is called the sigmoid function or logit function, it has a very simple mathematical formula is :

When we put the big value of z sigmoid(z) become very small similarly when we put very small value it gives large output value as in the example.

The sigmoid function converts inputs in the range 0 to1.

So actually in single layer neurons we have to follow two steps ,first we have to find the best bet line uisng linear regression as in supervised machine learning problem then we have to pas that value to sigmoid we get value around 0 and 1 then the values greater than 0.5 will be considered as 1 (buying insurance )and less than 0.5 will be 0 (Not Interested ) as below.

Now we have find the value cofficient of x and value of intercept b such as

The values of cofficients are not randomly selected it can be find by linear regression This will be our first step-I,in Step-II we have to pass this value to logit function such as :

In this fnction when we gave age 35 as input the z will be 0.48 which is less than 0.5 sothe person is not interseted in buyinh insurance hence it is denoted by red color.

When the age 43 as input then the value of output function will be 0.57 .It means the person is buying insurance.

it means the neurons has two parts,one part is regression part and second one is called activation part.

Mow let suppose we have multiple factor or features such as Age ,Income and Education then it means that if the person buying insurance not only depends on age and income of person but also on education of a person.Hence it can be formulise as :

where w1 ,w2 and w3 are called weights,x1 ,x2 and x3 is called features and b is some constant number called bias.So in neural network we can represent it like this as.

Theory Behind Single Layer perceptron

Written by Zulfiqar Ali