CYBER CRIME SECURITY USING CONFUSION MATRIX

VidyaMai S
4 min readJun 6, 2021

CYBER CRIME: In today’s computer world cyber crime is a known word by everyone. In every minute there are at least 100 cases registering in the name of cyber crime.

The main causes of cyber crime are happening through our systems where our passwords get auto-saved, not changing our passwords regularly, visiting inappropriate sites, downloading unknown files etc. even though we have lot of software’s downloaded in our systems to detect malwares 1 out of 10 persons are still facing issues of attack.

Today we discuss why we are still facing issues even though there are plenty of malware detection software’s available.

MALWARE DETECTION: One of such software is Web Application Firewall, or Intrusion Prevention System (IPS). Many of us doesn’t know how a software detects a malware and send alerts . The core process depends upon machine learning function, confusion matrix. This does the whole background work of detecting attack and sending alerts.

CONFUSION MATRIX:

Confusion matrix is a function from scikit-learn library used in machine learning.

syntax- confusion_ matrix( )

It is mainly used in binary classification model where the target classes are 0 and 1.

Confusion matrix functions by taking the 80percent dataset under train model to analyze the data and 20percent dataset under test model to predict values. i.e., in the given dataset the data is split into two parts so that it will give the result to the test dataset using analysis method.

So that, we can compare the values of predicted data with the actual data and check the accuracy it performed.

ex- if there is10 columns of data with 10 feature variables/inputs and 10 known targets classes/outputs

but we are provided only with 7 columns of data now we have to determine the other 3 target variables as we got trained with the 70percent of the data we got provided. But the thing is accuracy, we should check how many in 3 are accurate/correct/positive. Only by that we gets to know how much efficient we are and same with the confusion matrix.

Now, we have to perform accuracy test on confusion matrix predicted values with the actual values.

To do the accuracy we have to know four variables, which are:

  1. True Negative(TN)
  2. False Negative(FN)
  3. False Positive(FP)
  4. True Positive(TP)

For suppose in total 1000 values of data if 800 is considered as train and 200 is considered as test, from that 200 predicted values if confusion matrix predicted 112 are negative/’0' and 88 are positive/’1' .

But the actual data is showing from 112 predicted negative values only 100 are actual negative and 12 are actual positive ,

then true negatives are called True Negative(TN)-100 and wrong negatives are called False Negatives(FN)-12

and from 88 predicted positives the actual data is showing only 80 are actual positive and 8 are actual negatives.

the correct positives are called True Positives(TP)-80 and wrong positives are called False Positives(FP)-8.

Now we can measure accuracy

accuracy= ((TN+TP) / (FN+TN+TP+FP))*100

i.e., 100+80/100+12+80+8=(180/200)*100=90%

So, out of 100 cases, 90 cases the model is predicting correctly but other 10 are false positives/false negatives which will cause a greater damage.

Confusion matrix is used in Web Application Firewall(WAF), or Intrusion Prevention System (IPS) to detect malware attacks on a system. 9 out of 10 times it is showing correct results but the 1 false positive/error is causing the damage.

False Positives(FP): As we discussed, the output is given as malware detected or 1 the error the model gives even if there is no malware in the system . This error doesn’t give any effect as we will be alert to delete the suspicious files and clean our system. This error is also called type 1 error.

False Negative(FN): In this the model gives the output as there is no malware detected or 0. This is very dangerous error as we think our files and system is clean, and we don’t put any concentration in further checking the files/software we are downloading but the attack has already happened with out our prior knowledge. This is also called type 2 error. This error causes us to fall in a trap of cyber crimes so we have to be conscious and check what kind of files/web apps we are dealing with.

But it doesn’t make sense to use Web Application Firewall(WAF), or Intrusion Prevention System (IPS) if we have to keep on checking our systems manually.

so to avoid type-1 and type-2 errors, and to increase the accuracy to 100percent, we have multiple ways like to increase the datasets to train the model and confusion matrix using deep learning have come in place and this will make models to function work more efficiently and accurately.

Thank you for reading the article till here even if not thanks for reaching till here ; )

#vimaldaga #righteducation #educationredefine #rightmentor #worldrecordholder #linuxworld #makingindiafutureready #righeducation #summertraining

--

--