Skip to content

Latest commit

 

History

History
50 lines (31 loc) · 2.51 KB

README.md

File metadata and controls

50 lines (31 loc) · 2.51 KB

Bayesian Minimum Error Rate Classifier

Running the model

Create a new directory Data and place the csv files containing the data of the two classes (separately) in it.

Add the relevant column names to the list features in the binClassifier.py. Assign the split values to split1 and split2 in binClassifier.py.

On running binClassifier.py, the dataset is shuffled and sampled 100 times. The mean, minimum and maximum accuracies are printed(classwise and overall).

Mathematical Background

Likelihood

The probability density function used is the multivariate normal distribution. the likelihood p(x|wi) is given by

x is the d-dimensional feature vector, μ is the mean vector, Σ is the covariance matrix,|Σ| is the determinant of the covariance matrix, Σ−1 is the inverse of the covariance matrix and (x − μ)t is the transpose of the (x - μ) vector. p(x) is calculated using the covariance matrix of the data of a class wi.

p(x) for each of the classes is computed given the equation for multivariate normal distribution. This would be p(x|wi ) for i = 1, 2 (being a binary classifier).

Apriori Probabilities

the apriori probabilities P(w1) and P(w2) are calculated using

P(wi) = (numberof data points in wi) / (total number of datapoints)

Evidence

the evidence for each data point(in the test set) is calculated using the equation

p(x) = P (w1) ∗ p(x|w1) + P (w2) ∗ p(x|w2) (being a two category case)

Posterior Probability

Using Bayes rule, the posterior probability (Conditional probability) is found for each of the two classes.

posterior probability = apriori probability ∗ likelihood / evidence

Now with the conditional probabilities computed for each of the two classes, we can make a prediction based off of the values of P(w1|x) and P (w2|x).

Prediction

And being a minimum error rate classifier, we define the discriminant function gi(x) as P (wi|x). if P(w1|x) ≥ P (w2|x) (i.e., g1(x) ≥ g2(x)) then we predict the class to be w1 and predict w2 otherwise.