Naive Bayes Classifier
11 Oct 2018
Naive Bayes Classifier
Naive Bayes is one of the machine learning techniques used for classification and prediction. Obviously, it involves Bayesian Theorem. The classifier assumes that all explanatory variables are independent to each other respectively contributing to the response variable.
Advantage
-
Naive Bayes classifier can be trained effectively on Supervised Learning Environment, because it does not require a lot of data for training.
-
Despite its simplicity and design, it has been proven to work well in various complex situations
Model
Naives Bayes Model is conditional probabilistic model. The \(n\) features (independent explanatory variables) are represented as vector \(\mathbf x = (x_1,x_2,x_3,x_4,....x_n)\), which is the data given for instance classification.
The instance probabilities are \(p(C_k \vert x_1,...,x_n)\). In other words, the probability is calculated for all \(K\) (or \(C_k\)) possible outcomes.
The denominator $p(\mathbf x)$ is always the same, so when comparing probability between instances and find the best classification, the numerator part is the one that only matters.
Using the characteristics of conditional probability and if the explanatory variables are independent, the numerator part above can be altered like this:
Thus, the final model can be written as:
\[p(C_k \vert \mathbf x) \varpropto p(C_k)*\prod_{i=1}^{n} p(x_i \vert C_k)\]In short, the Naive Bayes Classifier is product of all conditional probablities of explanatory variables to the given category and the probability of category itself.