StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>Naive Bayes probabilistic classifiers are commonly-used in text categorization. The basic idea is to use the joint probabilities of words and categories to estimate the probabilities of categories given a document. The naive part of such a model is the assumption of word independence. The simplicity of this assumption makes the computation of the Naive Bayes classifier far more efficient than the exponential complexity of non-naive Bayes approaches because it does not use word combination as predictors. If the task is to classify a test document into a single class, then the class with the highest posterior probability is selected.</p> <p>Here is one reference: [1] Tom Mitchell, "Machine Learning", McGraw-Hill, 1997. (Section 6.10)</p> <p>If you assume each question category as a text type then you can use text categorization.</p> <p>Naive Bayes classifier is based on Bayes theorem where you assume that all the features(or attribute) are independent.</p> <p>It's very easy to implement. You can find many software package with the implementation. e1071 Package in R implements it. Here is the sample code in R which uses naive bayes classifier:</p> <p><pre><code> N <- nrow(data) Ntrain <- round(N*0.7) data <- data[sample(1:N),] train <- data[1:Ntrain,] test <- data[(Ntrain+1):N,] y<-as.factor(train[,13]) x<-train[,3:12] y_test <- as.factor(test[,13]) x_test <- test[,3:12] library(e1071) m <- naiveBayes(x, y) pred_test <- predict(m,x_test, type = "class") pred <- predict(m,x, type = "class") </pre></code></p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload