Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>It is called an N-gram; in your case a 4-gram. It can indeed be obtained as the by-product of a Markov-chain, but you could also use a sliding window (size 4) to walk through the (linear) text while updating a 4-dimensionsal "histogram". </p> <p>UPDATE 2011-11-22: A markov chain is a way to model the probability of switching to a new state, given the current state. This is the stochastic equivalent of a "state machine". In the natural language case, the "state" is formed by the "previous N words", which implies that you consider the prior probability (before the previous N words) as equal_to_one. Computer people will most likely use a tree for implementing Markov chains in the NLP case. The "state" is simply the path taken from the root to the current node, and the probabilities of the words_to_follow are the probabilities of the current node's offspring. But every time that we choose a new child node, we actually shift down the tree, and "forget" the root node, out window is only N words wide, which translates to N levels deep into the tree.</p> <p>You can easily see that if you are walking a Markov chain/tree like this, at any time the probability before the first word is 1, the probability after the first word is P(w1), after the second word = P(w2) || w1, etc. So, when processing the corpus you build a Markov tree ( := update the frequencies in the nodes), at the end of the ride you can estimate the probability of a given choice of word by freq(word) / SUM(freq(siblings)). For a word 5-deep into the tree this is the <strong>probability of the word, given the previous 4 words</strong>. If you want the N-gram probabilities, you want the <strong>product of all the probabilities in the path</strong> from the root to the last word.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload