Note that there are some explanatory texts on larger screens.

plurals
  1. POA little help needed in code translation (Python to C#)
    primarykey
    data
    text
    <p>Good night everyone,</p> <p>This question leaves me a little embarassed because, of couse, I know I should be able to get the answer alone. However, my knowledge about Python is just a little bit more than nothing, so I need help from someone more experienced with it than me...</p> <p>The following code comes from <a href="http://norvig.com/ngrams" rel="nofollow">Norvig's "Natural Language Corpus Data"</a> chapter in a recently edited book, and it's about transforming a sentence "likethisone" into "[like, this, one]" (that means, segmenting the word correctly)...</p> <p>I have ported all of the code to C# (in fact, re-wrote the program by myself) except for the function <code>segment</code>, which I am having a lot of trouble even trying to understand it's syntax. Can someone please help me translating it to a more readable form in C#?</p> <p>Thank you very much in advance.</p> <pre><code>################ Word Segmentation (p. 223) @memo def segment(text): "Return a list of words that is the best segmentation of text." if not text: return [] candidates = ([first]+segment(rem) for first,rem in splits(text)) return max(candidates, key=Pwords) def splits(text, L=20): "Return a list of all possible (first, rem) pairs, len(first)&lt;=L." return [(text[:i+1], text[i+1:]) for i in range(min(len(text), L))] def Pwords(words): "The Naive Bayes probability of a sequence of words." return product(Pw(w) for w in words) #### Support functions (p. 224) def product(nums): "Return the product of a sequence of numbers." return reduce(operator.mul, nums, 1) class Pdist(dict): "A probability distribution estimated from counts in datafile." def __init__(self, data=[], N=None, missingfn=None): for key,count in data: self[key] = self.get(key, 0) + int(count) self.N = float(N or sum(self.itervalues())) self.missingfn = missingfn or (lambda k, N: 1./N) def __call__(self, key): if key in self: return self[key]/self.N else: return self.missingfn(key, self.N) def datafile(name, sep='\t'): "Read key,value pairs from file." for line in file(name): yield line.split(sep) def avoid_long_words(key, N): "Estimate the probability of an unknown word." return 10./(N * 10**len(key)) N = 1024908267229 ## Number of tokens Pw = Pdist(datafile('count_1w.txt'), N, avoid_long_words) </code></pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload