StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POMapping a list to a Huffman Tree whilst preserving relative order
primarykey
Id
20223488
data
AcceptedAnswerId
20290611
AnswerCount
2
ClosedDate
CommentCount
5
CommunityOwnedDate
CreationDate
2013-11-26T17:07:19.067
FavoriteCount
0
LastActivityDate
2013-12-02T17:30:58.667
LastEditDate
2013-11-29T12:01:56.293
LastEditorUserId
755671
OwnerUserId
755671
ParentId
0
PostTypeId
1
Score
2
ViewCount
680
LastEditorDisplayName
text
Body
I'm having an issue with a search algorithm over a Huffman tree: for a given probability distribution I need the Huffman tree to be identical regardless of permutations of the input data. Here is a picture of what's happening vs what I want: <img src="https://i.stack.imgur.com/t4G7u.jpg" alt="Expectation vs reality"> Basically I want to know if it's possible to preserve the relative order of the items from the list to the tree. If not, why is that so? For reference, I'm using the Huffman tree to generate sub groups according to a division of probability, so that I can run the search() procedure below. Notice that the data in the merge() sub-routine is combined, along with the weight. The codewords themselves aren't as important as the tree (which should preserve the relative order). For example if I generate the following Huffman codes: <pre><code>probabilities = [0.30, 0.25, 0.20, 0.15, 0.10] items = ['a','b','c','d','e'] items = zip(items, probabilities) t = encode(items) d,l = hi.search(t) print(d) </code></pre> Using the following Class: <pre><code>class Node(object): left = None right = None weight = None data = None code = None def __init__(self, w,d): self.weight = w self.data = d def set_children(self, ln, rn): self.left = ln self.right = rn def __repr__(self): return "[%s,%s,(%s),(%s)]" %(self.data,self.code,self.left,self.right) def __cmp__(self, a): return cmp(self.weight, a.weight) def merge(self, other): total_freq = self.weight + other.weight new_data = self.data + other.data return Node(total_freq,new_data) def index(self, node): return node.weight def encode(symbfreq): pdb.set_trace() tree = [Node(sym,wt) for wt,sym in symbfreq] heapify(tree) while len(tree)>1: lo, hi = heappop(tree), heappop(tree) n = lo.merge(hi) n.set_children(lo, hi) heappush(tree, n) tree = tree[0] def assign_code(node, code): if node is not None: node.code = code if isinstance(node, Node): assign_code(node.left, code+'0') assign_code(node.right, code+'1') assign_code(tree, '') return tree </code></pre> I get: <pre><code>'a'->11 'b'->01 'c'->00 'd'->101 'e'->100 </code></pre> However, an assumption I've made in the search algorithm is that more probable items get pushed toward the left: that is I need 'a' to have the '00' codeword - and this should always be the case regardless of any permutation of the 'abcde' sequence. An example output is: <pre><code>codewords = {'a':'00', 'b':'01', 'c':'10', 'd':'110', 'e':111'} </code></pre> (N.b even though the codeword for 'c' is a suffix for 'd' this is ok). For completeness, here is the search algorithm: <pre><code>def search(tree): print(tree) pdb.set_trace() current = tree.left other = tree.right loops = 0 while current: loops+=1 print(current) if current.data != 0 and current is not None and other is not None: previous = current current = current.left other = previous.right else: previous = other current = other.left other = other.right return previous, loops </code></pre> It works by searching for the 'leftmost' 1 in a group of 0s and 1s - the Huffman tree has to put more probable items on the left. For example if I use the probabilities above and the input: <pre><code>items = [1,0,1,0,0] </code></pre> Then the index of the item returned by the algorithm is 2 - which isn't what should be returned (0 should, as it's leftmost).
Tags
<python><data-structures><huffman-code>
Title
Mapping a list to a Huffman Tree whilst preserving relative order
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USTom Kealy
UserOwnerUserId
1. USTom Kealy
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POMapping a list to a Huffman Tree whilst preserving relative order
 UserUserId
 USTom Kealy
 VoteTypeVoteTypeId
 VTBountyStart
2. VO
 singulars
 PostPostId
 POMapping a list to a Huffman Tree whilst preserving relative order
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTApproveEditSuggestion
3. VO
 singulars
 PostPostId
 POMapping a list to a Huffman Tree whilst preserving relative order
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.