StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
464767
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
6
CommunityOwnedDate
2009-01-21T10:36:56.087
CreationDate
2009-01-21T10:36:56.087
FavoriteCount
0
LastActivityDate
2009-01-21T16:25:06.260
LastEditDate
2017-05-23T11:47:26.357
LastEditorUserId
-1
OwnerUserId
9493
ParentId
464342
PostTypeId
2
Score
14
ViewCount
0
LastEditorDisplayName
Brian
text
Body
There is a slight flaw in <a href="https://stackoverflow.com/questions/464342/combining-two-sorted-lists-in-python#464454">ghoseb's</a> solution, making it O(n**2), rather than O(n). The problem is that this is performing: <pre><code>item = l1.pop(0) </code></pre> With linked lists or deques this would be an O(1) operation, so wouldn't affect complexity, but since python lists are implemented as vectors, this copies the rest of the elements of l1 one space left, an O(n) operation. Since this is done each pass through the list, it turns an O(n) algorithm into an O(n**2) one. This can be corrected by using a method that doesn't alter the source lists, but just keeps track of the current position. I've tried out benchmarking a corrected algorithm vs a simple sorted(l1+l2) as suggested by <a href="https://stackoverflow.com/questions/464342/combining-two-sorted-lists-in-python#464538">dbr</a> <pre><code>def merge(l1,l2): if not l1: return list(l2) if not l2: return list(l1) # l2 will contain last element. if l1[-1] > l2[-1]: l1,l2 = l2,l1 it = iter(l2) y = it.next() result = [] for x in l1: while y < x: result.append(y) y = it.next() result.append(x) result.append(y) result.extend(it) return result </code></pre> I've tested these with lists generated with <pre><code>l1 = sorted([random.random() for i in range(NITEMS)]) l2 = sorted([random.random() for i in range(NITEMS)]) </code></pre> For various sizes of list, I get the following timings (repeating 100 times): <pre><code># items: 1000 10000 100000 1000000 merge : 0.079 0.798 9.763 109.044 sort : 0.020 0.217 5.948 106.882 </code></pre> So in fact, it looks like dbr is right, just using sorted() is preferable unless you're expecting very large lists, though it does have worse algorithmic complexity. The break even point being at around a million items in each source list (2 million total). One advantage of the merge approach though is that it is trivial to rewrite as a generator, which will use substantially less memory (no need for an intermediate list). [Edit] I've retried this with a situation closer to the question - using a list of objects containing a field "<code>date</code>" which is a datetime object. The above algorithm was changed to compare against <code>.date</code> instead, and the sort method was changed to: <pre><code>return sorted(l1 + l2, key=operator.attrgetter('date')) </code></pre> This does change things a bit. The comparison being more expensive means that the number we perform becomes more important, relative to the constant-time speed of the implementation. This means merge makes up lost ground, surpassing the sort() method at 100,000 items instead. Comparing based on an even more complex object (large strings or lists for instance) would likely shift this balance even more. <pre><code># items: 1000 10000 100000 1000000[1] merge : 0.161 2.034 23.370 253.68 sort : 0.111 1.523 25.223 313.20 </code></pre> [1]: Note: I actually only did 10 repeats for 1,000,000 items and scaled up accordingly as it was pretty slow.
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POCombining two sorted lists in Python
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USCommunity
UserOwnerUserId
1. USBrian
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.