StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
8950266
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
2
CommunityOwnedDate
CreationDate
2012-01-21T02:51:33.180
FavoriteCount
0
LastActivityDate
2012-01-21T02:51:33.180
LastEditDate
LastEditorUserId
0
OwnerUserId
16007
ParentId
8949514
PostTypeId
2
Score
3
ViewCount
0
LastEditorDisplayName
text
Body
Pipelining and caches and the cpu itself no longer being the primary bottleneck has done two things to your question. One, the cpu's today generally execute one instruction per clock, second it can take many (dozens to hundreds) of clocks to feed the cpu an instruction. The more modern processors, even if their instruction sets are old, rarely bother to mention clock execution because it is one clock and the "real" execution speed is too hard to describe. The cache and pipeline try to allow the cpu to run at this one instruction per clock rate, but for example a read from memory, has to wait for the response to come back. If this item is not in cache this can be hundreds of clock cycles as it will have to read a number of locations to fill a line in the cache then some more clocks to get it through the caches back to the processor. Now if you go back in time, or present time but in the microcontroller world for example or other system where the memory system can respond in one clock, or at least a very deterministic number (say two clocks for eeprom and one for ram, that kind of thing), then you can very easily count the exact number of clocks. Processors like often do publish a table of cycles per instruction. A two instruction read for example would be two clocks to fetch the instruction, then another clock to perform the read, 3 clocks minimum. some would actually take more than one clock to execute so that would be added in as well. I highly recommend finding a (used) copy of Zen of Assembly Language by Michael Abrash. It was dated when it came out but still an important work. learning to juggle the relatively simple 8088/86 was tough enough, todays x86 and other systems are quite a bit more complicated. If running windows or linux or something like that trying to time your code wont necessarily get you to where you want. add or remove a nop, causing the code to be aligned in memory by as much as a byte can have dramatic affects on the performance of the remainder of the code which other than its location in ram has not changed. As a simple example of understanding the complicated nature of the problem. What processor or system are you interested in? the stm32f4 discovery board, about $20, contains an ARM (cortex-m) processor with instruction and data caches. It has the complications of a bigger system, but at the same time simple enough (relative to a bigger system) to be able to have controlled experiments. If you are familiar with the microchip pic world they often count cycles to perform precision delays between events. A very deterministic environment (so long as you dont use interrupts). 
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POHow long does each machine language instruction take to execute?
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. This table or related slice is empty.
UserOwnerUserId
1. USold_timer
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId
1. COIt seems to me your answer is out of date with respect to modern out-of-order processors, which don't execute instructions one by one, or even in the order they are laid out in memory. Of course there are still many low-end microprocessors that are based on a pipelined in-order design.
 singulars
 PostPostId
 PO
 UserUserId
 UShan
2. COit is very much in line, that just adds to the complication, but at the same time you still have a list of instructions trying to be fed into an execution unit through a pipe which puts you back into the same problem. You can chose to view it at the single pipe, single execution level which you see today, or back out and see multiple execution units, branch prediction, causing more chaos with the cache, etc.
 singulars
 PostPostId
 PO
 UserUserId
 USold_timer

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.