StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
9960007
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
8
CommunityOwnedDate
CreationDate
2012-03-31T21:50:53.083
FavoriteCount
0
LastActivityDate
2012-04-04T18:39:48.817
LastEditDate
2012-04-04T18:39:48.817
LastEditorUserId
1274815
OwnerUserId
1274815
ParentId
9959902
PostTypeId
2
Score
4
ViewCount
0
LastEditorDisplayName
text
Body
As Alexandre C. explains, <a href="http://en.wikipedia.org/wiki/IEEE_754-2008" rel="nofollow">IEEE doubles</a> have a 53 bit mantissa (52 stored and top bit implied), and float has 24 bits (23 bits stored, and top bit implied). Edit: (Thanks for feedback, I hope this is clearer) When an integer is converted to a double <code>double f = (double)1024;</code>, the number is held with an appropriate exponent (1023+10), and the same bit pattern is effectively stored as the original integer (Actually IEEE binary floating point does not store the top bit. IEEE floating point numbers are 'normalised' to have the top bit = 1, by adjusting the exponent, then the top 1 is trimmed off because it is 'implied', which saves a bit of storage). A 32bit integer will require a double to hold its value perfectly, and an 8bit integer will be held perfectly in a float. There is no loss of information there. It can be converted back to an integer without loss. Loss happens with arithmetic, and fractional values. The integer is not mapped to +/-1 unless code does it. When code divides that 32bit integer, stored as a double, to map it to the range +/-1, then error will very likely be introduced. That mapping to +/-1 will loose some of the 53bit precision, but the error will only be in the lowest bits, well below the 32bits needed for the original integer. Subsequent operations might also lose precision. For example multiplying two numbers with a resulting range of more than 53 bits of precision will lose some bits (i.e. multiply two numbers with mantissa's more than 27 significant bits). An explanation of floating point which might be helpful is <a href="http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html" rel="nofollow">"What Every Computer Scientist Should Know About Floating-Point Arithmetic"</a> It explains some of the counter-intuitive (to me) behaviour of floating point numbers. For example, the number 0.1 can not be held exactly in an IEEE binary floating point double. This program might help you see what is happening: <pre><code>/* Demonstrate IEEE 'double' encoding on x86 * Show bit patterns and 'printf' output for double values * Show error representing 0.1, and accumulated error of adding 0.1 many times * G Bulmer 2012 */ #include <stdio.h> typedef struct { unsigned long long mantissa :52; unsigned exponent :11; unsigned sign :1; } double_bits; const unsigned exponent_offset = 1023; typedef union { double d; unsigned long long l; double_bits b; } Xlate; void print_xlate(Xlate val) { const long long IMPLIED = (1LL<<52); if (val.b.exponent == 0) { /* zero? */ printf("val: d: %19lf bits: %016llX [sign: %u exponent: zero=%u mantissa: %llX]\n", val.d, val.l, val.b.sign, val.b.exponent, val.b.mantissa); } else { printf("val: d: %19lf bits: %016llX [sign: %u exponent: 2^%4-d mantissa: %llX]\n", val.d, val.l, val.b.sign, ((int)val.b.exponent)-exponent_offset, (IMPLIED|val.b.mantissa)); } } double add_many(double d, int many) { double accum = 0.0; while (many-- > 0) { /* only works for +d */ accum += d; } return accum; } int main (int argc, const char * argv[]) { Xlate val; val.b.sign = 0; val.b.exponent = exponent_offset+1; val.b.mantissa = 0; print_xlate(val); val.d = 1.0; print_xlate(val); val.d = 0.0; print_xlate(val); val.d = -1.0; print_xlate(val); val.d = 3.0; print_xlate(val); val.d = 7.0; print_xlate(val); val.d = (double)((1LL<<31)-1LL); print_xlate(val); val.d = 2147483647.0; print_xlate(val); val.d = 10000.0; print_xlate(val); val.d = 100000.0; print_xlate(val); val.d = 1000000.0; print_xlate(val); val.d = 0.1; print_xlate(val); val.d = add_many(0.1, 100000); print_xlate(val); val.d = add_many(0.1, 1000000); print_xlate(val); val.d = add_many(0.1, 10000000); print_xlate(val); val.d = add_many(0.1,10); print_xlate(val); val.d *= 2147483647.0; print_xlate(val); int i = val.d; printf("int i=truncate(d)=%d\n", i); int j = lround(val.d); printf("int i=lround(d)=%d\n", j); val.d = add_many(0.0001,1000)-0.1; print_xlate(val); return 0; } </code></pre> The output is: <pre><code>val: d: 2.000000 bits: 4000000000000000 [sign: 0 exponent: 2^1 mantissa: 10000000000000] val: d: 1.000000 bits: 3FF0000000000000 [sign: 0 exponent: 2^0 mantissa: 10000000000000] val: d: 0.000000 bits: 0000000000000000 [sign: 0 exponent: zero=0 mantissa: 0] val: d: -1.000000 bits: BFF0000000000000 [sign: 1 exponent: 2^0 mantissa: 10000000000000] val: d: 3.000000 bits: 4008000000000000 [sign: 0 exponent: 2^1 mantissa: 18000000000000] val: d: 7.000000 bits: 401C000000000000 [sign: 0 exponent: 2^2 mantissa: 1C000000000000] val: d: 2147483647.000000 bits: 41DFFFFFFFC00000 [sign: 0 exponent: 2^30 mantissa: 1FFFFFFFC00000] val: d: 2147483647.000000 bits: 41DFFFFFFFC00000 [sign: 0 exponent: 2^30 mantissa: 1FFFFFFFC00000] val: d: 10000.000000 bits: 40C3880000000000 [sign: 0 exponent: 2^13 mantissa: 13880000000000] val: d: 100000.000000 bits: 40F86A0000000000 [sign: 0 exponent: 2^16 mantissa: 186A0000000000] val: d: 1000000.000000 bits: 412E848000000000 [sign: 0 exponent: 2^19 mantissa: 1E848000000000] val: d: 0.100000 bits: 3FB999999999999A [sign: 0 exponent: 2^-4 mantissa: 1999999999999A] val: d: 10000.000000 bits: 40C388000000287A [sign: 0 exponent: 2^13 mantissa: 1388000000287A] val: d: 100000.000001 bits: 40F86A00000165CB [sign: 0 exponent: 2^16 mantissa: 186A00000165CB] val: d: 999999.999839 bits: 412E847FFFEAE4E9 [sign: 0 exponent: 2^19 mantissa: 1E847FFFEAE4E9] val: d: 1.000000 bits: 3FEFFFFFFFFFFFFF [sign: 0 exponent: 2^-1 mantissa: 1FFFFFFFFFFFFF] val: d: 2147483647.000000 bits: 41DFFFFFFFBFFFFF [sign: 0 exponent: 2^30 mantissa: 1FFFFFFFBFFFFF] int i=truncate(d)=2147483646 int i=lround(d)=2147483647 val: d: 0.000000 bits: 3CE0800000000000 [sign: 0 exponent: 2^-49 mantissa: 10800000000000] </code></pre> That shows a full 32-bit int is represented exactly, and 0.1 is not. It shows that printf does not print exactly the floating point number but rounds or truncates (a thing to be wary of). It also illustrates that the amount of error in that representation of 0.1 doesn't accumulate to a large enough value in 1,000,000 add operations to cause printf to print it. It shows that the original integer can be recovered by rounding, but not assignment because assignment truncates. It shows that the subtraction operation can 'amplify' error (all that is left after that subtraction is error), and hence arithmetic should be carefully analysed. To put this into the context of music, where the sample rate might be 96KHz. It would take more than 10 seconds of additions before the error had built up enough for it to cause the top 32bits to contain more than 1 bit in error. Further. Christopher “Monty” Montgomery who created Ogg and Vorbis argues that 24 bits should be more than enough for audio in an article on music, sampling rate and sample resolution <a href="http://people.xiph.org/~xiphmont/demo/neil-young.html" rel="nofollow">24/192 Music Downloads ...and why they make no sense</a> Summary double holds 32-bit integers perfectly. There are rational decimal numbers of the form N/M (where M and N can be represented by a 32bit integer) which can not be represented by a finite sequence of binary-fraction bits. So, when an integer is mapped to the range +/-1, and hence is converted to a rational number (N/M) some numbers can not be represented by the finite number of bits in a double's fractional part, so errors will creep in. Those errors are typically very small, in the lowest bits, hence well below the upper 32 bits. So they can be converted back and forth between integer and double using rounding, and the error of double representation will not cause the integer to be wrong. BUT, arithmetic can change error. Incorrectly constructed arithmetic can cause the errors to grow rapidly, and could grow to a magnitude where the original integer value has been corrupted. Other thoughts: If precision is critical, there are other ways you might use doubles. None of them are as convenient as mapping to +/-1. Everything I can think of would require the arithmetic operations to be tracked, which would best be done using C++ wrapper classes. This would dramatically slow calculation, so may be pointless. This is a very sneaky way of doing <a href="http://blog.sigfpe.com/2005/07/automatic-differentiation.html" rel="nofollow">'Automatic Diferentiation'</a> by wrapping arithmetic in classes which keep track of extra information. I think the ideas in there might inspire an approach. It might even help identify where precision is lost. 
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POHow many bits of precision for a double between -1.0 and 1.0?
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USgbulmer
UserOwnerUserId
1. USgbulmer
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTDownMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.