No, I get that. I’m sure the programming language design people know what they are doing. I just can’t grasp how a double (which has to use at least 1 bit to represent whether or not there is a fractional component) can possibly store more exact integer vales than an integer type of the same length (same number of bits).
It just seems to violate some law of information theory to my novice mind.
It doesn’t. A double is a 64 bit value while an integer is 32 bit. A long is a 64 bit signed integer which stores more exact integer numbers than a double.
Technically, a double stores most integers exactly ( up until a certain value ) and then approximations of integers of much larger sizes. A long stores all its integers exactly but cannot handle values nearly as large.
For most real world data ranges, they are both going to store integers exactly.
I don’t think that’s possible. Representing more exact ints means representing larger ints and vice versa. I’m ignoring signed vs. unsigned here as in theory both the double and int/long can be signed or unsigned.
Edit: ok, I take this back. I guess you can represent larger values as long as you are ok that they will be estimates. Ie, double of N (for some very large N) will equal double of N + 1.
You can think of a double as having a fixed precision, but, in contrast to an integer, this precision can be moved over the decimal point depending on the value you want to represent. Therefore, despite representing floating-point numbers, a double still has discrete steps determined by its binary representation of 64 bits. If the value of a double gets larger, it reaches a point where the smallest difference between two subsequent doubles is greater than one.
For float (32 bit), you reach this point at 16777216. The next larger number to be represented as a float is 16777218 (i.e., +2).
I agree with all that. But I’m talking about exact integer values as mentioned in the parent.
I just think this has to be true:
count(exact integers that can be represented by a N bit floating point variable) < count(exact integers that can be represented by an N bit int type variable)
I’m going to guess here (cause I feel this community is for learning)…
Integers have exactness. Doubles have range.
So if MAX_INT + 1 is possible, then ~(MAX_INT + 1) is probably preferable to an overflow or silent MIN_INT.
But Math.ceil probably expects a float, because it is dealing with decimals (or similar). If it was an int, rounding wouldn’t be required.
So if Math.ceil returned and integer, then it could parse a float larger than INT_MAX, which would overflow an int (so error, or overflow). Or just return a float
I would need to look into the exact difference of double vs integer to know, but a partially educated guess is that they are referring to Int32 vs double and not Int64, aka long. I did a small search and saw that double uses 32 bits for the whole numbers and the others for the decimal.
Okay, so I dug in a bit deeper. Doubles are standardized as a 64 bit bundle that is divided into 1 signed bit, 11 exponetioal bits and 52 bits for decimal. It’s quite interesting. As to how it works indepth, I probably will try to analyze a bit conversion if I can try something
No, I get that. I’m sure the programming language design people know what they are doing. I just can’t grasp how a double (which has to use at least 1 bit to represent whether or not there is a fractional component) can possibly store more exact integer vales than an integer type of the same length (same number of bits).
It just seems to violate some law of information theory to my novice mind.
It doesn’t. A double is a 64 bit value while an integer is 32 bit. A long is a 64 bit signed integer which stores more exact integer numbers than a double.
Technically, a double stores most integers exactly ( up until a certain value ) and then approximations of integers of much larger sizes. A long stores all its integers exactly but cannot handle values nearly as large.
For most real world data ranges, they are both going to store integers exactly.
It doesn’t store more values bit for bit, but it can store larger values.
I don’t think that’s possible. Representing more exact ints means representing larger ints and vice versa. I’m ignoring signed vs. unsigned here as in theory both the double and int/long can be signed or unsigned.
Edit: ok, I take this back. I guess you can represent larger values as long as you are ok that they will be estimates. Ie, double of N (for some very large N) will equal double of N + 1.
You can think of a double as having a fixed precision, but, in contrast to an integer, this precision can be moved over the decimal point depending on the value you want to represent. Therefore, despite representing floating-point numbers, a double still has discrete steps determined by its binary representation of 64 bits. If the value of a double gets larger, it reaches a point where the smallest difference between two subsequent doubles is greater than one. For float (32 bit), you reach this point at 16777216. The next larger number to be represented as a float is 16777218 (i.e., +2).
Here is a nice online tool that demonstrates this (and contains much more information on the encoding of floating-point numbers): https://www.h-schmidt.net/FloatConverter/IEEE754.html
I agree with all that. But I’m talking about exact integer values as mentioned in the parent.
I just think this has to be true: count(exact integers that can be represented by a N bit floating point variable) < count(exact integers that can be represented by an N bit int type variable)
I’m going to guess here (cause I feel this community is for learning)…
Integers have exactness. Doubles have range.
So if
MAX_INT + 1
is possible, then~(MAX_INT + 1)
is probably preferable to an overflow or silentMIN_INT
.But
Math.ceil
probably expects a float, because it is dealing with decimals (or similar). If it was an int, rounding wouldn’t be required.So if
Math.ceil
returned and integer, then it could parse a float larger than INT_MAX, which would overflow an int (so error, or overflow). Or just return a floatI would need to look into the exact difference of double vs integer to know, but a partially educated guess is that they are referring to Int32 vs double and not Int64, aka long. I did a small search and saw that double uses 32 bits for the whole numbers and the others for the decimal.
Yeah, that was my guess too. But that just means they could return a long (or whatever the 64 bit int equivalent in java is) instead of an int.
Okay, so I dug in a bit deeper. Doubles are standardized as a 64 bit bundle that is divided into 1 signed bit, 11 exponetioal bits and 52 bits for decimal. It’s quite interesting. As to how it works indepth, I probably will try to analyze a bit conversion if I can try something
Oh now I get what you mean, and like others mentioned, yeah it’s more bits :)