Overflow
Two words very common in Computer Science are
``buffer overflow'' and ``integer overflow''. Buffer overflow happens
when a programmer instructs the computer to write more bytes than the
buffer may hold; For example: programmer declares char buffer[2] and
instructs computer: buffer[0] = 'o', buffer[1] = 'v', buffer[2] =
'e'. Since the array named ``buffer'' is not big enough to store the
letter ``e'', the last one that was attempted, programmer has
overflown the buffer and that causes, many times, security holes in
the program.
Integer overflow happens when programmer instructs computer to store a
number that requires more bits than the integer is capable of; for
example: how could the computer store the number 2 in a 1-bit integer
data type? In binary (base 2), numbers are expressed with zeros and
ones. So, 0d = 0b, 1d = 1b, 2d = 10b. So the number 2 requires 2 bits
and if attempted to be stored in a 1-bit integer, overflow is what
happens.
Solution
Certify availability. If a buffer can only hold x
amount of bytes, make sure computer won't go beyond that. If integer
has only y bits, make sure integers have less or equal number of bits
than y before instructing computer to store it.
Implementation of scan_uint64()
The function below was
written by D. J. Bernstein; the overflow detection was written and
nicely explained by Wayne
Marshall. The function converts a buffer s to an unsigned integer
with 64 bits available. That is, the number can range from 0 to 2^64 -
1.
int scan_uint64(register char *s, register uint64 *u) {
register unsigned int pos;
register uint64 r;
register uint64 c;
pos = 0; r = 0;
for ( ;; ) {
c = (uint64) (unsigned char) (s[pos] - '0');
if (c < 10) {
r = r * 10 + c;
++pos; continue;
}
break;
}
*u = r;
return pos;
}
The function converts chars to integers as usual, but r may
overflow. The next implementation solves the problem.
int scan_uint64(register char *s, register uint64 *u) {
register unsigned int pos;
register uint64 r;
register uint64 c;
pos = 0; r = 0;
for ( ;; ) {
c = (uint64) (unsigned char) (s[pos] - '0');
if (c < 10) {
if( ((ULLONG_MAX - c) / 10) >= r)
r = r * 10 + c;
else return -1; /* overflow */
++pos; continue;
}
break;
}
*u = r;
return pos;
}
Constant ULLONG_MAX is defined as 0xffffffffffffffffULL in
some systems and in a similar fashion in others --- depending on how
many bytes is the size of a long.
The conversion is done by multiplying r by the base to append
a zero at the end. Example: r = 5. Multiply by 10: r = 50. The next
character to be converted is char = '4'. Convert it to integer c; c =
4. Add it to r. r = 50 + 4 = 54.
Verifying
The verification made by ((ULLONG_MAX - c) / 10) >= r) can be
geometrically represented and it may offer a clearer view of why it
works. Let's take a 4-bit integer data type. With 4 bits, the maximum
value we can store in an unsigned integer is 15. We will try to add 10
+ 6 --- which overflows --- and demonstrate how the verification from
scan_uint64() traps the overflow. We use # to
represent a digit and we call it a ``block''.
-----########## 10 blocks
---------###### 6 blocks
By adding 10 + 6, we are putting these blocks together. To check, we
can verify if one block would overlap on another.
############### 15 blocks (maximum)
########## 10 blocks
###### 6 blocks
^ It overlaps by one block. Overflow.
This is what scan_uint64() does. It checks to see if they'd
overlap. Demonstration: Computer will read the string "16" and try to
store it in a 4-bit unsigned integer with scan_uint64. First
character read is '1', conversion is done and r receives the value of
1. To get 16, computer needs to multiply r by 10 --- our base --- and
then add 6, but before multiplying r by 10, computer checks first if 6
can be added without overflow. Verifying: 15 - 6 = 9. Next, 9 / 10 =
0. Is 0 >= 1? No. Overflow.
Created: Fri Dec 31 09:45:02 EST 2004.
Updated: Wed Jan 05 20:56:16 EST 2005.