# Division by constant signed integers

The code accompanying this article can be found in a github repository.

Division is a relatively slow operation. When the divisor is constant, the division can be optimized significantly. In [1] I explored how this can be done for unsigned integers. In this follow-up article, I cover how we can optimize division by constant signed integers. This article should be read as a continuation of [1]. This article essentially provides the same information as [2].

I assume that like in most programming languages, the result of the signed division is rounded toward zero. This presents some challenges which are different than those for optimizing unsigned division. We are only dealing with numbers with a magnitude of at most , and we will see that this means we can always use the round-up method as described in [1]. The challenge consists of efficiently rounding up the quotient when is negative.

## Mathematical background

### Preliminaries

I will assume that we are working on an -bit machine which can efficiently compute the full -bit product of two -bit signed integers. I will use the notation ​ for the set of unsigned integers that can be represented with bits:

Likewise, I will use the notation for the set of signed integers that can be represented with bits:

When and are sets, the set denotes the set of elements of that are not in .

For some real number , I will denote the absolute value of by . That is:

I will use the notation for the biggest integer smaller than or equal to , and for the smallest integer bigger than or equal to . I will use to denote the value of when rounded toward zero. That is

I will use the notation for the sign function:

Finally, I will use the notation where is a predicate to denote the characteristic function:

### Signed division

Supposing that is positive, we want to evaluate

So, it is natural to try to find two expressions, one which equals and one which equals .

In [1], we saw that we for every we can always find an -bit magic number such that for every . For unsigned division this was a problem since we want to fit in bits. For nonnegative we have so in this case will always fit in bits.

Corollary 11: Let with . If there exists an with

then for all nonnegative .

Proof: Note that the set of nonnegative in is exactly . So this is just theorem 2 from [1] with replaced by .

We will now prove a similar result for negative . The following lemma will come in handy.

Lemma 12: Let , and . When

then

Proof: Set with , . Then . Now and so that . So . Since is an integer, it follows that .

The following theorem gives us a way to calculate from the same expression when is negative.

Lemma 13: Let . If

then

for all negative .

Proof: Multiply by . Remember that is negative, so the inequality ‘flips’ and we get . Now, using that we see that , so we have . The result now follows from lemma 2.

So, if we find an such that , we have . The following theorem extends this result to the case where is negative and shows us how to pick .

Theorem 14: Let and be integers with and define and . Then

for all .

Proof: First, observe that is simply the first multiple of larger than . Since there are , there must be at least one multiple of in the range . So we have . Using corollary 11 we see that for nonnegative . Using lemma 13, we see that for negative . So for all . Using the result follows.

Lemma 5 tells us that the magic number is a positive number with at most bits. In the following section on implementation, we will see that multiplying an -bit signed number by an -bit unsigned number is slightly less efficient than multiplying an -bit signed number by an -bit unsigned number. For some divisors, we can use the following result to reduce the number of bits that we need to represent .

Corollary 16: Let be a positive integer that is not a power of two, and let , such that satisfy the condition of lemma corollary 11 and lemma 13:

If is even, then this condition also holds for , so we have . If is odd, then it is the smallest integer for which this condition holds.

Proof: This follows directly from theorem 9.

## Implementation

In this section, I use the uint and sint datatypes, which are an -bit unsigned integer and an -bit signed integer, respectively. I try to provide a general strategy that should work well on most instruction set architectures. Variations in the implementation might give a more efficient result. In general, you should always benchmark your implementation if performance is critical.

While theorem 3 in the previous section seems to provide a straightforward method to compute the quotient for any , there is one subtlety we glanced over. In theorem 4, we use the -bit expression , where is an unsigned value and is a signed value. While most processors have instructions to compute the full -bit product of two -bit unsigned integers or two -bit signed integers, most processors do not provide an instruction to compute the -bit product of an -bit unsigned integer and an -bit signed integer.

While it is also possible to compute the product by first extending and to -bit signed values and computing the product of those extended values, this is less efficient.

### Computing the product of an unsigned and a signed value

In this section, consider and to be -bit bit strings. So these variables no longer represent a number, but purely a series of bits, which can hold a zero or a one. So we write , where are the individual bits.

Now, we can provide a string with a value when we interpret it as either an unsigned value , or as a signed value . These interpretations are defined as

and

We see that when and when . So when we have

So in this case we can just use signed multiplication. When we have

So, the upper bits of the product equals . This expression can be evaluated by multiplying and as if they where signed numbers, taking the upper bits, and adding to this.

### Runtime optimization

Consider the following code:

sint d = read_divisor();

for (int i = 0; i < size; i++) {
quotient[i] = dividend[i] / d;
}


The value of the divisor d is not known at compile time, but once it is read at runtime, it does not change. As such, we consider d to be a runtime constant, and we can optimize this code in the following way:

sint d = read_divisor();
divdata_t divisor_data = precompute(divisor);

for (int i = 0; i < size; i++) {
quotient[i] = fast_divide(dividend[i], divisor_data);
}


Now, the divdata_t datatype needs to hold , the number of bits to shift, and some field to indicate that we should negate the result when is negative:

typedef struct {
uint mul;
uint shift;
bool negative;
} divdata_t;


Now, we compute such that always has the most significant bit set. This way we can always compute the upper bits of the product in fast_divide by taking the signed product, taking the upper bit, and adding to this.

The precomputation is now a relatively straightforward implementation of theorem 4:

sdivdata_t precompute(sint d) {
sdivdata_t divdata;
uint d_abs = abs(d);

// Compute ceil(log2(d_abs))
uint l = floor_log2(d_abs);
if ((1 << l) < d_abs) l++;

// Handle case |d| = 1
if (dabs == 1) l = 1;

// Compute m = floor(2^(N - 1 + l) / d) + 1
uint m = (((big_uint)1) << (N - 1 + l)) / d_abs + 1;

divdata.mul = m;
divdata.negative = d < 0;
divdata.shift = l - 1;
return divdata;
}


It should be noted that in the fast_divide function, the right shift by is implemented by taking the upper bits of the product , and shifting this right by bits. If this is not possible, since in this case we need to shift right by bits, but taking the upper bits is already equivalent to a right shift of bits. We can fix this by simply setting when . In this case the expression for becomes , which will overflow to simply . Now, while theorem 4 doesn’t hold anymore for , the calculation of the product in fast_divide assumes that the most significant bit of is set, so we will end up with the correct value. In fact, setting would work as well.

The fast_divide function has a lot of steps, but every step should be understandable.

sint fast_divide(sint n, sdivdata_t dd) {
big_sint full_signed_product = ((big_sint)n) * (sint)dd.mul;
sint high_word_of_signed_product = full_signed_product >> N;
sint high_word_of_unsigned_product = high_word_of_signed_product + n;
sint rounded_down_quotient = high_word_of_unsigned_product >> dd.shift;
sint quotient_rounded_toward_zero = rounded_down_quotient - (n >> (N - 1));
if (dd.negative) {
quotient_rounded_toward_zero = -quotient_rounded_toward_zero;
}
return quotient_rounded_toward_zero;
}


### Compile-time optimization

In this section, I will consider how to generate optimized code for division by compile-time constant signed integers.

Most of the tricks that are applicable to calculate a quotient of unsigned integers efficiently also apply to signed integers, although we might have to do some work to handle negative integers. Of course, a division by one can be ignored and a division by minus one is equivalent to a negation. For some instruction-set architectures, it might be beneficial to implement a special case for big divisors with an absolute value of more than . In this case, the value of the quotient is when and zero otherwise.

expression_t div_by_const_sint(const sint d, expression_t n) {
if (d == 1) return n;
if (d == -1) return neg(n);
uint d_abs = abs(d);
if (is_power_of_two(d_abs)) return div_by_const_signed_power_of_two(n, d);
return div_fixpoint(d, n);
}


Let us first consider the case where is a power of two. If we do an arithmetic right shift by bits, the result will be correct when is positive. However, this will round down the quotient when is negative. In this case we can add to in order to round up.

So, we would like to have an expression which equals when is negative and otherwise, so we can simply add this to . The value consists of consecutive ones in the binary representation. It can be created by doing an arithmetic right shift by bits on , and shifting the result right by bits (with a normal right shift). The first shift produces the ones in the most significant bits when is negative (these are zero bits otherwise), the second shift puts them in the least significant bit positions.

expression_t div_by_const_signed_power_of_two(expression_t n, sint d) {
uint d_abs = abs(d);
int l = floor_log2(d_abs);

// addme equals 2^l - 1 when n is negative and 0 otherwise
// We need to add this to n to round towards zero.
expression_t addme = shr(sar(n, constant(l - 1)), constant(N - l));