In: Computer Science
Square-Root Implementation & Performance Comparisons Square root operation is considered difficult to implement in hardware, in this project, you must Write and test a MIPS assembly language program to implement three algorithms of an 8-bit integer square root. Background: In mathematics, a square root (√) of a number x is a number r such that r 2 = x, or, in other words, a number r whose square (the result of multiplying the number by itself, or r × r) is x. Some of the algorithms for calculating square roots of a positive Integer number N are shown below:
1. Newton Raphson Method The Newton Raphson method was first used in Gray
2. Iterative method starts with an initial (guess) value and improves accuracy of the result with each iteration. Assuming that X the original number the iterative equations for calculating the reciprocal of it's square root is: Yi+1 = Yi (3 – X (Yi)^2) Once Y has been calculated, one can get the square root by multiplying with X. This algorithm has quadratic convergence. In each iteration multiplications and additions or subtractions are needed. In order to speed up the multiplier, there must be a special a algorithm such as Wallace tree to get the partial production and use a carry propagate adder to get the production, because the multiplier requires a rather large number of gates counts, it is not so practical to place many multipliers on FPGA. Also it is hard to get an exact remainder of the square root.
3-The Radix-2 SRT-Redundant and Non-Redundant Algorithm The Radix-2 SRT-Redundant and Non-Redundant method are similar. Since them both based on recursive relation. In each iteration, they will be one digit shift left and addition. The determination of a function is rather complex, especially for high radix SRT algorithm. The implantations are not capable of accepting a square root on every clock cycle. Also notice that these two methods may generate a wrong resulting value at the last digit position.
4- The Non-Restoring Algorithm The operation at each iteration is simple: addition or subtraction based on the result bit generated in previous iteration. The remainder of the addition or subtraction is fed via registers to the next iteration directly even it is negative. At the last iteration, if the remainder is non-negative, it is a precise remainder. Otherwise, we can obtain a precise remainder by an addition operation.
Digit-recurrence algorithms produce a fixed number of quotient bits at each iteration where every iteration produces one quotient digit, most significant quotient digits first. This occurs because there are many variations on a particular implementation including radix, circuit choices for quotient selection, and given internal precision for a design. Such algorithms are similar to the paper-and-pencil method. It is well known that the choice of radix and quotient digit set influences the overall latency of the algorithm. Increasing the radix decreases the number of iterations required for the same quotient precision. As the radix increases, every iteration becomes more complicated and the overall latency is not reduced as expected. Moreover, it becomes impractical to generate the required divisor multiples for higher radices. The two basic digit-recurrence algorithms are the restoring and the non-restoring algorithms.Those algorithms only rely on simple radix-2 iterations, i.e., one bit of the quotient is produced at each iteration.
The restoring algorithm uses radix r = 2 with quotient digit set
{0, 1}. At each iteration, the algorithm
subtracts the divisor d from the previous partial remainder w[j -
1] multiplied by r. The quotient digit
selection function is derived by comparing the residuals at each
step with the divisor multiples. Restoring Division Algorithm :
w[0] ← x
for j from 1 to n do
w[j] ← 2 × w[j − 1]
w[j] ← w[j] − d
if w[j] ≥ 0 then
qj ← 1
else
qj ← 0
w[j] ← w[j] + d
Non-restoring division is an improved version of the restoring method in the sense that it completely avoids the restoration step by combining restoration additions with the next recurrence, thus, reducing the overall latency. Moreover, it uses the quotient digit set {-1, +1} to perform directly the recurrence with the selection function.
SRT Division :
SRT division of 10002/00112
Redundant digit sets : One important tool for speeding up subtractive division is redundant digit sets. In a non-redundant digit set for radix-r values, there are exactly r digits. The standard digit set for radix-r values is { 0, 1,.., r-1 } . In a redundant digit set, the number of digits is greater than r. For quotient representation, most of these are are of the form
non-redundant.
Code :
isqrt:
# v0 - return / root
# t0 - bit
# t1 - num
# t2,t3 - temps
move $v0, $zero # initalize return
move $t1, $a0 # move a0 to t1
addi $t0, $zero, 1
sll $t0, $t0, 30 # shift to second-to-top bit
isqrt_bit:
slt $t2, $t1, $t0 # num < bit
beq $t2, $zero, isqrt_loop
srl $t0, $t0, 2 # bit >> 2
j isqrt_bit
isqrt_loop:
beq $t0, $zero, isqrt_return
add $t3, $v0, $t0 # t3 = return + bit
slt $t2, $t1, $t3
beq $t2, $zero, isqrt_else
srl $v0, $v0, 1 # return >> 1
j isqrt_loop_end
isqrt_else:
sub $t1, $t1, $t3 # num -= return + bit
srl $v0, $v0, 1 # return >> 1
add $v0, $v0, $t0 # return + bit
isqrt_loop_end:
srl $t0, $t0, 2 # bit >> 2
j isqrt_loop
isqrt_return:
jr $ra
# Calling the Procedure
addi $a0, $zero, 15
jal isqrt # v0 = result