Square-Root Implementation & Performance Comparisons Square root operation is considered difficult to implement in hardware, in...

Square-Root Implementation & Performance Comparisons Square root operation is considered difficult to implement in hardware, in this project, you must Write and test a MIPS assembly language program to implement three algorithms of an 8-bit integer square root. Background: In mathematics, a square root (√) of a number x is a number r such that r 2 = x, or, in other words, a number r whose square (the result of multiplying the number by itself, or r × r) is x. Some of the algorithms for calculating square roots of a positive Integer number N are shown below:

1. Newton Raphson Method The Newton Raphson method was first used in Gray

2. Iterative method starts with an initial (guess) value and improves accuracy of the result with each iteration. Assuming that X the original number the iterative equations for calculating the reciprocal of it's square root is: Yi+1 = Yi (3 – X (Yi)^2) Once Y has been calculated, one can get the square root by multiplying with X. This algorithm has quadratic convergence. In each iteration multiplications and additions or subtractions are needed. In order to speed up the multiplier, there must be a special a algorithm such as Wallace tree to get the partial production and use a carry propagate adder to get the production, because the multiplier requires a rather large number of gates counts, it is not so practical to place many multipliers on FPGA. Also it is hard to get an exact remainder of the square root.

3-The Radix-2 SRT-Redundant and Non-Redundant Algorithm The Radix-2 SRT-Redundant and Non-Redundant method are similar. Since them both based on recursive relation. In each iteration, they will be one digit shift left and addition. The determination of a function is rather complex, especially for high radix SRT algorithm. The implantations are not capable of accepting a square root on every clock cycle. Also notice that these two methods may generate a wrong resulting value at the last digit position.

4- The Non-Restoring Algorithm The operation at each iteration is simple: addition or subtraction based on the result bit generated in previous iteration. The remainder of the addition or subtraction is fed via registers to the next iteration directly even it is negative. At the last iteration, if the remainder is non-negative, it is a precise remainder. Otherwise, we can obtain a precise remainder by an addition operation.

Solutions

Expert Solution

Digit-recurrence algorithms produce a fixed number of quotient bits at each iteration where every iteration produces one quotient digit, most significant quotient digits first. This occurs because there are many variations on a particular implementation including radix, circuit choices for quotient selection, and given internal precision for a design. Such algorithms are similar to the paper-and-pencil method. It is well known that the choice of radix and quotient digit set influences the overall latency of the algorithm. Increasing the radix decreases the number of iterations required for the same quotient precision. As the radix increases, every iteration becomes more complicated and the overall latency is not reduced as expected. Moreover, it becomes impractical to generate the required divisor multiples for higher radices. The two basic digit-recurrence algorithms are the restoring and the non-restoring algorithms.Those algorithms only rely on simple radix-2 iterations, i.e., one bit of the quotient is produced at each iteration.

The restoring algorithm uses radix r = 2 with quotient digit set {0, 1}. At each iteration, the algorithm
subtracts the divisor d from the previous partial remainder w[j - 1] multiplied by r. The quotient digit
selection function is derived by comparing the residuals at each step with the divisor multiples. Restoring Division Algorithm :

w[0] ← x

for j from 1 to n do

w[j] ← 2 × w[j − 1]

w[j] ← w[j] − d

if w[j] ≥ 0 then

q_j ← 1

else

q_j ← 0

w[j] ← w[j] + d

Non-restoring division is an improved version of the restoring method in the sense that it completely avoids the restoration step by combining restoration additions with the next recurrence, thus, reducing the overall latency. Moreover, it uses the quotient digit set {-1, +1} to perform directly the recurrence with the selection function.

SRT Division :

If B has k leading zeros when expressed using n bits, shift all the registers left k bits.
For i=0, n - 1, a) If the top three bits of P are equal, set q_i=0 and shift (P,A) one bit left. b) If the top three bits of P are not all equal and P is negative, set q_i=1, shift (P,A) one bit left, and add B. c) Otherwise set q_i=1, shift (P,A) one bit left, and subtract B.
If the final remainder is negative, correct the remainder by adding B, and correct the quotient by subtracting 1 from q₀. Finally, the remainder must be shifted k bits right, where k is the initial shift.

SRT division of 1000₂/0011₂

Redundant digit sets : One important tool for speeding up subtractive division is redundant digit sets. In a non-redundant digit set for radix-r values, there are exactly r digits. The standard digit set for radix-r values is { 0, 1,.., r-1 } . In a redundant digit set, the number of digits is greater than r. For quotient representation, most of these are are of the form

non-redundant.

Code :

isqrt:
# v0 - return / root
# t0 - bit
# t1 - num
# t2,t3 - temps
move $v0, $zero # initalize return
move $t1, $a0 # move a0 to t1

addi $t0, $zero, 1
sll $t0, $t0, 30 # shift to second-to-top bit

isqrt_bit:
slt $t2, $t1, $t0 # num < bit
beq $t2, $zero, isqrt_loop

srl $t0, $t0, 2 # bit >> 2
j isqrt_bit

isqrt_loop:
beq $t0, $zero, isqrt_return

add $t3, $v0, $t0 # t3 = return + bit
slt $t2, $t1, $t3
beq $t2, $zero, isqrt_else

srl $v0, $v0, 1 # return >> 1
j isqrt_loop_end

isqrt_else:
sub $t1, $t1, $t3 # num -= return + bit
srl $v0, $v0, 1 # return >> 1
add $v0, $v0, $t0 # return + bit

isqrt_loop_end:
srl $t0, $t0, 2 # bit >> 2
j isqrt_loop

isqrt_return:
jr $ra

# Calling the Procedure
addi $a0, $zero, 15
jal isqrt # v0 = result

venereology answered 8 months ago

Next > < Previous

Question

Square-Root Implementation & Performance Comparisons Square root operation is considered difficult to implement in hardware, in...

Solutions

Expert Solution

Related Solutions

In this assignment, we will implement a square root approximater and then an nth root approximater....

In C++ In this assignment, we will implement a square root approximater and then an nth...

implement this crypto algorithm: Compare observed performance differences and differences in the implementation techniques. just implement...

Select five metrics used to assess performance (e.g. error, root mean square error (RMSE), bias, etc.)....