Question

In: Computer Science

Question about IEEE 754 and ARM Assembly: So as we know, numbers with fractional components such...

Question about IEEE 754 and ARM Assembly:

So as we know, numbers with fractional components such as 45.278 are commonly represented with single precision or double precision IEEE 754 Floating point format (also called binary32 or binary64, respectively). How are numbers stored in these formats? and briefly describe:

The fields in the binary32 format (size, purpose, etc)

How to convert a binary32 number to a decimal value (write a formula)

Solutions

Expert Solution

Usually, a real number in binary will be represented in the following format,

ImIm-1…I2I1I0.F1F2…FnFn-1

Where Im and Fn will be either 0 or 1 of integer and fraction parts respectively.

A finite number can also represented by four integers components, a sign (s), a base (b), a significand (m), and an exponent (e). Then the numerical value of the number is evaluated as

(-1)s x m x be ________ Where m < |b|

Depending on base and the number of bits used to encode various components, the IEEE 754 standard defines five basic formats. Among the five formats, the binary32 and the binary64 formats are single precision and double precision formats respectively in which the base is 2.

Single Precision Format (or) Binary32 Format:

As mentioned in Table 1 the single precision format has 23 bits for significand (1 represents implied bit, details below), 8 bits for exponent and 1 bit for sign.

For example, the rational number 9÷2 can be converted to single precision float format as following,

9(10) ÷ 2(10) = 4.5(10) = 100.1(2)

The result said to be normalized, if it is represented with leading 1 bit, i.e. 1.001(2) x 22. (Similarly when the number 0.000000001101(2) x 23 is normalized, it appears as 1.101(2) x 2-6). Omitting this implied 1 on left extreme gives us the mantissa of float number. A normalized number provides more accuracy than corresponding de-normalized number. The implied most significant bit can be used to represent even more accurate significand (23 + 1 = 24 bits) which is called subnormal representation. The floating point numbers are to be represented in normalized form.

The subnormal numbers fall into the category of de-normalized numbers. The subnormal representation slightly reduces the exponent range and can’t be normalized since that would result in an exponent which doesn’t fit in the field. Subnormal numbers are less accurate, i.e. they have less room for nonzero bits in the fraction field, than normalized numbers. Indeed, the accuracy drops as the size of the subnormal number decreases. However, the subnormal representation is useful in filing gaps of floating point scale near zero.

In other words, the above result can be written as (-1)0 x 1.001(2) x 22 which yields the integer components as s = 0, b = 2, significand (m) = 1.001, mantissa = 001 and e = 2. The corresponding single precision floating number can be represented in binary as shown below,

Where the exponent field is supposed to be 2, yet encoded as 129 (127+2) called biased exponent. The exponent field is in plain binary format which also represents negative exponents with an encoding (like sign magnitude, 1’s compliment, 2’s complement, etc.). The biased exponent is used for representation of negative exponents. The biased exponent has advantages over other negative representations in performing bitwise comparing of two floating point numbers for equality.

A bias of (2n-1 – 1), where n is # of bits used in exponent, is added to the exponent (e) to get biased exponent (E). So, the biased exponent (E) of single precision number can be obtained as

E = e + 127

The range of exponent in single precision format is -126 to +127. Other values are used for special symbols.

Note: When we unpack a floating point number the exponent obtained is biased exponent. Subtracting 127 from the biased exponent we can extract unbiased exponent.

Float Scale:

The following figure represents floating point scale.


Related Solutions

Determine the representation for the following decimal numbers in single-precision IEEE 754 format. Give them in...
Determine the representation for the following decimal numbers in single-precision IEEE 754 format. Give them in 32-bit binary and show the calculation. -10^(−8)
In this question, you are provided with an IEEE-754 floating-point number in the form of 8...
In this question, you are provided with an IEEE-754 floating-point number in the form of 8 hexadecimal digits. You are asked to decode this value into its decimal representation. You MUST report your answer as a real number. Do NOT use scientific notation. Do NOT round or truncate your answer. Do NOT add any spaces or commas to your answer. If the converted number is positive, do NOT add the plus sign. Convert, i.e., decode, 0x48801002 from the 32-bit single-precision...
Given the following 32-bit binary sequences representing single precision IEEE 754 floating point numbers: a =...
Given the following 32-bit binary sequences representing single precision IEEE 754 floating point numbers: a = 0100 0000 1101 1000 0000 0000 0000 0000 b = 1011 1110 1110 0000 0000 0000 0000 0000 Perform the following arithmetic and show the results in both normalized binary format and IEEE 754 single-precision format. Show your steps. a)     a + b b)     a × b
Express the following two base 10 numbers in binary using the IEEE 754 single-precision floating point...
Express the following two base 10 numbers in binary using the IEEE 754 single-precision floating point format (i.e., 32 bits). Express your final answer in hexadecimal (e.g., 32’h????????). a) 68.3125 b) -19.675
The following code fragment is expressed in arm assembly code.Fill in the blanks, so that...
The following code fragment is expressed in arm assembly code. Fill in the blanks, so that it is equivalent to the following C code.int counter;int x = 5;int y = 6;for (counter =10; counter >0;counter--)IF(X==Y)Y = Y + 1 ;ELSEY = Y + 2}Fill in the blanks in the following code:MOV__________ ;loop counter into r0-ten times round the loopMOV__________ ;Value of y loaded into r1MOV__________ ;Value of x loaded into r2Next CMP ____________ ;assume r1 contains y and r2 contains...
Represent the following decimal numbers using IEEE-754 floating point representation. A. -0.375 B. -Infinity C. 17...
Represent the following decimal numbers using IEEE-754 floating point representation. A. -0.375 B. -Infinity C. 17 D. 5.25
The following code fragment is expressed in arm assembly code. Fill in the blanks, so that...
The following code fragment is expressed in arm assembly code. Fill in the blanks, so that it is equivalent to the following C code. int counter; int x = 5; int y = 6; for (counter =10; counter >0;counter--) IF(X==Y) Y = Y + 1 ; ELSE Y = Y + 2} Fill in the blanks in the following code: MOV__________ ;loop counter into r0-ten times round the loop MOV__________ ;Value of y loaded into r1 MOV__________ ;Value of x...
Read and comment about the following statement and question: Why is so important to know the...
Read and comment about the following statement and question: Why is so important to know the variety of costs to be considered in an engineering economic analysis?(Sullivan W., Wicks E. & Patrick C., 2012, page 21.).
Your agency works primarily with schizophrenic clients. Given what we know about this illness, what components...
Your agency works primarily with schizophrenic clients. Given what we know about this illness, what components would need to be available in treatment to be most effective with this population?
we have studied the concept of risk and return - so we know the fundamentals. To...
we have studied the concept of risk and return - so we know the fundamentals. To assume additional risk, investors will require the opportunity to receive additional return. Additionally, some investors by nature are more risk averse than others - this is what drives financial markets. Let's assume that you have just inherited an unexpected large sum of $100,000 for which you have no pressing financial demands and which you decided to invest for 10 years to revisit at that...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT