Question

In: Computer Science

Please show all work: Represent the number (+46.5) as a 32 bit floating-point number using the...

Please show all work:

Represent the number (+46.5) as a 32 bit floating-point number using the IEEE standard 754 format. N.B. The attached ‘Appendix’ section may prove useful in the conversion process.

           

  1. 0 1000 0111 0100 0100 0000 0000 0000 000
  2. 0 1011 0100 0111 0100 0000 0000 0000 000
  3. 0 1100 0110 0101 0100 0000 0000 0000 000
  4. 0 1000 0100 0111 0100 0000 0000 0000 000
  5. 0 1010 0100 0111 0100 0000 0000 0000 000

Solutions

Expert Solution

1. First, convert integer part to binary: 46.
Divide the number repeatedly by 2.
Keep track of each of the remainder.
We stop when we get a quotient that is equal to zero.
division = quotient + remainder;
46 ÷ 2 = 23 + 0;
23 ÷ 2 = 11 + 1;
11 ÷ 2 = 5 + 1;
5 ÷ 2 = 2 + 1;
2 ÷ 2 = 1 + 0;
1 ÷ 2 = 0 + 1;

2. Construct the base 2 representation of the integer part of the number.
Take all the remainders starting from the bottom of the list constructed above.
46(10) =
10 1110(2)

3. Convert to the binary (base 2) the fractional part: 0.5.
Multiply it repeatedly by 2.
Keep track of each integer part of the results.
Stop when we get a fractional part that is equal to zero.

#) multiplying = integer + fractional part;
1) 0.5 × 2 = 1 + 0;

4. Construct the base 2 representation of the fractional part of the number.
Take all the integer parts of the multiplying operations, starting from the top of the constructed list above:
0.5(10) =
0.1(2)

5. Positive number before normalization:
46.5(10) =
10 1110.1(2)

6. Normalize the binary representation of the number.
Shift the decimal mark 5 positions to the left so that only one non zero digit remains to the left of it:
46.5(10) =
10 1110.1(2) =
10 1110.1(2) × 20 =
1.0111 01(2) × 25

7. Up to this moment, there are the following elements that would feed into the 32 bit single precision IEEE 754 binary floating point representation:
Sign: 0 (a positive number)
Exponent (unadjusted): 5
Mantissa (not normalized):
1.0111 01

8. Adjust the exponent.
Use the 8 bit excess/bias notation:
Exponent (adjusted) =
Exponent (unadjusted) + 2(8-1) - 1 =
5 + 2(8-1) - 1 =
(5 + 127)(10) =
132(10)

9. Convert the adjusted exponent from the decimal (base 10) to 8 bit binary.
Use the same technique of repeatedly dividing by 2:
division = quotient + remainder;
132 ÷ 2 = 66 + 0;
66 ÷ 2 = 33 + 0;
33 ÷ 2 = 16 + 1;
16 ÷ 2 = 8 + 0;
8 ÷ 2 = 4 + 0;
4 ÷ 2 = 2 + 0;
2 ÷ 2 = 1 + 0;
1 ÷ 2 = 0 + 1;

10. Construct the base 2 representation of the adjusted exponent.
Take all the remainders starting from the bottom of the list constructed above:
Exponent (adjusted) =
132(10) =
1000 0100(2)

11. Normalize the mantissa.
a) Remove the leftmost bit, since it's allways 1, and the decimal point, if the case.
b) Adjust its length to 23 bits, by adding the necessary number of zeros to the right.
Mantissa (normalized) =
1. 01 1101 0 0000 0000 0000 0000 =
011 1010 0000 0000 0000 0000

12. The three elements that make up the number's 32 bit single precision IEEE 754 binary floating point representation:
Sign (1 bit) =
0 (a positive number)
Exponent (8 bits) =
1000 0100
Mantissa (23 bits) =
011 1010 0000 0000 0000 0000

Final answer:
Number 46.5 converted from decimal system (base 10)
to
32 bit single precision IEEE 754 binary floating point:
0 - 1000 0100 - 011 1010 0000 0000 0000 0000


Related Solutions

c) Using the 32-bit binary representation for floating point numbers, represent the number 1011100110011 as a...
c) Using the 32-bit binary representation for floating point numbers, represent the number 1011100110011 as a 32 bit floating point number. i) A digital camera processes the images images in the real-world and stores them in binary form. Using the principles of digital signal processing, practically explain how this phenomenon occurs.
represent the decimal number 101 and 6 as floating point binary numbers please show your work...
represent the decimal number 101 and 6 as floating point binary numbers please show your work and explained, I have a test.
verilog code to implement 32 bit Floating Point Adder in Verilog using IEEE 754 floating point...
verilog code to implement 32 bit Floating Point Adder in Verilog using IEEE 754 floating point representation.
Urgent Please Explain and show the difference between IEEE 16, 32, 64, 128-bit floating-point numbers.
Urgent Please Explain and show the difference between IEEE 16, 32, 64, 128-bit floating-point numbers.
The number –11.375 (decimal) represented as a 32-bit floating-point binary number according to the IEEE 754...
The number –11.375 (decimal) represented as a 32-bit floating-point binary number according to the IEEE 754 standard is
Consider the following 32-bit floating point representation based on the IEEE floating point standard: There is...
Consider the following 32-bit floating point representation based on the IEEE floating point standard: There is a sign bit in the most significant bit. The next eight bits are the exponent, and the exponent bias is 28-1-1 = 127. The last 23 bits are the fraction bits. The representation encodes number of the form V = (-1)S x M x 2E, where S is the sign, M is the significand, and E is the biased exponent. The rules for the...
Q1.Convert C46C000016 into a 32-bit single-precision IEEE floating-point binary number.
Q1.Convert C46C000016 into a 32-bit single-precision IEEE floating-point binary number.
Write a MIPS assembly language program that inputs a floating-point number and shows its internal 32-bit...
Write a MIPS assembly language program that inputs a floating-point number and shows its internal 32-bit representation as a 8-digit hexadecimal representation. For example: 2.5 decimal is 10.1 binary, which normalized is 1.01x21 and would be stored in the IEEE format as 0100 0000 0010 0000 0000 0000 0000 0000 which is 0x40200000
6a) (6 pts. each) Find the decimal represented by the 32-bit single precision floating point number...
6a) (6 pts. each) Find the decimal represented by the 32-bit single precision floating point number for the hexadecimal value C47CD000.
Assume that you have a 12-bit floating point number system, similar to the IEEE floating point...
Assume that you have a 12-bit floating point number system, similar to the IEEE floating point standard, with the format shown below and a bias of 7. The value of a floating point number in this system is represented as    FP = (-1)^S X 1.F X 2^(E-bias) for the floating point numbers A = 8.75 and B = -5.375. The binary representation of A is given as A = 0101 0000 1100 Show the hexidecimal representation of B.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT