Question

In: Computer Science

Convert the following decimal numbers to 32-bit IEEE floating point: 86.59375 -1.59729 Convert the following 32-bit...

Convert the following decimal numbers to 32-bit IEEE floating point: 86.59375 -1.59729

Convert the following 32-bit IEEE floating point numbers to decimal:

0100 1100 1110 0110 1111 1000 0000 0000
1011 0101 1110 0110 1010 0110 0000 0000

Solutions

Expert Solution

1)
a)
Converting 86.59375 to binary
   Convert decimal part first, then the fractional part
   > First convert 86 to binary
   Divide 86 successively by 2 until the quotient is 0
      > 86/2 = 43, remainder is 0
      > 43/2 = 21, remainder is 1
      > 21/2 = 10, remainder is 1
      > 10/2 = 5, remainder is 0
      > 5/2 = 2, remainder is 1
      > 2/2 = 1, remainder is 0
      > 1/2 = 0, remainder is 1
   Read remainders from the bottom to top as 1010110
   So, 86 of decimal is 1010110 in binary
   > Now, Convert 0.59375 to binary
      > Multiply 0.59375 with 2.     Since 1.1875 is >= 1. then add 1 to result
      > Multiply 0.1875 with 2.  Since 0.375 is < 1. then add 0 to result
      > Multiply 0.375 with 2.   Since 0.75 is < 1. then add 0 to result
      > Multiply 0.75 with 2.    Since 1.5 is >= 1. then add 1 to result
      > Multiply 0.5 with 2.     Since 1.0 is >= 1. then add 1 to result
      > This is equal to 1, so, stop calculating
   0.59375 of decimal is .10011 in binary
   so, 86.59375 in binary is 1010110.10011
86.59375 in simple binary => 1010110.10011
so, 86.59375 in normal binary is 1010110.10011 => 1.01011010011 * 2^6

single precision:
--------------------
sign bit is 0(+ve)
exp bits are (127+6=133) => 10000101
   Divide 133 successively by 2 until the quotient is 0
      > 133/2 = 66, remainder is 1
      > 66/2 = 33, remainder is 0
      > 33/2 = 16, remainder is 1
      > 16/2 = 8, remainder is 0
      > 8/2 = 4, remainder is 0
      > 4/2 = 2, remainder is 0
      > 2/2 = 1, remainder is 0
      > 1/2 = 0, remainder is 1
   Read remainders from the bottom to top as 10000101
   So, 133 of decimal is 10000101 in binary
frac bits are 01011010011000000000000

so, 86.59375 in single-precision format is 0 10000101 01011010011000000000000
in hexadecimal it is 0x42AD3000

b)
Converting 1.59729 to binary
   Convert decimal part first, then the fractional part
   > First convert 1 to binary
   Divide 1 successively by 2 until the quotient is 0
      > 1/2 = 0, remainder is 1
   Read remainders from the bottom to top as 1
   So, 1 of decimal is 1 in binary
   > Now, Convert 0.5972900000000001 to binary
      > Multiply 0.5972900000000001 with 2.  Since 1.1945800000000002 is >= 1. then add 1 to result
      > Multiply 0.1945800000000002 with 2.  Since 0.3891600000000004 is < 1. then add 0 to result
      > Multiply 0.3891600000000004 with 2.  Since 0.7783200000000008 is < 1. then add 0 to result
      > Multiply 0.7783200000000008 with 2.  Since 1.5566400000000016 is >= 1. then add 1 to result
      > Multiply 0.5566400000000016 with 2.  Since 1.1132800000000032 is >= 1. then add 1 to result
      > Multiply 0.11328000000000316 with 2.     Since 0.2265600000000063 is < 1. then add 0 to result
      > Multiply 0.2265600000000063 with 2.  Since 0.4531200000000126 is < 1. then add 0 to result
      > Multiply 0.4531200000000126 with 2.  Since 0.9062400000000252 is < 1. then add 0 to result
      > Multiply 0.9062400000000252 with 2.  Since 1.8124800000000505 is >= 1. then add 1 to result
      > Multiply 0.8124800000000505 with 2.  Since 1.624960000000101 is >= 1. then add 1 to result
      > Multiply 0.624960000000101 with 2.   Since 1.249920000000202 is >= 1. then add 1 to result
      > Multiply 0.24992000000020198 with 2.     Since 0.49984000000040396 is < 1. then add 0 to result
      > Multiply 0.49984000000040396 with 2.     Since 0.9996800000008079 is < 1. then add 0 to result
      > Multiply 0.9996800000008079 with 2.  Since 1.9993600000016158 is >= 1. then add 1 to result
      > Multiply 0.9993600000016158 with 2.  Since 1.9987200000032317 is >= 1. then add 1 to result
      > Multiply 0.9987200000032317 with 2.  Since 1.9974400000064634 is >= 1. then add 1 to result
      > Multiply 0.9974400000064634 with 2.  Since 1.9948800000129268 is >= 1. then add 1 to result
      > Multiply 0.9948800000129268 with 2.  Since 1.9897600000258535 is >= 1. then add 1 to result
      > Multiply 0.9897600000258535 with 2.  Since 1.979520000051707 is >= 1. then add 1 to result
      > Multiply 0.979520000051707 with 2.   Since 1.959040000103414 is >= 1. then add 1 to result
      > Multiply 0.9590400001034141 with 2.  Since 1.9180800002068281 is >= 1. then add 1 to result
      > Multiply 0.9180800002068281 with 2.  Since 1.8361600004136562 is >= 1. then add 1 to result
      > Multiply 0.8361600004136562 with 2.  Since 1.6723200008273125 is >= 1. then add 1 to result
      > Multiply 0.6723200008273125 with 2.  Since 1.344640001654625 is >= 1. then add 1 to result
      > Multiply 0.34464000165462494 with 2.     Since 0.6892800033092499 is < 1. then add 0 to result
      > Multiply 0.6892800033092499 with 2.  Since 1.3785600066184998 is >= 1. then add 1 to result
      > Multiply 0.37856000661849976 with 2.     Since 0.7571200132369995 is < 1. then add 0 to result
      > Multiply 0.7571200132369995 with 2.  Since 1.514240026473999 is >= 1. then add 1 to result
      > Multiply 0.514240026473999 with 2.   Since 1.028480052947998 is >= 1. then add 1 to result
      > Multiply 0.028480052947998047 with 2.    Since 0.056960105895996094 is < 1. then add 0 to result
      > Multiply 0.056960105895996094 with 2.    Since 0.11392021179199219 is < 1. then add 0 to result
      > Multiply 0.11392021179199219 with 2.     Since 0.22784042358398438 is < 1. then add 0 to result
      > Multiply 0.22784042358398438 with 2.     Since 0.45568084716796875 is < 1. then add 0 to result
      > Multiply 0.45568084716796875 with 2.     Since 0.9113616943359375 is < 1. then add 0 to result
      > Multiply 0.9113616943359375 with 2.  Since 1.822723388671875 is >= 1. then add 1 to result
      > Multiply 0.822723388671875 with 2.   Since 1.64544677734375 is >= 1. then add 1 to result
      > Multiply 0.64544677734375 with 2.    Since 1.2908935546875 is >= 1. then add 1 to result
      > Multiply 0.2908935546875 with 2.     Since 0.581787109375 is < 1. then add 0 to result
      > Multiply 0.581787109375 with 2.  Since 1.16357421875 is >= 1. then add 1 to result
      > Multiply 0.16357421875 with 2.   Since 0.3271484375 is < 1. then add 0 to result
      > Multiply 0.3271484375 with 2.    Since 0.654296875 is < 1. then add 0 to result
      > Multiply 0.654296875 with 2.     Since 1.30859375 is >= 1. then add 1 to result
      > Multiply 0.30859375 with 2.  Since 0.6171875 is < 1. then add 0 to result
      > Multiply 0.6171875 with 2.   Since 1.234375 is >= 1. then add 1 to result
      > Multiply 0.234375 with 2.    Since 0.46875 is < 1. then add 0 to result
      > Multiply 0.46875 with 2.     Since 0.9375 is < 1. then add 0 to result
      > Multiply 0.9375 with 2.  Since 1.875 is >= 1. then add 1 to result
      > Multiply 0.875 with 2.   Since 1.75 is >= 1. then add 1 to result
      > Multiply 0.75 with 2.    Since 1.5 is >= 1. then add 1 to result
      > Multiply 0.5 with 2.     Since 1.0 is >= 1. then add 1 to result
      > This is equal to 1, so, stop calculating
   0.5972900000000001 of decimal is .10011000111001111111111101011000001110100101001111 in binary
   so, 1.59729 in binary is 1.10011000111001111111111101011000001110100101001111
-1.59729 in simple binary => 1.10011000111001111111111101011000001110100101001111
so, -1.59729 in normal binary is 1.10011000111001111111111101011000001110100101001111 => 1.10011000111001111111111 * 2^0

single precision:
--------------------
sign bit is 1(-ve)
exp bits are (127+0=127) => 01111111
   Divide 127 successively by 2 until the quotient is 0
      > 127/2 = 63, remainder is 1
      > 63/2 = 31, remainder is 1
      > 31/2 = 15, remainder is 1
      > 15/2 = 7, remainder is 1
      > 7/2 = 3, remainder is 1
      > 3/2 = 1, remainder is 1
      > 1/2 = 0, remainder is 1
   Read remainders from the bottom to top as 1111111
   So, 127 of decimal is 1111111 in binary
frac bits are 10011000111001111111111

so, -1.59729 in single-precision format is 1 01111111 10011000111001111111111
in hexadecimal it is 0xBFCC73FF

2)
a)
0 10011001 11001101111100000000000
sign bit is 0(+ve)
exp bits are 10011001
   => 10011001
   => 1x2^7+0x2^6+0x2^5+1x2^4+1x2^3+0x2^2+0x2^1+1x2^0
   => 1x128+0x64+0x32+1x16+1x8+0x4+0x2+1x1
   => 128+0+0+16+8+0+0+1
   => 153
in decimal it is 153
so, exponent/bias is 153-127 = 26
frac bits are 110011011111

IEEE-754 Decimal value is 1.frac * 2^exponent
IEEE-754 Decimal value is 1.110011011111 * 2^26
1.110011011111 in decimal is 1.804443359375
   => 1.110011011111
   => 1x2^0+1x2^-1+1x2^-2+0x2^-3+0x2^-4+1x2^-5+1x2^-6+0x2^-7+1x2^-8+1x2^-9+1x2^-10+1x2^-11+1x2^-12
   => 1x1+1x0.5+1x0.25+0x0.125+0x0.0625+1x0.03125+1x0.015625+0x0.0078125+1x0.00390625+1x0.001953125+1x0.0009765625+1x0.00048828125+1x0.000244140625
   => 1+0.5+0.25+0.0+0.0+0.03125+0.015625+0.0+0.00390625+0.001953125+0.0009765625+0.00048828125+0.000244140625
   => 1.804443359375
so, 1.804443359375 * 2^26 in decimal is 121094144.0
so, 01001100111001101111100000000000 in IEEE-754 single precision format is 121094144.0
Answer: 121094144.0

b)
1 01101011 11001101010011000000000
sign bit is 1(-ve)
exp bits are 01101011
   => 01101011
   => 0x2^7+1x2^6+1x2^5+0x2^4+1x2^3+0x2^2+1x2^1+1x2^0
   => 0x128+1x64+1x32+0x16+1x8+0x4+1x2+1x1
   => 0+64+32+0+8+0+2+1
   => 107
in decimal it is 107
so, exponent/bias is 107-127 = -20
frac bits are 11001101010011

IEEE-754 Decimal value is 1.frac * 2^exponent
IEEE-754 Decimal value is 1.11001101010011 * 2^-20
1.11001101010011 in decimal is 1.80194091796875
   => 1.11001101010011
   => 1x2^0+1x2^-1+1x2^-2+0x2^-3+0x2^-4+1x2^-5+1x2^-6+0x2^-7+1x2^-8+0x2^-9+1x2^-10+0x2^-11+0x2^-12+1x2^-13+1x2^-14
   => 1x1+1x0.5+1x0.25+0x0.125+0x0.0625+1x0.03125+1x0.015625+0x0.0078125+1x0.00390625+0x0.001953125+1x0.0009765625+0x0.00048828125+0x0.000244140625+1x0.0001220703125+1x6.103515625e-05
   => 1+0.5+0.25+0.0+0.0+0.03125+0.015625+0.0+0.00390625+0.0+0.0009765625+0.0+0.0+0.0001220703125+6.103515625e-05
   => 1.80194091796875
so, 1.80194091796875 * 2^-20 in decimal is 1.7184647731482983e-06
so, 10110101111001101010011000000000 in IEEE-754 single precision format is -1.7184647731482983e-06
Answer: -0.0000017184647731482983

Related Solutions

Convert 1.67e14 to the 32-bit IEEE 754 Floating Point Standard, with the following layout: first bit...
Convert 1.67e14 to the 32-bit IEEE 754 Floating Point Standard, with the following layout: first bit is sign bit, next 8 bits is exponent field, and remaining 23 bits is mantissa field; result is to be in hexadecimal and not to be rounded up. answer choices 5717E27B 57172EB7 5717E2B7 C717E2B7 5771E2B7
Convert the following binary number (signed 32-bit floating point IEEE-754) into decimal. 0100 0011 0100 0000...
Convert the following binary number (signed 32-bit floating point IEEE-754) into decimal. 0100 0011 0100 0000 0000 0000 0000 0000
Consider the following 32-bit floating point representation based on the IEEE floating point standard: There is...
Consider the following 32-bit floating point representation based on the IEEE floating point standard: There is a sign bit in the most significant bit. The next eight bits are the exponent, and the exponent bias is 28-1-1 = 127. The last 23 bits are the fraction bits. The representation encodes number of the form V = (-1)S x M x 2E, where S is the sign, M is the significand, and E is the biased exponent. The rules for the...
Q1.Convert C46C000016 into a 32-bit single-precision IEEE floating-point binary number.
Q1.Convert C46C000016 into a 32-bit single-precision IEEE floating-point binary number.
The number –11.375 (decimal) represented as a 32-bit floating-point binary number according to the IEEE 754...
The number –11.375 (decimal) represented as a 32-bit floating-point binary number according to the IEEE 754 standard is
Given the following 32-bit binary sequences representing single precision IEEE 754 floating point numbers: a =...
Given the following 32-bit binary sequences representing single precision IEEE 754 floating point numbers: a = 0100 0000 1101 1000 0000 0000 0000 0000 b = 1011 1110 1110 0000 0000 0000 0000 0000 Perform the following arithmetic and show the results in both normalized binary format and IEEE 754 single-precision format. Show your steps. a)     a + b b)     a × b
verilog code to implement 32 bit Floating Point Adder in Verilog using IEEE 754 floating point...
verilog code to implement 32 bit Floating Point Adder in Verilog using IEEE 754 floating point representation.
Urgent Please Explain and show the difference between IEEE 16, 32, 64, 128-bit floating-point numbers.
Urgent Please Explain and show the difference between IEEE 16, 32, 64, 128-bit floating-point numbers.
Convert 0.875 to an IEEE 754 single-precision floating-point number. Show the sign bit, the exponent, and...
Convert 0.875 to an IEEE 754 single-precision floating-point number. Show the sign bit, the exponent, and the fraction. Convert -3.875 to an IEEE 754 double-precision floating-point number. Show the sign bit, the exponent, and the fraction Convert the IEEE 754 single-precision floating-point numbers 42E4800016 and 0080000016 to their corresponding decimal numbers.
Convert the following floating-point number (stored using IEEE floating-point standard 754) to a binary number in...
Convert the following floating-point number (stored using IEEE floating-point standard 754) to a binary number in non-standard form. 0100_0001_1110_0010_1000_0000_0000_0000
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT