Question

In: Computer Science

A sequential file contains at most four billion 32-bit integers in random order, we need to...

A sequential file contains at most four billion 32-bit integers in random order, we need to find a 32-bit integer that isn’t in the file. Note that certainly we can simply read them all into memory (if we have enough memory) and sort them. That would cost O(n log n) for sorting and O(n) for searching. Think if we do better than that WITHOUT sorting and answer the following questions with better solutions.

(a) Explain why there must be at least one such integer missing?

(b) If you have some memory that is enough to hold one bit for each such integer, i.e., with such bitmap approach, how would you solve the problem?

(c) If you only have a few hundred bytes of main memory but you could use several external files, how would you solve it? (hint: use binary search)

Solutions

Expert Solution

Answer a

32-bit integer means an integer value can be from 0 till 232-1.

32-bit Integer value range = 0 to 4,294,967,295 means more than 4 billion.

So if we have file having 4 billion integers in random order, obviously there will be many numbers missing in the file. To have all numbers supported by 32-bit integer file must have each and every value from 0 to 4,294,967,295 which is not the case.

So we can say for sure at least one number is missing.

Answer b

If we need to solve the missing numbers problem we can have bitmap where each bit indicating whether particular number is available in file or not.

so we need to have 232 bits = 4,294,967,296 bits which means 512 MB memory.

We can create bit map for 4,294,967,296 entries and initialize it to 0. We will scan through the file. For any number found,we mark corresponding bit as 1. If we find same number again we don't change anything. At the end we scan the bit map and list out the entries with value 0. Those all are missing numbers.

Answer c

If we have few hundreds bytes of memory. We can create bit map accordingly. And in multiple scan of files we can find out which numbers are missing.

For example 512 Bytes means we can have bit map for 4096 integers. In first scan if we find any number from 0 to 4095 we store the bit value. After the scan we scan the bit map to find integers missing based on 0 value of bit. In 2nd scan we can store again any number from 4096 - 8191. Same way we can scan full file in multiple scans.


Related Solutions

- sparc assembly - *Write a program that takes four 32-bit integers (A,B,C,D) in hexadecimal form...
- sparc assembly - *Write a program that takes four 32-bit integers (A,B,C,D) in hexadecimal form and calculates A*B + C*D. (Assumption: User input is 32-bit 0 or positive. The result is expressed in 64 bits.) [result example] bash $ assm Hex value? ffffffff Hex value? 8 Hex value? ffffffff Hex value? 8 Result is 0000000f fffffff0
Assuming integers are represented as 32-bit words and negative numbers are represented using the 2's complimentary...
Assuming integers are represented as 32-bit words and negative numbers are represented using the 2's complimentary method convert the following decimal numbers to hexadecimal numbers (show your work). a. -1314, b. 2020
IN MIPS ASSEMBLY LANGUAGE Write an assembly program to read three 32-bit signed integers from the...
IN MIPS ASSEMBLY LANGUAGE Write an assembly program to read three 32-bit signed integers from the user. Determine the smallest of these three numbers and display this result. Don’t use loops. Prompt the user for each entered integer. Your output should look something like the following example. Enter an integer: 7556 Enter an integer: -22984 Enter an integer: 8875 -22984
For each of the following file processing operations, indicate whether a sequential file, indexed random file, virtual storage access method, hashing,
For each of the following file processing operations, indicate whether a sequential file, indexed random file, virtual storage access method, hashing, or pointer structure would work best. You may choose as many as you wish for each step. Also, indicate which would perform the least optimally.a. Retrieve a record from the file based on its primary key value.b. Update a record in the file. c. Read a complete file of records. d. Find the next record in a file. e....
In writing integers to a file you need to connect 2 streams together to accomplish this...
In writing integers to a file you need to connect 2 streams together to accomplish this task. Below show the code you might use to connect 2 stream classes together. (Use a FileOutputStream and a DataOutputStream)
C programing language A file "data.txt" contains only integers. Write a code to find average of...
C programing language A file "data.txt" contains only integers. Write a code to find average of all values and print the average How would you use execlp function to execute "ps –e –a –l" command char *dt = "The five boxing wizards jump quickly"; write a program to count frequency of each letter, ignore case. Print the letter and frequency of each letter. // 1A: . Ask the user to enter a password string, store it in pass. Password should...
Assume that we are executing the following code on a 32-bit machine using two’s complement arithmetic...
Assume that we are executing the following code on a 32-bit machine using two’s complement arithmetic for signed integers. Which of the following will be printed when the following code is executed (circle those printed, and show work; e.g., how the values are stored): #include <stdio.h> int main() { char x = 0xF;                // x = ________ char y = -1;                 // y = ________ unsigned char z = 0xFF;      // z = 11111111        if (x<z)     printf("performed unsigned compare,...
Suppose we have a sequential ordered file of 200,000 records, where each record is 200 bytes....
Suppose we have a sequential ordered file of 200,000 records, where each record is 200 bytes. Assume blocksize = 2048 bytes (10 records per block), average seek time = 10 ms, average rotational delay = 8.3 ms, and block transfer time = 0.8 ms. Suppose we want to make X independent random record reads from the file. This could be done in two different approaches. read the entire file once and look for the X records of interest use a...
3. The following table contains data on health assessment for a random sample of 32 cases...
3. The following table contains data on health assessment for a random sample of 32 cases from the GSS 2006. Health is measured according to a four-point scale: 1 = excellent, 2 = good, 3 = fair, and 4 = poor. Four social classes are reported here: lower, working, middle, and upper. Using  = 0.05, test the null hypothesis that there is no difference between the groups. State very clear your hypothesis and your conclusions. You can use Excel...
We have talked a bit about how chars are really just ASCII integers. Let's see this...
We have talked a bit about how chars are really just ASCII integers. Let's see this in action. We will show the characters starting at ASCII 33 and end with a user-supplied number. Use a for loop to output the characters with two spaces between each. Prompt: This program will output the ASCII characters starting at 33 Enter the integer for the last ASCII character you wish to see: [user types: 43] Output: ! " # $ % & '...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT