Question

In: Computer Science

A sequential file contains at most four billion 32-bit integers in random order, we need to...

A sequential file contains at most four billion 32-bit integers in random order, we need to find a 32-bit integer that isn’t in the file. Note that certainly we can simply read them all into memory (if we have enough memory) and sort them. That would cost O(n log n) for sorting and O(n) for searching. Think if we do better than that WITHOUT sorting and answer the following questions with better solutions.

(a) Explain why there must be at least one such integer missing?

(b) If you have some memory that is enough to hold one bit for each such integer, i.e., with such bitmap approach, how would you solve the problem?

(c) If you only have a few hundred bytes of main memory but you could use several external files, how would you solve it? (hint: use binary search)

Solutions

Expert Solution

Answer a

32-bit integer means an integer value can be from 0 till 232-1.

32-bit Integer value range = 0 to 4,294,967,295 means more than 4 billion.

So if we have file having 4 billion integers in random order, obviously there will be many numbers missing in the file. To have all numbers supported by 32-bit integer file must have each and every value from 0 to 4,294,967,295 which is not the case.

So we can say for sure at least one number is missing.

Answer b

If we need to solve the missing numbers problem we can have bitmap where each bit indicating whether particular number is available in file or not.

so we need to have 232 bits = 4,294,967,296 bits which means 512 MB memory.

We can create bit map for 4,294,967,296 entries and initialize it to 0. We will scan through the file. For any number found,we mark corresponding bit as 1. If we find same number again we don't change anything. At the end we scan the bit map and list out the entries with value 0. Those all are missing numbers.

Answer c

If we have few hundreds bytes of memory. We can create bit map accordingly. And in multiple scan of files we can find out which numbers are missing.

For example 512 Bytes means we can have bit map for 4096 integers. In first scan if we find any number from 0 to 4095 we store the bit value. After the scan we scan the bit map to find integers missing based on 0 value of bit. In 2nd scan we can store again any number from 4096 - 8191. Same way we can scan full file in multiple scans.


Related Solutions

- sparc assembly - *Write a program that takes four 32-bit integers (A,B,C,D) in hexadecimal form...
- sparc assembly - *Write a program that takes four 32-bit integers (A,B,C,D) in hexadecimal form and calculates A*B + C*D. (Assumption: User input is 32-bit 0 or positive. The result is expressed in 64 bits.) [result example] bash $ assm Hex value? ffffffff Hex value? 8 Hex value? ffffffff Hex value? 8 Result is 0000000f fffffff0
For each of the following file processing operations, indicate whether a sequential file, indexed random file, virtual storage access method, hashing,
For each of the following file processing operations, indicate whether a sequential file, indexed random file, virtual storage access method, hashing, or pointer structure would work best. You may choose as many as you wish for each step. Also, indicate which would perform the least optimally.a. Retrieve a record from the file based on its primary key value.b. Update a record in the file. c. Read a complete file of records. d. Find the next record in a file. e....
C programing language A file "data.txt" contains only integers. Write a code to find average of...
C programing language A file "data.txt" contains only integers. Write a code to find average of all values and print the average How would you use execlp function to execute "ps –e –a –l" command char *dt = "The five boxing wizards jump quickly"; write a program to count frequency of each letter, ignore case. Print the letter and frequency of each letter. // 1A: . Ask the user to enter a password string, store it in pass. Password should...
Assume that we are executing the following code on a 32-bit machine using two’s complement arithmetic...
Assume that we are executing the following code on a 32-bit machine using two’s complement arithmetic for signed integers. Which of the following will be printed when the following code is executed (circle those printed, and show work; e.g., how the values are stored): #include <stdio.h> int main() { char x = 0xF;                // x = ________ char y = -1;                 // y = ________ unsigned char z = 0xFF;      // z = 11111111        if (x<z)     printf("performed unsigned compare,...
Suppose we have a sequential ordered file of 200,000 records, where each record is 200 bytes....
Suppose we have a sequential ordered file of 200,000 records, where each record is 200 bytes. Assume blocksize = 2048 bytes (10 records per block), average seek time = 10 ms, average rotational delay = 8.3 ms, and block transfer time = 0.8 ms. Suppose we want to make X independent random record reads from the file. This could be done in two different approaches. read the entire file once and look for the X records of interest use a...
We have talked a bit about how chars are really just ASCII integers. Let's see this...
We have talked a bit about how chars are really just ASCII integers. Let's see this in action. We will show the characters starting at ASCII 33 and end with a user-supplied number. Use a for loop to output the characters with two spaces between each. Prompt: This program will output the ASCII characters starting at 33 Enter the integer for the last ASCII character you wish to see: [user types: 43] Output: ! " # $ % & '...
3. The following table contains data on health assessment for a random sample of 32 cases...
3. The following table contains data on health assessment for a random sample of 32 cases from the GSS 2006. Health is measured according to a four-point scale: 1 = excellent, 2 = good, 3 = fair, and 4 = poor. Four social classes are reported here: lower, working, middle, and upper. Using  = 0.05, test the null hypothesis that there is no difference between the groups. State very clear your hypothesis and your conclusions. You can use Excel...
Construct an array of 1000 random integers within range [0, 100] An input file input.txt is...
Construct an array of 1000 random integers within range [0, 100] An input file input.txt is provide. Each line of input.txt is a query integer that you need to check how many of that number is in your random integer array. For each query integer, fork a new child process to do the counting. The output is for each input query, output the count and child process id. For example: $> query: 13    count: 5    pid: 13342 $> query: 22   ...
Suppose that we are using extendable hashing on a file that contains records with the following...
Suppose that we are using extendable hashing on a file that contains records with the following search-key values: (2, 3, 5, 7, 11, 17, 19, 23, 29, 31). Show the final extendable hash structure for this file if the hash function is h(x) = x mod 8 and buckets can hold three records. Recall that: 1. the bucket address table is indexed by the prefix of the binary representation of h(x). 2. Initially i = 0, i.e. the prefix consists...
Consider an array of length n containing positive and negative integers in random order. Write the...
Consider an array of length n containing positive and negative integers in random order. Write the C++ code that rearranges the integers so that the negative integers appear before the positive integers. write a program that includes both functions and a main() function that tests them. Name the two functions rearrangeN() and rearrangeN2().
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT