Question

In: Computer Science

You are interested in analyzing some hard-to-obtain data from two separate databases. Each database contains n...

You are interested in analyzing some hard-to-obtain data from two separate databases. Each database contains n numerical values -so there are 2n values total - and you may assume that no two values are the same.

You’d like to determine the median of this set of 2n values, which we will define here to be the nth smallest value. However, the only way you can access these values is through queries to the databases. In a single query, you can specify a value k to one of the two databases, and the chosen database will return the kth smallest value that it contains. Since queries are expensive (you pay $10 for each), you would like to compute the median using as few queries as possible.

Write a java method named findMedian(Database a, Database b), that returns the median value using at most O(log n) queries.

/*
Notes for your code:

Database a, and Database b here are special objects that you can query for the kth smallest item.

  • If you call a.query(1), Database a will return the smallest item in it (and charge you $10)
  • If you call a.query(5), Database a will return the 5th smallest item in it (and charge you another $10) and so on.
  • Also, you can get the size of Database a by calling a.size(); (this will cost you nothing)
  • And if you call a.query(a.size()) this will give you the largest element in Database a (and you will pay $10 of course)
  • AND Note the important fact that no two values are the same (among both Databases)

Please give me a clear Java code with explanation !

public static int findMedian(Database a, Database b){} The head of the method should look like this !

Solutions

Expert Solution

Here is the solution to the question above. I have inserted comments wherever necessary for better understanding.

CODE:

//We can consider the two database tables as sorted arrays

//Assumption- we have 0-based indexing in database queries just like arrays. so a.query(0) will yield smallest element of database a

//O(n) approach will brute force - that is access all the elements and store it in 3 array and then find nth smallest element

//We are using here O(log n) approach - Divide and Conquer Approach



public class median{

public static float findMedian(Database a,Database b){

int m=a.size();int n=b.size(); // 0 cost

int k=n;//to find nth smallest element

int aa=a.query(1);int bb=b.query(1);

if(n==1) return (a<b)?a:b; // compare if only single element is present

else return func(a,b,m,n,k,0,0);

}

static int func(Database a,Database b, int m,int n, int k,int s1,int s2){

if(s1==m)return b.query(s2+k-1); // If we have reached end of Database(array) a

if(s2==n) return a.query(s1+k-1); // If we have reached end of Database(array) b

if(k==0 || k>(m-s1)+(n-s2)) return -1;  // k should reach 0 or exceed size of databases a,b

int current = k/2;

if(current-1 >= m-s1){

if(a.query(m-1) < b.query(s2+current-1)) return b.query(s2+(k-(m-s1)-1)); 

else return func(a,b,m,n,k-current,s1,s2+current);

}

if(current-1 >= n-s2){

if(b.query(n-1) < a.query(s1+current-1)) return a.query(s1+(k-(n-s2)-1));

else return func(a,b,m,n,k-current,s1+current,s2);

}else{

if(a.query(current+s1-1) < b.query(current+s2-1)) return func(a,b,m,n,k-current,s1+current,s2);

else return func(a,b,m,n,k-current,s1,s2+current);

}

}

}

If you find this helpful please do UPVOTE the answer.

!!!!!!!!!!!!!!!!!!!!!!!!!!!! H A P P Y C H E G G I N G !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!


Related Solutions

Your company maintains a database with information on your customers, and you are interested in analyzing...
Your company maintains a database with information on your customers, and you are interested in analyzing patters observed over the past quarter. 23% of customer in the database placed new orders within this period. However, for those customers who had a salesperson assigned to them, the new order rate was 58%. Overall, 14% of customers within the database had salesperson assigned to them. a) Draw a contingency table for this situation. b) What percentage of customers in the database placed...
You are analyzing GSS data, and you are interested in whether or not respondents with diabetes...
You are analyzing GSS data, and you are interested in whether or not respondents with diabetes have a different number of days of poor mental health than respondents who do not have diabetes. Information about the variables: Name: Days of poor mental health past 30 days Questions associated with this variable: B. Now thinking about your mental health, which includes stress, depression, and problems with emotions, for how many days during the past 30 days was your mental health not...
You are analyzing a project and have prepared the following data:                                &n
You are analyzing a project and have prepared the following data:                                     Year                             Cash flow                                        0                                 -$169,000                                        1                                  $ 46,200                                        2                                  $ 87,300                                        3                                  $ 41,000                                        4                                  $ 39,000            Required payback period        2.5 years            Required AAR                              7.25 percent                   Required return                                   8.50 percent C-1 What is the payback period? Should you accept the project? C-2 What is the profitability index of...
Part 1: You obtain an organic compound that contains C, H, N, and O and determine...
Part 1: You obtain an organic compound that contains C, H, N, and O and determine the percent composition of each element. You determine that the sample is 34.91% carbon, 6.83% hydrogen, 27.26% nitrogen, and 31.00% oxygen, each by mass. We will make the assumption that the percent listed is the same as the mass in grams, if the total mass is 100 grams. a) Using this assumption, determine the moles of each element. After, add up the moles of...
You run a regression analysis on a bivariate set of data (n=99). You obtain the regression...
You run a regression analysis on a bivariate set of data (n=99). You obtain the regression equation y=0.843x+6.762 with a correlation coefficient of r=0.954 which is significant at α=0.01 You want to predict what value (on average) for the explanatory variable will give you a value of 150 on the response variable. What is the predicted explanatory value? x = (Report answer accurate to one decimal place.) Here is a bivariate data set. Find the regression equation for the response...
What could happen if you are analyzing some data to solve an engineering problem, but the...
What could happen if you are analyzing some data to solve an engineering problem, but the data was not taken correctly and does not represent the population size?
The following data were obtained from a study using two separate toothpastes. You used them on...
The following data were obtained from a study using two separate toothpastes. You used them on the last homework in R. Now you will work with them to compute the ANOVA by hand. Subjects in group I received a fluoride-supplemented toothpaste for one year while those in group II used one without added fluoride. At the end of the trial period the number of cavities needing to be filled was used as the dependant variable. A) Compute an analysis of...
With a cube from a database or data warehouse, you are bringing data "forward." Explain what...
With a cube from a database or data warehouse, you are bringing data "forward." Explain what this means and how it impacts the front-end and back-end user from a database perspective. Provide examples to illustrate your ideas. typed please
You found a public database for hedge fund returns and are eager to do some data...
You found a public database for hedge fund returns and are eager to do some data analyses on hedge funds in general using the database. Your colleague found out that the hedge funds in the database were under no pressure to report their performance to anyone outside the funds. Based on her finding, she argues that the returns in the public data are likely to be biased upward. Would you agree? Why or why not? Briefly explain.
You found a public database for hedge fund returns and are eager to do some data...
You found a public database for hedge fund returns and are eager to do some data analyses on hedge funds in general using the database. Your colleague found out that the hedge funds in the database were under no pressure to report their performance to anyone outside the funds. Based on her finding, she argues that the returns in the public data are likely to be biased upward. Would you agree? Why or why not? Briefly explain.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT