Question

In: Computer Science

1. Read the Netflix Challenge Preview the document – datacenter edition paper to understand the relationship...

1. Read the Netflix Challenge Preview the document – datacenter edition paper to understand the relationship between the Netflix challenge and the cluster resource allocation problem.

2. Quasar Preview the document classifies resource allocation for scale-up, scale-out, heterogeneity, and interference. Why are classification criteria important, and how are they applied?

3. What are stragglers and how does Quasar deal with them?

Solutions

Expert Solution

(1) Netflix problem and cluster allocation

Netflix is the largest on demand internet streaming media. It can contain huge amount of data from which information to be gathered.

The clustering techniques that can be used to collect information from these huge data for a valued business ideas.

If we are supposed to find which movies is the most liked from the set of some movies recently broadcasted.

Clustering is to group the items together based on their atttributes the datas are typically unlabeled and the smiliarity is measured based on the distance between two points.

Eg: To classify the available list of cinemas under genre action or romance.

Types of clustering

​​​​k - means clustering

Agglomerative clustering

In k- means clusteringor partitional clustering is a memory based method that measures the distance between the query instance(movieID,Genre) and every instance in the training set. We find the K training instances with the least distance from the query instance and average the rating. This is the predicted rating for the query instance. We can use the ‘Euclidean Distance, Manhattan Distance, Minkowsky Distance or Mahalanobis Distance’ formulas to find the distance. The formula to use entirely depends on the dataset type

Agglomerative clusteringClustering based nearest neighbor approach to obtain genre for every movie from external sources.

We create a vector representing each genre as one cell and we count the number of moviews that users has rated in that particular genre. This has collective opinion of the users.

(2) Resource allocation classification by quasar

Quasar uses the result of classification to jointy
perform resource allocation and assignment and eliminates the inherent inefficiencies of performing allocation without knowing the assignmet challenges.
Greedy algorithm combines the result of the for independent classifications to select the number and specific set of resources that meet upto the performance constraints.
Quasar also monitors workload performance. If the constraint is not met at some point or resources are in idle then eitherthe workload has changed in load or phase or classification was incorrect, or the greedy scheme led to suboptimal results. In any case, Quasar adjusts the allocation and assignment or it can reclassify and reschedule the workload.
Quasar handles both resource allocation and assignment different from paragon. And cssification step also characterizes scale-out and scale-up issues for each work Quasar also introduces an interface for performance constraints

Scale up Classification It explores how performance varies with the amount of resources used within a server.

Scale out classification: This type of classification is only applicable to workloads that can use multiple servers, such as distributed frameworks stateless or stateful distributed services. Scale-out classification requires one more run in addition to single nodes

Heterogeneity classification requires one more profiling run on a different and randomly-chosen server type using the same workload parameters and for the same duration as a scale-up run. Collaborative filtering can estimate the workload performance across all other server types.

Interference classification quantifies the sensitivity of the workload to interference caused and tolerated in various shared resources, including the CPU, cache hierarchy, memory capacity, bandwidth, storage and network bandwidth.

3) stragglers

Stragglers are the under performing tasks which are 50℅ slower than the median

In frameworks like Hadoop or Spark, individual tasks may take longer than usual to complete for reasons like ranging from poor work partitioning to network interference and machine instability.

These straggling tasks are typically identified and relaunched by the framework to ensure timely job completion.

Straggler detection in Hadoop are as follows.
Quasar calls the TaskTracker API in Hadoop and checks for under performing tasks which are atleast 50% slower than the median.
For application of such tasks, Quasar injects two contentious microbenchmarks in the corresponding servers and reclassifies them. With respect to interference caused and tolerated.
If the results in the place of classification differ from the original by more than 20% then signal the task as a straggler and notify the Hadoop
JobTracker to relaunch it on a newly assigned server.
It allows Quasar to detect stragglers 19% earlier than Hadoop, and 8% earlier than the Hadoop applications


Related Solutions

Read the article “Data is Worthless if you don’t Communicate it”Preview the document from the HBR...
Read the article “Data is Worthless if you don’t Communicate it”Preview the document from the HBR Blog. This article lists out six critical questions for the data analyst/user/presenter. Based on what you have learned about inventory management, please create a hypothetical scenario in which you answer those questions for the CEO of your firm. Not only do we need to communicate big data to our management, but we need to understand what it is worth, what value it brings to...
What is the relationship between IMFs and paper chromatography? Please explain (I don’t understand this concept!)
What is the relationship between IMFs and paper chromatography? Please explain (I don’t understand this concept!)
Question 1: Netflix would like to carry out market research to understand the online interactions among...
Question 1: Netflix would like to carry out market research to understand the online interactions among fans of its original programming such as ‘Orange is the New Black’ and ‘House of Cards’. Netflix hopes to use these customer insights to understand what aspects of these programmes make them so popular. What research approach would best help Netflix gather this type of information from viewers? Explain your choice. (10 marks – approximately 500 words / 1 page). The information about Netflix...
1. According to Erik Erikson, what is the chief challenge of adolescence? maintaining a positive relationship...
1. According to Erik Erikson, what is the chief challenge of adolescence? maintaining a positive relationship with their parents developing a career path maintaining strong peer relationships identity formation 2. According to James Marcia, those who are high on exploration and high on commitment have which identity status? moratorium foreclosure achievement diffusion 3. According to James Marcia, those who are high on exploration and low on commitment have which identity status? diffusion foreclosure achievement moratorium 4. According to James Marcia,...
QUESTION 1 Read the extract below and answer the questions which follow: Stakeholder Engagement The Challenge...
QUESTION 1 Read the extract below and answer the questions which follow: Stakeholder Engagement The Challenge The fast-changing events of 2008 reinforced how important it is to understand the undercurrents of change that shape our world—and determine the success of business strategy. Companies that anticipate underlying social, economic, technological, and political changes are positioned to win. Our Strategy BSR, in partnership with the Institute for the Future (IFTF), developed a framework called Sustainability Outlook to identify the signals of change...
Required Solution Format 1. Understand the problem and devise a plan: a. Read and translate the...
Required Solution Format 1. Understand the problem and devise a plan: a. Read and translate the problem statement. b. Determine applicable concepts and/or laws and assumptions and/or simplifications. 2. Represent the problem physically and mathematically. a. Represent physically. b. Represent the concepts and/or laws mathematically. 3.Solve for the unknown quantity (or quantities). a.Solve for the unknown quantity (or quantities) using algebra, geometry, trigonometry and/or calculus. 4. Reflect. Is the answer reasonable? Does it make physical sense? a. Evaluate the result...
READ DOCUMENT AND ASNWER THE FOLLOWING QUESTIONS 1.Explain what is meant by government policy. What change...
READ DOCUMENT AND ASNWER THE FOLLOWING QUESTIONS 1.Explain what is meant by government policy. What change in government policy occurred in Korea in the 1960 – 1980 time period? What impact had the change in policy on the savings rate? 2. Looking back over corporate, private (personal) and public savings rate, describe the trend in South Korea. What is the driver of savings for the Koreans? Consider the corporate, private (personal) and public savings rate. A number of East-Asian nations...
1. Read the article to make sure you understand what it is talking about.   2. Write...
1. Read the article to make sure you understand what it is talking about.   2. Write a paragraph (200 words) summarizing the article. Be sure you have factual information included in your summary. Make sure you state where you found your article (which newspaper / online source), the author and the date of the article.   3. Write a paragraph (300 words) with your opinion on the article.   What did you think about the article? If the article made you feel...
write a reading report 1. Read no less than 3 paper on the topic of international...
write a reading report 1. Read no less than 3 paper on the topic of international economics; 2. Paper need to be downloaded from Journal of International Economics, International Economic Review in the past 3 years; 3. Write a reading report, which should include: the main content of these paper, main methods, main conclusions, understanding of the article, and future research directions
1.  Read pages 1-9 and 17-19 of Van de Water’s paper posted in the Deficits tab of...
1.  Read pages 1-9 and 17-19 of Van de Water’s paper posted in the Deficits tab of D2L for the course entitled “Federal Spending and Revenues will Need to Grow in Coming Years, Not Shrink,” and answer the following questions. A.How has the composition of Federal Spending as Percentage of GDP changed since 1976? Why?  What are the current big ticket items? B.Why are costs for Social Security, Medicare, and Medicaid increasing? C.What are our options for reducing the growth rate of...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT