In: Computer Science
1. Read the Netflix Challenge Preview the document – datacenter edition paper to understand the relationship between the Netflix challenge and the cluster resource allocation problem.
2. Quasar Preview the document classifies resource allocation for scale-up, scale-out, heterogeneity, and interference. Why are classification criteria important, and how are they applied?
3. What are stragglers and how does Quasar deal with them?
(1) Netflix problem and cluster allocation
Netflix is the largest on demand internet streaming media. It can contain huge amount of data from which information to be gathered.
The clustering techniques that can be used to collect information from these huge data for a valued business ideas.
If we are supposed to find which movies is the most liked from the set of some movies recently broadcasted.
Clustering is to group the items together based on their atttributes the datas are typically unlabeled and the smiliarity is measured based on the distance between two points.
Eg: To classify the available list of cinemas under genre action or romance.
Types of clustering
k - means clustering
Agglomerative clustering
In k- means clusteringor partitional clustering is a memory based method that measures the distance between the query instance(movieID,Genre) and every instance in the training set. We find the K training instances with the least distance from the query instance and average the rating. This is the predicted rating for the query instance. We can use the ‘Euclidean Distance, Manhattan Distance, Minkowsky Distance or Mahalanobis Distance’ formulas to find the distance. The formula to use entirely depends on the dataset type
Agglomerative clusteringClustering based nearest neighbor approach to obtain genre for every movie from external sources.
We create a vector representing each genre as one cell and we count the number of moviews that users has rated in that particular genre. This has collective opinion of the users.
(2) Resource allocation classification by quasar
Quasar uses the result of classification to jointy
perform resource allocation and assignment and eliminates the
inherent inefficiencies of performing allocation without knowing
the assignmet challenges.
Greedy algorithm combines the result of the for independent
classifications to select the number and specific set of resources
that meet upto the performance constraints.
Quasar also monitors workload performance. If the constraint is not
met at some point or resources are in idle then eitherthe workload
has changed in load or phase or classification was incorrect, or
the greedy scheme led to suboptimal results. In any case, Quasar
adjusts the allocation and assignment or it can reclassify and
reschedule the workload.
Quasar handles both resource allocation and assignment different
from paragon. And cssification step also characterizes scale-out
and scale-up issues for each work Quasar also introduces an
interface for performance constraints
Scale up Classification It explores how performance varies with the amount of resources used within a server.
Scale out classification: This type of classification is only applicable to workloads that can use multiple servers, such as distributed frameworks stateless or stateful distributed services. Scale-out classification requires one more run in addition to single nodes
Heterogeneity classification requires one more profiling run on a different and randomly-chosen server type using the same workload parameters and for the same duration as a scale-up run. Collaborative filtering can estimate the workload performance across all other server types.
Interference classification quantifies the sensitivity of the workload to interference caused and tolerated in various shared resources, including the CPU, cache hierarchy, memory capacity, bandwidth, storage and network bandwidth.
3) stragglers
Stragglers are the under performing tasks which are 50℅ slower than the median
In frameworks like Hadoop or Spark, individual tasks may take longer than usual to complete for reasons like ranging from poor work partitioning to network interference and machine instability.
These straggling tasks are typically identified and relaunched by the framework to ensure timely job completion.
Straggler detection in Hadoop are as follows.
Quasar calls the TaskTracker API in Hadoop and
checks for under performing tasks which are atleast 50% slower than
the median.
For application of such tasks, Quasar
injects two
contentious microbenchmarks in
the corresponding servers and reclassifies them. With respect to
interference caused and tolerated.
If the results in the place of classification differ from the
original by more than 20% then signal the task as a straggler and
notify the Hadoop
JobTracker to relaunch it on a newly assigned server.
It allows Quasar to detect stragglers 19% earlier than Hadoop, and
8% earlier than the Hadoop applications