Question

In: Computer Science

Problem 2: Do a side-by-side comparison of Cascading and the following technologies in regards to writing...

Problem 2: Do a side-by-side comparison of Cascading and the following technologies in regards to writing Hadoop applications. Make sure you include the advantages and disadvantages of each, as well as when to use each technology over the other.

  • Cascading vs. cascalog

Solutions

Expert Solution

Answer:

Cascalog was created for developers who want to…

  • Build data applications with Clojure or Java
  • Query HDFS, databases, local data from the Clojure REPL
  • Easily run arbitrary Clojure code in your queries
  • Leverage the benefits of the Cascading application framework

Cascalog queries run as a series of MapReduce jobs. You can query from HDFS, various databases, and locally by making use of Cascading’s Tap abstraction.

Cascalog data processing code can be written in Clojure or java. Cascalog is mainly used for processing “Big Data” on Hadoop and for analysing data residing on local computer. Cascalog is another tool for processing data similar to tools like Pig, Hive and Cascading. The major difference between the alternative tools and cascalog is that cascalog operates at a significantly higher level of abstraction than other mentioned tools.

Cascading provides a set of high level APIs which internally call Hadoop map-reduce frameworks and invoke map-reduce jobs. It allows any java developer to write simple java programs and solve a Map-reduce problem in the form of simple constructs like Grouping, Aggregate, Function etc.

The Cascading framework provides an abstraction layer on top of Hadoop and allows enterprises to leverage existing skills and resources to build data processing applications on Apache Hadoop, without specialized Hadoop skills.
Because Cascading is Java-based, it naturally fits into JVM-based languages like Scala, Clojure, Jruby, Jython, and Groovy. Within many of these languages, many scripting and query languages has been created that simplify ad-hoc and production-ready analytics as well as machine learning applications.

On the other side - core map-reduce programming requires a developer to understand map reduce constructs like partitioning, sort, shuffle and map/reduce which takes time to learn and sometimes could be a longer cycle to get up to speed.

The advantage of writing core map reduce job is that - if you know it well, you can control things with greater level of access to your flow and probably write optimal flows.

Cascading is a proven application development platform for building Big Data applications on Apache Hadoop. Whether solving simple or complex data problems, Cascading balances an optimal level of abstraction with the necessary degrees of freedom through a computation engine, systems integration framework, data processing and scheduling capabilities


Related Solutions

Problem 1: Problem 2: Assume that any number of 1s side-by-side represent a number, with the...
Problem 1: Problem 2: Assume that any number of 1s side-by-side represent a number, with the value of that number being the number of 1s that appear. For example: 011111110 represents the number 7. (This style of representing numbers is referred to a unary notation – it’s generally not used anywhere but number theory / set theory.) Write a Turing machine that computes the remainder of its input when that input is divided by 3. Given, for example, the following...
Client side program: Develop a client side program which will do the following: The client side...
Client side program: Develop a client side program which will do the following: The client side would accept the customer ID and the amount that customer has spent. The client will then forward the customer ID and the amount spend the server. (Include field validation wherever possible) a. Request connection to the server b. Accept and forward the necessary data to the server c. Receive and display the results from the server d. Close the connection after use
Use the geometric probability distribution to solve the following problem. On the leeward side of the...
Use the geometric probability distribution to solve the following problem. On the leeward side of the island of Oahu, in a small village, about 71% of the residents are of Hawaiian ancestry. Let n = 1, 2, 3, … represent the number of people you must meet until you encounter the first person of Hawaiian ancestry in the village. (a) Write out a formula for the probability distribution of the random variable n. (Enter a mathematical expression.) P(n) = (b)...
Use the geometric probability distribution to solve the following problem. On the leeward side of the...
Use the geometric probability distribution to solve the following problem. On the leeward side of the island of Oahu, in a small village, about 89% of the residents are of Hawaiian ancestry. Let n = 1, 2, 3, … represent the number of people you must meet until you encounter the first person of Hawaiian ancestry in the village. (a) Write out a formula for the probability distribution of the random variable n. (Enter a mathematical expression.) P(n) =   (b)...
Use the geometric probability distribution to solve the following problem. On the leeward side of the...
Use the geometric probability distribution to solve the following problem. On the leeward side of the island of Oahu, in a small village, about 72% of the residents are of Hawaiian ancestry. Let n = 1, 2, 3, … represent the number of people you must meet until you encounter the first person of Hawaiian ancestry in the village. (a) Write out a formula for the probability distribution of the random variable n. (Enter a mathematical expression.) P(n) = (b)...
1. Use the geometric probability distribution to solve the following problem. On the leeward side of...
1. Use the geometric probability distribution to solve the following problem. On the leeward side of the island of Oahu, in a small village, about 80% of the residents are of Hawaiian ancestry. Let n = 1, 2, 3, … represent the number of people you must meet until you encounter the first person of Hawaiian ancestry in the village. (a) Write out a formula for the probability distribution of the random variable n. (Enter a mathematical expression.) P(n) =...
Write functions for each of the following problems. Each problem should be solved by writing a...
Write functions for each of the following problems. Each problem should be solved by writing a recursive function. Your final program should not have any loops in it. (a) Write a function that uses recursion to raise a number to a power. The function should take two arguments, the number to be raised to the power (floating point) and the power (a non-negative int). (10 Points) (b) Write a boolean function named isMember that takes two arguments: an array of...
Discuss how to use R programming to solve the following problem. You’re not just writing the...
Discuss how to use R programming to solve the following problem. You’re not just writing the R code, you are also discussing the process. Given a person’s full name in the format of firstName middleName lastName such as Michael Carlos Dumas, write R code(s) to convert it to the format of lastName, middleInitial, firstName. In the case of Michael Carlos Dumas, the converted name is Dumas, C. Michael. Your solution should work with any names that comply with the format....
Discuss the following two topics: What effects do ethics have In regards to Statistics? Consider that...
Discuss the following two topics: What effects do ethics have In regards to Statistics? Consider that the average annaual high temperatures for Dallas(77) and LA(75) which are close. At the same rate, San Diego(70) and Richmond Virginia(70) both share close temperatures. With this in mind, if you were to illustrate this to a friend the weather, would you say they were the same? WHY? Discuss in a minimum of 4-5 sentences. Do Not Copy And Paste, other then primary or...
Based upon the following code (or writing your own from scratch) Do either of the following...
Based upon the following code (or writing your own from scratch) Do either of the following two: Moderately Detailed or Bare-Bones MODERATELY DETAILED BARE-BONES import java.util.Random; import java.util.Scanner; //end goal/aim/ask; WAP high and low game //user choice H or L; and //single round //dice roll: generate a random number 0 and 20 //high is 11 to 20 and low is 1 to 10 public class Main {   public static void main( String[] args ) {     int score = 0;     score...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT