Hadoop decided to abandon Java serialization, instead decided to implement their own serialization mechanism using Writable...

Hadoop decided to abandon Java serialization, instead decided to implement their own serialization mechanism using Writable and WritableComparable interface. If you were the lead architect of Hadoop, would you have taken the same approach? Why? Why not?

Expert Solution

No, Hadop doesn't use java serialization process.

Explanation :

Java comes with its own serialization mechanism, called Java Object Serialization. i.e Java Serialization.
That is tightly intergreted with language.
In Big Data proccessing scenarios like Hadoop, we need to have precise control over exactly how objects(It can by any data or applications ) are written and read.
With serialization you can get some control but you may face lots of problem ( will discuss problems below) central to hadoop. Even same hurdles can comes
with RMI ( Remote Method Invocaton ) i.e Where RMI is an API which allows an object to invoke a method on an object that exist in another address place which could be on same machine
or remote machine.
RMI creates a public remote server object that enables client and server side communications through simple method calls on the server object.

Problems with Java Serialization :

1. Java Serialization doesn’t meet the criteria for a serialization format listed earlier: compact, fast, extensible, and interoperable.

2. Java Serialization is not compact, it writes the classname of each object being written to the stream—this is true of classes that
implement java.io.Serializable or java.io.Externalizable.

3. In other words, briefly we could say that in java serialization process,
It writes meta data about the object which includes the class name, field name and types and its super class.
ObjectOutputStream and ObjectInputStream optimize this.
but object sequences with handles can not be accessed randomly since they reply on stream state and this further complicates sorting like things.
Where as in Hadoop Serialization mechanism, while defining ‘Writable’ we(applications) know the expected class.
so writables do not store their type in the serialized representation as while deserializing.
For example: if the input key is LongWritable instance , so an empty LongWritable instance is asked to populate itself from the input data stream.

As no Meta info need to be stored ( clasname, field, types, super class), this results in more compact binary files, random access and high performance.
So as hadoop technology core idea is to process large data sets with high performance it follow its own serialization rather than java serialization mechanism.

But in Hadoop, inter-process communication between nodes in the system is implemented using remote procedure calls (RPCs).
The RPC protocol uses serialization to render the message into a binary stream to be sent to the remote node,
which then deserializes the binary stream into the original message

venereology answered 11 months ago

Java String search Design and implement a recursive version of a binary search. Instead of using a...

Java String search Design and implement a recursive version of a binary search. Instead of using a loop to repeatedly check for the target value, use calls to a recursive method to check one value at a time. If the value is not the target, refine the search space and call the method again. The name to search for is entered by the user, as is the indexes that define the range of viable candidates can be entered by the user (that are...

In this Java program you will implement your own doubly linked lists. Implement the following operations...

In this Java program you will implement your own doubly linked lists. Implement the following operations that Java7 LinkedLists have. 1. public DList() This creates the empty list 2. public void addFirst(String element) adds the element to the front of the list 3. public void addLast(String element) adds the element to the end of the list 4. public String getFirst() 5. public String getLast() 6. public String removeLast() removes & returns the last element of the list. 7. public String...

Data structure program Implement (your own) the Radix Sort using single linked list java language

Explain in details Hadoop architecture. Give an example of Hadoop MapReduce example using real data

Write a class to implement HeadTailListInterface. Instead of using an array, use a List object as...

Write a class to implement HeadTailListInterface. Instead of using an array, use a List object as your instance data variable. (List (Links to an external site.) from the Java standard library- not ListInterface!). Instantiate the List object to type ArrayList. Inside the methods of this class, invoke methods on the List object to accomplish the task. Note: some methods might look very simple... this does not mean they are wrong! There is one difference in how this class will work...

Java program to implement the merge sort your own and test it to sort a list...

Java program to implement the merge sort your own and test it to sort a list of names based on the frequency.

using Java Your mission in this exercise is to implement a very simple Java painting application....

using Java Your mission in this exercise is to implement a very simple Java painting application. Rapid Prototyping The JFrame is an example that can support the following functions: 1. Draw curves, specified by a mouse drag. 2. Draw filled rectangles or ovals, specified by a mouse drag (don't worry about dynamically drawing the shape during the drag - just draw the final shape indicated). 3. Shape selection (line, rectangle or oval) selected by a combo box OR menu. 4....

Please show solution and comments for this data structure using java. Implement a program in Java...

Please show solution and comments for this data structure using java. Implement a program in Java to convert an infix expression that includes (, ), +, -, *, and / to postfix expression. For simplicity, your program will read from standard input (until the user enters the symbol “=”) an infix expression of single lower case and the operators +, -, /, *, and ( ), and output a postfix expression.

Design the database using the ER approach. Then using Java and SQL, implement the following functionality:...

Design the database using the ER approach. Then using Java and SQL, implement the following functionality: Implement a button called “Initialize Database”. When a user clicks it, all necessary tables will be created (or recreated) automatically, with each table be populated with at least 10 tuples so that each query below will return some results. All students should use the database name “sampledb”, username “john”, and password “pass1234”. Implement a user registration and login interface so that only a registered...

0. Introduction. In this assignment you will implement a stack as a Java class, using a...

0. Introduction. In this assignment you will implement a stack as a Java class, using a linked list of nodes. Unlike the stack discussed in the lectures, however, your stack will be designed to efficiently handle repeated pushes of the same element. This shows that there are often many different ways to design the same data structure, and that a data structure should be designed for an anticipated pattern of use. 1. Theory. The most obvious way to represent a...

Question