In: Computer Science
Hadoop has HDFS, which is the default built in FileSystem, written in Java. Cloudera and HortonWorks both use this built-in default Java implementation. MapR has taken a different approach. What approach has MapR taken in its FileSystem implementation, and what may be the advantages and disadvantages of MapR's approach versus other vendors? If there are disadvantages, how can they be addressed? Look at the advantages and disadvantages from user, developer, administrator and risk perspective.
MapR filesystem implementation:
1)MapR filesystem allows concurrent read and write operations on the disc.The Hadoop filesystem allows only read from closed files and append only writes.
2)MapR has high availability which is enabled by the feature that during cluster failure name node is eliminated.
3)In contrast Hadoop file systems performance is decreased when high I/O operations occur as it is layered over Linux file system.
Comparisons based on different perspectives:
1)Risk:MapR has uniform platform level security and hence is secure by default.The same is not true for HDFS.
2)Administrator:MapR has high availabilty and has built in multi latency and disaster recovery.For HDFS disaster recovery is manual.
3)Developer:Supports analytics engines and AI and Ml algortihms on the same cluster.Supports open APIs.
Disadvantages of MapR:
1)It is more expensive.This may be offset by the fact that it was designed to have zero administration.
2)HDFS is open source.MapR isnt.