Advantages of using kafka streams VS elastic
search:
Kafka:
- Apache Kafka is able to handle a large number of I/Os (writes)
using 3-4 cheap servers.
- It scales very well over large workloads and can handle
extreme-scale deployments (eg. Linkedin with 300 billion user
events each day).
- The same Kafka setup can be used as a messaging bus, storage
system or a log aggregator making it easy to maintain as one system
feeding multiple applications.
Elastic Search:
- Super-fast search on millions of documents. We've got over 2
billion documents in our index and the retrieve speeds are still in
the < 1-second range.
- Analytics on top of your search. If you organize your data
appropriately, Elasticsearch can serve as a distributed OLAP
system
- Elasticsearch is great for geographic data as well, including
searching and filtering with geojson, and a variety of geospatial
algorithms.
Disadvantages:
Kaska:
- Still a bit inmature, some clients have required recoding in
the last few versions
- New feaures coming very fast, several upgrades a year may be
required
- Not many commercial companies provide support
Elastic Search:
arch
- Setting Java memory thresholds can be a pain for those not
accustomed to things like Eden Space & Old Generation which can
lead to over allocation, or more likely, under allocation. Apache
Solr had a similar issue. It would be nice if the program would
take an extra step and dogfood it's own advice by analyzing the
system & processes to return a solid recommendation for that
configuration. The proper configuration information is outlined in
the documentation, it would be nice if that was automated.
- The only health check that ElasticSearch reports back is a
"red" status without any real solid information about what is going
on, though its usually memory thresholds or disk I/O. I am
currently on ElasticSearch 1.5 so that may have changed for newer
versions. When the status goes "red", I as the administrator of the
software, feel like I lose control of whats going on which should
rarely happen. Something more verbose would eliminate that.
- This is more of a critique of the ElasticStack in general. The
whole top to bottom stack is starting to get feature creep with
things that are better suited in other software and increasing the
barrier for entry for people to get started with setting up a
robust logging infrastructure. ElasticSearch as a storage search
engine, is pretty streamlined, but I can see that the tools that
comprise the ELK Stack are going to require a certification with
constant study at some point. During major release for Logstash a
while back, it literally took a month to learn a new language
because Elastic completely changed the syntax. For a medium sized
organization of only a couple of admins, that is a pretty high bar
where time is money. They really should work on refining/automating
the tools & search engine they have, instead of
shoehorning/changing things on to an already rock solid
foundation.
Note: If you have any related doubts, queries, feel free
to ask by commenting down below.
And if my answer suffice your requirements, then kindly
upvote.
Happy Learning