In: Computer Science
Write 400–600 words that respond to the following
questions with your thoughts, ideas, and comments. This will be the
foundation for future discussions by your classmates. Be
substantive and clear, and use examples to reinforce your
ideas.
In today’s modern world, people constantly create data
through their constant desire and need to be connected to the
digital world. For this activity, you will research the various
types of data that are commonly generated. Examples may include
browsing products in an online store, using social media to follow
or like products or interact with structured data within a
transactional system, and many other scenarios.
Consider this type of data and do the
following:
Discuss structured, unstructured, and semi-structured
data
Compare structured, unstructured, and semi-structured
data
Discuss how a modern business can utilize data to be
competitive
Use examples as needed to support the
discussion.
1. Compare structured, unstructured, and semi-structured data
Structured data
Data that is the easiest to search and organize, because it is usually contained in rows and columns and its elements can be mapped into fixed pre-defined fields, is known as structured data. Example: Relational data.
Structured data is data whose elements are addressable for effective analysis. It has been organized into a formatted repository that is typically a database. It concerns all data which can be stored in database SQL in a table with rows and columns. They have relational keys and can easily be mapped into pre-designed fields. Today, those data are most processed in the development and simplest way to manage information.
Structured data can be created by machines and humans. Examples of structured data include financial data such as accounting transactions, address details, demographic information, star ratings by customers, machines logs, location data from smart phones and smart devices, etc.
Semi-Structured data
Beyond structured and unstructured data, there is a third category, which basically is a mix between both of them. Email messages are a good example. While the actual content is unstructured, it does contain structured data such as name and email address of sender and recipient, time sent, etc. Another example is a digital photograph. The image itself is unstructured, but if the photo was taken on a smart phone, for example, it would be date and time stamped, geo tagged, and would have a device ID. Once stored, the photo could also be given tags that would provide a structure, such as ‘dog’ or ‘pet.’
Semi-structured data is information
that does not reside in a relational database but that have some
organizational properties that make it easier to analyze. With some
process, you can store them in the relation database (it could be
very hard for some kind of semi-structured data), but
Semi-structured exist to ease space. Example: XML
data.
Unstructured data
A much bigger percentage of all the data is our world is unstructured data. Unstructured data is data that cannot be contained in a row-column database and doesn’t have an associated data model. Think of the text of an email message.
Unstructured data is a data which is not organized in a predefined manner or does not have a predefined data model, thus it is not a good fit for a mainstream relational database. So for Unstructured data, there are alternative platforms for storing and managing, it is increasingly prevalent in IT systems and is used by organizations in a variety of business intelligence and analytics applications.
Example: Word,
PDF, Text, Media logs.
Other examples of
unstructured data include photos, video and audio files, text
files, social media content, satellite imagery, presentations,
PDFs, open-ended survey responses, websites and call center
transcripts/recordings.
Differences between Structured, Semi-structured and Unstructured data
PROPERTIES |
STRUCTURED DATA |
SEMI-STRUCTURED DATA |
UNSTRUCTURED DATA |
Technology |
It is based on Relational database table |
It is based on XML/RDF(Resource Description Framework). |
It is based on character and binary data |
Transaction management |
Matured transaction and various concurrency techniques |
Transaction is adapted from DBMS not matured |
No transaction management and no concurrency |
Version management |
Versioning over tuples,row,tables |
Versioning over tuples or graph is possible |
Versioned as a whole |
Flexibility |
It is schema dependent and less flexible |
It is more flexible than structured data but less flexible than unstructured data |
It is more flexible and there is absence of schema |
Scalability |
It is very difficult to scale DB schema |
It’s scaling is simpler than structured data |
It is more scalable. |
Robustness |
Very robust |
New technology, not very spread |
— |
Level of organizing |
Structured Data as name suggest this type of data is well organized and hence level of organizing is highest in this type of data. |
On other hand in case of Semi Structured Data the data is organized up to some extent only and rest is non organized hence the level of organizing is less than that of Structured Data and higher than that of Unstructured Data. |
In last the data is fully non organized in case of Unstructured Data and hence level of organizing is lowest in case of Unstructured Data. |
Query performance |
Structured query allow complex joining |
Queries over anonymous nodes are possible |
Only textual queries are possible |
2. Discuss how a modern business can utilize data to be competitive
Data was and continues to be a gold mine for many businesses around the globe. Countless product improvements and optimizations have been made based on the insights gathered from customer data. Fast forward to the present day, many sales teams and investors continue to assume that customer data can give them a competitive edge forever. They also believe that the more customers they have, the more data they can gather, and that data, when analyzed with machine-learning tools, allows them to create a better product and drive profits. But this is not always true. Most of us overestimate the advantage that data confers.
How do companies use data for competitive advantage?
Companies can build winner-take-all positions by collecting and analyzing customer data. The more customers a firm has, the more data it can gather and mine; the resulting insights allow it to offer a better product that attracts even more customers, from which it can collect still more data.
Data is a key for competitive business
As the world becomes smarter and smarter, data becomes the key to competitive advantage, meaning a company’s ability to compete will increasingly be driven by how well it can leverage data, apply analytics and implement new technologies. In fact, according to the International Institute for Analytics, by 2020, businesses using data will see $430 billion in productivity benefits over competitors who are not using data.
So, it’s clear that data is now a key business asset, and it’s revolutionizing the way companies operate, across most sectors and industries. In effect, every business, regardless of size, now needs to be a data business. And if every business is a data business, every business therefore needs a robust data strategy.
Data Analytics
Instead of starting with the data itself, every business should start with strategy. At this stage, it doesn’t matter what data is out there, what data you’re already collecting, what data your competitors are collecting, or what new forms of data are becoming available. Neither does it matter whether your business has mountains of analysis-ready data at your disposal, or next to none. A good data strategy is not about what data is readily or potentially available – it’s about what your business wants to achieve, and how data can help you get there.
Every business needs a company-wide data plan. It is also important to remember that no one type of data is inherently better than any other kind. Using data strategically is about finding the best data for your company, and that may be very different to what’s best for another company.
The key elements of a good data and analytics strategy
To create a robust data and analytics strategy, business leaders need to consider many factors. Here are the critical points I would expect to see in a strong data strategy:
Big Data analystics with real world example
Big data analytics involves examining large amounts of data. This is done so as to uncover the hidden patterns, correlations and also to give insights so as to make proper business decisions.
Big data analytics is done using advanced software systems. This allows businesses to reduce the analytics time for speedy decision making. Basically, the modern big data analytics systems allow for speedy and efficient analytical procedures. This ability to work faster and achieve agility offers a competitive advantage to businesses. In the meantime, businesses enjoy lower cost using big data analytics software.
A real example of a company that uses big data analytics to drive customer retention is Coca-Cola. In the year 2015, Coca-Cola managed to strengthen its data strategy by building a digital-led loyalty program. Coca-Cola director of data strategy was interviewed by ADMA managing editor. The interview made it clear that big data analytics is strongly behind customer retention at Coca-Cola.
Netflix is a good example of a big brand that uses big data analytics for targeted advertising. With over 100 million subscribers, the company collects huge data, which is the key to achieving the industry status Netflix boosts. If you are a subscriber, you are familiar to how they send you suggestions of the next movie you should watch. Basically, this is done using your past search and watch data. This data is used to give them insights on what interests the subscriber most.
UOB bank from Singapore is an example of a brand that uses big data to drive risk management. Being a financial institution, there is huge potential for incurring losses if risk management is not well thought of. UOB bank recently tested a risk management system that is based on big data. The big data risk management system enables the bank to reduce the calculation time of the value at risk. Initially, it took about 18 hours, but with the risk management system that uses big data, it only takes a few minutes. Through this initiative, the bank will possibly be able to carry out real-time risk analysis in the near future
You have probably heard of Amazon Fresh and Whole Foods. This is a perfect example of how big data can help improve innovation and product development. Amazon leverages big data analytics to move into a large market. The data-driven logistics gives Amazon the required expertise to enable creation and achievement of greater value. Focusing on big data analytics, Amazon whole foods is able to understand how customers buy groceries and how suppliers interact with the grocer. This data gives insights whenever there is need to implement further changes.
PepsiCo is a consumer packaged goods company that relies on huge volumes of data for an efficient supply chain management. The company is committed to ensuring they replenish the retailers’ shelves with appropriate volumes and types of products. The company’s clients provide reports that include their warehouse inventory and the POS inventory to the company, and this data is used to reconcile and forecast the production and shipment needs. This way, the company ensures retailers have the right products, in the right volumes and at the right time.