Question

In: Computer Science

The term "Big Data" is used so often today, but some still don't have a basic...

The term "Big Data" is used so often today, but some still don't have a basic understanding of the term. This discussion aims to provide a simple definition of the term that you can use in everyday discussion. Once you've read this posting, share with us your thoughts about big data. Is it good? Is it evil? What potential benefits do you see? Is there any ethical responsibility that goes along with use of this data? Happy learning and Posting!

BIG DATA IS BIG!

First, watch the following video that briefly describes big data and some of its benefits.

Play media comment.

Big data is new and “ginormous” and scary –very, very scary. No, wait. Big data is just another name for the same old data marketers have always used, and it’s not all that big, and it’s something we should be embracing, not fearing. No, hold on. That’s not it, either. What I meant to say is that big data is as powerful as a tsunami, but it’s a deluge that can be controlled . . . in a positive way, to provide business insights and value. Yes, that’s right, isn’t it?

Over the past few years, I have heard big data defined in many, many different ways, and so, I’m not surprised there’s so much confusion surrounding the term. So to get things started, let's converge on a single definition of the term:

Big data is a collection of data from traditional and digital sources inside and outside your company that represents a source for ongoing discovery and analysis.

Some people like to constrain big data to digital inputs like web behavior and social network interactions; however I believe that we can’t exclude traditional data derived from product transaction information, financial records and interaction channels, such as the call center and point-of-sale. All of that is big data, too, even though it may be dwarfed by the volume of digital data that’s now growing at an exponential rate.

In defining big data, it’s also important to understand the mix of unstructured and multi-structured data that comprises the volume of information.

Unstructured data comes from information that is not organized or easily interpreted by traditional databases or data models, and typically, it’s text-heavy. Metadata, Twitter tweets, and other social media posts are good examples of unstructured data.

Multi-structured data refers to a variety of data formats and types and can be derived from interactions between people and machines, such as web applications or social networks. A great example is web log data, which includes a combination of text and visual images along with structured data like form or transactional information. As digital disruption (Links to an external site.) transforms communication and interaction channels—and as marketers enhance the customer experience across devices, web properties, face-to-face interactions and social platforms—multi-structured data will continue to evolve.

Industry leaders like the global analyst firm Gartner (Links to an external site.) use phrases like “volume” (the amount of data), “velocity” (the speed of information generated and flowing into the enterprise) and “variety” (the kind of data available) to begin to frame the big data discussion. Others have focused on additional V’s, such as big data’s “veracity” and “value.”

What can you do with big data? Well, first, since this is a BUSINESS class, let's think of the potential for business. Watch this brief video about the potential of getting to know your customer better through big data.

Play media comment.

One question that arises when thinking about the vast amounts of data that companies now have access to is the question of consumer data privacy. Where does the ethical responsibility regarding use or misuse of such data lie?

Solutions

Expert Solution

Due to the increase in the power of computation and the advent of Internet and mobile networks, the rate at which data is getting generated is getting increased exponentially for the past two decades. Conventinally, organizations used to reprsent the data in a structured format using RDBMS. But now the data is getting generated through different devices (Computers/Laptops/Mobiles/Digital Cameras/CCTVs/Social Medias etc...), mostly these data are unstructured, i.e. they don't follow any particular structure / order. This unstructured data generated through different means in huge quantity are called as Big Data.

As per Gartner, Big Data will be defined as the data that is:

1. high in Volume (quantity of data)

2. Created at a faster rate. (Velocity)

3. The data created will be of varying kinds. (Variety)

Further, researchers / scientists extended the following features to Big Data definition

4. The correctness / accuracy of the data will inconsistent. (Veracity) and

5. The quality of the data will be inconsistent. (Value).

Is Big Data is Good?

Yes, Big Data is good, if the data available within an organization is studied and analyzed through experts and through Artificial Intelligence techniques like Machine Learning. By effectively analyzing the data, deep insights about the customer buying patterns / profit options / wastages / overheads can be understood and proper remedial or improvement action can be taken bu the company.

Is Big Data is Evil?

Big Data may turn to evil, if the data is not analyzed and if the organization is not learning from the data, then this will just increase the back up storage of the organization without any insights or learning. Further, if the data is available on a wrong hand then they may misuse the data leading to privacy issues for the customers and the customers may loose hope and respect for the organization.

Ethical Responsibility in usig the Big Data

Atmost care should be taken that the customer name and the other personal details about the customers are removed from the data before it reaches the hands of Data Analyst or Data Scientists. Care should be taken that the priivacy of the customers is preserved and it is not comprimised at any point. It would be advisable to not have any human involvement in collecting and processing the personal data and before analyzing the data the personal details about the customers should be automatically removed by the system.

The personal details about the customer should be stored in encrypted format to avoid in mishandlings.

Not following ethical responsibility will lead to information leakage like the disease of a patient, password of user, the credit card / debit card details, personal communications etc....Care should be taken to ensure that the customer data is safe.


Related Solutions

Cracking Fraud with Government’s Big Data What are some ways that data mining could be used...
Cracking Fraud with Government’s Big Data What are some ways that data mining could be used to detect fraud in health insurance claims? How could private insurance companies and public government agencies collaborate to combat insurance fraud? What types of business skills would be necessary to define the rules for and analyze the results from data mining? What business processes are necessary to complement the IS component of data mining?
The term “backup is often used in computing to talk about making copies of data files...
The term “backup is often used in computing to talk about making copies of data files so that they can be replaced in case of loss. But in this case of business it is not just being able to replace data that should concern managers. Explain how a business backup plan needs to be much more comprehensive.
Here are some basic rules for calculating the Big O for some T(n) or an algorithm....
Here are some basic rules for calculating the Big O for some T(n) or an algorithm. 1. Only the highest degree of n matters. For example T(n) = n^3 + 5n^2 + 107 → O(n^3 ) since once n becomes super-massively huge, the other terms just stop mattering. 2. Constant factors don’t matter. T(n) = 500n and T(n) = 0.005n both O(n). Again, as n becomes bigger, these constants stop mattering; what matters is the rate of growth. Example: T(n)...
Diet is a term that simply means “what you eat.” This term is often used in...
Diet is a term that simply means “what you eat.” This term is often used in conjunction with losing weight. In this assignment, you will compare (explain how they are the same) and contrast (tell how they are different) the three diet plans identified below. Latin American Diet Plan Mediterranean Plate Mediterranean Diet Plan USDA My Plate So, start out by: Describe each of the 3 diet plans or pyramids. Address each plan individually and explain the key components of...
Today, if a company spends some money to invest in a business but still carries environmental...
Today, if a company spends some money to invest in a business but still carries environmental protection, it is almost impossible to get a high profit, so how do you balance those two factors? Or is it compulsory to choose 1 in 2? (About 100-120words)
Having issues starting an assignment. Still learning so If you don't mind could you explain public...
Having issues starting an assignment. Still learning so If you don't mind could you explain public Theater() Initialize the 2-D array “seats”, with 3 rows and 4 columns. To assign the price for each seat, you need to open and read from the file “seatPrices.txt”. The file contains 12 doubles representing the price for each row. All seats in a given row are the same price, but different rows have different prices. You also need to initialize “totalSeats” to 0....
One reason that Normal distribution models show up so often is because they have some special...
One reason that Normal distribution models show up so often is because they have some special and useful properties, many of which were covered in class. Here is another: Mathematical Fact: If the variables X and Y are both normally distributed and independent, then new variables X + Y (the sum) and X – Y (the difference) are also normally distributed. * The mean and standard deviation of the sum or difference are calculated using the properties of random variables...
(I already have part A but still still include it anyways so you can use its...
(I already have part A but still still include it anyways so you can use its data to solve for parts B and C. Chemical energy is released or absorbed from reactions in various forms. The most easily measurable form of energy comes in the form of heat, or enthalpy. The enthalpy of a reaction can be calculated from the heats of formation of the substances involved in the reaction: ΔH∘rxn=ΔH∘f(products)−ΔH∘f(reactants) Entropy change, ΔS∘, is a measure of the number...
Examine communism and socialism. We hear so often today that socialism and communism are the same...
Examine communism and socialism. We hear so often today that socialism and communism are the same thing. Examine the similarities and differences between the two. Why do so many Americans seem to hate even the word "socialism"? We already employ some socialism in this country today (and for the past several decades). Examples include libraries, police departments, public education, Social Security, Medicare/Medicaid, public parks, roads/highways, and so on. How do these services factor into arguments for or against socialism?
Bromophenol Blue is often used as an acid-base indicator. In neutral and basic conditions it is...
Bromophenol Blue is often used as an acid-base indicator. In neutral and basic conditions it is blue while in acidic conditions it is yellow. The yellow form of bromophenol blue absorbs at 440nm with a molar absorptivity of 5.85x10^4 (assume that the blue form does not absorb at this wavelength). The pKa of bromophenol blue is 4. you would like to make 3ml solution of bromophenol blue that has an absorbance at440= 0.560. Select the buffer that you would use...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT