In: Computer Science
What is good data? What is meant by bad data? A term that you may have already encountered is "GIGO". This term refers to Garbage In, Garbage Out. In other words, if incorrect/bad data is entered into a database, the same useless data will be extracted. This results in poor decisions, lost revenue, and unhappy customers. Have you ever been the victim of bad data?
Discuss the importance of queries and good/bad data as they relate to database reports. Describe the impact on business of erroneous reports generated by bad data or faulty queries.
Good Data:
Good data can be referred to non erroneous, consistent and well cleaned up data. When we collect data from various sources in order to incorporate that data into any of our projects or business models, we encounter different kinds of sources that may be reliable, unreliable, complete or incomplete. So we can state that good data is complete, consistent and in which null values are handled well. It is reliable and accurate.
Bad Data:
Bad data may contain :
In today's world we scrape data from websites and other sources. We might not be able to scrape what we desired to or we may find that what we scraped ot collected is not a standard for use in our business model. This kind of data is called Bad Data.We all know that data is bought and sold in today's day and age we might find fraudulent sources that may sell us data pretending to be genuine sources, so we can say such are the sources of bad data.
Garbage In Garbage Out:
In Business Intelligence or Data Analysis, the first step is data preprocessing. The data we got can be bad data and training models or performing analyses on this bad data can lead to inconsistent results. Results which we cannot trust neither for our organisation nor for the stakeholders. This kind of data can cause millions and billions of dollars of loss. Regular surveys by Gartner on the cost of bad data remains at >$10 million per enterprise even after shelling out close to $200,000 annually on data quality tools. Hence we can get an idea what kind of effects bad data has on Businesses and how harmful it is in the long run. I personally have been a victim of bad data when I once wanted to analyse a dataset while training a machine learning model. The time taken to preprocess the data was much more than making analysis decisions for the data. So the most important thing is cleaning the data. We have tools these days made specially for cleaning and preprocessing data.
Importance of good database:
Good data stored in the database, correct retrieval and access of the data is crucial for any organization. Suppose Google misuses it's database of users. It will not only cause the company's reputation to go down but will cause huge losses. Writing correct queries and generating trustworthy and accurate reports using the database is a vital point in any modern organisation. We can say that data is the heart of any organisation. If data is good, the organisation will go a long way.