Question

In: Computer Science

Does the data contain errors? If so, write queries in SQL to find these errors and...

Does the data contain errors? If so, write queries in SQL to find these errors and propose a way to address the issues

Theres a change of weight error (end weight-start weight) calculated wrong

and a logical error

Solutions

Expert Solution

Yes, Of Course the data contain the errors. I think the errors may be:

  1. Gauge min and max values: For continuous variables, checking the minimum and maximum values for each column can give you a quick idea of whether your values are falling within the correct range.

To Fix this issue using sql queries: create a measure using ALL() function to calculate the max value, ALL() function ignores all slicers and visual/page/report filters.

  1. Look for missings: The easiest way to find missings is to perform a count, if you have this function available. If not, there are other ways to find missing values. Try sorting your columns (both ‘ascending’ and ‘descending’) to see if any missing values exist in your columns, or filtering your dataset such that you’re only looking at records with a missing value. While sometimes missing values are inevitably due to chance, it’s worth double-checking to see if there might be an underlying reason for missingness, and address them as best you can.

To Fix this issue using sql queries: By default, SQLite does not display NULL values in its output. The .nullvalue command causes SQLite to display the value you specify for NULLs. We will use the value -null- to make the NULLs easier to see

SELECT * FROM Visited;

SELECT * FROM Visited WHERE dated IS NULL;

  1. Check the values of categorical variables: Depending on your methodology and the number of people contributing to a database, there can be lots of room for error when entering data. One quick way to find these is to pull up all of the different categories that a categorical variable can take on.

To Fix this issue using sql queries: The correlated subquery causes the ERROR: because it has multiple rows:

, CASE

    WHEN a.Group IN (' ')

    THEN (select first(b.group) from mydata as b where first(a.Group_no) = first(b.Group_no))

  END AS group_desc2

Change the sub-query to one that returns a single row. Something like:

, case

    when not missing(group) then group

    else (select min(group) from have as inner where inner.group_no = outer.group_no)

  end

  as group



  1. Look at the ‘incidence rate’ of binary variables: If we think of a true binary variable as one made up of 1’s and 0’s, looking at its mean (or the incidence rate) will tell you the proportion of 1’s you have in your dataset. It’s worth double-checking this to make sure that your binary is set up correctly. One common mistake I’ve seen is to have 1’s and nulls, rather than 1’s and 0’s. This becomes easy to spot because the “rate” of the binary variable will be equal to 1. The proportion of 1’s you get should make sense for the behavior you are trying to flag within your dataset.

Fix weight error which was calculated wrong:

For Example..Weight Decimal (5,2) means the total number of digits cannot exceed 5 and 2 digits can be placed to the right of the decimal. However, the value 1000.45 in the second line of code above exceeds the specified range of (5, 2) since it means 6 digits in total and throws an overflow error .

Fix using

ALTER TABLE Table_name ALTER COLUMN Weight decimal (6,2)

About logical error :

Sometimes, a subcondition is inconsistent, but the entire condition is consistent (e.g., because of a disjunction). Of course, also the opposite can happen: Sub Conditions that are tautologies.Both kinds of unnecessary complications indicate logical misconceptions and it is quite likely that the query will not behave as expected.Furthermore, implied sub conditions are unnecessary complications. In certain circumstances,implied sub conditions can help the optimizer to find a better execution plan, but then they should be clearly marked as an optimizer hint. In exams, it happens quite often that students adda condition, such as “A IS NOT NULL '' that is already enforced as a constraint.There are different possible formalizations of the requirement for “no unnecessary logicalcomplications”. A quite strict version is that whenever in the DNF of the query condition, asub condition is replaced by “true” or “false”, the result is not equivalent to the original condition.                


Related Solutions

Write the following SQL queries and show the corresponding output of the DBMS: 1) Write an...
Write the following SQL queries and show the corresponding output of the DBMS: 1) Write an SQL statement to display all the information of all Nobel Laureate winners. 2) Write an SQL statement to display the string "Hello, World!". 3) Write an SQL query to display the result of the following expression: 2 * 14 +76. 4) Write an SQL statement to display the winner and category of all Laureate winners. 5) Write an SQL query to find the winner(s)...
In this assignment, you are required to write the SQL statements to answer the following queries...
In this assignment, you are required to write the SQL statements to answer the following queries using PostgreSQL system. The SQL statements comprising the DDL for Henry Books Database are given to you in two files. For that database, answer the following queries. Create the files Q1 to Q10 in PostgreSQL. Do follow the restrictions stated for individual queries. 1. List the title of each book published by Penguin USA. You are allowed to use only 1 table in any...
Using your downloaded DBMS (MS SQL Server or MySQL), write SQL queries that inserts at least...
Using your downloaded DBMS (MS SQL Server or MySQL), write SQL queries that inserts at least three rows in each table. For the On-Demand Streaming System, First, insert information for multiple users, at least three video items and insert the three different types of subscriptions (Basic, Advanced, Unlimited) into the database. Then insert at least three user subscriptions. Execute the queries and make sure they run correctly
Write the SQL queries that accomplish the following tasks using the AP Database 9. Write a...
Write the SQL queries that accomplish the following tasks using the AP Database 9. Write a select statement to show the invoicelineitemdescriptions that have the total invoicelineitemamount >1000 and the number of accountno is >2. 10. Write a select statement that returns the vendorid, paymentsum of each vendor, and the number of invoices of each vendor, where paymentsum is the sum of the paymentotal column. Return only the top ten vendors who have been paid the most and the number...
Write SQL queries below for each of the following: List the names and cities of all...
Write SQL queries below for each of the following: List the names and cities of all customers List the different states the vendors come from (unique values only, no duplicates) Find the number of customers in California List product names and category descriptions for all products supplied by vendor Proformance List names of all employees who have sold to customer Rachel Patterson
Write the following questions as queries in SQL. Use only the operators discussed in class (no...
Write the following questions as queries in SQL. Use only the operators discussed in class (no outer joins) Consider the following database schema: INGREDIENT(ingredient-id,name,price-ounce) RECIPE(recipe-id,name,country,time) USES(rid,iid,quantity) where INGREDIENT lists ingredient information (id, name, and the price per ounce); RECIPE lists recipe information (id, name, country of origin, and time it takes to cook it); and USES tells us which ingredients (and how much of each) a recipe uses. The primary key of each table is underlined; rid is a foreign...
Write the following questions as queries in SQL. Use only the operators discussed in class (no...
Write the following questions as queries in SQL. Use only the operators discussed in class (no outer joins) Consider the following database schema: INGREDIENT(ingredient-id,name,price-ounce) RECIPE(recipe-id,name,country,time) USES(rid,iid,quantity) where INGREDIENT lists ingredient information (id, name, and the price per ounce); RECIPE lists recipe information (id, name, country of origin, and time it takes to cook it); and USES tells us which ingredients (and how much of each) a recipe uses. The primary key of each table is underlined; rid is a foreign...
Given the following relational schema, write queries in SQL to answer the English questions. There is...
Given the following relational schema, write queries in SQL to answer the English questions. There is a shipment database on the MySQL server. You can also use the DDL for MySQL. You must only submit the SQL for your answers but you can include the query output as well to help the TA with marking. Customer(cid: integer, cname: string, address: string, city: string, state: string) Product(pid: integer, pname: string, price: currency, inventory: integer) Shipment(sid: integer, cid: integer, shipdate: Date/Time) ShippedProduct(sid:...
Given the following relational schema, write queries in SQL to answer the English questions. There is...
Given the following relational schema, write queries in SQL to answer the English questions. There is a shipment database on the MySQL server. You can also use the DDL for MySQL. You must only submit the SQL for your answers but you can include the query output as well to help the TA with marking. Customer(cid: integer, cname: string, address: string, city: string, state: string) Product(pid: integer, pname: string, price: currency, inventory: integer) Shipment(sid: integer, cid: integer, shipdate: Date/Time) ShippedProduct(sid:...
This is t a relational database please write SQL queries to solve the listed questions. The...
This is t a relational database please write SQL queries to solve the listed questions. The database is a variation of the “Movie Database” . There are several differences in it, so look it over carefully before writing your SQL queries Notes: TheaterNum, MovieNum, and ActorNum are numeric primary key fields in their respective tables. Movie and actor names are not assumed to be unique unless specified otherwise in a question. In the THEATER table, Capacity is the number of...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT