Question

In: Computer Science

There is a lot of repeating data in the file example below in the form of...

There is a lot of repeating data in the file example below in the form of key names being in each and every record. This is a tradeoff to allow each record to have a different mix of (or even entirely different) keys. CSV files generally have one record (the first one in the file) which is simply a record of each headers (or "key") for each column and every record has the same fields and keys.

My tsv2json utility has but one argument: tsv2json [-c | --compact]

This lets you use the CSV approach when you know every record contains the same fields.

What are the plusses and minuses of each approach? Also, what are your thoughts on redundancy and size, and processing-convenience in file formats?

File Example:

[
  {
    "type": "act",
    "line_id": 1,
    "play_name": "Henry IV",
    "speech_number": "",
    "line_number": "",
    "speaker": "",
    "text_entry": "ACT I"
  },
  …
  {
    "type": "line",
    "line_id": 111396,
    "play_name": "A Winters Tale",
    "speech_number": 38,
    "line_number": "",
    "speaker": "LEONTES",
    "text_entry": "Exeunt"
  }
]

Solutions

Expert Solution

JSON is ‘JavaScript Object Notation’. It is used as the syntax for storing and exchanging the data. It is a language-independent format. It is referred to as a lightweight data-interchange format. It is easy to understand. Its filename extension is .json

CSV is ‘Comma Separated Value’. It is a delimiting text that uses the comma to separate the file. In the CSV file, the tabular data has been saved as plain text data, separated by the comma. CSV formats are widely used to represent the sequence of records in which each record has an identical list of fields. Its filename extension is .csv and its internet media type is text/CSV.

Below mentioned are the Pros and Cons of each approach with processing convinience and file size aspects.

CSV characteristics:
Pros:-

  • CSV is a flat structure data format which is suitable for small applications.
  • Compared to JSON, CSV demands less technical knowledge and can be accessed using most applications
  • CSV format is referred to as the most compact format from all the formats of a file.
  • CSV format is about half the size of the JSON
  • It helps in reducing the bandwidth and the size would be very less
  • It supports multi-platform. It is a common data exchange format that is mainly supported by business and scientific applications.

Cons:-

  • The encoding has to be set in the application which handles the file for all the characters to display properly.
  • CSV is not recommended for large-scale and complex data projects.
  • All records should have the same number of fields and it should be in the same order

JSON characteristics:

Pros:-

  • JSON is a very flexible data format that supports nested structure or simply your data can have multiple sub categories.
  • Handling JSON format requires slightly less processing power and is also light-weight.
  • JSON is the recommended data format for complex and large-scale applications.

Cons:-

  • A parser has to be programmed to access the data in a JSON file which maydemand technical labour.
  • It has no error handling for JSON calls. If the dynamic script insertion works, you get called and will get the response perfectly. If not inserted, nothing happens. It just fails silently.

In short if we are considering the size aspects, CSV will be the firts choice, but performance and flexibility aspects will made you choose JSON format.


Related Solutions

The data below give the number of lot-to-lot failures of a plastic-encapsulated resistor to a moisture-resistance...
The data below give the number of lot-to-lot failures of a plastic-encapsulated resistor to a moisture-resistance test. Subgroup Number Number of Non-conforming Subgroup Size 1 8 250 2 5 125 3 3 125 4 6 210 5 18 250 6 1 150 7 4 125 8 1 125 9 5 150 10 3 250 11 2 150 12 10 250 12 6 125 14 3 125 15 6 150 16 11 125 17 4 125 18 4 125 19 2...
Lot Price Data Lot Price is lot price in $1000s Lot Size is lot size in...
Lot Price Data Lot Price is lot price in $1000s Lot Size is lot size in 1000s of square feet Mature Trees is the number of mature trees on the property Distance from Water is the distance from the edge of property to the water in feet Distance from Road is the distance from the main road to the center of the property in miles Lot Price Lot Size Mature Trees Distance from Water Distance from Road 105.4 41.2 24...
The accompanying data file contains 20 observations for t and yt. The data are plotted below....
The accompanying data file contains 20 observations for t and yt. The data are plotted below. t 1 2 3 4 5 6 7 8 9 10 yt 10.8 14.1 10.3 10.9 11.3 13.5 10.7 9.2 8.8 12 t 11 12 13 14 15 16 17 18 19 20 yt 9.8 11 15.1 12.5 12.9 12.3 9 14.9 10.1 11.9 b-1. Use the exponential smoothing method to make forecasts with α = 0.2. (Round intermediate calculations to at least 4...
Access the hourly wage data on the below Excel Data File (Hourly Wage). An economist wants...
Access the hourly wage data on the below Excel Data File (Hourly Wage). An economist wants to test if the average hourly wage is less than $28. Assume that the population standard deviation is $8. b-1. Find the value of the test statistic. (Negative value should be indicated by a minus sign. Round intermediate calculations to at least 4 decimal places and final answer to 2 decimal places.) Hourly Wage Education Experience Age Gender 39.00 11 2 40 1 21.02...
Java code TIA I need this: Your program makes accommodations for repeating digits. For example if...
Java code TIA I need this: Your program makes accommodations for repeating digits. For example if the random numbers generated were 141 in that order. Then the user entered 271 in that order, be sure that the last one does not count a match to the first and third numbers of the random numbers. They only matched one number in this case. Now if the random numbers are 141 in that order and the user enters 113, then they did...
Describe the five general categories of data analysis tools? Illustrate with an example of how file...
Describe the five general categories of data analysis tools? Illustrate with an example of how file viewer software is used in child pornography cases?
i need to post a file to a url. The "Content-Type: multipart/form-data" and it "accept: application/json"...
i need to post a file to a url. The "Content-Type: multipart/form-data" and it "accept: application/json" i need to upload a file, three strings. It needs to be in Java, but i keep getting issues.
The data on a loan has been collected in the Microsoft ExcelOnline file below. Open...
The data on a loan has been collected in the Microsoft Excel Online file below. Open the spreadsheet and perform the required analysis to answer the questions below.Open spreadsheeta. Complete an amortization schedule for a $35,000 loan to be repaid in equal installments at the end of each of the next three years. The interest rate is 11% compounded annually. Round all answers to the nearest cent.BeginningRepaymentEndingYearBalancePaymentInterestof PrincipalBalance1$  $  $  $  $  2$  $  $  $  $  3$  $  $  $  $  b. What percentage of the payment represents interest and what percentage represents principal...
Provide an example of a computerized data collection form or system from your institution or one...
Provide an example of a computerized data collection form or system from your institution or one you have found online for syndromic surveillance and explain how it is used. What are the advantages and disadvantages of this type of data collection system?  
(Can you give me an example of data and calculations from this example below?) Hypothesis testing...
(Can you give me an example of data and calculations from this example below?) Hypothesis testing for the mean and P-values, left tailed and right tailed, z test... An example to use could be if a weight loss product, which would be a list of meal plans, works. You could gather 100 people that are subscribers and users of this program and calculate the amount of weight they are losing within a span of 6 months. The claim could be...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT