In: Computer Science
Don't attempt if you can't attempt fully, i will dislike and negative comments would be given Please it's a request.
c++
We will read a CSV files of a data dump from the GoodReads 2 web
site that contains information
about user-rated books (e.g., book title, publication year, ISBN
number, average reader rating,
and cover image URL). The information will be stored and some
simple statistics will be
calculated. Additionally, for extra credit, the program will create
an HTML web page based on
the top n highest rated books. As is typical of many subject matter
information sources, the data
in the file contains various errors. As such, we will track the
errors and create an exceptions files
to track the lines with data errors.
Develop a class, bookDataType, to provide functionality for reading
and storing book
information. The UML class specifications are provided below. A
main will be provided that
uses the bookDataType class.
●
Book Data Type Class
The class will implement the functions.
bookDataType
-COL_LIMIT = 23: static constexpr unsigned int
-TOP_LIMIT = 20: static constexpr unsigned int
-booksFileName: string
-webPageFileName: string
-exceptionsFileName: string
-bookCount: unsigned int
1 For more information, refer to:
https://en.wikipedia.org/wiki/Comma-separated_values
2 See: www.goodreads.com-topBooksLimit: unsigned int
-*topBooks: unsigned int
-struct bookErrsStruct
-bookIDErrors: unsigned int
-bookYearErrors: unsigned int
-AveRatingErrors: unsigned int
-duplicateDataErrors: unsigned int
-bookErrInfo: bookErrsStruct
-struct bookStruct
-bookTitle: string
-isbn: string
-pubYear: short
-aveRating: float
-imageURL: string
-bookID: unsigned int
-*bookInfo: bookStruct
+bookDataType()
+~bookDataType()
+getBookArguments(int, char *[], string &, bool &):
bool
+getBookFileName() const: string
+getWebPageFileName() const: string
+getExceptionsFileName() const: string
+getReadBookIDErrors() const: unsigned int
+getReadBookYearErrors() const: unsigned int
+getReadBookAveRatingErrors() const: unsigned int
+getReadBookDuplicateErrors() const: unsigned int
+getTopBooksLimit() const: unsigned int
+showBookData(unsigned int) const: void
+getBookCount() const: unsigned int
+getAverageOverallRating() const: float
+showHighestRatedBooks() const: void
+readBookData(const string): bool
+buildWebPage(const string="CS 202 Top Books") const: bool
+findHighestRatedBooks(): void
+setWebPageFileName(const string): void
+setExceptionsFileName(const string): void
+setTopBooksLimit(unsigned int): void
-parseLine(string, string []) const: void
Note, points will be deducted for insufficient commenting, poor
style, or inefficient coding. The
error messages should be prefixed with the function name (to help
better identify the source of
the error). Refer to the sample execution for error message
examples.Function Descriptions
•
The bookDataType() constructor should set the books filename to the
empty string, the
bookCount to 0, the topLimit to a default value of 5, the error
counts to 0, the web page
file name to a default value of “index.html”, and the exceptions
file to a default value of
“errors.txt”, and the pointers to the nullptr.
•
The ~bookDataType() destructor should delete the dynamically
allocated arrays, set the
other class variables to their default values (noted above).
•
The getBookArguments() function should read the command line
qualifiers in the
required format ( -i <booksFileName> [<-show|-noshow>]
) to obtain the file
name and set the show extra information flag (true/false). The data
file and show extra
flag may be in either order. The show extra flag is optional and
the “-noshow” is the
default if not specified. This includes a usage message and error
messages for both the
input file specifier and the input file name. The file name must be
at least one letter and
include a “.csv” extension (thus, the minimum length is 5). If the
file name is incorrect or
does not exist, an appropriate error message should be displayed,
the class variable
should remain unchanged, and the function should return false. To
determine if a file
exists (without opening it), you can use the access() function
(i.e,
( access(fn.c_str(), F_OK) ) which returns a 0 if the file exists
and returns a -1 if
the files does not exist. Note, the access() function requires the
#include <unistd.h>
statement. If there is an error, the function should output one of
the following error
messages:
cout
cout
cout
cout
cout
•
•
•
•
•
•
•
•
•
<<
<<
<<
<<
<<
"Usage:
"Error,
"Error,
"Error,
"Error,
./books -i <bookDataFileName> [<-show|-noshow>]"
<< endl;
invalid input file name specifier." << endl;
invalid command line options." << endl;
book data file name must be '.csv' extension." << endl;
invalid show extra information specifier." << endl;
based on the specific error.
The getBookFileName() function should return the current book file
name.
The getWebPageFileName() function should return the current web
page file name.
The getReadBookIDErrors(), getReadBookYearErrors(),
getReadBookAveRatingErrors(), and getReadBookDuplicateErrors()
functions should
return the applicable structure field.
The getExceptonsFileName() function should return the current
exceptions file name.
The setBookFileName() function should set the class variable for
the books file name to
the passed file name. The file name must be at least one letter and
include a “.csv”
extension (thus, the minimum length is 5). If the passed file name
is correct and the file
exists, the class variable should be set and a true returned. If
the file name is incorrect or
does not exist, an appropriate error message should be displayed,
the class variable
should remain unchanged, and the function should return false. To
determine if a file
exists (without opening it), you can use the access() function
(i.e,
( access(fn.c_str(), F_OK) ) which returns a 0 if the file exists
and returns a -1 if
the files does not exist..
The setWebPageFileName() function should set the class variable for
the web page file
name to the passed file name. The file name must be at least one
letter and include a
“.html” extension (thus, the minimum length is 6).
The setExceptonsFileName() function should set the class variable
for the file name to
the passed file name. The file name must be at least one letter and
include a “.txt”
extension (thus, the minimum length is 5).
The getTopBooksLimit() function should return the value for the
current number of
highest rated books to be found.
The setTopBooksLimit() function should set the class variable for
the current number of
highest rated books to be found. The value must not exceed the
TOP_LIMIT constant. If
the passed value is out of range, nothing should be changed.•
•
•
The getBookCount() should return the current number of books in the
data set.
The getAverageOverallRating() function should return the overall
average of book rating
in the entire current data set.
The showBookData() function should display the formatted book
information to the
screen in the specified format (see output example).
cout
cout
cout
cout
cout
cout
cout
•
•
•
•
•
<<
<<
<<
<<
<<
<<
<<
"Book Information:" <<
offset << "Title:
"
offset << "Book ID
"
offset << "ISBD:
"
offset << "Year:
"
offset << "Ave Rate: "
endl;
endl;
<< bookInfo[idx].bookTitle << endl;
<< bookInfo[idx].bookID << endl;
<< bookInfo[idx].isbn << endl;
<< bookInfo[idx].pubYear << endl;
<< bookInfo[idx].aveRating << endl;
The showHighestRatedBooks() function should show the topBooksLimit
number of
highest rated books using the showBookData() function from the
topBooks array. As
such, the findHighestRatedBooks() function must have been
previously called.
The findHighestRatedBooks() function should find the topBooksLimit
number of highest
rated books. This will require dynamic creation and population of
the topBooks[] array
of topBooksLimit size. The array will hold the index of the book
into the bookInfo[]
array. Due to the data size, a sort is not appropriate. The
topBooksLimit number of
highest rated books should be determined with out performing a
sort.
The parseLine() function will accept a string in comma-separated
format and break the
string into its individual comma separated fields. This includes
handling quoted fields
that may contain commas which are not field separators when inside
quotes. The
function should populate the passed array with the COL_LIMIT fields
in string format.
If the line contains more than COL_LIMIT fields, only the first
COL_LIMIT should be
returned (thus, do not over-write the array).
The readBookData() function should read the books file (CSV
format). This function
will call the private parseLine() function. From the returned
string array, the following
fields should be placed into the applicable fields of bookInfo[]
array.
•
Book title (first title), string
◦ note, of first title is empty, use second title
•
ISBN (10 digit), string
•
ImageURL (first of two), string
•
Publication Year, short
•
Book ID (good reads book ID, which is first), unsigned
integer
•
Average Rating, float
Note, since some data field may be invalid, try/catch blocks must
be used for the
conversion. The first line is a header line and must be skipped.
Blank lines should be
skipped. In order to size the bookInfo[] struct array, you will
need to read the file twice;
once to count the data lines and again to read the data. To reset
the file to the beginning,
use inFile.clear(); followed by inFile.seekg(0, ios::beg); . To
convert
string values into floats or integers, use the stoi() and stof()
functions. In order to check
for errors, these should be performed within a try/catch block.
Errors should be written
to the exceptions file with a line of 60 ‘-’s, the specific error,
the line number (from the
source data file), and on the next line the title “Row Data:”, and
on the next line the
complete row followed by a blank line. Refer to the examples for
formatting. Duplicate
rows are determined by the same book ID number and the second
occurrence written to
the exceptions file. See examples for formatting.
EXTRA CREDIT (up to 25 pts) → The buildWebPage() function should
build an HTML
web page of the top topBooksLimit number of highest rated books
including a link to the
image and the book information on the GoodReads web site. The Good
Reads link is
generated by appending the book ID to the URL"
https://www.goodreads.com/book/show/ " within an HREF tag along
with the
title. For example, <a
href=https://www.goodreads.com/book/show/24812>The
Complete
Calvin and Hobbes</a> for book ID 24812. The passed string is
the web page title
(using <title>CS 202 Top Books Page</title> in the
header block) and the initial we page
label (using an <h1>title</h1> tag) with a subtitle of
“Top Rated Books” (using an
<h2>subtitle</h2> tag). The minimum requirements for
the web page include the
title and subtitle headers, followed by the books. The books must
be numbered (1, 2, ...),
include the good reads book information link, the book cover image,
the ISBN number
(10 digit), and the book average rating. See the provided example
for a minimal required
format. The full 25 points will only be awarded if the final web
page exceeds the
minimal formatting (see example).
Refer to the example executions for output formatting. Make sure
your program includes the
appropriate documentation. See Program Evaluation Criteria for CS
202 for additional
information.
Make File:
You will need to develop a make file. You should be able to
type:
make
Which should create the executable. The makefile will be very
similar to the previous
assignment makefiles.
Submission:
● Submit a zip file of the program source files, header files, and
makefile via the on-line
submission. All necessary files must be included in the ZIP
file.
The grader will download, extract, and type make (so you must have
a valid, working makefile).
CSV Format
Fields in a CSV file are comma-separated. Typically (but not
always), the first line of the file
contains a row showing the field names. This is the case for our
data files. A field may contain
a number or may be quoted (that is, enclosed within double-quote
characters) indicating string
fields such as book titles. Such strings (names/titles) may have
embedded commas and
embedded quote characters (which must be double-quoted). For
example,
,"J.K. Rowling, Mary GrandPré, Rufus Beck",
,"A Child Called ""It"": One Child's Courage to Survive",
,"""M"" is for Malice",
,"""Who Could That Be at This Hour?""",
The double-quotes are used to mark a string field and are not
actually part of the string. For
example, the first line (above) is actually J.K. Rowling, Mary
GrandPré, Rufus Beck .
Where the double-quotes mark only the start and end. Since the
double-quote is used to mark the
start and end of a field, a double double-quote is used to signify
an actual double-quote. For
example, the second line is A Child Called "It": One Child's
Courage to
Survive , the second line is "M" is for Malice , and the third line
is "Who Could That
Be at This Hour?" .These requirements can make the reading of CSV
files a challenge. In addition, may data
sources in CSV format have imperfect data with various errors
include invalid numeric values,
too few fields, or too many fields.
Try/Catch Block Example
Below is an example of how to use the try/catch block for
conversion using the C++ stoi()
function.
unsigned int
unsigned long
string
try {
someNumber = 0;
size = 0;
badNum = "12-34";
someNumber = stoi(badNum, &size);
if (size != columns[8].size())
throw
invalid_argument("Conversion Error");
}
catch (exception &err) {
errFile << err.what() << endl;
}
// CS 202 - Provided Main
// This main uses the complexType and Newton fractal
types.
#include <cstdlib>
#include <iostream>
#include <cmath>
#include <string>
#include "newtonType.h"
using namespace std;
void displayImageData(newtonType &image);
// ****************************************************
int main(int argc, char *argv[])
{
string bars;
bars.append(50,'-');
// ---------------------------------------------------
// Get/check command line argument (-id).
bool showImageData = false;
if (argc > 1 && string(argv[1]) ==
"-summary")
showImageData = true;
// ---------------------------------------------------
// Some quick tests for the complex type.
complexType val1(23.9, 34.7);
complexType val2(3.2, -4.1);
complexType num1, num2, num3, num4;
complexType ans1(27.1, 30.6);
complexType ans2(20.7, 38.8);
complexType ans3(218.75, 13.05);
complexType ans4(-2.43216, 7.72754);
num1 = val1 + val2;
num2 = val1 - val2;
num3 = val1 * val2;
num4 = val1 / val2;
if ( (num1 != ans1) || !(num2 == ans2) ||
(num3 != ans3) || !(num4 == ans4) )
{
cout << "Error complex
calulation are not correct." << endl;
cout << "num1 = " <<
num1 << " should be = " << ans1 << endl;
cout << "num2 = " <<
num2 << " should be = " << ans2 << endl;
cout << "num3 = " <<
num3 << " should be = " << ans3 << endl;
cout << "num4 = " <<
num4 << " should be = " << ans4 << endl;
return 0;
}
// ---------------------------------------------------
// Simple, initial tests
// Note, second image (newton2.bmp) is the one on
the
// assignment handout and is ~12MB.
cout << endl << bars <<
endl;
cout << "Newton Image Creation." << endl
<< endl;
newtonType nImage1(600, 600, "newton1.bmp",
4.5,
complexType(0.7, 0.27015));
nImage1.createNewtonImage();
nImage1.writeNewtonImageFile();
newtonType nImage2(2000, 2000, "newton2.bmp",
1.5,
complexType(0.83300, 0.23780));
nImage2.createNewtonImage();
nImage2.writeNewtonImageFile();
// ---------------------------------------------------
// Test set functions
newtonType nImage3;
nImage3.setImageSize(900, 900);
nImage3.setImageFileName("newton3.bmp");
nImage3.setScale(10.0);
nImage3.setAvalue(complexType(0.23300,
0.23780));
nImage3.createNewtonImage();
nImage3.writeNewtonImageFile();
// ---------------------------------------------------
// Test constructor
cout << "Constructor Errors:" << endl;
newtonType badImage1(-100, 800,
"newton1.bmp",
0.9,
complexType(-0.7, 0.27015));
newtonType badImage2(900, -200, "newton1.bmp",
0.9,
complexType(-0.7, 0.27015));
newtonType badImage3(900, 200000, "newton1.bmp",
0.9,
complexType(-0.7, 0.27015));
newtonType badImage4(600, 600, "newton1.jpg",
0.9,
complexType(-0.7, 0.27015));
newtonType badImage5(600, 600, "newton1",
0.9,
complexType(-0.7, 0.27015));
newtonType badImage6(500, 600, "newton1.bmmp",
0.9,
complexType(-0.7, 0.27015));
newtonType badImage7(500, 600, "newton1.bmp",
0.9,
complexType(-1.7, 0.27015));
newtonType badImage8(500, 600, "newton1.bmp",
0.9,
complexType(-0.7, 2.27015));
newtonType badImage9(500, 600, "newton1.bmp",
0.9,
complexType(-0.7, 0.27015));
newtonType badImage10(500, 600, "newton1.bmp",
109.0,
complexType(-0.7, 2.27015));
newtonType badImage11(500, 600, "newton1.bmp",
0.0,
complexType(-0.7, 2.27015));
cout << endl;
cout << "Setter Errors:" << endl;
newtonType badImage;
badImage.setImageSize(-900, 600);
badImage.setImageSize(900, -600);
badImage.setImageSize(9, 6);
badImage.setImageSize(90000, 600);
badImage.setImageSize(900, 600000);
badImage.setImageFileName("newton0");
badImage.setImageFileName("newton0.bmmp");
badImage.setImageFileName("newton0.bp");
badImage.setImageFileName("newton0.gif");
badImage.setScale(-1.0);
badImage.setScale(211.0);
badImage.setScale(0.0);
badImage.setAvalue(complexType(-1.41453,
0.34364));
badImage.setAvalue(complexType(0.41453,
-1.34364));
badImage.setAvalue(complexType(1.41453,
0.34364));
badImage.setAvalue(complexType(0.41453, 1.34364));
cout << endl;
cout << "Should Show -> Image Creation
Error:" << endl;
badImage.createNewtonImage();
// ---------------------------------------------------
// Test the read functions.
// Try setting A=(0.99999, -0.77777)
newtonType nImage4;
cout << endl;
cout << "Image Data Entry:" << endl;
nImage4.readImageFileName();
nImage4.readImageSize();
nImage4.readAvalue();
nImage4.readScale();
nImage4.createNewtonImage();
nImage4.writeNewtonImageFile();
// ---------------------------------------------------
// call function to display image summary info.
if (showImageData) {
cout << endl <<
endl;
displayImageData(nImage1);
displayImageData(nImage2);
displayImageData(nImage3);
displayImageData(nImage4);
}
// ---------------------------------------------------
// Done, end program.
return 0;
}
// ****************************************************
// Simple function to display image data
void displayImageData(newtonType &image)
{
string fName;
string spaces = " ";
int height, width;
fName = image.getImageFileName();
image.getImageSize(width, height);
cout <<
"-----------------------------------------------------" <<
endl;
cout << "Image:" << endl;
cout << spaces << "File Name: " <<
fName << endl;
cout << spaces << "Image Size: (" <<
height << "," <<
width << ")" <<
endl;
cout << spaces << "A value: " <<
image.getAvalue() << endl;
cout << spaces << "Scale: " <<
image.getScale() << endl;
cout << endl;
}