Question

In: Computer Science

In C++ For this assignment, you will write a program to count the number of times...

In C++

For this assignment, you will write a program to count the number of times the words in an input text file occur.

The WordCount Structure

Define a C++ struct called WordCount that contains the following data members:

  • An array of 31 characters named word
  • An integer named count

Functions

Write the following functions:

  • int main(int argc, char* argv[])

    This function should declare an array of 200 WordCount objects and an integer numWords to track the number of array elements that are in use.

    If no input file name is specified as a program argument when the program is run, then argc will be equal to 1. If so, print an error message similar to the following and exit the program:

       Usage: assign1 [file-name]
    

    Call the function countWords() passing argv[1] as the file name parameter. Store the value returned by the function in numWords.

    Call the function sortWords() to sort the array.

    Print a header line as shown in the sample output, then call printWords() to print the words and their counts.

  • int countWords(const char* fileName, WordCount wordArray[])

    Parameters: 1) A C string that will not be changed by the function and that contains the name of an input file; 2) an array of WordCount objects.

    Returns: The number of distinct words that the function stored in the array (i.e., the number of array elements filled with valid data).

    This function should declare a file stream variable and open it for the file name passed in as the first parameter. If the file fails to open successfully, print an error message and exit the program.

    Declare an integer numWords to keep track of the number of distinct words stored in the array of of WordCount objects. This variable should be initialized to 0.

    The function should then read words from the file as C strings using the >> operator until end-of-file is reached. For each word read, the function should do the following:

    1. Call stripPunctuation() to strip any punctuation from the beginning and end of the word string. If the resulting string is empty (length 0), the remaining steps can be skipped.
    2. Call stringToUpper() to convert the word string to uppercase.
    3. Call searchForWord() to search for the word string in the array of WordCount objects.
    4. If the index returned by the search is -1, this is a new word that must be added to the end of the array. Copy the word string into the word data member of the next empty array element (the one at index numWords), and set its count data member to 1. Then increment numWords.
    5. Otherwise, this word has been found in the array. Increment the count data member for the WordCount object at the index returned by the search.

    Once all words have been read from the file, the file should be closed and numWords should be returned.

  • void stripPunctuation(char* s)

    Parameters: 1) A C string that contains a word to be stripped of punctuation.

    Returns: Nothing.

    This function should remove any punctuation characters at the beginning and end of the C string s. For example:

    • The string "textile" should become textfile
    • The string (Wikipedia, should become Wikipedia
    • The string content. should become content
    • The string generic should remain generic

    It is possible (although rare) for a string to contain nothing but punctuation characters. In that case, the result of executing this function should be an empty string.

    There are a number of valid approaches to solving this problem.

    You will need to be able to distinguish between punctuation characters and non-punctuation (or alphanumeric) characters; the C library functions isalnum() and ispunct() can help you do that.

    Performing the required modifications to the string "in place" may be difficult, so feel free to use a local temporary character array to make your changes and then copy the final result back into s at the end of the function.

  • void stringToUpper(char* s)

    Parameters: 1) A C string that contains a word to be converted to uppercase.

    Returns: Nothing.

    This function should loop through the characters of the C string s and convert them to uppercase using the C library function toupper().

  • int searchForWord(const char* word, const WordCount wordArray[], int numWords)

    Parameters: 1) A C string that will not be changed by this function and that contains a word to search for; 2) an array of WordCount objects to search that will not be changed by this function; 3) the number of elements in the array filled with valid data.

    Returns: If the search was successful, returns the index of the array element that contains the word that was searched for, or -1 if the search fails.

    This function should use the linear search algorithm to search for the C string word in wordArray.

  • void sortWords(WordCount wordArray[], int numWords)

    Parameters: 1) An array of WordCount objects to sort; 2) the number of elements in the array filled with valid data.

    Returns: Nothing.

    This function should sort the array of WordCount objects in ascending order by account number using the selection sort algorithm.

    The sort code linked to above sorts an array of integers called numbers of size size. You will need to make a number of changes to that code to make it work in this program:

    1. Change the parameters for the function to those described above.

    2. In the function body, change the data type of temp to WordCount. This temporary storage will be used to swap elements of the array of WordCount objects.

    3. In the function body, change any occurrence of numbers to the name of your array of WordCount objects and size to numWords (or whatever you called the variable that tracks the number of array elements filled with valid data.

    4. The comparison of numbers[j] and numbers[min] will need to use the C string library function strcmp() to perform the comparison. The final version of the if condition should look something like this:

         if (strcmp(wordArray[j].word, wordArray[min].word) < 0)
            ...
      
    5. It is legal to assign one WordCount object to another; you don't need to write code to copy individual data members.

  • void printWords(const WordCount wordArray[], int numWords)

    Parameters: 1) An array of WordCount objects to print that will not be changed by this function; 2) the number of elements in the array filled with valid data.

    Returns: Nothing.

    This function should loop through the array and print each word and its corresponding count, neatly formatted into columns similar to the sample output. It should also print the number of words in the file (which is equal to the sum of the counts) and the number of distinct words (equal to numWords).

Text File:

Text Files - A Brief Description (Wikipedia, 2019)

A text file (sometimes spelled "textfile"; an old alternative name is "flatfile") is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists stored as data within a computer file system. The end of a text file is often denoted by placing one or more special characters, known as an end-of-file marker, after the last line in a text file. Such markers were required under the CP/M and MS-DOS operating systems. On modern operating systems such as Windows and Unix-like systems, text files do not contain any special EOF character.

"Text file" refers to a type of container, while plain text refers to a type of content. Text files can contain plain text, but they are not limited to such.

At a generic level of description, there are two kinds of computer files: text files and binary files.

Solutions

Expert Solution

Save following code in file assign1.cpp

#include <iostream>
#include <string.h>
#include <fstream>

using namespace std;

//structure defintion
struct WordCount
{
   //two data members
   char word[31];
   int count;
};


//Function to convert string upper case
void stringToUpper(char *s)
{
   for(int i=0;s[i]!='\0';i++) //repeat until end of string char null
   {
       if(*(s+i)>=97 && *(s+i)<=122) //when lower case letter
           *(s+i)=*(s+i)-32;//convert to upper case by subtraction of ascii with 32
   }
}

//Function striping punctuation characters
void stripPunctuation(char *s)
{
   int pos=0;
   for (char *p = s; *p; ++p) //repeat all characters
       if (isalpha(*p)) //when alphabet
           s[pos++] = *p; //then concate to s
   s[pos] = '\0'; //end with null
}
          

//Function search word
int searchForWord(const char* word, const WordCount wordArray[],int numWords)
{
   for(int i=0;i<numWords;i++) //repeat loop
       if(strcmp(wordArray[i].word,word)==0) //when matched
           return i;//return its index
   return -1;//return -1, when fails
}

//Function for counting words
int countWords(const char* fileName,WordCount wordArray[])
{
   ifstream infile(fileName);//open file for reading
   int count=0;//initially count is 0
   char word[31];//take max length word
   infile>>word;//read the word
  
   while(!infile.eof()) //repeat until end of file
   {
       stringToUpper(word);//convert to upper case
       stripPunctuation(word);//remove punctuation characters
       int found=searchForWord(word, wordArray,count);//search word
       if(found==-1) //when not found
       {
           strcpy(wordArray[count].word,word);//store as new word
           wordArray[count].count=1;//first count is 1
           count++;//increment count
       }
       else //when found
           wordArray[found].count+=1;//increment existing count
       infile>>word;//read the next words
   }
   infile.close();//close the file
   return count;//finally return count      
      
}

//Function to sort words
void sortWords(WordCount wordArray[],int numWords)
{
   for(int i=0;i<numWords-1;i++) //outer loop
       for(int j=0;j<numWords-i-1;j++) //inner loop
       {
           if(strcmp(wordArray[j].word,wordArray[j+1].word)==0) //compare
           {
               //swapping process
               WordCount temp=wordArray[j];
               wordArray[j]=wordArray[j+1];
               wordArray[j+1]=temp;
           }
       }
}

//Function prints all words and their frequency
void printWords(const WordCount wordArray[],int numWords)
{
   cout<<"Word\tFrequency"<<endl;
   for(int i=0;i<numWords;i++) //repeat loop
   {
       cout<<wordArray[i].word<<"\t"<<wordArray[i].count<<endl;
   }
}
          

//main with command line arguments
int main(int argc,char* argv[])
{
   WordCount wordcount[200];//declare maximum 200 words
   if(argc!=2) //when count of args are not 2
   {
       cout<<"\nUsage : assign1 [file-name]"<<endl;//print error message
       return 1; //return error code 1
   }
   else //when passed exactly two args
   {
       int total=countWords(argv[1],wordcount);//read to array
       printWords(wordcount,total);//call print function
   }
      
   return 0;
}

In file text1.txt save the sample text as

Text Files - A Brief Description (Wikipedia, 2019)

A text file (sometimes spelled "textfile"; an old alternative name is "flatfile") is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists stored as data within a computer file system. The end of a text file is often denoted by placing one or more special characters, known as an end-of-file marker, after the last line in a text file. Such markers were required under the CP/M and MS-DOS operating systems. On modern operating systems such as Windows and Unix-like systems, text files do not contain any special EOF character.

"Text file" refers to a type of container, while plain text refers to a type of content. Text files can contain plain text, but they are not limited to such.

At a generic level of description, there are two kinds of computer files: text files and binary files.
As you execute the file you will get output as below.

Word   Frequency
TEXT   12
FILES   6
A   11
BRIEF   1
DESCRIPTION   2
WIKIPEDIA   1
FILE   7
SOMETIMES   1
SPELLED   1
TEXTFILE   1
AN   2
OLD   1
ALTERNATIVE   1
NAME   1
IS   4
FLATFILE   1
KIND   1
OF   8
COMPUTER   3
THAT   1
STRUCTURED   1
AS   4
SEQUENCE   1
LINES   1
ELECTRONIC   1
EXISTS   1
STORED   1
DATA   1
WITHIN   1
SYSTEM   1
THE   3
END   1
OFTEN   1
DENOTED   1
BY   1
PLACING   1
ONE   1
OR   1
MORE   1
SPECIAL   2
CHARACTERS   1
KNOWN   1
ENDOFFILE   1
MARKER   1
AFTER   1
LAST   1
LINE   1
IN   1
SUCH   3
MARKERS   1
WERE   1
REQUIRED   1
UNDER   1
CPM   1
AND   3
MSDOS   1
OPERATING   2
SYSTEMS   3
ON   1
MODERN   1
WINDOWS   1
UNIXLIKE   1
DO   1
NOT   2
CONTAIN   2
ANY   1
EOF   1
CHARACTER   1
REFERS   2
TO   3
TYPE   2
CONTAINER   1
WHILE   1
PLAIN   2
CONTENT   1
CAN   1
BUT   1
THEY   1
ARE   2
LIMITED   1
AT   1
GENERIC   1
LEVEL   1
THERE   1
TWO   1
KINDS   1
BINARY   1


Related Solutions

2.c++ if and loop statement Write a program that will count the number of even number...
2.c++ if and loop statement Write a program that will count the number of even number and odd numbers between two inputted numbers. Display the numbers and compute the sum and average of all the even numbers and the sum and average all the odd numbers. Sample outputs: Enter starting number:3 Enter starting number:4 Enter ending number:10 Enter ending number:10 odd numbers Even number 3 4 5 6 7 8 9 10 number of even numbers=4 number of even numbers=4...
C LANGUAGE ONLY Write a C program to count the total number of duplicate elements in...
C LANGUAGE ONLY Write a C program to count the total number of duplicate elements in an array. Enter the number of elements to be stored in the array: 3 Input 3 elements in the arrangement: element [0]: 5 element [1]: 1 element [2]: 1 Expected output: The total number of duplicate elements found in the array is: 1
2. [50] Write a C program to count the total number of commented characters and words...
2. [50] Write a C program to count the total number of commented characters and words in a C file taking both types of C file comments (single line and block) into account.
Write a C program that counts the number of odd numbers with using function count() within...
Write a C program that counts the number of odd numbers with using function count() within the set. The set has only one negative number which determines the end of set.
For this computer assignment, you are to write a C++ program to implement a class for...
For this computer assignment, you are to write a C++ program to implement a class for binary trees. To deal with variety of data types, implement this class as a template. Most of the public member functions of the BinaryTree class call private member functions of the class (with the same name). These private member functions can be implemented as either recursive or non-recursive, but clearly, recursive versions of these functions are preferable because of their short and simple implementations...
For this computer assignment, you are to write a C++ program to implement a class for...
For this computer assignment, you are to write a C++ program to implement a class for binary trees. To deal with variety of data types, implement this class as a template. Most of the public member functions of the BinaryTree class call private member functions of the class (with the same name). These private member functions can be implemented as either recursive or non-recursive, but clearly, recursive versions of these functions are preferable because of their short and simple implementations...
In Java: Write a program that will count the number of characters, words, and lines in...
In Java: Write a program that will count the number of characters, words, and lines in a file. Words are separated by whitespace characters. The file name should be passed as a command-line argument, as shown below. c:\exercise>java Exercise12_13 Loan.java File loan.java has 1919 characters 210 words 71 lines c:\exercise> Class Name: Exercise12_13
For this week’s lab assignment, you will write a program called lab9.c. You will write a...
For this week’s lab assignment, you will write a program called lab9.c. You will write a program so that it contains two functions, one for each conversion. The program will work the same way and will produce the same exact output. The two prototypes should be the following: int btod(int size, char inputBin[size]); int dtob(int inputDec); The algorithm for the main() function should be the following: 1. Declare needed variables 2. Prompt user to enter a binary number 3. Use...
C PROGRAMMING – Steganography In this assignment, you will write an C program that includes processing...
C PROGRAMMING – Steganography In this assignment, you will write an C program that includes processing input, using control structures, and bitwise operations. The input for your program will be a text file containing a large amount of English. Your program must extract the “secret message” from the input file. The message is hidden inside the file using the following scheme. The message is hidden in binary notation, as a sequence of 0’s and 1’s. Each block of 8-bits is...
using java For this assignment, you will write a program that guesses a number chosen by...
using java For this assignment, you will write a program that guesses a number chosen by your user. Your program will prompt the user to pick a number from 1 to 10. The program asks the user yes or no questions, and the guesses the user’s number. When the program starts up, it outputs a prompt asking the user to choose a number from 1 to 10. It then proceeds to ask a series of questions requiring a yes or...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT