Question

In: Computer Science

C++ Question: we need to read speech from .txt file. Steve Jobs delivered a touching and...

C++ Question:

we need to read speech from .txt file.

Steve Jobs delivered a touching and inspiring speech at Stanford's 2005 commencement. The transcript of this speech is attached at the end of this homework description. In this homework, you are going to write a program to find out all the unique tokens (or words) used in this speech and their corresponding frequencies, where the frequency of a word w is the total number of times that w appears in the speech. You are required to store such frequency information into a vector and then sort these tokens according to frequency. Please feel free to use existing functions such as strtok() or sstream to identify tokens in this implementation.

Specifically, you are required to include the following elements in your program:

Declare a struct TokenFreq that consists of two data members: (1) string value; and (2) int freq; Obviously, an object of this struct will be used to store a specific token and its frequency. For example, the following object word stores the token "dream" and its frequency 100:

  TokenFreq word;

  word.value="dream";

word.freq=100;

Remember to declare this struct at the beginning of your program and outside any function. A good place would be right after the "using namespace std;" line. This way, all the functions in your program will be able to use this struct to declare variables.

Implement the function vector<TokenFreq> getTokenFreq( string inFile_name); This function reads the specified input file line by line, identifies all the unique tokens in the file and the frequency of each token. It stores all the identified (token, freq) pairs in a vector and returns this vector to the calling function. Don't forget to close the file before exiting the function. In this homework, these tokens are case insensitive. For example, "Hello" and "hello" are considered to be the same token.  

Implement the selection sort algorithm to sort a vector<TokenFreq> in ascending order of token frequency. The pseudo code of the selection algorithm can be found at http://www.algolist.net/Algorithms/Sorting/Selection_sort You can also watch an animation of the sorting process at http://visualgo.net/sorting -->under "select". This function has the following prototype:

void selectionSort( vector<TokenFreq> & tokFreqVector ); This function receives a vector of TokenFreq objects by reference and applies the selections sort algorithm to sort this vector in increasing order of token frequencies.  

Implement the insertion sort algorithm to sort a vector<TokenFreq> in descending order of token frequency. The pseudo code of the selection algorithm can be found at http://www.algolist.net/Algorithms/Sorting/Insertion_sort Use the same link above to watch an animation of this algorithm. This function has the following prototype:

void insertionSort( vector<TokenFreq> & tokFreqVector );  

Implement the void writeToFile( vector<TokenFreq> &tokFreqV, string outFileName); function. This function receives a vector of TokenFreq objects and writes each token and its frequency on a separate line in the specified output file.

Implement the int main() function to contain the following features: (1) asks the enduser of your program to specify the name of the input file, (2) ) call the getTokenFreq() to identify each unique token and its frequency, (3) call your selection sort and insertion sort functions to sort the vector of TokenFreq objects assembled in (2); and (4) call the WriteToFile() function to print out the sorted vectors in two separate files, one in ascending order and the other in descending order.    

Example input and outputs:  

Assume that your input file contains the following paragraph: "And no, I'm not a walking C++ dictionary. I do not keep every technical detail in my head at all times. If I did that, I would be a much poorer programmer. I do keep the main points straight in my head most of the time, and I do know where to find the details when I need them. by Bjarne Stroustrup"

After having called the getTokenFreq() function, you should identify the following list of (token, freq) pairs and store them in a vector (note that the order might be different from yours): {'no,': 1, 'and': 1, 'walking': 1, 'be': 1, 'dictionary.': 1, 'Bjarne': 1, 'all': 1, 'need': 1, 'Stroustrup': 1, 'at': 1, 'times.': 1, 'in': 2, 'programmer.': 1, 'where': 1, 'find': 1, 'that,': 1, 'would': 1, 'when': 1, 'detail': 1, 'time,': 1, 'to': 1, 'much': 1, 'details': 1, 'main': 1, 'do': 3, 'head': 2, 'I': 6, 'C++': 1, 'poorer': 1, 'most': 1, 'every': 1, 'a': 2, 'not': 2, "I'm": 1, 'by': 1, 'And': 1, 'did': 1, 'of': 1, 'straight': 1, 'know': 1, 'keep': 2, 'technical': 1, 'points': 1, 'them.': 1, 'the': 3, 'my': 2, 'If': 1}

After having called the selectionSort() function, the sorted vector of token-freq pairs will contain the following information (again, the tokens of the same frequency might appear in different order from yours) : [('no,', 1), ('and', 1), ('walking', 1), ('be', 1), ('dictionary.', 1), ('Bjarne', 1), ('all', 1), ('need', 1), ('Stroustrup', 1), ('at', 1), ('times.', 1), ('programmer.', 1), ('where', 1), ('find', 1), ('that,', 1), ('would', 1), ('when', 1), ('detail', 1), ('time,', 1), ('to', 1), ('much', 1), ('details', 1), ('main', 1), ('C++', 1), ('poorer', 1), ('most', 1), ('every', 1), ("I'm", 1), ('by', 1), ('And', 1), ('did', 1), ('of', 1), ('straight', 1), ('know', 1), ('technical', 1), ('points', 1), ('them.', 1), ('If', 1), ('in', 2), ('head', 2), ('a', 2), ('not', 2), ('keep', 2), ('my', 2), ('do', 3), ('the', 3), ('I', 6)]

Solutions

Expert Solution

#include <iostream>
#include <cstring>
#include <cctype>
#include <vector>
#include <fstream>

using namespace std;

struct TokenFreq {
   string value;
   int freq;
};

string to_lower (string str){
   int i;
   for (i=0;i<str.size();i++)
       str[i] = tolower(str[i]);
   return str;
}
vector<TokenFreq> getTokenFreq ( string inFile_name ) {
   int i;
   string token, str1, str2;
   vector<TokenFreq> tok;
   ifstream tfile;
   tfile.open(inFile_name.c_str());
   while (tfile >> token)
   {
       if(tok.size() == 0) {
           tok[0].value = token;
           tok[0].freq = 1;
       }
       else {
           for(i=0; i<tok.size(); i++) {
               str1 = to_lower(tok[i].value);
               str2 = to_lower(token);
               if(str1.compare(str2) == 0) {
                   tok[i].value = token;
                   tok[i].freq++;
               }
               else {
                   tok[i].value = token;
                   tok[i].freq = 1;
               }
           }
       }
   }
   return tok;
}

void insertionSort( vector<TokenFreq> & tokFreqVector){
   int i, j, temp;

   for (i=0; i<tokFreqVector.size(); i++) {
       for (j=i; j>=0; j--) {
           if(tokFreqVector[i].freq < tokFreqVector[i].freq){
               temp = tokFreqVector[j].freq;
               tokFreqVector[j].freq = tokFreqVector[j-1].freq;
               tokFreqVector[j-1].freq = temp;
           }
           else
               break;
       }
   }
}

void selectionSort( vector<TokenFreq> &tokFreqVector){
   int i, j, loc, temp, size, min;
   size = tokFreqVector.size();
   for(i=0; i<size-1;i++) {
       min = tokFreqVector[i].freq;
       loc = i;
       for(j=i+1;j<size;j++) {
           if(min > tokFreqVector[j].freq) {
               min = tokFreqVector[j].freq;
               loc = j;
           }
       }
       temp = tokFreqVector[i].freq;
       tokFreqVector[i].freq = tokFreqVector[loc].freq;
       tokFreqVector[loc].freq = temp;
   }
}

void writeToFile( vector<TokenFreq> &tokFreqVector, string outFileName){
   int size,i;
   ofstream outfile;
   outfile.open(outFileName.c_str());
   size = tokFreqVector.size();
   for(i=0; i<size; i++) {
       outfile << tokFreqVector[i].value << " : " << tokFreqVector[i].freq << endl;  
   }
   outfile.close();
}

int main() {
        string inFile_name, outFileName;
        vector<TokenFreq> tokFreqVector;
        cout<< "Enter Input File Name: ";
        cin>>inFile_name;
        tokFreqVector = getTokenFreq ( inFile_name );

        selectionSort(tokFreqVector);
        cout << "Enter Output file For selection Sort";
        cin >> outFileName;
        writeToFile(tokFreqVector, outFileName);
  
        insertionSort( tokFreqVector);
        cout << "Enter Output file For selection Sort";
        cin >> outFileName;       
        writeToFile(tokFreqVector, outFileName);
}


Related Solutions

Analysis steve jobs commencement speech stanford 2005. Explore and describe the content of the speech. Provide...
Analysis steve jobs commencement speech stanford 2005. Explore and describe the content of the speech. Provide a summary of the speech. The following questions may help you develop this portion of your analysis: •What were the main points of the speech? How were they easy or not so easy to follow?•How were these main points developed? (Use of examples? Use of statistics?Use of metaphors?) •How was the message ‘worded’? What language was used? Did any terms or phrases really make...
1.   Bubble Sort Implement a bubble sort program that will read from a file “pp2.txt” from...
1.   Bubble Sort Implement a bubble sort program that will read from a file “pp2.txt” from the current directory a list of intergers (10 numbers to be exact), and the sort them, and print them to the screen. You can use redirection to read data from a given file through standard input, as opposed to reading the data from the file with the read API (similar to Lab #1). You can assume the input data will only have 10 numbers...
C++ Assignment Hi, I need to create a program that: 1.Reads a source file (.txt) with...
C++ Assignment Hi, I need to create a program that: 1.Reads a source file (.txt) with following information: 1,2,3,4,5 red,blue,green,yellow,orange left, right,front, back 2. After having program read the .txt file, output the above information in categories of Symbol, Token Type, and Count : Example: Symbol---Token Type (data type)----Count (how many times symbol appeared in .txt file) =========================================================================== 1 ----digit ----1 2 ----digit ----1 red ----color ----1 blue ----color ----1 left ----direction ----1 right ----direction    ----1
C++ Assignment Hi, I need to create a program that: 1.Reads a source file (.txt) with...
C++ Assignment Hi, I need to create a program that: 1.Reads a source file (.txt) with following information: 1,2,3,4,5 red,blue,green,yellow,orange left, right,front, back 2. After having program read the .txt file, output the above information in categories of Symbol, Token Type, and Count : Example: Symbol---Token Type (data type)----Count (how many times symbol appeared in .txt file) =========================================================================== 1 ----digit ----1 2 ----digit ----1 red ----color ----1 blue ----color ----1 left ----direction ----1 right ----direction    ----1
C++ Assignment Hi, I need to create a program that: 1.Reads a source file (.txt) with...
C++ Assignment Hi, I need to create a program that: 1.Reads a source file (.txt) with following information: 1,2,3,4,5 red,blue,green,yellow,orange left, right,front, back 2. After having program read the .txt file, output the above information in categories of Symbol, Token Type, and Count : Example: Symbol---Token Type (data type)----Count (how many times symbol appeared in .txt file) =========================================================================== 1 ----digit ----1 2 ----digit ----1 red ----color ----1 blue ----color ----1 left ----direction ----1 right ----direction    ----1
****NEED CODED IN C++, READ THE INSTRUCTIONS CAREFULLY AND PAY ATTENTION TO THE INPUT FILE, IT...
****NEED CODED IN C++, READ THE INSTRUCTIONS CAREFULLY AND PAY ATTENTION TO THE INPUT FILE, IT IS REQUIRED FOR USE IN THE PROBLEM**** You are to generate a list of customers to serve based on the customer’s priority, i.e. create a priority queue/list for a local company. The company has been receiving request and the request are recorded in a file, in the order the request was made. The company processes each user based on their priority, the highest priority...
For this question you will need to upload a file. It is question 3 from the...
For this question you will need to upload a file. It is question 3 from the PDF file I sent you. Please show your work. The following information is given about the market for a normal good. Demand: P = 150-2Qd Supply P = 20+ 0.5 Qs What is the quantity demanded at a price of $80? (1.5 mark) What is the quantity supplied at a price of $80? (1.5 mark) At a price of $80 the market is not...
I need C++ program that Read an input file of text.txt one word at a time....
I need C++ program that Read an input file of text.txt one word at a time. The file should consist of about 500 words. The program should remove all punctuations,keep only words. Store the words in a built-in STL container, such as vector or map.Can someone help with any additional comments that I can understand the logic?thank you
Create a c++ program with this requirements: Create an input file using notepad ( .txt )...
Create a c++ program with this requirements: Create an input file using notepad ( .txt ) . When testing your program using different input files, you must change the filename inside your program otherwise there will be syntax errors. There are a finite number of lines to be read from the data file. But we can’t assume to know how many before the program executes; so, the standard tactic is to keep reading until you find the “End of File”...
C Language NO ARRAY UTILIZATION OR SORTING Create a .txt file with 20 integers in the...
C Language NO ARRAY UTILIZATION OR SORTING Create a .txt file with 20 integers in the range of 0 to 100. There may be repeats. The numbers must not be ordered/sorted. The task is to find and print the two smallest numbers. You must accomplish this task without sorting the file and without using arrays for any purpose. It is possible that the smallest numbers are repeated – you should print the number of occurrences of the two smallest numbers....
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT