Question

In: Computer Science

Programming in C (not C++) The high level goal of this project is to write a...

Programming in C (not C++)

The high level goal of this project is to write a program called "wordfreak" that takes "some files" as input, counts how many times each word occurs across them all (considering all letters to be lower case), and writes those words and associated counts to an output file in alphabetical order.

We provide you some example book text files to test your program on. For example, if you ran

$ ./wordfreak aladdin.txt

Then the contents of the output file would be:

$ cat output.txt

a : 49

aback : 1

able : 1

...

required : 1

respectfully : 1

retraced : 1

...

that : 11

the : 126

their : 2

...

you : 20

young : 1

your : 7

The words from all the input files will be counted together. If a word appears 3 times in one input file and 4 times in another, it will be counted 7 times between the two.

Input

wordfreak needs to be able to read input from 3 sources: standard input, files given in argv, and a file given as the environment variable. It should read words from all these that are applicable (always standard in, sometimes the other 2).

A working implementation must be able to accept input entered directly into the terminal, with the end of such input signified by the EOF character (^D (control+d)):

$ ./wordfreak

I can write words here,

and end the file with control plus d

$ cat output.txt

and : 1

can : 1

control : 1

d : 1

end : 1

file : 1

here : 1

i : 1

plus : 1

the : 1

with : 1

words : 1

write : 1

However, it should alternately be able to accept a file piped in to standard input via bash’s operator pipe:

$ cat aladdin.txt | ./wordfreak

It should be noted that your program has no real way to tell which of these two situations is occuring, it just sees information written to standard input. However, by just treating standard input like a file, you will get both of these behaviours.

A working implementation must also accept files as command line arguments:

$ ./wordfreak aladdin.txt iliad.txt odyssey.txt

Finally, a working implementation must also accept an environment variable called WORD_FREAK set to a single file from the command line to be analyzed:

$ WORD_FREAK=aladdin.txt ./wordfreak

And of course, it should be able to do all of these at once

$ cat newton.txt | WORD_FREAK=aladdin.txt ./wordfreak iliad.txt odyssey.txt

Words

Words should be comprised of only alpha characters, and all alpha characters should be taken to be lower case.

For example "POT4TO???" would give the words "pot" and "to". And the word "isn’t" would be read as "isn" and "t". While this isn't necessarily intuitively correct, this is what your code is expected to do:

$ echo "Isn’t that a POT4TO???" | ./wordfreak

$ cat output.txt

a : 1

isn : 1

pot : 1

t : 1

that : 1

to : 1

You are required to store the words in a specific data structure. You should have a binary search tree for each letter 'a' to 'z' that stores the words starting with that letter (and their counts). This can be thought of as a hash function from strings to binary search trees, where the hashing function is just first_letter - 'a'. Note that these BSTs will not likely be balanced; that is fine.

Output

The words should be written to the file alphabetically (the BSTs make this fairly trivial). Each word will give a line of the form "[word][additional space] : [additional space][number]\n". The caveat is that all the colons need to line up. The words are left-aligned and the longest will have a single space between its end and the colon (note "respectfully" in the example below); the numbers are right-aligned and the longest will have a single space between the colon and its beginning (note 126 in the example below).

$ ./wordfreak aladdin.txt

$ cat output.txt

a : 49

...

respectfully : 1

...

the : 126

...

your : 7

The output file should be named output.txt. Note that when opening the file to write to, you will either need to create the file or remove all existing contents, so make use of open()'s O_CREAT and O_TRUNC. Moreover, you will want the file’s permissions to be set so that it can be read. open()’s third argument determines permissions of created files, something like 0644 will make it readable.

restricted to only using the following system calls: open(), close(), read(), write(), and lseek() for performing I/O. You are allowed to use other C library calls (e.g., malloc(), free()). However, all I/O is restricted to the Linux kernel’s direct API support for I/O. You are also allowed to use sprintf() to make formatting easier.

Solutions

Expert Solution

i programmed the code to find wordcount from one file at a time but i can't do it for multiple files as it was getting error sorry......

#include <stdio.h>
#include <ctype.h>

enum {INITIAL,WORD,SPACE};

int main()
{
  int c;
  int state = INITIAL;
  int wcount = 0;

  c = getchar();
  while (c != EOF)
  {
    switch (state)
    {
      case INITIAL: wcount = 0;
                    if (isalpha(c) || c=='\'')
                    {
                       wcount++;
                       state = WORD;
                    }
                    else
                       state = SPACE;
                    break;

      case WORD:    if (!isalpha(c) && c!='\'')
                       state = SPACE;
                    break;

      case SPACE:   if (isalpha(c) || c=='\'')
                    {
                       wcount++;
                       state = WORD;
                    }
    }
    c = getchar();
  }
  printf ("%d words\n", wcount);
  return 0;
}

Related Solutions

The goal of this project is to practice (Write a C Program) with a function that...
The goal of this project is to practice (Write a C Program) with a function that one of its parameter is a function.The prototype of this function is: void func ( float (*f)(float*, int), float* a, int length); This means the function: func has three parameters: float (*f)(float*, int): This parameter itself is a function: f that has two parameters and returns a floating-point number. In the body of the function: func we call the function: f with its arguments...
1. INTRODUCTION The goal of this programming assignment is for students to write a Python program...
1. INTRODUCTION The goal of this programming assignment is for students to write a Python program that uses repetition (i.e. “loops”) and decision structures to solve a problem. 2. PROBLEM DEFINITION  Write a Python program that performs simple math operations. It will present the user with a menu and prompt the user for an option to be selected, such as: (1) addition (2) subtraction (3) multiplication (4) division (5) quit Please select an option (1 – 5) from the...
ASM Programming instructions: For this week the student’s goal is to write an occurrence finding function...
ASM Programming instructions: For this week the student’s goal is to write an occurrence finding function that builds off of the past two weeks of assembly programming. The requirements for this function are as follows. The user is allowed to type in 10 positive integers plus an occurrence value (meaning 11 inputs). The 10 integers should be handled in a loop where each value input by the user (as in week 1) is stored into memory in an array like...
C - There is a high level of uncertainty in the global economy as a result...
C - There is a high level of uncertainty in the global economy as a result of COVID-19, which has negatively affected global trade. The impact of this pandemic has left even the most developed economies crippled. In the absence of COVID-19, would Nigeria have made it to South Africa’s list of the top 5 trading partners by exports in the Agricultural sector. Use the CAGE framework to answer this question and provide a well substantiated answer. [25 marks]
write pseudocode not c program If- else programming exercises 1.    Write a C program to find...
write pseudocode not c program If- else programming exercises 1.    Write a C program to find maximum between two numbers. 2.    Write a C program to find maximum between three numbers. 3.    Write a C program to check whether a number is negative, positive or zero. 4.    Write a C program to check whether a number is divisible by 5 and 11 or not. 5.    Write a C program to check whether a number is even or odd. 6.    Write...
Programming II: C++ - Programming Assignment Vector Overloads Overview In this assignment, the student will write...
Programming II: C++ - Programming Assignment Vector Overloads Overview In this assignment, the student will write a C++ program that overloads the arithmetic operators for a pre-defined Vector object. When completing this assignment, the student should demonstrate mastery of the following concepts: · Object-oriented Paradigm · Operator Overloading - Internal · Operator Overloading - External · Mathematical Modeling Assignment In this assignment, the student will implement the overloaded operators on a pre-defined object that represents a Vector. Use the following...
The Programming Language is C++ Objective: The purpose of this project is to expose you to:...
The Programming Language is C++ Objective: The purpose of this project is to expose you to: One-dimensional parallel arrays, input/output, Manipulating summation, maintenance of array elements. In addition, defining an array type and passing arrays and array elements to functions. Problem Specification: Using the structured chart below, write a program to keep records and print statistical analysis for a class of students. There are three quizzes for each student during the term. Each student is identified by a four-digit student...
The Programming Language is C++ Objective: The purpose of this project is to expose you to:...
The Programming Language is C++ Objective: The purpose of this project is to expose you to: One-dimensional parallel arrays, input/output, Manipulating summation, maintenance of array elements. In addition, defining an array type and passing arrays and array elements to functions. Problem Specification: Using the structured chart below, write a program to keep records and print statistical analysis for a class of students. There are three quizzes for each student during the term. Each student is identified by a four-digit student...
C PROGRAMMING – Steganography In this assignment, you will write an C program that includes processing...
C PROGRAMMING – Steganography In this assignment, you will write an C program that includes processing input, using control structures, and bitwise operations. The input for your program will be a text file containing a large amount of English. Your program must extract the “secret message” from the input file. The message is hidden inside the file using the following scheme. The message is hidden in binary notation, as a sequence of 0’s and 1’s. Each block of 8-bits is...
Programming in C (not C++) Write the function definition for a function called CompareNum that takes...
Programming in C (not C++) Write the function definition for a function called CompareNum that takes one doyble argument called "num". The function will declare, ask, and get another double from the user. Compare the double entered by the user to "num" and return a 0 if they are the same, a -1 num is less than the double entered by the user and 1 if it is greater.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT