In: Computer Science
Assignment #1: Sorting with Binary Search Tree (IN C LANGUAGE)
Through this programming assignment, the students will learn to do the following:
1. Know how to process command line arguments.
2. Perform basic file I/O.
3. Use structs, pointers, and strings.
4. Use dynamic memory.
This assignment asks you to sort the lines of an input file (or from standard input) and print the sorted lines to an output file (or standard output). Your program, called bstsort (binary search tree sort), will take the following command line arguments: % bstsort [-c] [-o output_file_name] [input_file_name] If -c is present, the program needs to compare the strings case sensitive; otherwise, it's case insensitive. If the output_file_name is given with the -o option, the program will output the sorted lines to the given output file; otherwise, the output shall be the standard output. Similarly, if the input_file_name is given, the program will read from the input file; otherwise, the input will be from the standard input. You must use getopt() to parse the command line arguments to determine the cases. All strings will be no more than 100 characters long. In addition to parsing and processing the command line arguments, your program needs to do the following:
1. You need to construct a binary search tree as you read from input. A binary search tree is a binary tree. Each node can have at most two child nodes (one on the left and one on the right), both or either one can be empty. If a child node exists, it's the root of a binary search tree (we call subtree). Each node contains a key (in our case, it's a string) and a count of how many of that string were included. If he left subtree of a node exists, it contains only nodes with keys less than the node's key. If the right subtree of a node exists, it contains only nodes with keys greater than the node's key. You can look up binary search tree on the web or in your Data Structure textbook. Note that you do not need to balance the binary search tree (that is, you can ignore all those rotation operations) in this assignment.
2. Initially the tree is empty (that is, the root is null). The
program reads from the input file (or stdin) one line at a time; If
the line is not an empty line and the line is not already in the
tree, it should create a tree node that stores a pointer to the
string and a count of 1 indicating this is the first occurrence of
that string, and then insert the tree node to the binary search
tree. An empty line would indicate the end of input for stdin, an
empty line or end of file would indicate the end of
input for an input file. If the line is not an empty line and the
line is already in the tree, increase the count for that node
indicating that there are multiple instances of that line.
3. You must develop two string comparison functions, one for case sensitive and the other for case insensitive. You must not use the strcmp() and strcasecmp() functions provided by the C library. You must implement your own version. You will be comparing the ASCII values. Note that using ASCII, all capital letters come before all lower case letters.
4. Once the program has read all the input (when EOF is returned or a blank line encountered), the program then performs an in-order traversal of the binary search tree to print out all the strings one line at a time to the output file or stdout. Next to the line include a count of how many times that line appeared. If the selection was for case insensitive then you should include either the first choice encountered, the last choice encountered or all capital letters.
5. Before the program ends, it must reclaim the tree! You can do this by performing a post-order traversal, i.e., reclaiming the children nodes before reclaiming the node itself. Make sure you also reclaim the memory occupied by the string as well. 6. It is required that you use getopt for processing the command line and use malloc or calloc and free functions for dynamically allocating and deallocating nodes and the buffers for the strings. It is required that you implement your own string comparison functions instead of using the corresponding libc functions.
Here's an example:
bash$ cat myfile
bob is working.
david is a new hire.
Bob is working.
alice is bob's boss.
charles doesn't like bob.
bash$ ./bstsort myfile
1 alice is bob's boss.
2 bob is working.
1 charles doesn't like bob.
1 david is a new hire.
Please submit your work through the inbox as one zip file. Follow the instructions below carefully (to avoid unnecessary loss of grade): You should submit the source code and the Makefile in the zip file called FirstnameLastnameA1. One should be able to create the executable by simply 'make'.
The Makefile should also contain a 'clean' target for cleaning up the directory (removing all temporary files and object files). Make sure you don't include intermediate files: *.o, executables, *~, etc., in your submission. (There'll be a penalty for including unnecessary intermediate files). Only two files should be included unless permission is given for more, those would be bstsort.c, and Makefile. If you feel a need to include a bstsort.h file, please send me a note asking for permission.
Following are the description of files, i used. All of them are coded in C language
MakeFile is specifies, all the header files that are used in the code.
MakeFile and binarysort file can be used for the assignment purpose.
MakeFile file
CC = gcc
CFLAG=-Wall -g
all:
build bin
build:
main.o parseCommandLine.o determineRead.o stringComparisons.o
binarySort.o determineOutput.o
$(CC) $(CFLAG) -o bstsort2 main.o
parseCommandLineOptions.o determineRead.o readFile.o
readFromInput.o stringComparisons.o binarySort.o determineOutput.o
writeToOutputFile.o writeToScreen.o
main.o:
main.c header.h
$(CC) $(CFLAG) -c main.c
parseCommandLine.o:
parseCommandLineOptions.c header.h
$(CC) $(CFLAG) -c parseCommandLineOptions.c
determineRead.o:
determineRead.c readFile.c readFromInput.c header.h
$(CC) $(CFLAG) -c determineRead.c readFile.c
readFromInput.c
stringComparisons.o:
stringComparisons.c header.h
$(CC) $(CFLAG) -c stringComparisons.c
binarySort.o:
binarySort.c header.h
$(CC) $(CFLAG) -c binarySort.c
determineOutput.o:
determineOutput.c writeToOutputFile.c writeToScreen.c
header.h
$(CC) $(CFLAG) -c determineOutput.c
writeToOutputFile.c writeToScreen.c
bin:
mkdir bin
mv *.o bin/
clean:
rm -rf *.o bin bstsort2
____________________________________________________________________________________________
binarysort file
#include "header.h"
int insertNode(struct Node *root, char *stringB, int
caseSensitive){
int checkIfSame =
sameString(root->currLine,stringB,caseSensitive);
int greaterThanResult =
greaterThan(root->currLine,stringB,caseSensitive);
if(checkIfSame ==1) {
root->duplicate++;
if(caseSensitive ==0) {
struct List *tempList;
tempList = (struct List*)malloc(sizeof(struct List));
tempList->currLine = stringB;
tempList->next = NULL;
if(root->list == NULL) {
root->list = tempList;
}else{
root->list->next = tempList;
}
return 0;
}
return 0;
}
if(greaterThanResult == 1) {
if(root->left != NULL) {
insertNode(root->left, stringB, caseSensitive);
}else{
struct Node *tempNode;
tempNode = (struct Node*)malloc(sizeof(struct Node));
tempNode->currLine = stringB;
tempNode->right = NULL;
tempNode->left = NULL;
tempNode->list = NULL;
tempNode->duplicate = 0;
root->left = tempNode;
return 0;
}
}else if (greaterThanResult == 0) {
if(root->right != NULL) {
insertNode(root->right, stringB, caseSensitive);
}else{
struct Node *tempNode;
tempNode = (struct Node*)malloc(sizeof(struct Node));
tempNode->currLine = stringB;
tempNode->right = NULL;
tempNode->left = NULL;
tempNode->list = NULL;
tempNode->duplicate = 0;
root->right = tempNode;
return 0;
}
}
return 0;
}
void printInOrder(struct Node *root,FILE *fp){
static int index = 0;
if(root != NULL) {
printInOrder(root->left,fp);
if(fp != NULL) {
fprintf(fp, "%s\n", root->currLine);
printLinkedList(fp, root);
}else{
printf("Index '%d' value'%s'\n", index, root->currLine);
printLinkedList(fp, root);
}
index++;
printInOrder(root->right,fp);
}
}
int printLinkedList(FILE *fp, struct Node *root){
if(root->list == NULL) {
return 0;
}
int true = 1;
struct List *tempList;
tempList = root->list;
while(true) {
if(fp != NULL) {
char message[] = "Linked List Item: ";
strcat(message, tempList->currLine);
fprintf(fp, "%s\n", message);
}else{
printf("\t\tLinkList Item: '%s'\n",tempList->currLine);
}
if(tempList->next == NULL) {
true = 0;
}else{
tempList = tempList->next;
}
}
return 0;
}
void printPostOrder(struct Node *root){
static int index = 0;
if(root != NULL) {
printPostOrder(root->left);
printPostOrder(root->right);
free(root->currLine);
index++;
}
}
____________________________________________________________________________________________
Header file
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <ctype.h>
#include <stdlib.h>
// Struct Declartions
struct Node {
char *currLine;
int duplicate;
struct Node *left;
struct Node *right;
struct List *list;
} root;
struct List {
struct List *next;
char *currLine;
};
//String Functions
int sameString(char stringA[],char stringB[],int caseType);
void toUpperCase(char *word,int count);
void toLowerCase(char *word,int count);
int stringLength(char *word);
int greaterThan(char stringA[],char stringB[],int caseType);
//Command Line Function
void parseCommandLineOptions(int argc, char *argv[],int
*caseSensitive,char **outputFile,char **inputFile);
//OpenFile Function
int determineRead(char *fileName);
void readFile(char *fileName, int caseSensitive);
void readFromInput(int caseSensitive);
//WriteFile Function
void determineOutput(char *fileName);
void writeToOutputFile(char *fileName);
void writeToScreen();
//BinarySort
int insertNode(struct Node *root,char *line, int
caseSensitive);
void printInOrder(struct Node *root,FILE *fp);
void printPostOrder(struct Node *root);
int printLinkedList(FILE *fp, struct Node *root);
____________________________________________________________________________________________
Main file
#include "header.h"
int main(int argc, char *argv[]){
int caseSensitive = 0;
char *outputFile = NULL;
char *inputFile = NULL;
parseCommandLineOptions(argc,argv,&caseSensitive,&outputFile,&inputFile);
int readType = determineRead(inputFile);
if(readType){
readFile(inputFile,caseSensitive);
}else{
readFromInput(caseSensitive);
}
determineOutput(outputFile);
printPostOrder(&root);
return 0;
}
____________________________________________________________________________________________
Parse Command Line Options file
#include "header.h"
void parseCommandLineOptions(int argc, char *argv[],int *caseSensitive,char **outputFile,char **inputFile){
int c;
opterr = 0;
while((c = getopt(argc,argv, "co:")) != -1) {
switch(c) {
case 'c':
*caseSensitive = 1;
break;
case 'o':
*outputFile = optarg;
break;
case '?':
if(optopt == 'c')
fprintf(stderr, "Option -%c requires an argument.\n",
optopt);
else if(isprint(optopt))
fprintf(stderr, "Unkown Option '-%c'\n", optopt);
else
fprintf(stderr, "Unkown Option character '\\x%x'\n", optopt);
exit(1);
default:
abort();
}
}
*inputFile = argv[optind];
}
}
____________________________________________________________________________________________
Read File file
#include "header.h"
void readFile(char *fileName,int caseType){
char path[254];
if(getcwd(path, sizeof(path)) == NULL) {
fprintf(stderr, "Error with getting path\n");
exit(1);
}
FILE *fp = NULL;
strcat(path,"/");
strcat(path,fileName);
fp = fopen(path,"r");
if(!fp) {
fprintf(stderr, "File Not Found '-%s'\n", path);
exit(1);
}
root.left = NULL;
root.right = NULL;
root.duplicate = 0;
char lineBuffer[1024];
char ch = NULL;
int count = 0;
int initalizeRoot = 0;
while((ch = getc(fp)) != EOF) {
if((initalizeRoot == 0) && (ch == '\n') && (count
!= 0)){
char *temp = (char *)malloc(sizeof(char));
lineBuffer[count] = '\0';
strcpy(temp,lineBuffer);
root.currLine = temp;
initalizeRoot++;
count= 0;
}else if(ch == '\n' && count != 0 && initalizeRoot
> 0) {
char *temp = (char *)malloc(sizeof(char));
lineBuffer[count] = '\0';
strcpy(temp, lineBuffer);
insertNode(&root,temp,caseType);
count = 0;
}
if(ch != '\n') {
lineBuffer[count] = ch;
count++;
}
}
fclose(fp);
}
____________________________________________________________________________________________
Read From Input file
#include "header.h"
void readFromInput(int caseSensitive){
root.left = NULL;
root.right = NULL;
root.list = NULL;
root.duplicate = 0;
int end = 0;
char *inputString = (char *)malloc(sizeof(char));
char *rootVal = (char *)malloc(sizeof(char));
printf("Enter Values To Sort\n");
scanf("%[^\n]", rootVal );
root.currLine = rootVal;
while(end == 0){
printf("Enter Values To Sort\n");
scanf(" %[^\n]", inputString );
printf("You Typed '%s'\n", inputString);
char *temp = (char *)malloc(sizeof(char));
strcpy(temp, inputString);
// printf("The value of root '%s'\n", root.currLine);
insertNode(&root, temp, caseSensitive);
printf("Enter 1 to stop and 0 to continue\n");
scanf(" %d", &end );
};
}
____________________________________________________________________________________________
String Comparisions
#include "header.h"
int stringLength(char *word){
int count = 0;
while(word[count]) count++;
return count;
}
void toUpperCase(char *word, int count){
int a =0;
for(a = 0; a < count; a++) {
if(word[a] > 96) {
word[a] = word[a] - 32;
}
}
}
void toLowerCase(char *word, int count){
int a =0;
for(a = 0; a < count; a++) {
if(word[a] < 96) {
word[a] = word[a] + 32;
}
}
}
int sameString(char stringA[], char stringB[], int
caseType){
int lengthA = stringLength(stringA);
int lengthB = stringLength(stringB);
if(lengthA != lengthB) return 0;
char tempA[lengthA], tempB[lengthB];
strcpy(tempA,stringA);
strcpy(tempB,stringB);
int size = 0;
if(caseType == 0) {
toUpperCase(tempA, lengthA);
toUpperCase(tempB, lengthA);
for(size = 0; size < lengthA; size++) {
if(tempA[size] != tempB[size]) {
return 0;
}
}
return 1;
}
for(size = 0; size < lengthA; size++) {
if(tempA[size] != tempB[size]) {
return 0;
}
}
return 1;
}
int greaterThan(char stringA[],char stringB[],int caseType){
int lengthA = stringLength(stringA);
int lengthB = stringLength(stringB);
char tempA[lengthA], tempB[lengthB];
strcpy(tempA,stringA);
strcpy(tempB,stringB);
int count = 0;
if(caseType == 0) {
toUpperCase(tempA, lengthA);
toUpperCase(tempB, lengthB);
if(lengthB < lengthA ) {
for(count = 0; count < lengthB; count++) {
if(tempA[count] != tempB[count]) {
if(tempA[count] > tempB[count]) {
return 1;
}else{
return 0;
}
}
}
}else {
for(count = 0; count < lengthA; count++) {
if(tempA[count] != tempB[count]) {
if(tempA[count] > tempB[count]) {
return 1;
}else{
return 0;
}
}
}
}
}
if(lengthB < lengthA ) {
for(count = 0; count < lengthB; count++) {
if(tempA[count] != tempB[count]) {
if(tempA[count] > tempB[count]) {
return 1;
}else{
return 0;
}
}
}
}else {
for(count = 0; count < lengthA; count++) {
if(tempA[count] != tempB[count]) {
if(tempA[count] > tempB[count]) {
return 1;
}else{
return 0;
}
}
}
}
return 0;
}
____________________________________________________________________________________________
Determine Read file
#include "header.h"
int determineRead(char *fileName){
if(fileName != NULL)
return 1;
return 0;
}
____________________________________________________________________________________________
Determine Output file
#include "header.h"
void determineOutput(char *fileName){
if(fileName != NULL) {
writeToOutputFile(fileName);
}else{
writeToScreen();
}
}
____________________________________________________________________________________________
Wrtie to Output file
#include "header.h"
void writeToOutputFile(char *fileName){
char path[2048];
if(getcwd(path,sizeof(path)) == NULL) {
fprintf(stderr, "Error with getting path\n");
exit(1);
}
FILE *fp = NULL;
strcat(path,"/");
strcat(path,fileName);
fp = fopen(path,"w+");
if(!fp) {
fprintf(stderr, "File Not Found '-%s'\n", path);
exit(1);
}
printInOrder(&root,fp);
fprintf(fp, "\n");
fclose(fp);
}
____________________________________________________________________________________________
Write to Screen file
#include "header.h"
void writeToScreen(){
FILE *fp = NULL;
printInOrder(&root,fp);
}
____________________________________________________________________________________________
Happy Coding:)