In: Computer Science
From question 19: findAndTruncate
"...+2 EC (at the instructor's discretion) will go to the (correct) solution with the least number of statements ."
Find two unneeded statements and one performance issue in the below code:
findAndTruncate(char *trunc, const char placeholder)
{
for(int i = 0; *(trunc + i) != '\0'; i++)
{
if(*(trunc + i) == placeholder)
{
* (trunc + i) = '\0';
}
}
}
We are given a function, which is doing something like this:
1. We got a character pointer (which is essentially pointing to a character array) and a character (on which const qualifier is applied so as to declare the character as a constant)
2. We are traversing the array starting from the first index(the 0th index) using a for loop. We will come out of the loop, once we hit the null character, '/0'.
(
'/0' - the null character
)
3. In each iteration, we are checking the character in our array against the placeholder character, and if they match, we put the null character ('\0') on that location in our array.
Let's try to understand the process with the help of an example.
Say, the contents of our character array and the placeholder are:
Now, our iterator i will start traversing from the 0th index and will check if the character in the array is equal to placeholder using the if condition specified.
First iteration: (No match)
Second iteration: (No match)
Third iteration: (there is a match)
So, we will put the null character ('\0') at index 2.
We will repeat this for all those characters which satisfy the if condition that is, the characters which are equal to the character present in placeholder.
Therefore, our new character array becomes:
WHAT'S THE PERFORMANCE ISSUE?
Now, the problem with such an array is that, as we come out of our loop (and eventually return from our function), we won't be able to access any character present beyond the index 2. Please note that characters in an array are accessed by using the base address of the array, which, in this case is trunc. And when we print or traverse our character array, we can go only up to the point until we hit the null character ('\0') for the first time. All the data beyond that, is present in the memory that we don't own and we return from our function, the caller won't know about anything beyond the null character. Also, accessing that memory may cause various types of memory errors.
Conclusion:
So, the problem is we won't be able to access data beyond index 2. So, why traverse further.
And because of that once we put our first null character, in our case, which is at index 2, all the work done by the loop after that is not needed, or we can say, it's degrading the performance of the code by traversing the rest of the array for nothing. Our for loop will continue to look for 'x' and put the null characters there, when in reality, we won't be able to access anything after index 2, anyway.
How this will degrade the performance?
Consider an array with 1000 characters and the first index which had 'x' was the index 2 and others followed. Now, effectively, we are done with our work as we put the null character in it, but, our for loop will continue to search ahead, which is just extra work. We can increase the size to higher powers of 10 to realize how inefficient the algorithm can become.
Solution:
That is, we need to put the null character only once, which means, the if condition runs only once, and we are done. Therefore, instead of having a separate condition which checks the character put something for all the characters in the array, we can make this condition the part of our looping condition (the condition that is checked for each iteration to decide whether to continue the loop or not). So, when x is found for the first time, we will come out of the loop and then, will put null character there. That's it.
Therefore, we can have better performance by changing code like this:
void findAndTruncate(char *trunc, const char placeholder) {
int i = 0;
for(; *(trunc + i) != '\0' && *(trunc + i) != placeholder; i++) {}
*(trunc + i) = '\0';
}
Or we can do the same with the help of while loop, which seems more natural in this case:
void findAndTruncate(char *trunc, const char placeholder) {
int i = 0;
while (*(trunc + i) != '\0' && *(trunc + i) != placeholder) {
++i;
}
*(trunc + i) = '\0';
}
We have made 3 changes:
1. We have taken out the statement which declares the variable i.
int i = 0;
2. Instead of having a separate if condition which checks that if the character in array is equal to the character in placeholder, we have made that it the part of our looping condition. Therefore, whenever this condition hold [ *(truc + i) == placeholder ], we will come out of the loop
// code in for loop
for (; *(trunc + i) != '\0' && *(trunc + i) != placeholder); ++i) {}
// code in while loop
while (*(trunc + i) != '\0' && *(trunc + i) != placeholder)) {
i++;
}
3. As this is the index which found out to be having the character present in the placeholder variable. We will put the null character in it.
// put the null character in the ith index
*(trunc + i) = '\0';
Now, our code works better.
Now, let's see which statements can be removed.
Let's first have look at our new code.
void findAndTruncate(char *trunc, const char placeholder) {
int i = 0; // FIRST STATEMENT
// SECOND STATEMENT
for(; *(trunc + i) != '\0' && *(trunc + i) != placeholder; i++) {}
// THIRD STATEMENT
*(trunc + i) = '\0';
}
So, in total, we have 3 statements. (The code given in the problem statement also had 3 statements, one for loop statement, one if condition, one inside the if condition )
We have put the FIRST STATEMENT (the declaration of variable i) outside the loop because, we need the variable outside the loop after the loop terminates. So, if we want to put any of the FIRST or THIRD STATEMENT inside the for loop statement, the SECOND STATEMENT, they both need to move into it.
The final code will look like this:
void findAndTruncate(char *trunc, const char placeholder) {
for (int i = 0; *(trunc + i) != '\0' && ((*(trunc + i) != placeholder) ? 1 : (*(trunc + i) = '\0')); i++);
}
As it is looking something complicated, let's put it like this:
for (initialize; condition; increment);
1. initialize:
int i = 0;
2. condition:
*(trunc + i) != '\0' && ((*(trunc + i) != placeholder) ? 1 : (*(trunc + i) = '\0'));
3. increment:
i++;
1. We have put initialization and increment inside the for loop and
2. we made use of the ternary operator to incorporate the THIRD STATEMENT into the SECOND STATEMENT. Ternary operator works the same like the if else statements.
codition ? doThis : OrThis
IS BASICALLY SAME AS:
if (condition) {
doThis
} else {
OrThis
}
As we want to put null character where the character present in placholder is found.
Let's put our code in the if else condition form to understand it better:
for (int i = 0;;i++) {
if (*(trunc + i) != '\0' && ((*(trunc + i) != placeholder))) {
continue;
} else {
*(trunc + i) = '\0';
break;
}
}
Now, we will convert this code into ternary operator form:
for (int i = 0; (*(trunc + i) != '\0' && ((*(trunc + i) != placeholder)) ? 1 : (*(trunc + i) = '\0')); i++);
(
Please note that every assignment in C programming is also an expression which returns the variable on the left hand side(also known as, the lvalue), for example:
if (a = 1) {
printf ("Hello");
}
The statement above first put 1 into a, and is then, computed as 1 as a whole, therefore, the condition in if is fulfilled, and "Hello" will be printed.
The same will be used in our ternary operator, when we are putting 1 after ?
)
What have we done?
1. We have replaced the continue statement, by the 1, so that the condition is fulfilled. (1 here denotes that the condition in the for loop is found to be true).
2. If our condition is false, then, we will put the null character at that index. And as trunc[i] will hold '\0' now, (from the note we have provided earlier) the value of the expression will be null character, which is computed as 0(or false) by the compiler. Therefore, we will come out of the loop.
And the function returns.
Please refer to the screenshot of the code in order to understand it better.
Let's look at some input - output: