In: Computer Science
String search
Describe the brute force solution to counting the number of times the letter “a” occurs in a text.
Outline a divide-and-conquer algorithm for counting the number of times the letter “a” occurs in a text.
Analyze each approach and compare the efficiencies.
Don't forget to upvote if you like my answer
Please comment if you have any queries or suggestions.
Brute force solution:
For searching a particular alphabet using brute force, we will do it in the following method:
Dry run:
Lets take a string "abracadabra". We start iterating from the first alphabet('a'). The first letter is 'a', so we increase count_a by 1. Now count_a =1. The next letter('b') is not 'a'. We keep the value of count_a unchanged. In this way we iterate through the whole string and the final value of count_a is 5.
Efficiency:
This algorithm will take O(n) time where n is the length of the string because it has to iterate through the whole string.
Divide-conquer
For searching the string using divide and conquer method, the string needs to be in an ascending order. If the string is not sorted, we need to sort it using some algorithm like merge sort.
Divide and conquer or binary search needs a condition to split the string. If the string is not sorted, there will no definite condition on the basis of which the string can be split.
Now, when the string is sorted, we know that all the similar letters will be grouped together. While searching for an alphabet in the sorted sequence, what we need to find is the difference between the start index and the end index of that group containing our concerned letter. This difference increased by unity will be our result.
Consider the example 'abracadabra'. When we sort it, it becomes 'aaaaabbcdrr'. The starting index for 'a' in the sorted sequence is 1 and the end index is 5. The difference between the end index and the start index is 4. The difference increased by unity is 5 which is our result.
Now it is evident that the start index of 'a' in any sorted sequence of alphabets is 1 because 'a' is the first alphabet. The problem boils down to finding the end index of the group of 'a's. We find that index using binary search.
Efficiency:
If the string is sorted, the binary search will take O(log n)
time.
But if we have to sort the string, we also have to consider the
complexity of sorting algorithms which is O(n*logn) for algorithms
like merge sort and quick sort.