In: Computer Science
Please describe in detail how you got the answer.
Give a regular expression which describes the set of strings over {a, b, c} in which every a is immediately followed by at least one c.
The following regex matches the required strings:
[bc]*((acc*)*[bc]*)*
Explanation:
Following regex tools are used.
"[ ]" Matches any one of the characters inside the brackets. It is called "capture set".
"*" Matches 0 or more occurrence of the preceding element.
"()" Matches exact occurrence of characters in the same order. It is called "capture group".
Let's split the regex and understand step by step:
1. [bc]* - Matches 0 or more occurrence of either "b" or "c" or both in any order. Example: b, bcbc, bbccc, cc, cbbc.
2. (acc*)* - Matches "a" followed by 1 or more "c". Example: ac, accccc, accccccccc
3. [bc]* - Again matches 0 or more occurrence of either "b" or "c" or both in any order. Example: b, bcbc, bbccc, cc, cbbc.
4. ((acc*)*[bc]*)* - Matches two or more "ac" with any number of "b" or "c" in between them. Example: acbbbcacbb, acacbac
5. [bc]*((acc*)*[bc]*)* - This is the final regex that matches strings over {a,b,c} with every "a" followed by atleast one "c". Example: bbb, accc, acacbbacbcbaa.
The regex DOES NOT match the following:
abcc - Here "a" is followed by "b"
bbcbcabcc - Here again "a" is followed by "b"
xyz - Here none of the characters belong to {a,b,c}