In: Computer Science
1. regular expressions as a mechanism to specify tokens,
2. defining context-free grammars for language constructs, and
3. derivation of language terms and expressions based on a context-free grammar.
1 Regex
1. Define the regex for the following description of tokens:
(a) Any string that starts with character
t
(b) Any string of at least length 3 that starts with
t
and ends with
u
(c) Any string that specifies the range of numbers between 11 and 23.
(d) Any string that specifies a date in MM:DD:YYYY format.
2. In C, an identifier is defined as a string of characters (both upper-case and lower-case), digits, and
underscore “_”, starting with either a character or underscore. Define the regex for identifiers in C.
3. Give five strings that conform with the regex:
[0-9]+((E|e)(\+|\-)?[0-9]+)?
a) ^t - Regex for string starting with t (small letter t)
b) ^t.u - Regex for string starting from t and ending on u with 3 character
^ - Matches the beginning of the string, or the beginning of a line if the multiline flag (m) is enabled. This matches a position, not a character.
. - Matches any character except line breaks.
c) [1-2][1-3] ----- First digit in between 1-2 and second digit would be in between 1-3.
d) ^(1[0-2]|0[1-9]):(3[01]|[12][0-9]|0[1-9]):[0-9]{4}$ -----
Explanation: (1[0-2]|0[1-9]) - capturing group 1 which matches for month format , | works as OR condition
(3[01]|[12][0-9]|0[1-9]) - capturing group 2 for dd
[0-9]{4} - year validation
2) ^[_a-zA-Z][_a-zA-Z0-9]{0,30}
Explanation : ^ is used for start charater. {0,30} represnts upto 30 character long
3) strings matching regex:[0-9]+((E|e)(\+|\-)?[0-9]+)?
1e+9
4E-9
2E-5
9e+0
3e-2
Please rate your answer.