In: Computer Science
Python Question
I have created a dictionary shown below:
import pandas as pd
records_dict = {'FirstName': [
'Jim', 'John', 'Helen'],
'LastName': [
'Robertson', 'Adams', 'Cooper'],
'Zipcode': [
'21801', '22321-1143', 'edskd-2134'],
'Phone': [
'555-555-5555', '4444444444', '323232']
}
I have stored this dictionary in a data frame, like shown below:
records = pd.DataFrame(records_dict)
print(records)
I am able to print the records just fine. My issue is, I want to eliminate, or put a blank space in, the values of the zipcode and phone number keys that do not match the correct format, using regular expressions.
How would I write the syntax for this?
import re
# 2 formats, ddddd or ddddd-dddd are allowed
def vpin(one):
if re.fullmatch("\d{5}|\d{5}-\d{4}", one):
return True
return False
# ddd-ddd-dddd is only allowed
def vphone(one):
if re.fullmatch("\d{3}-\d{3}-\d{4}", one):
return True
return False
# looping through zipcodes and phone numbers to check validity
and modify
def validateZipcode(rec):
for i in range(len(rec["Zipcode"])):
one =
rec["Zipcode"][i]
if not vpin(one):
rec["Zipcode"][i] = " "
for i in range(len(rec["Phone"])):
one =
rec["Phone"][i]
if not
vphone(one):
rec["Phone"][i] = " "
records_dict = {'FirstName': [
'Jim', 'John', 'Helen'],
'LastName': [
'Robertson', 'Adams', 'Cooper'],
'Zipcode': [
'21801', '22321-1143', 'edskd-2134'],
'Phone': [
'555-555-5555', '4444444444', '323232']
}
validateZipcode(records_dict)
print(records_dict)
# Output: {'FirstName': ['Jim', 'John', 'Helen'], 'LastName': ['Robertson', 'Adams', 'Cooper'], 'Zipcode': ['21801', '22321-1143', ' '], 'Phone': ['555-555-5555', ' ', ' ']}
# Hit the thumbs up if you are fine with the answer. Happy Learning!