Question

In: Computer Science

Python Question I have created a dictionary shown below: import pandas as pd records_dict = {'FirstName':...

Python Question

I have created a dictionary shown below:

import pandas as pd

records_dict = {'FirstName': [
'Jim', 'John', 'Helen'],
'LastName': [
'Robertson', 'Adams', 'Cooper'],
'Zipcode': [
'21801', '22321-1143', 'edskd-2134'],
'Phone': [
'555-555-5555', '4444444444', '323232']
}

I have stored this dictionary in a data frame, like shown below:

records = pd.DataFrame(records_dict)

print(records)

I am able to print the records just fine. My issue is, I want to eliminate, or put a blank space in, the values of the zipcode and phone number keys that do not match the correct format, using regular expressions.

How would I write the syntax for this?   

Solutions

Expert Solution

import re

# 2 formats, ddddd or ddddd-dddd are allowed
def vpin(one):
    if re.fullmatch("\d{5}|\d{5}-\d{4}", one):
        return True
    return False

# ddd-ddd-dddd is only allowed
def vphone(one):
    if re.fullmatch("\d{3}-\d{3}-\d{4}", one):
        return True
    return False

# looping through zipcodes and phone numbers to check validity and modify
def validateZipcode(rec):
    for i in range(len(rec["Zipcode"])):
        one = rec["Zipcode"][i]
        if not vpin(one):
            rec["Zipcode"][i] = " "
          
    for i in range(len(rec["Phone"])):
        one = rec["Phone"][i]
        if not vphone(one):
            rec["Phone"][i] = " "
  
      

records_dict = {'FirstName': [
'Jim', 'John', 'Helen'],
'LastName': [
'Robertson', 'Adams', 'Cooper'],
'Zipcode': [
'21801', '22321-1143', 'edskd-2134'],
'Phone': [
'555-555-5555', '4444444444', '323232']
}

validateZipcode(records_dict)
print(records_dict)

# Output: {'FirstName': ['Jim', 'John', 'Helen'], 'LastName': ['Robertson', 'Adams', 'Cooper'], 'Zipcode': ['21801', '22321-1143', ' '], 'Phone': ['555-555-5555', ' ', ' ']}

# Hit the thumbs up if you are fine with the answer. Happy Learning!


Related Solutions

I have an excel file imported into canopy (python) using import pandas as pd. The excel...
I have an excel file imported into canopy (python) using import pandas as pd. The excel file has headers titled: datetime created_at PM25 temperatureF dewpointF    humidityPCNT windMPH    wind_speedMPH wind_gustsMPH pressureIN precipIN these column headers all have thousands of data numbers under them. How could i find the average of all of the numbers in each column and plot them on 1 graph (line graph or scatter plot) Thank you.(please comment out your code)
Explain if lines from "Import pandas as pd" to "#predicting the value" answers the questions to...
Explain if lines from "Import pandas as pd" to "#predicting the value" answers the questions to question 1. Also explain what is supposed to found using the lines "#income: avg. area income" to "#address." What does this mean "#Also try to see if the model performance can be improved with feature selection." #1. #Fit a linear regression model on data: USA_housing.csv to predict the Price of the house. import pandas as pd housing_df = pd.read_csv("USA_Housing.csv") from sklearn.linear_model import LinearRegression #loading...
Python I am creating a class in python. Here is my code below: import csv import...
Python I am creating a class in python. Here is my code below: import csv import json population = list() with open('PopChange.csv', 'r') as p: reader = csv.reader(p) next(reader) for line in reader: population.append(obj.POP(line)) population.append(obj.POP(line)) class POP: """ Extract the data """ def __init__(self, line): self.data = line # get elements self.id = self.data[0].strip() self.geography = self.data[1].strip() self.targetGeoId = self.data[2].strip() self.targetGeoId2 = self.data[3].strip() self.popApr1 = self.data[4].strip() self.popJul1 = self.data[5].strip() self.changePop = self.data[6].strip() The problem is, I get an error saying:  ...
How would I setup this dictionary for Python 3? class Student(object): def __init__(self, id, firstName, lastName,...
How would I setup this dictionary for Python 3? class Student(object): def __init__(self, id, firstName, lastName, courses = None): The “id”, “firstName” and “lastName” parameters are to be directly assigned to member variables (ie: self.id = id) The “courses” parameter is handled differently. If it is None, assign dict() to self.courses, otherwise assign courses directly to the member variable. Note: The “courses” dictionary contains key/value pairs where the key is a string that is the course number (like “course1”) and...
*Python* 1.1) Create an empty dictionary called 'addresses'. The dictionary you just created will map names...
*Python* 1.1) Create an empty dictionary called 'addresses'. The dictionary you just created will map names to addresses. A person's name (stored as a string) will be the KEY and that person's address (stored as a string) will be the VALUE. 1.2) Insert into the dictionary 'addresses' the names and addresses of two (possibly imaginary) friends of yours. 1.3) Create a second empty dictionary called 'ages'. The dictionary you just created will map names to ages. A person's name (stored...
I have created a MinHeap program without using arrays. Below is the followup question : The...
I have created a MinHeap program without using arrays. Below is the followup question : The heap class is a collection. Determine the correct location in your language’s collection class hierarchy. Find all methods that you need to implement in order to add your class in the language’s collection class hierarchy. What does this mean and how to implement it?
Question #1 Please specify he output of the following code (2.4) # import the pandas library...
Question #1 Please specify he output of the following code (2.4) # import the pandas library and aliasing as pd import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(8, 4), index = ['a','b','c','d','e','f','g','h'], columns = ['A', 'B', 'C', 'D']) # for getting values with a boolean array print df.loc['a']>0 Question #2 Please specify the output of the following code (2.2) import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(5, 3), index=['a', 'c', 'e', 'f', 'h'],columns=['one', 'two',...
How would I test a code i made in FreeBSD? I created a dictionary called ls...
How would I test a code i made in FreeBSD? I created a dictionary called ls and inside of it i made a makefile and a code called ls.c; trying to recreate the ls -l command and want to test it.
I have a python dictionary with the following format Key Type Value January str ['January 01...
I have a python dictionary with the following format Key Type Value January str ['January 01 2020', 'January 02 2019', 'January 03 2018'] June str ['June 04 2018', 'June 05 2018', 'June 06 2016] August str ['Augsut 07 2016', 'August 08 2016'] How do return the following conclusion with python code? January has the most day (1) in 2020 January has the most day (1) in 2019 June has the most days (2) in 2018 August has the most days...
here i have a dictionary with the words and i have to find the words with...
here i have a dictionary with the words and i have to find the words with the largest size, (i mean word count for eg abdominohysterectomy) which have anagrams , like for example ,loop -> pool. Also , i have to sort them in alphabetical way so what could be the algorithm for that public class Anagrams {    private static final int MAX_WORD_LENGTH = 25;    private static IArray<IVector<String>> theDictionaries =        new Array<>(MAX_WORD_LENGTH);    public static void...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT