In: Computer Science
USE PYTHON
[1121, "Jackie Grainger", 22.22,
1122, "Jignesh Thrakkar", 25.25,
1127, "Dion Green", 28.75, False,
24.32, 1132, "Jacob Gerber",
"Sarah Sanderson", 23.45, 1137, True,
"Brandon Heck", 1138, 25.84, True,
1152, "David Toma", 22.65,
23.75, 1157, "Charles King", False,
"Jackie Grainger", 1121, 22.22, False,
22.65, 1152, "David Toma"]
First of all let's understand the data given (shown in below image):
PART-1
Here, we can see that, we have 3 values for each row: emp_id, emp_name and hrl_wage. In some cases, there is a boolean value which has no significance since we don't know what it is. So let's first create a list of dictionaries where one dictionary will contain one employee information. Below is the code I am using for this.
import pandas as pd
import json
# Given Data:
data = [1121, "Jackie Grainger", 22.22,
1122, "Jignesh Thrakkar", 25.25,
1127, "Dion Green", 28.75, False,
24.32, 1132, "Jacob Gerber",
"Sarah Sanderson", 23.45, 1137, True,
"Brandon Heck", 1138, 25.84, True,
1152, "David Toma", 22.65,
23.75, 1157, "Charles King", False,
"Jackie Grainger", 1121, 22.22, False,
22.65, 1152, "David Toma"]
new_data = []
info = {}
for ele in data:
if(type(ele)==int):
info['emp_info'] = ele
elif(type(ele)==str):
info['emp_name'] = ele
elif(type(ele)==float):
info['emp_hrl_wage'] = ele
else:
pass
if(len(info)==3):
new_data.append(info)
info = {}
df = pd.DataFrame(new_data)
df.drop_duplicates(subset ="emp_info",keep = False, inplace = True)
df['total_hrl_rate'] = df['emp_hrl_wage'] * 1.3
main_list = json.loads(df.to_json(orient ='records'))
main_list
In the above code we separated the rows based on the data type of values and whenever we see 3 elements in the dictionary, we assume we have all the data for one employee and then we repeat the same process.
Once we have the dataframe for all the data, we apply the drop_duplicates function. We also added a new key, total_hrl_rate, which is 1.3 times of emp_hrl_wage, and then converted the data to json/database-like format. Below is the output for the above code.
PART-2 : To get the data of employee having underpaid salaries (between 28.15 and 30.65), below is the code that can be used:
# Getting list of emp with underpaid Salary.
underpaid_salaries = df[df['total_hrl_rate'] < 30.65][df['total_hrl_rate'] > 28.15]
underpaid_salaries = json.loads(underpaid_salaries.to_json(orient ='records'))
underpaid_salaries
Here, in the above code, we applied the filter in dataframe format itself and then converted to json format.
Output for above code is:
PART-3 : Here we need to calculate the raise for each dictionary in the main_list.
For creating a raise column based on given condition, below code can be used:
def get_raise(wage):
if(wage >= 22.0 and wage < 24.0):
return wage * 0.05
elif(wage >= 24.0 and wage < 26.0):
return wage * 0.04
elif(wage >= 26.0 and wage < 28.0):
return wage * 0.03
else:
return wage * 0.02
#company_raises
df['raise'] = df['emp_hrl_wage'].map(get_raise)
company_raises = json.loads(df[['emp_name','raise']].to_json(orient ='records'))
company_raises
Here each row's wage data is mapped to a function which returns the corresponding raise. Further separated the emp_name and raise to a different dataframe and converted to json format.
Below is the output for the above code:
With this, I hope that, I have answered to your question and you got the idea to go ahead.
Thanks and Have a nice day!!
=========================================================================
Below is the screenshots of all the codes used, to avoid any indentation issues: