In: Computer Science
Please use python.
Step a: Create two dataframes df1 and df2 as follows:
|
import numpy as np import pandas as pd rng = np.random.RandomState(100) df1 = pd.DataFrame(rng.randint(0, 100, (4, 3)), columns=['A', 'B', 'C']) df2 = pd.DataFrame(rng.randint(0, 100, (3, 4)), columns=['A', 'B', 'C', 'D']) |
Step b: Create a new dataframe df which is the summation of df1 and df2;
Step c: Subtract all columns of df by the half of column 'C' in df1; (Remark: the values in df should be updated)
Step d: Replace the NaN in df by 10; (Remark: the values in df should be updated)
Step e: Use df.apply() to calculate the summation of the numbers in each row of df, and show the result. (Remark: the result should be a vector of four values)
PLEASE GIVE A THUMBS UP


import pandas as pd
import numpy as np
rng = np.random.RandomState(100)
df1 =
pd.DataFrame(rng.randint(0,100,(4,3)),columns=['A','B','C'])
df2 =
pd.DataFrame(rng.randint(0,100,(3,4)),columns=['A','B','C','D'])
print("Data of df1")
print(df1)
print("Data of df2")
print(df2)
df = df1 + df2
print("Data of df")
print(df)
df.loc[:,'A'] = df.loc[:,'A'] - df.loc[:,'C']/2
df.loc[:,'B'] = df.loc[:,'B'] - df.loc[:,'C']/2
df.loc[:,'C'] = df.loc[:,'C'] - df.loc[:,'C']/2
df.loc[:,'D'] = df.loc[:,'D'] - df.loc[:,'C']/2
print("Data of df after subtraction with df1 column 'C'")
print(df)
df = df.replace([None],[10])
print("Data of df after replace NaN by 10")
print(df)
l = df.sum(axis=1)
vec = []
for i in range(4):
vec.append(l.loc[i])
print("Summation")
print(vec)