In: Computer Science
In python,
Here's some fake data.
df = {'country': ['US', 'US', 'US', 'US', 'UK', 'UK',
'UK'],
'year': [2008, 2009, 2010, 2011, 2008, 2009,
2010],
'Happiness': [4.64, 4.42, 3.25, 3.08, 3.66, 4.08,
4.09],
'Positive': [0.85, 0.7, 0.54, 0.07, 0.1, 0.92,
0.94],
'Negative': [0.49, 0.09, 0.12, 0.32, 0.43, 0.21,
0.31],
'LogGDP': [8.66, 8.23, 7.29, 8.3, 8.27, 6.38,
6.09],
'Support': [0.24, 0.92, 0.54, 0.55, 0.6, 0.38,
0.63],
'Life': [51.95, 55.54, 52.48, 53.71, 50.18, 49.12,
55.84],
'Freedom': [0.65, 0.44, 0.06, 0.5, 0.52, 0.79, 0.63,
],
'Generosity': [0.07, 0.01, 0.06, 0.28, 0.36, 0.33,
0.26],
'Corruption': [0.97, 0.23, 0.66, 0.12, 0.06, 0.87,
0.53]}
I have a list of happiness and six explanatory vars.
exp_vars = ['Happiness', 'LogGDP', 'Support', 'Life', 'Freedom',
'Generosity', 'Corruption']
1. Define a variable called explanatory_vars that contains the list of the 6 key explanatory variables
2. Define a variable called plot_vars that contains Happiness and each of the explanatory variables. (Hint: recall that you can concatenate Python lists using the addition (+) operator.)
3. Using sns.pairplot, make a pairwise scatterplot for the WHR data frame over the variables of interest, namely the plot_vars. To add additional information, set the hue option to reflect the year of each data point, so that trends over time might become apparent. It will also be useful to include the options dropna=True and palette='Blues'.
import seaborn as sns
import pandas as pd
df = {'country': ['US', 'US', 'US', 'US', 'UK', 'UK', 'UK'],
'year': [2008, 2009, 2010, 2011, 2008, 2009, 2010],
'Happiness': [4.64, 4.42, 3.25, 3.08, 3.66, 4.08, 4.09],
'Positive': [0.85, 0.7, 0.54, 0.07, 0.1, 0.92, 0.94],
'Negative': [0.49, 0.09, 0.12, 0.32, 0.43, 0.21, 0.31],
'LogGDP': [8.66, 8.23, 7.29, 8.3, 8.27, 6.38, 6.09],
'Support': [0.24, 0.92, 0.54, 0.55, 0.6, 0.38, 0.63],
'Life': [51.95, 55.54, 52.48, 53.71, 50.18, 49.12, 55.84],
'Freedom': [0.65, 0.44, 0.06, 0.5, 0.52, 0.79, 0.63, ],
'Generosity': [0.07, 0.01, 0.06, 0.28, 0.36, 0.33, 0.26],
'Corruption': [0.97, 0.23, 0.66, 0.12, 0.06, 0.87, 0.53]}
dataFrame = pd.DataFrame.from_dict(df)
explanatory_vars = ['LogGDP', 'Support', 'Life', 'Freedom', 'Generosity', 'Corruption']
plot_vars = ['Happiness'] + explanatory_vars
sns.pairplot(dataFrame,
x_vars = explanatory_vars,
dropna=True,
palette="Blues")
----------------------------------------------------------------------------------------------------
Your ThumbsUp on this answer matters to me a lot :)
----------------------------------------------------------------------------------------------------
For any further clarifications, Please do not hesitate to reach out
in the comments section