In: Statistics and Probability
/*Question 3: The following data contains five columns (variables) and five rows (observations). First, read the data into SAS to create a data set. Notice that the first, third, and the fifth variable have missing values. Please replace the missing values of the first, third, and fifth variable with 30, 40, and 50, respectively. Next, for all the variables, if a value is at least 100, make an adjustment to the value such that its new value is equal to its old value minus 50. 15 20 . 35 55 60 50 75 100 80 . 40 100 200 . 25 50 79 120 45 80 150 . 120 35 */
All the SAS code will be indented ( space left on the left margin), and the explanation and comments will be non-indented.
We are first required to create the dataset. I will call the data set 'data_set' and call the columns 'first', 'second', 'third', 'fourth' and 'fifth'. The following code can create a dataset:
DATA data_set;
input first second third fourth fifth;
datalines;
15 20 . 35 55
60 50 75 100 80
. 40 100 200 .
25 50 79 120 45
80 150 . 120 35
;
Next, we are required to replace the missing values in the first column with 30, third column with 40 and fifth column with 50. The following code can replace missing values:
DATA data_set;
set data_set;
if first=. then do;
first=30;
end;
DATA data_set;
set data_set;
if third=. then do;
third=40;
end;
DATA data_set;
set data_set;
if fifth=. then do;
fifth=50;
end;
Next, we are required to subtract 50 from the values of the dataset, if the values are 100 or greater than 100. I have looped through all the columns, and if the values in those columns was 100 or above, I have deducted 50. The following code can do the looping and the subtraction:
DATA data_set;
set data_set;
array cols first -- fifth;
do over cols;
if cols>100|cols=100 then do;
cols= cols-50;
end;
end;
I hope this helped you, and solved the problem at hand. Please let me know if you have a doubt or spot an error. Happy learning!