In: Statistics and Probability
We learned four kinds of combination. Now let us combine the above data sets. In the following step, please do not create the data sets again. Just use them.
1) Which of them do not require a sorting procedure before the combination? Do these kinds of combination.
2) Which of them do require a sorting procedure before the combination? Do these kinds of combination.
3) Do one of the combinations in 1) again. Compare the result with that in 1). Report and explain your findings.
DATA One;
INPUT ID age name $;
DATALINES;
1001 21 Jone
1002 20 Peter
1003 22 Mary
1004 19 Joe
1005 21 Mark
1006 23 Sue
1007 19 Harry
1008 18 Tom
1009 22 Andy
1020 21 Larry
;
RUN;
PROC PRINT DATA=One;
RUN;
DATA Two;
INPUT ID gender $ department $;
DATALINES;
1006 F Econ
1007 M Econ
1008 M Math
1009 M Stat
1010 M Stat
1001 M Econ
1002 M Stat
1003 F Econ
1004 F Econ
1005 M Fina
;
RUN;
PROC PRINT DATA=Two;
RUN;
Sol:
Joins and set statements do not require sorting
We will join the two datasets using sql joins and also we will append 2 datasets using set statement and PROC append .
Horizontal combination is using joins or merge
and vertically combining
SAS CODE:
proc sql;
select * from one, Two
where one.Id =Two.id;
run;
data ds;
set one Two;
run;
proc sql;
select * from one full join Two
on one.Id =Two.id;
run;
proc sql;
select * from one left join Two
on one.Id =Two.id;
run;
proc sql;
title "right join result";
select * from one right join Two
on one.Id =Two.id;
run;
data comb;
set one two;
run;
Solution2:
For merging 2 datasets we need sorting.
dataset one is already sorted Perform proc sort dataset two
SAS CODE
proc sort data=two;
by id;
run;
data ds3;
merge one two;
by id;
run;
RESULTS:
Solution3:
For merging we need sorting of datasets
by combining merge we get maximum records and merge and join results are same.
merge gives minimum records and set statement gives maximum records.
if there are no matching records in the combining dataset it is indicated with blank or null for that particular variable
l