Question

In: Statistics and Probability

We learned four kinds of combination. Now let us combine the above data sets. In the...

We learned four kinds of combination. Now let us combine the above data sets. In the following step, please do not create the data sets again. Just use them.

1) Which of them do not require a sorting procedure before the combination? Do these kinds of combination.

2) Which of them do require a sorting procedure before the combination? Do these kinds of combination.

3) Do one of the combinations in 1) again. Compare the result with that in 1). Report and explain your findings.

DATA One;
INPUT ID age name $;
DATALINES;
1001 21 Jone
1002 20 Peter
1003 22 Mary
1004 19 Joe
1005 21 Mark
1006 23 Sue
1007 19 Harry
1008 18 Tom
1009 22 Andy
1020 21 Larry
;
RUN;
PROC PRINT DATA=One;
RUN;
DATA Two;
INPUT ID gender $ department $;
DATALINES;
1006 F Econ
1007 M Econ
1008 M Math
1009 M Stat
1010 M Stat
1001 M Econ
1002 M Stat
1003 F Econ
1004 F Econ
1005 M Fina
;
RUN;
PROC PRINT DATA=Two;
RUN;

Solutions

Expert Solution

Sol:

Joins and set statements do not require sorting

We will join the two datasets using sql joins and also we will append 2 datasets using set statement and PROC append .

Horizontal combination is using joins or merge

and vertically combining

SAS CODE:


proc sql;
select * from one, Two
where one.Id =Two.id;
run;

data ds;
set one Two;
run;
proc sql;
select * from one full join Two
on one.Id =Two.id;
run;

proc sql;
select * from one left join Two
on one.Id =Two.id;
run;

proc sql;

title "right join result";
select * from one right join Two
on one.Id =Two.id;
run;

data comb;
set one two;
run;

Solution2:

For merging 2 datasets we need sorting.

dataset one is already sorted Perform proc sort dataset two

SAS CODE

proc sort data=two;

by id;

run;

data ds3;

merge one two;

by id;

run;

RESULTS:

Solution3:

For merging we need sorting of datasets

by combining merge we get maximum records and merge and join results are same.

merge gives minimum records and set statement gives maximum records.

if there are no matching records in the combining dataset it is indicated with blank or null for that particular variable

l


Related Solutions

This week, we learned that there are four levels of abstraction of a data model. They...
This week, we learned that there are four levels of abstraction of a data model. They are contextual, conceptual, logical, and physical. Briefly explain each level with a focus on the differences between them. When working on the final project for this course, many students go directly to the physical model. What would you say to these students? Why would you say so?
This week’s discussion will allow you to combine what we learned about global marketing with the...
This week’s discussion will allow you to combine what we learned about global marketing with the BABY BOOMER PHONE APPLICATION product you are using for your marketing plan. Assume that you are being sent to another country other then Canada to market the BABY BOOMER APPLICATION (activities for active boomers) product you are using in your class marketing plan. Considering everything we have learned so far about the marketing mix, environment and other factors you need to develop a plan...
This week’s discussion will allow you to combine what we learned about global marketing with the...
This week’s discussion will allow you to combine what we learned about global marketing with the product you are using for your marketing plan. Assume that you are being sent to another country other then Canada to market the baby boomer app (activities for active boomers) product you are using in your class marketing plan. Considering everything we have learned so far about the marketing mix, environment and other factors you need to develop a plan to sell your product....
Let A be a collection of sets, then We defined A’ = Union of all elements...
Let A be a collection of sets, then We defined A’ = Union of all elements of A. Definition: If A = NOT the union of C and D , where C and D are non empty sub-collection, such that C’ intersect D’ = empty Then A is coherent. Prove: if A is coherent then A’ is connected (i.e A’ is not the union of two separated sets in standard topology)
So this week we learned about the four V's of Big data - Velocity, Volume, Veracity...
So this week we learned about the four V's of Big data - Velocity, Volume, Veracity and Variety. As we know the velocity of data can vary so does it affect the other three V's? How are they affected? Also same if there are changes in other or one of the V then how are the rest affected?
For the operating systems game, let us now assume the intrinsic superiority of Mac is not...
For the operating systems game, let us now assume the intrinsic superiority of Mac is not as great and that network effects are stronger for Windows. These modifications are reflected in different payoffs. Now, the payoff from adopting windows is (50 x W) and from adopting Mac is (15+ 5 x M); n consumers are simultaneously deciding between Windows and Mac. A. Find all nash equilibria. B. With these new payoffs, let us now suppose that a third option exists,...
We are now 10th August 2020, and the dispute over Numbers R’ US is now settled....
We are now 10th August 2020, and the dispute over Numbers R’ US is now settled. The ads from Sparkling Ads are proving very good and customers are rushing in. The firm keeps on growing. Their revenue is now reaching $49 million. They have had to employ many more employees. The works force is now 150 employees. Tom and Michael are wondering whether it is time to change business structure. They will be happy to provide the capital but as...
This week we learned about stress and the toll it can take on us psychologically and...
This week we learned about stress and the toll it can take on us psychologically and physically. We discussed many strategies for alleviating this stress in healthy ways. In this week’s reflection, include how you believe that stress can impact your attitude and ability to persevere through your educational pursuits. From all the strategies presented by you and others throughout this week’s discussion, share how will you work to overcome the negative aspects of stress.
For this discussion, we will consider the impact of variance of data sets and predictability. This...
For this discussion, we will consider the impact of variance of data sets and predictability. This module we talked about the various tools investors and managers employ to measure market risk. First Post Please address the following: Describe the significance of applying statistical tools to measure risk. Can investors or financial managers confidently forecast performance without the application of statistical tools? Why or why not?
For this lab we will use NFL Scouting Combine data for drafted running backs and wide...
For this lab we will use NFL Scouting Combine data for drafted running backs and wide receivers from years 2012-2014.The combine is a series of tests to evaluate college football players ahead of the NFL Draft. The dataset is available on Canvas. To perform the hypothesis tests below, you will need to upload the data to StatKey and select the appropriate columns. For guidance, refer to the StatKey guide.The dataset contains the following variables: year Year player participated in combine...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT