In: Statistics and Probability
Write up how to use JMP for 5 items (or other
statistical software) with instructions. Cite sources if
needed.
1. How to import data and change data type (from nominal to
interval/ratio level)
2. How to find descriptive statistics (mean, median, mode,
variance, etc)
3. How to find confidence intervals
4. How to do linear regression
5. How to do ANOVA
1. you can move data into JMP by importing data from another application, such as Excel, SAS, or text files. The basic steps to import data are as follows:
1. Select File > Open.
2. Navigate to your file's location.
3. If your file is not listed in the Open Data File window, select the correct file type from the Files of type menu.
4. Click Open.
2. Take new column, which contains the text length for the entry for the column, and then run
Analyze==>Distribution
on the new column. It will provide you with the Mean, STD, STD Error of the Mean, etc.
3. Confidence intervals
• Load data into JMP.
• Go to Analyse → Distribution.
• Put the column you are interested in, into the box [Y, columns] and press okay.
• In the output rotate the plot round (to make it look better), by right clicking on the name of the column, Histogram options → vertical.
• To make a CI right click on the column name, and go to Confidence Interval, and select the confidence level you want.
• In the out put you should get the CI.
• The first column should contain the parameter, the second the estimator of that parameter (for example, the sample mean).
4. Linear regression
(i) Suppose we want to fit the model Y = β0 + β1x + ε to bivariate data. For example we want to fit Height = β0 + β1foot size +εt to the data. Then upload the data into JMP.
(ii) Click on analyze and Fit Y by X.
(iii) Put into Y, response the response variable, for example height. Put into X, factor the regressor (explanatory variable), for example foot size.
(iv) Press okay. You should see a scatter plot of Y against X.
(v) To fit a line of best fit, right click in Bivariate fit of.... and select Fit line. You will now see the line of best fit on the scatter plot together with a lot of output. Note if you select Fit mean you will get a horizontal line which is the sample mean.
(vi) To obtain the residuals, right click on the small red triangle just below the scatter plot and adjacent to a line with Linear Fit by the size. When you do this you should see the option for Save Residuals click on this. A new column should appear in your output containing the residuals (note the sample mean of residuals in zero).
(vii) To check whether the linear model is an appropriate model plot the residuals againist the regressors (explanatory variables).
(viii) To make this plot go to Analyse and Fit Y by X. Put the residuals into Y, response and regressor (such as shoes size) into X, factor. Then press okay, you should get a scatterplot.
• Learn to interpret the output.
5. ANOVA
• We want to see whether the mean in all three populations are the same against the alternative that at least one is different. Ie. H0 : µ1 = µ2 = µ3 = µ4 = µ5 against HA : at least one mean is different.
(i) The data needs to be inputed into JMP as two columns, on containing all the data and the other indicating which group (or population) it comes from.
(ii) Input the data into JMP, and make sure that the observations (samples, such as the speed of light data is continuous random variable) and the groups is an ordinal random variable. You do this by going to the variables on the left hand side of the data and left clicking on markers next to them.
(iii) Go to analyze click on Fit Y by X.
(iv) Put the observations into the Y, response box. For example, the speed of light column goes here. Put the factors (the column which indicates the group that the observations belong to) into X, factor.
(v) Press Okay.
(vi) you will get a chart which plots the n groups side by side. From here you can see by eye whether the sample mean looks similar or not.
(vii) To get the Analysis of Variance table. Right click on Oneway analysis... and select Means/Anova.
• In the output you will get an ANOVA table. Please learn to inteprete this.