Discuss the major issues relating to types of variables used and sample size required in the...

Discuss the major issues relating to types of variables used and sample size required in the application of discriminant analysis.

Expert Solution

Discriminant analysis is a technique that is used by the researcher to analyze the research data when the criterion or the dependent variable is categorical and the predictor or the independent variable is interval in nature. The term categorical variable means that the dependent variable is divided into a number of categories. For example, three brands of computers, Computer A, Computer B and Computer C can be the categorical dependent variable.

The objective of discriminant analysis is to develop discriminant functions that are nothing but the linear combination of independent variables that will discriminate between the categories of the dependent variable in a perfect manner. It enables the researcher to examine whether significant differences exist among the groups, in terms of the predictor variables. It also evaluates the accuracy of the classification.

Discriminant analysis is described by the number of categories that is possessed by the dependent variable.

As in statistics, everything is assumed up until infinity, so in this case, when the dependent variable has two categories, then the type used is two-group discriminant analysis. If the dependent variable has three or more than three categories, then the type used is multiple discriminant analysis. The major distinction to the types of discriminant analysis is that for a two group, it is possible to derive only one discriminant function. On the other hand, in the case of multiple discriminant analysis, more than one discriminant function can be computed.

There are many examples that can explain when discriminant analysis fits. It can be used to know whether heavy, medium and light users of soft drinks are different in terms of their consumption of frozen foods. In the field of psychology, it can be used to differentiate between the price sensitive and non price sensitive buyers of groceries in terms of their psychological attributes or characteristics. In the field of business, it can be used to understand the characteristics or the attributes of a customer possessing store loyalty and a customer who does not have store loyalty.

For a researcher, it is important to understand the relationship of discriminant analysis with Regression and Analysis of Variance (ANOVA) which has many similarities and differences. Often we can find similarities and differences with the people we come across. Similarly, there are some similarities and differences with discriminant analysis along with two other procedures. The similarity is that the number of dependent variables is one in discriminant analysis and in the other two procedures, the number of independent variables are multiple in discriminant analysis. The difference is categorical or binary in discriminant analysis, but metric in the other two procedures. The nature of the independent variables is categorical in Analysis of Variance (ANOVA), but metric in regression and discriminant analysis

Unequal group size does not influence the direct solution of the discriminant analysis problem. However, unequal group size can cause subtle changes during the classification phase. Normally, the sampling frequency of each group (the proportion of the total sample that belongs to a particular group) is used during the classification stage. If the relative group sample sizes are not representative of their sizes in the overall population, the classification procedure will be erroneous

Discriminant analysis does not make the strong normality assumptions that MANOVA does because the emphasis is on classification. A sample size of at least twenty observations in the smallest group is usually adequate to ensure robustness of any inferential tests that may be made. Outliers can cause severe problems that even the robustness of discriminant analysis will not overcome. You should screen your data carefully for outliers using the various univariate and multivariate normality tests and plots to determine if the normality assumption is reasonable. You should perform these tests on one group at a time.

Multicollinearity occurs when one predictor variable is almost a weighted average of the others. This collinearity will only show up when the data are considered one group at a time. Forms of multicollinearity may show up when you have very small group sample sizes (when the number of observations is less than the number of variables). In this case, you must reduce the number of independent variables.

Rahul Sunny answered 1 year ago

1). Discuss some of the ethical issues relating to genetic testing. 2). Discuss the purpose of...

1). Discuss some of the ethical issues relating to genetic testing. 2). Discuss the purpose of the Health Insurance Portability and Accountability act of 1996. 3). Describe the public duties of physicians. 4). Why do good Samaritan laws exist in many states? 5). Discuss the ethical considerations relating to surrogate motherhood and contraception.

Discuss the major types of health care information Analyze four issues associated with poor quality health...

Discuss the major types of health care information Analyze four issues associated with poor quality health care data. Examine the challenges associated with measuring data quality. Discuss the laws and regulations that govern the release of patient information. Patient consent required to release information Outline the process for handling substance abuse medical record data. Discuss how the releasing organization/provider authenticates a healthcare provider requesting the information. Discuss how data exchange is secured.

Create a scenario for two variables that may be related. Identify a sample dataset (sample size...

Create a scenario for two variables that may be related. Identify a sample dataset (sample size n=10) and using it calculate covariance and correlation values. Interpret the relationship.

for a fixed sample size as the number of indeoendent variables in a regression model increases...

for a fixed sample size as the number of indeoendent variables in a regression model increases the power of the regression decreases. T or F The width of a 95% confidence interval around a relative risk increases as the sample size decreases. T or F

Florida has a host of issues relating to water. Do a bit of research, and discuss...

Florida has a host of issues relating to water. Do a bit of research, and discuss one of these issues. Include your chosen problem and any contributing factors and possible solutions. Some possible topics include: Ground water depletion Everglades restoration Salt water intrusion Urban runoff Agricultural runoff Ground water contamination.

a What is the size of the smallest sample required to estimate an unknown proportion of...

a What is the size of the smallest sample required to estimate an unknown proportion of customers who would pay for an additional service, to within a maximum error of 0.06 with at least 95% confidence? 1b With reference to the previous problem, how would the required sample size be affected if it is known that the proportion to be estimated is at least 0.75?

List, discuss, and evaluate the major technique used in separating proteins, including: solubility, size, dialysis, SDS-PAGE,...

List, discuss, and evaluate the major technique used in separating proteins, including: solubility, size, dialysis, SDS-PAGE, charge-ion exchange chromatography, size and charge 2D gels, specific ligand binding affinity chromatography, antigen-antibody recognition (western blot, ELISA); mass spectrometry of peptide and proteomics

finalize three research questions relating to your selected dataset. Make sure the variables used in the...

finalize three research questions relating to your selected dataset. Make sure the variables used in the final questions you select are of the following format: 1. Continuous response and continuous explanatory variable 2. Categorical response and categorical explanatory variable 3. Continuous response and categorical explanatory variable (note: the categorical variable must be dichotomous for this question) Study of Interest: The Framington Heart Study My 10 variables chosen: CURSMOKE (currently smoking cigarettes): categorical-dichotomous AGE: continuous- ratio SEX: categorical-dichotomous DIABETES: categorical-dichotomous HEARTRTE...

inalize three research questions relating to your selected dataset. Make sure the variables used in the...

inalize three research questions relating to your selected dataset. Make sure the variables used in the final questions you select are of the following format: 1. Continuous response and continuous explanatory variable 2. Categorical response and categorical explanatory variable 3. Continuous response and categorical explanatory variable (note: the categorical variable must be dichotomous for this question) Each research question should have a clear explanatory and response variable. Review, as necessary, the ‘Laerd Statistics’ reading from last week which explains data...

Determine the minimum sample size required when you want to be 90% confident that the sample...

Determine the minimum sample size required when you want to be 90% confident that the sample mean is within one unit of the population mean and σ=10.8 Assume the population is normally distributed. Find the critical value Tc for the confidence level c=0.80 and sample size n=19.

Question