In: Statistics and Probability
1. (a)
Transport for London is the local government body responsible for most aspects of the transport system in London. Its role is to implement the transport strategy and to manage transport services across London.
TfL launched the Night Tube service on the London Underground in August 2016. Currently, the Night Tube operates on Friday and Saturday nights across five lines (the Central, Jubilee, Northern, Piccadilly and Victoria lines). TfL management would consider expansion of the service to more lines and/or other nights of the week, subject to consumer demand.
Some 1.34 billion passenger journeys are made each year on London Underground (albeit with many individuals making multiple journeys), however there is no sampling frame with full coverage of all distinct passengers available. Despite this limitation, TfL management wants to conduct market research to assess consumer appetite for an extension to the Night Tube service, which may have fare implications. At present Night Tube fares are classified as off-peak, although a special night-off-peak fare could be introduced, depending on passengers’ price elasticities.
TfL has invited you to devise an appropriate sampling scheme to research attitudes towards developing the Night Tube. Explain in detail how each of the following sampling methods could be applied to the overall sampling strategy. Make sure you describe the merits and limitations of each as well as how each would be applied in practice.
i. Convenience sampling.
ii. Simple random sampling.
iii. Systematic sampling. iv. Stratified sampling.
(b) Suppose we are interested in estimating the mean of a population with a finite variance using a simple random sample of size n.
State a suitable estimator of the population mean as well as its sampling distribution. Mention any assumptions which you make.
Explain how the sampling distribution derived in i. should be interpreted.
Explain how to determine the minimum sample size necessary to estimate a population mean to within e units assuming the population standard deviation is known. If the population standard deviation was unknown, how would you deal with this?
Explain the purpose of the finite population correction factor (including a formula) and when it should be used.
1) Convenience Sampling - Convenience Sampling is a type of sampling that involves the sample being drawn from the population that is close to hand or that is easily available.The sampling is done with definite purpose in view .
Advantages-
• easy availability - the researcher should not have to do any extra effect or go out of the ease to get the data. The matter of study is easily available within the proximity of the researcher.
•Saves Time- Using this sampling technique the survey is conducted in a very short period of time.
• Saves money- when the funds are not available for proper research we can gather data by this method of sampling.
Disadvantages
• Biased data- the data collected can be biased due to the over representation of some specific groups of the society.
•high possibility of sampling error- since the data collection is itself biased there are high chances of fluctuations and sampling error.
The Transport for london services are operating only on five lines and only on two days , this is the convenience sampling for government of london.
2) Simple Random Sampling- In this case the sample units are selected at random. A random sampling is one in which each unit of population has an equal and fair chance of being included in it.
Advantages
• It is free from errors of classification.
• It is useful for data analytics where inferential statistics is possible.
Disadvantages
•It cannot be used where the population is heterogeneous in nature.
•sometimes it is difficult to have a complete catalogue universe.
TFL management wants to conduct market research to access consumer appetite for an extension to the night tube services which may have fare implications , this suggests us that in this case government should use random sampling technique to get their work done.
3) Systematic Sampling - It is a type of sampling where sample is selected randomly from a large popluation and according to a fixed periodic interval.
Advantages
• easy to understand and execute- important for surevys which have tight budget or fixed funding.
• control and sense of process- By fixed periodic interval, we have the prior knowledge of when and which sample has to be selected next.
Disadvantages
• High risk of Data manipulation- there is high risk of data manipulation by researchers because in order to achieve their desired target researchers might construct their own system instead of selecting random data.
at present night tube fares are classified as off peak, although a night peak off fare could be introduced , depending on passengers elasticites , this shows the use of systematic sampling.
B)) the sample mean will be considered as an estimator of population mean.
Assumptions
1. The sample which is taken from the population depicts that population follows normal distribution.
If the population standard deviation is not known we use an estimator of population mean which is sample mean square, we can treat sample mean square as an estimator of population mean square.
Finite Population Correction-
=1-f
If the population size N is very large or if n is small compared with N then , f= n/N tends to zero and consequently f.p.c tend to 1.
It is needed under the circumstances when the central limit theorem does not hold and standard error of the estimates will be too big.