Question

In: Statistics and Probability

Describe a procedure of cross validation in Kriging in brief

Describe a procedure of cross validation in Kriging in brief

Solutions

Expert Solution

The term "Cross Validation" seems to have been introduced into Geostatistical applications around the late 1970's. although the concept of comparing actual values with estimates is far older (cf. Krige 1959). David's Geostatistical Ore Reserve Estimation (1977 p.56) gives a fully worked example of comparing estimates from two different estimation methods with the "true" values from sampled areas. The purpose in this example is to show that the Kriging estimator gives a smaller error variance than an Inverse Distance Squared method. He suggests comparing the histograms of the two sets of errors, in addition to their respective means and standard deviations.

By 1979, Parker et al are using the term "cross-validation" to check that their method of prediction was the correct one. In that case, the variable of interest was the proportion of mineralised composites in a uranium deposit. In the same volume: Davis & Borgman mention "crossvalidation" as a procedure available to check the validity of a semi-variogram model; Rendu uses comparison of theoretical and observed means and errors to decide between kriging methods as does Clark. In three out of four studies, therefore, the purpose of the cross validation was to justify the kriging technique chosen to perform the eventual evaluation.

This method of cross-checking a technique seems to have been welcomed by workers seeking a method of reducing the amount of subjectivity in Geostatistical estimation. By 1983, the NATO ASI on Geostatistics contained almost a dozen papers which referred to cross validation as a method of testing the fit of the semi-variogram model to the data. The interest in the problem is reflected, also, by the number of papers on "robust" estimators and statistical fitting procedures. However, these are outside the scope of the present paper.

Historically, then, Cross Validation has grown from a virtually unknown technique in the mid-1970's to a routine tool in the Geostatistician's armoury. In addition to published papers, it is now common practice amongst consultants to include a chapter in their reports justifying the choice of semi-variogram model and (sometimes) the kriging technique selected for estimation purposes.

What is Cross Validation?

The term "cross validation" is now generally accepted as describing the following procedure:

-         One sample is eliminated from the data set.

-         The surrounding samples are used to produce an estimate of the value at this (now) "unsampled" location, using a Geostatistical estimation method.

-         The actual error incurred in this process is measured by:

(Actual Value - Estimated Value)

-         The "expected" or "theoretical" error is measured by the kriging variance calculated during the estimation process (or by its square root, the kriging standard error).

The procedure produces a list of actual and theoretical errors. At this point, however, authors diverge on what should actually be done with this list.

The most common procedure, judging by the literature, is as follows. The actual errors are averaged. If the estimation is unbiassed this average should be zero. The variance of the errors is calculated and compared with the average kriging variance for all the estimations. The ratio between these two quantities is expected to be one, if the estimation procedure has been carried out correctly.

A minor variation on this process was used by Clark (op cit) to take account of different standard errors where data are not taken on a regular grid. Each "actual error" is divided by the appropriate "theoretical standard error" to form a standardised (Z) statistic. These statistics should then average zero and have a standard deviation of one.

In all cases. then, the actual error is compared with the expected error in such a way that two statistics are produced. These are expected to be zero and one respectively. Achieving (0.0,1.0) becomes the "proof" that the original semi-variogram model "fits" the data. The logic which produces this conclusion is:

The correct model gives (0,1)

I get (0,1),

therefore the model is correct

It is with this logic that this paper concerns itself.


Related Solutions

Cross validation: If we perform k-fold cross validation, training a Naive Bayes model, how many models...
Cross validation: If we perform k-fold cross validation, training a Naive Bayes model, how many models will we end up creating?
[USING R & dataset “Boston”] Using the leave-one-out cross-validation and 5-fold cross-validation techniques to compare the...
[USING R & dataset “Boston”] Using the leave-one-out cross-validation and 5-fold cross-validation techniques to compare the performance of models in (a) and (b) with: (a) SalesPredict <- lm(Sales ~ Price + Urban + US, data = Carseats) (b) SalesRevise <- lm(Sales ~ Price + US, data = Carseats) Hint: Functions update (with option subset) and predict.
4. Cross sectioning a) Describe briefly the steps involved in a cross-sectioning procedure of an encapsulated...
4. Cross sectioning a) Describe briefly the steps involved in a cross-sectioning procedure of an encapsulated sample. b)Explain the differences between grinding ad polishing.
how do you compute the test error in 5-fold cross-validation?
how do you compute the test error in 5-fold cross-validation?
What is the purpose of cross-validation? Specifically, why wouldn’t we test a hypothesis trained on all...
What is the purpose of cross-validation? Specifically, why wouldn’t we test a hypothesis trained on all of the available examples?
Advanced Database: What is overfitting? What is underfitting? What is decision tree pruning? What is cross-validation?...
Advanced Database: What is overfitting? What is underfitting? What is decision tree pruning? What is cross-validation? What is the role of the activation function? Provide some examples of activation functions. Every answer should be minimum if 4 to 5 lines
How can we use concordant pairs (the c-index) to perform cross-validation for a logistic regression model?
How can we use concordant pairs (the c-index) to perform cross-validation for a logistic regression model?
How does the computational time changes when we decrease the k in k-fold cross validation? Why?...
How does the computational time changes when we decrease the k in k-fold cross validation? Why? Explain. b. In which procedures, we can apply k-fold cross validation. Consider all the procedures that we learned.
describe cross-informant procedures
describe cross-informant procedures
We have 30 cross-validation results as below: 0.81, 0.20, 0.92, 0.99, 0.75, 0.88, 0.98, 0.42, 0.92,...
We have 30 cross-validation results as below: 0.81, 0.20, 0.92, 0.99, 0.75, 0.88, 0.98, 0.42, 0.92, 0.90, 0.88, 0.72, 0.94, 0.93, 0.77, 0.78, 0.79, 0.69, 0.91, 0.92, 0.91, 0.62, 0.82, 0.93, 0.85, 0.83, 0.95, 0.70, 0.80, 0.90 Calculate the 95% confidence interval of the mean.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT