In: Statistics and Probability
The Programme for International Student Assessment (PISA) is a worldwide study by the Organisation for Economic Co-operation and Development (OECD) in member and non-member nations intended to evaluate education systems by measuring 15-year-old school pupils' scholastic performance in Mathematics, Science and Reading. Refer to Math_and_gender dataset. The dataset contains the mean PISA Math scores for samples of 15-year-old male and female students from a number of randomly selected schools in each of various nations. You are to perform a regression analysis using mean male score as the explanatory variable and mean female score as the response variable. Perform the following:
Male mean | Female mean |
519 | 509 |
506 | 486 |
526 | 504 |
533 | 521 |
431 | 410 |
495 | 490 |
511 | 495 |
516 | 508 |
542 | 539 |
505 | 489 |
520 | 505 |
473 | 459 |
496 | 484 |
508 | 505 |
491 | 483 |
451 | 443 |
490 | 475 |
534 | 524 |
548 | 544 |
499 | 479 |
425 | 412 |
534 | 517 |
523 | 515 |
500 | 495 |
497 | 493 |
493 | 481 |
498 | 495 |
502 | 501 |
493 | 474 |
493 | 495 |
544 | 524 |
451 | 440 |
503 | 482 |
497 | 477 |
372 | 383 |
394 | 383 |
435 | 427 |
394 | 379 |
426 | 430 |
398 | 366 |
465 | 454 |
454 | 451 |
561 | 547 |
371 | 372 |
386 | 387 |
405 | 405 |
328 | 334 |
483 | 481 |
547 | 523 |
474 | 480 |
531 | 520 |
408 | 396 |
362 | 357 |
374 | 356 |
366 | 371 |
429 | 425 |
469 | 467 |
448 | 437 |
599 | 601 |
565 | 559 |
546 | 541 |
421 | 417 |
410 | 418 |
378 | 366 |
433 |
421 |
(f) Produce a histogram of the residuals.
(h) Identify two countries whereby the female students perform much worse than predicted based on the country's male mean score?
The Histogram is plotted using the R.
Male Female mean(I) mean(y) 519.00 506.00 526.00 533.00 431.00 495.00 511.00 516.00 542.00 505.00 520.00 473.00 496.00 508.00 491.00 451.00 490.00 534.00 548.00 499.00 425.00 534.00 523.00 500.00 497.00 493.00 498.00 502.00 493.00 493.00 544.00 509.00 486.00 504.00 521.00 410.00 490.001 495.00 508.00 539.00 489.00 505.00 459.00 484.00 505.00 483.00 443.00 475.00 524.00 544 00 79.00 412.00 517.00 515.00 495.00 493.00 481.00 495.00 501.00 474.00 495.00 524.00 y-y (x-7)(y (x-x)? -7) 47.02) 45.82 2154.03 2210.45 34.02 22.821 776.07 1157.05 54.02 40.821 2204.66 2917.66 61.02 57.82 3527.63 3722.88 -40.981 -53.18 2179.75 1679.74 23.02 26.821 617.17 529.71 39.0231.821 1241.291522.20 44.021 44.82 1972.57 1937.35 75.82 5308.24 4902.15 25.82 852.30 1090.02 48.021 41.82 2007.78 2305.48 1.02 -4.18 -4.25 1.03 24.02 20.82 499.89 576.74 36.02 41.82 1506.00 1297.11 19.02) 19.82 376.80 361.58 -20.98 -20.18 423.57 440.35 18.02 11.82 212.86 324.55 62.02 60.82 3771.49 3845.91 76.02 80.82 6143.21 5778.34 27.02 15.82 427.26 729.83 -46.981-51.18 2404.89 2207.55 62.02 53.82 3337.38 3845.91 51.02 51.82 2643.38 2602.57 28.021 31.821 891.32 784.86 25.02 29.82 7 45.84 625.771 21.021 17.821 374.40 441.651 26.02 31.82 8 27.69 676.801 30.02 37.82 1135.04 900.92 21.02 10.82 227.29 441.65 21.0231.82 668.61 441.651 72.02) 60.82 4379.64 5186.22 451.001 440.00 503.00 482.00 497.00 477.00 372.00 383.00 394.00 383.00 435.000 427.00 394.00 379.00 426.00 430.00 398.00 366.00 465.00| 454.00 454.00 451.00 561.00 547.00 371.00 372.00 386.00 387.00 405.00 405.00 328.00 334.00 483.00 481.00 547.00 523.00 474.00 480.00 531.00 520.00 408.00 396.00 362.001357.00 374.001 356.00 366.001371.00 429.001 425.00 469.001 467.00 448.001 437.00 599.00 601.00 565.00 559.00 546.00 541.00 421.00 417.00 410.00 418.00 378.00 366.00 433.00 421.00 -20.98 -23.18 31.02 18.82 25.02 13.82 -99.98 -80.18 - 77.981-80.18 -36.98 -36.18 - 77.98 -84.18 -45.98 -33.18 -73.98 -97.18 -6.98 -9.18 -17.98 -12.18 89.02 83.82 -100.98 -91.18 -85.98 -76.18 -66.98 -58.18 -143.98 -129.18 11.02 17.82 75.02] 59.82 2.02 16.82 59.02 56.82 -63.98 -67.18 -109.98 -106.18 -97.98 -107.18 -105.981-92.18 -42.981-38.18 -2.98 3.82 -23.981 -26.18 127.02 137.82 93.02 95.82 74.02 77.82 -50.98 -46.18 -61.98 45.18 -93.98 -97.18 -38.98 -42.18 486.52 440.35 583.57 961.951 345.60 625.77 8017.23 9996.92 6253.17 6081.60 1338.27 1367.86 6565.10 6081.60 1525.98 2114.58 7190.171 5473.72 64.15| 48.78 219.14 323.45 7460.86 7923.74 9208.24 10197.89 6550.70 7393.35 3897.47 4486.94 18600.60 20731.57 196.24 121.34 4487.07 5627.31 33.89 4.06 3352.98 3482.82 4298.78 4094.03 11678.67 12096.62 10502.44 9600.98 9770.15 11232.74 1641.35 1847.68 -11.398 .91 628.03 575.26 17504.67| 16132.91 8912.30 8651.86 5759.545478.28 2354.70 2599.43 2800.75| 3842.09 9 133.868833.11 1644.55 1519.80
Total 30679.00 30107.00 0.00 0.00 226829.18 235484.98 Mean 471.98 463.18 Answer (f): Let y=Female mean x= Male mean the regression equation of yon x will be y=bo+b1x= where b is slope of line By least square method regression coefficient of yon x (61) is given by 2(x-7)(y-7) (x - 2) 226829.18 The fitted values of Y based on obtained regression equation and Residual values are given in following table: S.no. Male Female y = Bo + B1x Residual(y- mean(x) mean(y) 1 519.00 509.00 508.4718 0.53 21506.00 486.00 495.94971 -9.95 31 526.00 504.00 515.21451 -11.21 41533.00 521.00 521.9572 -0.96 51 431.001 410.00 423.70651 -13.71 6 495.00 490.00 485.354 4.65 71 511.00 495.00 500.7659 8 516.00 508.00 505.5821 9 542.00 539.00 530.6264 10 505.00 489.00 494.9864 11 520.00 505.00 509.4351 12 473.00 459.00 464.1627 13| 496.00 484.00 486.3173 14 508.00 505.00 497.8762 15| 491.00 483. 00 4 81.501 16 451.00 443.00 442.9713 0.03 17| 490.00 475.00 480.5378 18 534.00 524.00 522.9205 19 548.00 544.00 536.4059 20 499.00 479.00 489.2071 21 425.00 412.00 417.927 22 534.00 517.00 522.9205 23 523.00 515.00 512.3248 2.68 241 500.00 495.00 490.17021 4.83 251 497.00 493.00 487.2805 26 493.00 481.00 483.4275 27 498.00 495.00 488.2437 28 | 502.00 501.00 492.0967 8.90 29 493.00 474.00 483.42751 -9.43 b = 235484.98 bi = 0.96 Slope of regression indicates the rate of change in the dependent variable (i.e. sales) due to one unit change in independent variable. Similarly bo Is given by bo = y-b, bo = 463.18-0.96.471.98 - 92 bo = 8.55 -243 So the required regression equation is Y=8.55 +0.96*X
64 378.00 65| 433.00 366.00 421.00 372.6546 4 25.633 4 .63 Histogram of Residuals 11.57 -8.55 -2.97 -11.06 - 10.28 16.12 -5.07 -0.56 -9.07 11.11 -25.92 -2.46 5.14 1.93 41 42454 Frequency 0.00 45 30 493.00 495.00 31 544.00 524.00 32) 451.00 440.00 33 503.00 482.00 34 497.00477.00 35 372.00 383.00 36 394.00 383.00 37 435.00 427.00 38 394.00 379.00 39 426.00 430.00 40 398.00 366.00 465.00 454.00 454.00 451.00 43 561.00 547.00 44 371.00 372.00 386.00 387.00 46 405.00 405.00 328.00 334.00 483.00 481.00 49 547.00 523.00 | 50 474.00 480.00 51 531.00 520.00 52 408.00 396.00 53 362.00 357.00 54 374.00 356.00 366.00 371.00 56 429.00 425.00 57 469.00 467.00 58 448.00 437.00 591 599.00 | 601.00 60 565.00 559.00 61 546.00 541.00 62 421.00 417.00 63 410.00 418.00 47 481 483.42751 532.5529 442.9713 493.06 487.2805 366.87521 388.0665 4 27.5595 388.0665 4 18.8903 391.9195 456.4567 445.8611 | 548.928 365.9119 380.3606 398.6622 324.4925 473.7951 535.4426 465.1259 520.0308 401.5519 357.2427 368.8017 361.0957 9 421.78 3 460.3097 440.0816 585.53131 552.781 534.4794 414.0741 403.4784 -30 -20 10 20 -10 0 Residuals Answer (9) .20 -12.44 14.87 -0.03 -5.55 -0.24] -12.80 .90 .22 6.69 -3.08 47 551 The residuals (error) for country number 5 and 40 is highest, i.e. the difference between predicted value and observed value is highest for country number 5 followed by country number 40. Hence, in these countries the female students perform much worse than predicted based on the country's male mean score. 14.52
Male Female mean(I) mean(y) 519.00 506.00 526.00 533.00 431.00 495.00 511.00 516.00 542.00 505.00 520.00 473.00 496.00 508.00 491.00 451.00 490.00 534.00 548.00 499.00 425.00 534.00 523.00 500.00 497.00 493.00 498.00 502.00 493.00 493.00 544.00 509.00 486.00 504.00 521.00 410.00 490.001 495.00 508.00 539.00 489.00 505.00 459.00 484.00 505.00 483.00 443.00 475.00 524.00 544 00 79.00 412.00 517.00 515.00 495.00 493.00 481.00 495.00 501.00 474.00 495.00 524.00 y-y (x-7)(y (x-x)? -7) 47.02) 45.82 2154.03 2210.45 34.02 22.821 776.07 1157.05 54.02 40.821 2204.66 2917.66 61.02 57.82 3527.63 3722.88 -40.981 -53.18 2179.75 1679.74 23.02 26.821 617.17 529.71 39.0231.821 1241.291522.20 44.021 44.82 1972.57 1937.35 75.82 5308.24 4902.15 25.82 852.30 1090.02 48.021 41.82 2007.78 2305.48 1.02 -4.18 -4.25 1.03 24.02 20.82 499.89 576.74 36.02 41.82 1506.00 1297.11 19.02) 19.82 376.80 361.58 -20.98 -20.18 423.57 440.35 18.02 11.82 212.86 324.55 62.02 60.82 3771.49 3845.91 76.02 80.82 6143.21 5778.34 27.02 15.82 427.26 729.83 -46.981-51.18 2404.89 2207.55 62.02 53.82 3337.38 3845.91 51.02 51.82 2643.38 2602.57 28.021 31.821 891.32 784.86 25.02 29.82 7 45.84 625.771 21.021 17.821 374.40 441.651 26.02 31.82 8 27.69 676.801 30.02 37.82 1135.04 900.92 21.02 10.82 227.29 441.65 21.0231.82 668.61 441.651 72.02) 60.82 4379.64 5186.22 451.001 440.00 503.00 482.00 497.00 477.00 372.00 383.00 394.00 383.00 435.000 427.00 394.00 379.00 426.00 430.00 398.00 366.00 465.00| 454.00 454.00 451.00 561.00 547.00 371.00 372.00 386.00 387.00 405.00 405.00 328.00 334.00 483.00 481.00 547.00 523.00 474.00 480.00 531.00 520.00 408.00 396.00 362.001357.00 374.001 356.00 366.001371.00 429.001 425.00 469.001 467.00 448.001 437.00 599.00 601.00 565.00 559.00 546.00 541.00 421.00 417.00 410.00 418.00 378.00 366.00 433.00 421.00 -20.98 -23.18 31.02 18.82 25.02 13.82 -99.98 -80.18 - 77.981-80.18 -36.98 -36.18 - 77.98 -84.18 -45.98 -33.18 -73.98 -97.18 -6.98 -9.18 -17.98 -12.18 89.02 83.82 -100.98 -91.18 -85.98 -76.18 -66.98 -58.18 -143.98 -129.18 11.02 17.82 75.02] 59.82 2.02 16.82 59.02 56.82 -63.98 -67.18 -109.98 -106.18 -97.98 -107.18 -105.981-92.18 -42.981-38.18 -2.98 3.82 -23.981 -26.18 127.02 137.82 93.02 95.82 74.02 77.82 -50.98 -46.18 -61.98 45.18 -93.98 -97.18 -38.98 -42.18 486.52 440.35 583.57 961.951 345.60 625.77 8017.23 9996.92 6253.17 6081.60 1338.27 1367.86 6565.10 6081.60 1525.98 2114.58 7190.171 5473.72 64.15| 48.78 219.14 323.45 7460.86 7923.74 9208.24 10197.89 6550.70 7393.35 3897.47 4486.94 18600.60 20731.57 196.24 121.34 4487.07 5627.31 33.89 4.06 3352.98 3482.82 4298.78 4094.03 11678.67 12096.62 10502.44 9600.98 9770.15 11232.74 1641.35 1847.68 -11.398 .91 628.03 575.26 17504.67| 16132.91 8912.30 8651.86 5759.545478.28 2354.70 2599.43 2800.75| 3842.09 9 133.868833.11 1644.55 1519.80
64 378.00 65| 433.00 366.00 421.00 372.6546 4 25.633 4 .63 Histogram of Residuals 11.57 -8.55 -2.97 -11.06 - 10.28 16.12 -5.07 -0.56 -9.07 11.11 -25.92 -2.46 5.14 1.93 41 42454 Frequency 0.00 45 30 493.00 495.00 31 544.00 524.00 32) 451.00 440.00 33 503.00 482.00 34 497.00477.00 35 372.00 383.00 36 394.00 383.00 37 435.00 427.00 38 394.00 379.00 39 426.00 430.00 40 398.00 366.00 465.00 454.00 454.00 451.00 43 561.00 547.00 44 371.00 372.00 386.00 387.00 46 405.00 405.00 328.00 334.00 483.00 481.00 49 547.00 523.00 | 50 474.00 480.00 51 531.00 520.00 52 408.00 396.00 53 362.00 357.00 54 374.00 356.00 366.00 371.00 56 429.00 425.00 57 469.00 467.00 58 448.00 437.00 591 599.00 | 601.00 60 565.00 559.00 61 546.00 541.00 62 421.00 417.00 63 410.00 418.00 47 481 483.42751 532.5529 442.9713 493.06 487.2805 366.87521 388.0665 4 27.5595 388.0665 4 18.8903 391.9195 456.4567 445.8611 | 548.928 365.9119 380.3606 398.6622 324.4925 473.7951 535.4426 465.1259 520.0308 401.5519 357.2427 368.8017 361.0957 9 421.78 3 460.3097 440.0816 585.53131 552.781 534.4794 414.0741 403.4784 -30 -20 10 20 -10 0 Residuals Answer (9) .20 -12.44 14.87 -0.03 -5.55 -0.24] -12.80 .90 .22 6.69 -3.08 47 551 The residuals (error) for country number 5 and 40 is highest, i.e. the difference between predicted value and observed value is highest for country number 5 followed by country number 40. Hence, in these countries the female students perform much worse than predicted based on the country's male mean score. 14.52