Question

In: Operations Management

Question 1 Data collected (including the normalized values) by a riding mower manufacturer on its existing...

Question 1

  1. Data collected (including the normalized values) by a riding mower manufacturer on its existing customers are listed below.

    Customer ID

    Income

    Lot_Size

    Normalized Income

    Normalized Lot_Size

    Ownership

    1

    61.5

    20.8

    -0.351

    0.762

    Owner

    2

    82.8

    22.4

    0.726

    1.421

    Owner

    3

    52.8

    20.8

    -0.790

    0.762

    Nonowner

    4

    84

    17.6

    0.786

    -0.556

    Nonowner

    5

    63

    14.8

    -0.275

    -1.709

    Nonowner

    The riding mower manufacturer wants to classify a new customer (data given below) as either an owner or a nonowner using k-Nearest Neighbors method.

    Income

    Lot_Size

    Normalized Income

    Normalized Lot_Size

    81

    20

    0.635

    0.532

    Round your answers to 3 digits after the decimal point.

    The Euclidean distance between the new customer and Customer 1 is?

    The Euclidean distance between the new customer and Customer 2 is?

    The Euclidean distance between the new customer and Customer 3 is?

    The Euclidean distance between the new customer and Customer 4 is?

    The Euclidean distance between the new customer and Customer 5 is?

    The new customer should be classified as?(owner or nonowner?) if k is set to 3.

Solutions

Expert Solution

Let's consider X as the features and y as out target variable. Thus all the columns 'Customer ID', 'Income', 'Lot Size', 'Normalized Income' and 'Normalized Lot size' are our feature columns, Xi and column 'Ownership' is y

Euclidean Distance for a given data x is calculated using the formula

where Xi are the column values of each row in the data set and xi are the column values for each row of the new data.

Given, xi = {81, 20, 0.635, 0.432} corresponding to Income, Lot_size, Normalized income, Normalized Lot_size respectively of the new customer.

We calculate the Euclidean distance of the new data point xi as shown below:

Cust. ID Income Lot_size Normalized Income Normalized Lot_size Distance
1 61.5 20.8 -0.351 0.762
2 82.8 22.4 0.726 1.421
3 52.8 20.8 -0.790 0.762
4 84 17.6 0.786 -0.556
5 63 14.8 -0.275 -1.709

Arranging each customer ID by their Euclidean distances in ascending order we get:

Customer ID Euclidean Distance
2 3.160
4 3.969
5 18.879
1 19.544
3 28.249

Since k = 3 we have to consider 3 nearest neighbors to classify the new data point as owner or nonowner.

As you can see the Customer IDs 2, 5 and 4 are the nearest 3 neighbors to the new data-point. (New data is the red circle)

Since among the 3 circles (2, 5 and 4) there are 2 circles that are purple (5 and 4) respectively, the new data will be considered a purple circle or a non-owner.

Thus for the given condition of k = 3 , we classify the new data / new customer with income 81 , Lot_size: 20, Normalized income: 0.635 and Normalized Lot_size = 0.432 as a nonowner.


Related Solutions

Question #1 A) The government of Canada has collected data on the prevalence of 5 chronic...
Question #1 A) The government of Canada has collected data on the prevalence of 5 chronic diseases over time. The data of the 5 diseases was collected in percent prevalence of Canadians over age 20, over the years 2008, 2010, 2012, 2014. A psychologist is interested in knowing if prevalence rates of the diseases change over time. To examine this, they use a one-way repeated measures ANOVA. Sphericity is assumed. Disease 2008 2010 2012 2014 Cancer 7.5 7.2 7.4 7.2...
David's Landscaping has collected data on home values (in thousands of $) and expenditures (in thousands...
David's Landscaping has collected data on home values (in thousands of $) and expenditures (in thousands of $) on landscaping with the hope of developing a predictive model to help marketing to potential new clients. Data for 14 households may be found in the file Landscape. Home Value ($1,000) Landscaping Expenditures ($1,000) 242 8.1 321 10.8 198 12.2 340 16.2 300 15.6 400 18.9 800 23.5 200 9.5 521 17.5 547 22.0 437 12.1 464 13.5 635 17.9 356 13.9...
David's Landscaping has collected data on home values (in thousands of $) and expenditures (in thousands...
David's Landscaping has collected data on home values (in thousands of $) and expenditures (in thousands of $) on landscaping with the hope of developing a predictive model to help marketing to potential new clients. Data for 14 households may be found in the file Landscape. Home Value ($1,000) Landscaping Expenditures ($1,000) 242 8.1 321 10.8 198 12.2 340 16.2 300 15.6 400 18.9 800 23.5 200 9.5 521 17.5 547 22.0 437 12.1 464 13.5 635 17.9 356 13.9...
Question 1:- A potential investor collected attendance data over a period of 49 days at the...
Question 1:- A potential investor collected attendance data over a period of 49 days at the North Mall and South Mall theaters in order to determine the difference between the average daily attendances. The North Mall Theater averaged 720 patrons per day with a variance of 100, while the South Mall Theater averaged 700 patrons per day with a variance of 96. Develop an interval estimate for the difference between the average daily attendances at the two theaters. Use a...
Question 1:- A potential investor collected attendance data over a period of 49 days at the...
Question 1:- A potential investor collected attendance data over a period of 49 days at the North Mall and South Mall theaters in order to determine the difference between the average daily attendances. The North Mall Theater averaged 720 patrons per day with a variance of 100, while the South Mall Theater averaged 700 patrons per day with a variance of 96. Develop an interval estimate for the difference between the average daily attendances at the two theaters. Use a...
1. Burton, a manufacturer of snowboards, is considering replacing an existing piece of equipment with a...
1. Burton, a manufacturer of snowboards, is considering replacing an existing piece of equipment with a more sophisticated machine. The following information is given. The proposed machine will cost $120,000 and have installation costs of $20,000. It will be depreciated using a 3 year MACRS recovery schedule. It can be sold for $60,000 after three years of use (before tax; at the end of year 3). The existing machine was purchased two years ago for $95,000 (including installation). It is...
The data below was collected from manufacturer advertisements of their vehicles horsepower and highway gas mileage...
The data below was collected from manufacturer advertisements of their vehicles horsepower and highway gas mileage (mpg.). Use this data to answer the following questions. x 158 250 340 350 390 190 220 y 33 28 15 17 11 35 42 1. Find the the correlation coefficient. Round your final answer to four decimal places. r=r=   2. Write the regression equation below. Round all numbers to four decimal places. ˆy=y^= 3. Using the data shown above,predict the the highway gas...
1.) Georgia, Inc. has collected the following data on one of its products. The direct materials...
1.) Georgia, Inc. has collected the following data on one of its products. The direct materials price variance is: Direct materials standard (3 lbs @ $1/lb) $3 per finished unit Total direct materials cost variance—unfavorable $22,250 Actual direct materials used 100,000 lbs. Actual finished units produced 25,000 units 2.) Milltown Company specializes in selling used cars. During the month, the dealership sold 32 cars at an average price of $16,000 each. The budget for the month was to sell 30...
Matlab question The data you will analyse is distribution zone substation data collected for the financial...
Matlab question The data you will analyse is distribution zone substation data collected for the financial years 2005 to 2017. It is available at https://cloudstor.aarnet.edu.au/plus/s/3Ffy7Qq3ps3i3vj with password ELEC2103#Assignment. 1. Write a sub-routine to load some or all of the Ausgrid distribution zone substation data into a useable format in MATLAB. 2. Write a sub-routine to analyse the data in MATLAB by modelling/fitting it, using regression, classification, ANOVA or other machine learning methods. You may wish to pre-process the data in...
QUESTION 6 (7 + 1 = 8 marks) Porcelain is a manufacturer of ceramic products. Its...
QUESTION 6 (7 + 1 = 8 marks) Porcelain is a manufacturer of ceramic products. Its management has projected values for the following account balances as at June 30, 2021. Projected loss for the financial year ending June 30, 2021 is $700. Required: Prepare a classified Balance Sheet for Porcelain as at June 30, 2021. Account titles Projected balance ($) Account titles Projected balance ($) Accum. Dep. – Equipment 12,400 Inventory 5,100 Bank Loan (due in 5 years) 16,800 Equipment...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT