Question

In: Computer Science

Question 1 : A computer program is said to learn from experience E with respect to...

Question 1 :

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience

For a voice recognition learning problem determine the possible:

task T
performance measure P
and training experience E

Question 2 :

Explain what is the curse of dimensionality? What is the relation between machine learning problems and the curse of dimensionality?

Question 3 :

The following training dataset is “reading email dataset”.

This dataset has four features as follows: author, thread, length, and where to read the mail. According to the features the algorithm has to predict the user’s action whether to read or skip the mail.

Use Naïve Bayes classifier to predict the user’s action (skips or reads) when the author of the mail is known, the thread of the mail is follow up, the length of the mail is short, and where to read the email is home.

Author	Thread	Length	Where to read	User’s Action
Known	new	long	home	Skips
unknown	new	short	work	Reads
unknown	Follow up	long	work	Skips
Known	Follow up	Long	Home	Skips
Known	New	Short	Home	Reads
Known	Follow up	Long	Work	Skips
Unknown	New	short	work	skips
Unknown	New	short	Work	reads
Known	Follow up	Long	Home	Skips
known	New	Long	Work	skips
unknown	Follow up	short	home	Skips
Known	new	Long	work	Skips
Known	Follow up	Short	Home	Reads
Known	New	Short	Work	Reads
known	New	short	Home	Reads
Known	Follow up	short	Work	Reads
Known	New	Short	home	Reads
unknown	new	short	work	Reads

(35 points) Write a Python code to implement a naïve Bayesian classifier to predict the user’s action (skips or reads) when the author of the mail is known, the thread of the mail is follow up, the length of the mail is short, and where to read the email is home. (Do not use Scikit-Learn)
(35 points) Use Scikit-Learn to predict the user’s action (skips or reads) when the author of the mail is known, the thread of the mail is follow up, the length of the mail is short, and where to read the email is home.

Hint in authors feature you can use 0, 1 instead of unknown and known. In thread feature you can use 0, 1 instead of follow up and new. In length feature you can use 0, 1 instead of short and long. In where to read feature you can use 0, 1 instead of home, work. In the target you can use 0 instead of skips and 1 instead of reads.

Expert Solution

1) For a voice recognition learning problem,

Task: Convert input speech to to text
Performance measures: Percentage of total number of speech commands converted to text correctly (with respect to the total number of speech commands tested)
Experience: The text representation of given speech data (mappings of speech signals with the correct textual representation of the spoken command).

2) The curse of dimensionality can be summarised as: "As the dimensionality of the feature space increases, the number of configurations can grow exponentially, and hence, the number of configurations covered by an observation also decreases." We can explain it with help of the Hughes' phenomenon, which states that as the number of features for a given model increases, the classifier's performance keeps on increasing up to a particular number of features. After that, beyond the limit, training more will give degrade the performance of the classifier.

The Euclidean distance between two n-dimensional vectors with Cartesian coordinates p = (p1, p2, …, pn) and q = (q1, q2, …, qn) is computed using the distance formula:

Notice that as the value of i increases, so do our dimensions required to calculate the distance.
Consider this example:
- When we have 10 observations with 1-dimension, we only have 10 features required to calculate distances.
- However, when we increase the dimensionality to 2, we have to take care of 10x10 = 100 features to calculate the Euclidean distance.
- When we up the number of dimensions to 3, we have to take care of 10x10x10 distance terms for calculating the Euclidean distance = 1000 features.

Thus, when we have more dimensions, the amount of calculations we need to perform goes up exponentially!

We can draw the following conclusions based on the above:

As distance between the various observations (one row of X, the dataset) increases, supervised machine learning becomes increasingly computationally intensive - predictions from the incoming data is likely to be based on other similar data the classifier has been previously trained on.
The number of unique sets of obersvations grows exponentially with the number of features.
Hence, detecting patterns within the data with the same model becomes exponentially harder, and the model's ability to generalise is hampered.
The variance of the data increases as the dimensionality increases, hence the model gets more exposed to noise which may mask useful features. Thus, again hampering generalization capability of the model.

(Please drop question 3 as a separate question).

If you liked this answer, consider giving it a thumbs up. Thank you so much!

venereology answered 1 year ago

Jake Yum's new computer program "Learn to be a Great Cook" is selling off the shelves,...

Jake Yum's new computer program "Learn to be a Great Cook" is selling off the shelves, and Jake wants to know whether there is a gender difference in the cooking quality of individuals who use the software. So Jake asked 10 males and 20 females to spend 20 hours over a month to go through the program. Jake then asked each participant to cook a dinner for him that includes a meat, 2 veggies, a bread, a dessert, and an...

Suppose that a computer program randomly generates an 8-letter string from the letters A,B,C,D,E. For example,...

Suppose that a computer program randomly generates an 8-letter string from the letters A,B,C,D,E. For example, the program might generate the string CCCCCCCC or DAAEDCBB. The letter in each of the 8 positions is chosen independently of the other positions, and each of the five letters is chosen with equal likelihood. What is the probability that the string contains at least one A or at least one B?

Discuss five lessons developing countries can learn from the East Asian development experience.

The response time of a distributed computer system is an important quality characteristic. From previous experience,...

The response time of a distributed computer system is an important quality characteristic. From previous experience, it is known that the standard deviation of response time is 6.0 milliseconds and the sample average response time is 50.0 milliseconds. Assuming the response time follows a normal distribution, what is the probability that the response time is more than 57 milliseconds? Answer tolerance: +/- 0.03

Give an example of each of the types of e-commerce from your own experience. Explain the...

Give an example of each of the types of e-commerce from your own experience. Explain the transaction that took place and what makes it of a specific type.

What are some things that a person should learn/expect to obtain from a nursing program?

The 189 births from the study in the last question are cross-tabulated below with respect to...

The 189 births from the study in the last question are cross-tabulated below with respect to mother’s race and if the birth was LBW or not. A chi-square statistic was calculated to test whether or not “race” was independent of LBW. The result was a calculated value of 8.20 LBW No Yes total Race White 76 20 96 Black 14 12 26 Other 43 24 67 total 133 56 189 The degrees of freedom associated this chi-square statistic = [a]. The...

Question 1: Write a program to receive a string from the user and compares the first...

Question 1: Write a program to receive a string from the user and compares the first and second half of the word and prints “First and Second Half Same” or “First and Second Half Different”. If the length of the word is odd, ignore the middle letter. For example: Run 1: Enter a word: abcdabc First and Second Half Same Run 2: Enter a word: abcde First and Second Half Different Hint: use the Math.floor() and length() to get the...

1. A computer manager needs to know how efficiency of her new computer program depends on...

1. A computer manager needs to know how efficiency of her new computer program depends on the size of incoming data. Efficiency will be measured by the number of processed requests per hour. Applying the program to data sets of different sizes, she obtains the following results, Data size (gigabytes) 6 7 7 8 10 10 15 Processed requests 40 55 50 41 17 26 16 Draw the scatterplot for the data. Be sure to label your axes. Is there...

Write a 1-2 paragraph detailing a computer companys policy on e contracts.