Question

In: Computer Science

1. Define the classification problem 2. What is the main difference between Simple Matching Coefficient (SMC)...

1. Define the classification problem

2. What is the main difference between Simple Matching Coefficient (SMC) Similarity and Jaccard Similarity?

3. Explain in your own words how the Decision Tree Classifier works.

4. Explain in your own words how the SVM Classifier works.

Solutions

Expert Solution

1. Define the classification problem?

---> The classification problem is the problem that for many real-world objects and systems.

---> coming up with an iron-clad classification system (to determine if an object is a member of a set or not, or which of several sets) is a difficult problem.

---> Classification is a central topic in machine learning that has to do with teaching machines how to group together data by particular criteria.

---> Classification is the process where computers group data together based on predetermined characteristics — this is called supervised learning.

---> There is an unsupervised version of classification, called clustering where computers find shared characteristics by which to group data when categories are not specified.

---> A common example of classification comes with detecting spam emails.

---> To write a program to filter out spam emails, a computer programmer can train a machine learning algorithm with a set of spam-like emails labelled as spam and regular emails labelled as not-spam.

---> The idea is to make an algorithm that can learn characteristics of spam emails from this training set so that it can filter out spam emails when it encounters new emails.

---> Classification is an important tool in today’s world, where big data is used to make all kinds of decisions in government, economics, medicine, and more.

---> Researchers have access to huge amounts of data, and classification is one tool that helps them to make sense of the data and find patterns.

---> While classification in machine learning requires the use of (sometimes) complex algorithms, classification is something that humans do naturally everyday.

---> Classification is simply grouping things together according to similar features and attributes.

---> When you go to a grocery store, you can fairly accurately group the foods by food group (grains, fruit, vegetables, meat, etc.) In machine learning, classification is all about teaching computers to do the same.

2. What is the main difference between Simple Matching Coefficient (SMC) Similarity and Jaccard Similarity?

Simple matching coefficient:
--------------------------------------

The simple matching coefficient (SMC) or Rand similarity coefficient is a statistic used for comparing the similarity and diversity of sample sets.

Given two objects, A and B, each with n binary attributes, SMC is defined as:

number of matching attributes
SMC = ----------------------------------------
number of attributes

M00 + M11
= --------------------------------
M00+M01+M10+M11

where:

M11 is the total number of attributes where A and B both have a value of 1.
M01 is the total number of attributes where the attribute of A is 0 and the attribute of B is 1.
M10 is the total number of attributes where the attribute of A is 1 and the attribute of B is 0.
M00 is the total number of attributes where A and B both have a value of 0.

Jaccard index:
-------------------

The Jaccard index, also known as Intersection over Union and the Jaccard similarity coefficient (originally given the French name coefficient de communauté by Paul Jaccard), is a statistic used for gauging the similarity and diversity of sample sets.

---> The Jaccard coefficient measures similarity between finite sample sets, and is defined as the size of the intersection divided by the size of the union of the sample sets:

|A^B| |A^B|
J(A,B)= --------- = -----------------
|AvB| |A| + |B| - |A^B|

If A and B are both empty , define J(A,B)=1

The Jaccard distance, which measures dissimilarity between sample sets, is complementary to the Jaccard coefficient and is obtained by subtracting the Jaccard coefficient from 1, or, equivalently, by dividing the difference of the sizes of the union and the intersection of two sets by the size of the union.

|AvB|-|A^B|
d_J(A,B) = 1 - J(A,B) = ------------------
|AvB|

SMD Difference with the Jaccard index :
-----------------------------------------------------

---> The SMC is very similar to the more popular Jaccard index.

---> The main difference is that the SMC has the term {\displaystyle M00 in its numerator and denominator, whereas the Jaccard index does not.

---> Thus, the SMC counts both mutual presences (when an attribute is present in both sets) and mutual absence (when an attribute is absent in both sets) as matches and compares it to the total number of attributes in the universe, whereas the Jaccard index only counts mutual presence as matches and compares it to the number of attributes that have been chosen by at least one of the two sets.

---> In market basket analysis, for example, the basket of two consumers who we wish to compare might only contain a small fraction of all the available products in the store, so the SMC will usually return very high values of similarities even when the baskets bear very little resemblance, thus making the Jaccard index a more appropriate measure of similarity in that context

---> For example, consider a supermarket with 1000 products and two customers. The basket of the first customer contains salt and pepper and the basket of the second contains salt and sugar. In this scenario, the similarity between the two baskets as measured by the Jaccard index would be 1/3, but the similarity becomes 0.998 using the SMC.

3. Explain in your own words how the Decision Tree Classifier works?

---> A decision tree is a decision support tool that uses a tree-like model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility.

---> It is one way to display an algorithm that only contains conditional control statements.

---> Decision trees are commonly used in operations research, specifically in decision analysis, to help identify a strategy most likely to reach a goal, but are also a popular tool in machine learning.

---> The tree can be explained by two entities, namely decision nodes and leaves. The leaves are the decisions or the final outcomes. And the decision nodes are where the data is split.

Decision rules:
-------------------

---> The decision tree can be linearized into decision rules, where the outcome is the contents of the leaf node, and the conditions along the path form a conjunction in the if clause. In general, the rules have the form.

---> Decision rules can be generated by constructing association rules with the target variable on the right. They can also denote temporal or causal relations.

Advantages and disadvantages:
---------------------------------------------

Among decision support tools, decision trees (and influence diagrams) have several advantages. Decision trees:

1. Are simple to understand and interpret. People are able to understand decision tree models after a brief explanation.

2. Have value even with little hard data. Important insights can be generated based on experts describing a situation (its alternatives, probabilities, and costs) and their preferences for outcomes.

3. Help determine worst, best and expected values for different scenarios.

4. Use a white box model. If a given result is provided by a model.

5. Can be combined with other decision techniques.

Disadvantages of decision trees:

1. They are unstable, meaning that a small change in the data can lead to a large change in the structure of the optimal decision tree.

2. They are often relatively inaccurate. Many other predictors perform better with similar data. This can be remedied by replacing a single decision tree with a random forest of decision trees, but a random forest is not as easy to interpret as a single decision tree.

3. For data including categorical variables with different number of levels, information gain in decision trees is biased in favor of those attributes with more levels.

4. Calculations can get very complex, particularly if many values are uncertain and/or if many outcomes are linked.

4. Explain in your own words how the SVM Classifier works.

---> SVM is a supervised machine learning algorithm which can be used for classification or regression problems

---> It uses a technique called the kernel trick to transform your data and then based on these transformations it finds an optimal boundary between the possible outputs.

---> Simply put, it does some extremely complex data transformations, then figures out how to seperate your data based on the labels or outputs you've defined.

---> Support Vector Machines – SVMs, represent the cutting edge of ranking algorithms and have been receiving special attention from the international scientific community.

---> Many successful applications, based on SVMs, can be found in different domains of knowledge, such as in text categorization, digital image analysis, character recognition and bioinformatics.

---> SVMs are relatively new approach compared to other supervised classification techniques, they are based on statistical learning theory developed by the Russian scientist Vladimir Naumovich Vapnik back in 1962 and since then, his original ideas have been perfected by a series of new techniques and algorithms.

So what makes it so great?

---> Non-linear SVM means that the boundary that the algorithm calculates doesn't have to be a straight line.

---> The benefit is that you can capture much more complex relationships between your datapoints without having to perform difficult transformations on your own. The downside is that the training time is much longer as it's much more computationally intensive.

---> Support vector machines are computational algorithms that construct a hyperplane or a set of hyperplanes in a high or infinite dimensional space.

---> SVMs can be used for classification, regression, or other tasks. Intuitively, a separation between two linearly separable classes is achieved by any hyperplane that provides no misclassification on all data points of any of the considered classes, that is, all points belonging to class A are labeled as +1, for example, and all points belonging to class B are labeled as -1.

venereology answered 1 year ago

Next > < Previous

Related Solutions

Define correlation coefficient. Describe in your own words the difference between correlation coefficient and coefficient of...

Define correlation coefficient. Describe in your own words the difference between correlation coefficient and coefficient of determination.

Problem 2 What is the main difference in prediction between the Lewis model and the Harris-Todaro...

Problem 2 What is the main difference in prediction between the Lewis model and the Harris-Todaro model on rural to urban migration? Given the validity of the Harris-Todaro model, what would you tell a government by creating government-subsidized jobs in its cities?

6. What is the difference between the correlation coefficient and ?2? Why should the correlation coefficient...

6. What is the difference between the correlation coefficient and ?2? Why should the correlation coefficient be -1 and 1? 7. What is the utility of marginal effects in regression models? How are they obtained? 8. What is heterocedasticity and homocedasticity? Explain how to detect and correct the first.

What is the difference between SMC-R50 composite and fabric reinforced composites? I know SMC uses random...

What is the difference between SMC-R50 composite and fabric reinforced composites? I know SMC uses random fiber orientation, is there a random fiber fabric? Thanks

1. What is the difference between Pearson’s correlation coefficient, r, and the coefficient of determination, r2?...

1. What is the difference between Pearson’s correlation coefficient, r, and the coefficient of determination, r2? What does each statistic tell us about the relationship between two variables? What do these statistics NOT tell us about the relationship between two variables?

Q1 what is the definition of knapsack problem ? Q2 what is the main difference between...

Q1 what is the definition of knapsack problem ? Q2 what is the main difference between salesman travelling problem and stagecoach problem ? please give short answer

1. What is the main difference between the Keynesian and Neoclassical approach to macro-equilibrium? 2. What...

1. What is the main difference between the Keynesian and Neoclassical approach to macro-equilibrium? 2. What are the corresponding Neoclassical and Keynesian policies to secure full employment equilibrium? 3. What are the components of aggregate spending? What are those components in a private closed economy? 4. What is disposable income? 5. Which is the main determinant of the consumption level? Describe the relationship. 6. What is Keynes’ ‘Fundamental psychological law’? 7. What is the marginal propensity to consume (MPC) and...

1. What are the three basic manufacturing cost categories? 2. What are the main difference between...

1. What are the three basic manufacturing cost categories? 2. What are the main difference between product costs and period costs? Give examples of each. 3. What is the difference between a fixed cost and a variable cost? Give examples of each. 4. What are differential costs? Opportunity costs? Sunk costs? Be specific and give examples of each. Please review youre grammar before posting the question.

1. what is the difference between classification of a note as short term or long term?...

1. what is the difference between classification of a note as short term or long term? 2. at the beginning of year 1, B Co, has a note of $72,000 that calls for an annual payment of $16,246, which includes both principal and interest rate is 8 percent, what is the amount o finterewst expense in year 1 and in year 2? what is the balance of the note at the end of year 2? 3. What is the purpose...

What is the main difference between notes payable and bonds payable? What is the main difference...

What is the main difference between notes payable and bonds payable? What is the main difference between a bond and a share of stock? What does it mean to issue bonds at Par? Discount? Premium? What is the contract rate and the market rate for bonds? How do you compute total bond interest expense when a bond is sold at a discount? Explain your answer. How do you compute bond interest expense when a bond is sold at a premium?...

Subjects