Question

In: Computer Science

For two sets ? and ?, their Jaccard similarity, denoted ?(?,?), is defined as ?(?,?) =...

For two sets ? and ?, their Jaccard similarity, denoted ?(?,?), is defined as

?(?,?) =
|? ∩ ?| |? ∪ ?|


where |?| is the size of set ?; ? ∩ ? is the intersection of ? and ?, that is the elements in both of ? and ?; and ? ∪ ? is the union of ? and ?, that is the elements in at least one of ? and ?. For example, if ? = {1,2,3,4} and ? = {2,4,6,8}, then ? ∩ ? = {2,4} and ? ∪ ? = {1,2,3,4,6,8} so

?(?,?) =
|? ∩ ?| |? ∪ ?|
=
|{2,4}| |{1,2,3,4,6,8}|
=
2 6
=
1 3


Write a method, jaccard, that given two sets represented as HashSets, returns their Jaccard similarity. The skeleton for the method is provided in the file Jaccard.java.

The following are a few sample runs:

Input : ? = [1, 2, 3, 4], ? = [2, 4, 6, 8] Return: 0.333333⋯

Input : ? = ["???ℎ???", "?ℎ??", "????", "??????", "????ℎ"] ? = ["?ℎ??", "?????", "????", "??????", "????ℎ", "??????"] Return: 0.375

import java.util.*;

public class Jaccard<T> {
   /**
   * Computes the Jaccard similarity of two sets represented as HashSets.
   *
   * Time Complexity: O(n) where n is the smaller of the sizes of the two input sets.
   *
   * @param A HashSet representation of one of the two input sets
   * @param B HashSet representation of the other of the two input sets
   * @return the Jaccard similarity of A and B
   */
   public double jaccard(HashSet<T> A, HashSet<T> B) {
      
       //Replace this line with your return statement
       return -1;
   }
}

Solutions

Expert Solution

import java.util.*;

public class Jaccard<T> {
    /**
     * Computes the Jaccard similarity of two sets represented as HashSets.
     * <p>
     * Time Complexity: O(n) where n is the smaller of the sizes of the two input sets.
     *
     * @param A HashSet representation of one of the two input sets
     * @param B HashSet representation of the other of the two input sets
     * @return the Jaccard similarity of A and B
     */
    public double jaccard(HashSet<T> A, HashSet<T> B) {
        //holds the intersection of two sets
        HashSet<T> intersection = new HashSet<>();
        for (T x : A) {
            //adding intersection data
            if (B.contains(x)) {
                intersection.add(x);
            }
        }
        int lenOfIntersection = intersection.size(); //finding size of intersection
        // |A U B| = |A| + |B| - |A ^ B|
        int lenOfUnion = A.size() + B.size() - lenOfIntersection;
        return lenOfIntersection * 1d / lenOfUnion;
    }

    public static void main(String[] args) {

        HashSet<Integer> A = new HashSet<>(Arrays.asList(1, 2, 3, 4));
        HashSet<Integer> B = new HashSet<>(Arrays.asList(2, 4, 6, 8));

        Jaccard<Integer> jaccard = new Jaccard<Integer>();

        System.out.println("Jaccard Similarity : " + jaccard.jaccard(A, B));
    }
}

Related Solutions

Let G be a Group. The center of, denoted by Z(G), is defined to be the...
Let G be a Group. The center of, denoted by Z(G), is defined to be the set of all elements of G that with every element of G. Symbolically, we have Z(G) = {x in G | ax=xa for all a in G}. (a) Prove that Z(G) is a subgroup of G. (b) Prove that Z(G) is an Abelian group.
The factorial of a natural number n is denoted by n! and is defined as follows:...
The factorial of a natural number n is denoted by n! and is defined as follows: the factorial of 0 is 1, the factorial of 1 is 1, and the factorial of any other positive integer is the product of all the integers from 2 to that integer. Write a VBA function called MyFactorial with one input of data type Integer which computes and returns the value of the factorial of the input if it is greater than or equal...
Each of the following four sets contains two electrons, each of which is defined by four...
Each of the following four sets contains two electrons, each of which is defined by four quantum numbers (n, l, ml, ms). For each set, indicate whether the quantum numbers for the two electrons are valid for the two electrons in the highest energy subshell of a neutral titanium atom. If they are not valid, list the principle that the quantum numbers violate (Pauli Exclusion, Aufbau, or Hund’s Rule) and explain below. (the numbers are shown below in the order...
Which of the following sets are not well defined? Explain.
Which of the following sets are not well defined? Explain. a. The set of wealthy school teachers b. The set of great books c. The set of natural numbers greater than 100 d. The set of subsets of \(\{1,2,3,4,5,6\}\) e. The set \(\{x \mid x \neq x\) and \(x \in N\}\)  
Let A be a collection of sets, then We defined A’ = Union of all elements...
Let A be a collection of sets, then We defined A’ = Union of all elements of A. Definition: If A = NOT the union of C and D , where C and D are non empty sub-collection, such that C’ intersect D’ = empty Then A is coherent. Prove: if A is coherent then A’ is connected (i.e A’ is not the union of two separated sets in standard topology)
we have defined open sets in R: for any a ∈ R, there is sigma >...
we have defined open sets in R: for any a ∈ R, there is sigma > 0 such that (a − sigma, a + sigma) ⊆ A. (i) Let A and B be two open sets in R. Show that A ∩ B is open. (ii) Let {Aα}α∈I be a family of open sets in R. Show that ∪(α∈I)Aα is open. Hint: Follow the definition of open sets. Please be specific and rigorous! Thanks!
Two sets are separated if the intersection of the closure of one of the sets with...
Two sets are separated if the intersection of the closure of one of the sets with the other is empty. a) Prove that two closed and disjoint sets of some metric space are separate. b) Prove that two open and disjoint sets of some metric space are separate.
) Eulerian and Hamiltonian cycles. Consider the graphs defined by the following four sets of edges:...
) Eulerian and Hamiltonian cycles. Consider the graphs defined by the following four sets of edges: a) 0-1 0-2 0-3 1-3 1-4 2-5 2-9 3-6 4-7 4-8 5-8 5-9 6-7 6-9 7-8 b) 0-1 0-2 0-3 1-3 0-3 2-5 5-6 3-6 4-7 4-8 5-8 5-9 6-7 6-9 8-8 c) 0-1 1-2 1-3 0-3 0-4 2-5 2-9 3-6 4-7 4-8 5-8 5-9 6-7 6-9 7-8 d) 4-1 7-9 6-2 7-3 5-0 0-2 0-8 1-6 3-9 6-3 2-8 1-5 9-8 4-5 4-7...
** Can I get the similarity and differences of the terms to seperasted by two parts...
** Can I get the similarity and differences of the terms to seperasted by two parts like "A. Similarities:", and "B. Differences" **Also this is due tonight by midnight so I need it asap please :) 1. calcium phosphate (hydroxyapatitie) crystals – collagen 3. periosteum - endosteum 4. epiphyseal plate - epiphyseal line 5. red marrow - yellow marrow 6. osteoblast - osteoclast 7. articular cartilage - synovial membrane 8. epiphysis - diaphysis 9. synarthrosis – diarthrosis 1 0. axial...
Alex has two funds A and B. The annual return of fund A is denoted by...
Alex has two funds A and B. The annual return of fund A is denoted by X (in %) while the annual return of fund B is denoted by Y (in %) . Assume X ∼ U (−10, 20) and Y ∼ N (10, 50). Further, suppose the probability that both stocks A and B have positive annual returns is 0.6. Find the probability that stock A has a negative return. Find the probability that stock B has a positive...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT