In: Computer Science
For two sets ? and ?, their Jaccard similarity, denoted ?(?,?),
is defined as
?(?,?) =
|? ∩ ?| |? ∪ ?|
where |?| is the size of set ?; ? ∩ ? is the intersection of ? and
?, that is the elements in both of ? and ?; and ? ∪ ? is the union
of ? and ?, that is the elements in at least one of ? and ?. For
example, if ? = {1,2,3,4} and ? = {2,4,6,8}, then ? ∩ ? = {2,4} and
? ∪ ? = {1,2,3,4,6,8} so
?(?,?) =
|? ∩ ?| |? ∪ ?|
=
|{2,4}| |{1,2,3,4,6,8}|
=
2 6
=
1 3
Write a method, jaccard, that given two sets represented as
HashSets, returns their Jaccard similarity. The skeleton for the
method is provided in the file Jaccard.java.
The following are a few sample runs:
Input : ? = [1, 2, 3, 4], ? = [2, 4, 6, 8] Return: 0.333333⋯
Input : ? = ["???ℎ???", "?ℎ??", "????", "??????", "????ℎ"] ? =
["?ℎ??", "?????", "????", "??????", "????ℎ", "??????"] Return:
0.375
import java.util.*;
public class Jaccard<T> {
/**
* Computes the Jaccard similarity of two sets
represented as HashSets.
*
* Time Complexity: O(n) where n is the smaller of the
sizes of the two input sets.
*
* @param A HashSet representation of one of the two
input sets
* @param B HashSet representation of the other of the
two input sets
* @return the Jaccard similarity of A and B
*/
public double jaccard(HashSet<T> A,
HashSet<T> B) {
//Replace this line with your
return statement
return -1;
}
}
import java.util.*; public class Jaccard<T> { /** * Computes the Jaccard similarity of two sets represented as HashSets. * <p> * Time Complexity: O(n) where n is the smaller of the sizes of the two input sets. * * @param A HashSet representation of one of the two input sets * @param B HashSet representation of the other of the two input sets * @return the Jaccard similarity of A and B */ public double jaccard(HashSet<T> A, HashSet<T> B) { //holds the intersection of two sets HashSet<T> intersection = new HashSet<>(); for (T x : A) { //adding intersection data if (B.contains(x)) { intersection.add(x); } } int lenOfIntersection = intersection.size(); //finding size of intersection // |A U B| = |A| + |B| - |A ^ B| int lenOfUnion = A.size() + B.size() - lenOfIntersection; return lenOfIntersection * 1d / lenOfUnion; } public static void main(String[] args) { HashSet<Integer> A = new HashSet<>(Arrays.asList(1, 2, 3, 4)); HashSet<Integer> B = new HashSet<>(Arrays.asList(2, 4, 6, 8)); Jaccard<Integer> jaccard = new Jaccard<Integer>(); System.out.println("Jaccard Similarity : " + jaccard.jaccard(A, B)); } } |