In: Statistics and Probability
Compare the TF-IDF pivoted normalization formula and Okapi formula analytically.
Both formulas are given in the figure above. What are the common statistical information about documents and queries that they both use? How are the two formulas similar to each other, and how are they different?