In: Computer Science
Can I get detailed explanation about the following topics
Keyword Queries
Boolean Queries
Phrase Queries
Proximity Queries
Natural Language Queries
Wildcard Queries
Rather than the solution for Chapter 27, proplem 18RQ in Fundamentals of database system (6th Edition) and also provide me with examples
Regards
Document 1: I am a happy person.
Document 2: I want to be happy
Document 3: How to be happy in life
Document 4: I am not sad.
Keyword Queries: The search term must exactly match the word(s) in the collection of documents being searched in order for the document(s) to be retrieved.
Example: A keyword query for "happy" will retrieve all the three docs. because the search term happy occurs in all the three.
Boolean Queries: A query that involves the usage of boolean operators (and, not and or) with the operands being simple search terms or complex queries.
Example: A boolean query for term1 = "happy" and term2 = "I" executed as (term1 and term2) will retrieve documents 1 and 2 because both term1 and term2 are present in these two documents.
Phrase Queries: A query that involves retrieving documents matching a specific sequence of terms. In information retrieval, we usually parse the documents first, before executing any queries on it. Phrase queries are different from keyword queries because they involve searching for phrases or sentences. You can argue that phrase queries is in other words substring matching, but their solutions (such as suffix arrays and permutation indexes) are not being extensively used in today's information retrieval systems.
Example: A query for "be happy" should retrieve documents 2 and 3.
Proximity Queries: A query that involves retrieving documents where the occurrences involve a slight modification of the search term, restricted to a fixed distance.
Example: A query for "Am I happy" will return document 1, but might return documents 2 and 3 also, depending on the distance specified. This way, we can match queries that aren't exactly present in documents but are relatively close.
Wildcard Queries: A wildcard query is made up of one or more missing characters. This is usually the case when the user is unsure of the exact spelling of the search term or when the search term has different spellings in different regions.
Example: "color" and "colour" are two words that have the same meaning but different spellings (British English and American English). Hence, to find documents containing these terms, the user will use "colo*" as the wildcard query.
Natural Language Queries: These are the queries that are made up of terms in the user's natural language.
Example: "Kahan pe ho" is a natural language query (which is the Hindi language typed in English) and it translates to "where are you" in English. Some popular search engines support these kinds of queries.
Please do give a thumbs-up rating if you liked this answer. Feel free to comment if you need any explanations/clarifications regarding the same. Thanks for asking :)