In: Computer Science
What is the process of removing data that you think is irrelevant (such as stop words) called?
Unsupervised learning |
||
Cleaning data |
||
Tabulating the census |
||
Untheorized research |
(ANSWER)
Cleaning data - > It is the process to remove the
"Stop Words" or the irrelevant data. Stop words
are those words which do not add any logic or meaning to a language
or a sentence but are only used to form a sentence.
Ex - > Robert plays cricket at 10:00 AM each day. In this
sentence, the word like at does not add any logical meaning when we
talk about machine learning or drawing some inference from a given
data
Unsupervised learning means that the we need to
draw some conclusion from a given data set but we do not have any
prior knowledge about the data which is given to us. It identify
patters in cluster of data and draw some conclusion based on the
property of data and its processing. It is a type of algorithm
which is implemented to acheive this.
Tabulating the census means that we are storing
the data after doing some kind of survey or data segregation based
on some provided input. Census here may be historical data
conclusion or may be a fresh survey as to make our system as good
as real tme by storing data in tablular form from the census.
Untheorized research means that we are just trying
to draw some conclusion from the given data set without applying
any streamlined process or way to achieve the conclusion. It is
just a brute force attack in the absense of any solid framework to
categorize the data and try to understand it.
Kindly upvote if
this helped