Question

In: Computer Science

a.What is the value of performing text analysis? How do companies benefit from this exercise? (b)...

a.What is the value of performing text analysis? How do companies benefit from this exercise?

(b) What are three challenges to performing text analysis?

(c) In your own words, discuss the text analysis steps (i.e., parsing, search and retrieval, and text mining).

(d) What are three major takeaways from text analysis?

Need 600 words

Solutions

Expert Solution

a. ) text analysis (also called text data mining ) is a method for extracting useful information from unstructured data through the identification and exploration of large amounts of text. Or, text mining is a method for extracting structured information from unstructured text.

Companies applies these text mining techniques such as categorization, entity extraction, sentiment analysis and natural language processing to transform text into data that can be used for further analysis. Applied to a corpus or body of information, text mining can be used to make large quantities of unstructured data accessible and useful by extracting useful information and knowledge hidden in text content and revealing patterns, trends and insight in large amounts of information.

b).major 3 challenges are

1.Establish a Contextualizing Data Structure

2.Achieve Semantic Disambiguation and Decoding of Textual Content

3.Promote Data Quality and Veracity

c.) 7 basic steps involved in preparing an unstructured text document for deeper analysis:

  1. Language Identification
  2. Tokenization
  3. Sentence Breaking
  4. Part of Speech Tagging
  5. Chunking
  6. Syntax Parsing
  7. Sentence Chaining

Each step is achieved on a spectrum between pure machine learning and pure software rules. Let’s review each step in order, and discuss the contributions of machine learning and rules-based NLP.

1. Language Identification

The first step in text analytics is identifying what language the text is written in. Spanish? Singlish? Arabic? Each language has its own idiosyncrasies, so it’s important to know what we’re dealing with.

2. Tokenization

Now that we know what language the text is in, we can break it up into pieces. Tokenization is the process of breaking a piece of text apart into pieces that a machine can understand.

We use the term “tokens”, and not “words”, because as well as being words, tokens can also be things like:

  • Punctuation (exclamation points intensify sentiment)
  • Hyperlinks (https://…)
  • Possessive markers (apostrophes)

Tokenization is language-specific, and each language has its own tokenization requirements. English, for example, uses white space and punctuation to denote tokens, and is relatively simple to tokenize.

3. Sentence Breaking

Point is, before you can run deeper text analytics functions (such as syntax parsing), you must be able to tell where the boundaries are on a sentence. Sometimes it’s a simple process, and other times… not so much.

Certain communication channels <cough> Twitter <cough> are particularly complicated to break down. We have ways of sentence breaking for social media, but we’ll leave that aside for now.

4. Part of Speech Tagging

Part of Speech tagging (or PoS tagging) is the process of determining the part of speech of every token in a document, and then tagging it as such.

5. Chunking

Let’s move on to the text analytics function known as Chunking (a few people call it light parsing, but we don’t). Chunking refers to a range of sentence-breaking systems that splinter a sentence into its component phrases (noun phrases, verb phrases, and so on).

Before we move forward, I want to draw a quick distinction between Chunking and Part of Speech tagging in text analytics.

  • PoS tagging means assigning parts of speech to tokens
  • Chunking means assigning PoS-tagged tokens to phrases

6. Syntax Parsing

The syntax parsing sub-function is a way to determine the structure of a sentence. In truth, syntax parsing is really just fancy talk for sentence diagraming. But it’s a critical preparatory step in sentiment analysis and other natural language processing features.

7. Sentence Chaining

The final step in preparing unstructured text for deeper analysis is sentence chaining, sometimes known as sentence relation.

Lexalytics utilizes a technique called “lexical chaining” to connect related sentences. Lexical chaining links individual sentences by each sentence’s strength of association to an overall topic.

d.)

1: Many marketers use gut instincts, not data.

Forty percent of marketers say that they use intuition to make decisions. That means a lot of businesses are rolling the dice on their marketing plans. Guessing wrong wastes time and resources, and many of these marketers aren’t measuring the outcomes of their decisions.

2.

Analytics help marketers calculate the actual value of their efforts.

One of the biggest complaints that digital marketers have is how difficult it is to prove that their campaigns are successful. Putting a dollar value on social media campaigns or content downloads can be a challenge.

3.Analytics help reveal who customers are and what they want.

Marketers can use Google Analytics to better understand their customer behavior. Tracking customer paths and running behavior reports helps you learn more about what people are doing on your website.



Related Solutions

How do megamergers benefit the other companies in the industry? How do megamergers benefit the economy?
How do megamergers benefit the other companies in the industry? How do megamergers benefit the economy?
A DOT is performing a benefit-cost analysis of a new highway using an analysis period of...
A DOT is performing a benefit-cost analysis of a new highway using an analysis period of 40 years as part the required environmental impact assessment of the project. The section of highway is estimated to have a construction cost $220 million dollars. The public benefit in reduced travel time and economic development around the highway is estimated to be $17 million per year for the first 5 years, then decrease by 3% per year for the remainder of the 40...
For Exercise 1 through 7, do a complete regression analysis by performing the following steps. a.Draw...
For Exercise 1 through 7, do a complete regression analysis by performing the following steps. a.Draw the scatter plot. b.Compute the value of the correlation coefficient. c.Test the significance of the correlation coefficient at α = 0.01, using Table I or use the P-value method. d.Determine the regression line equation if r is significant. e.Plot the regression line on the scatter plot, if appropriate. f.Predict y′ for a specific value of x, if appropriate. Sections 10–1 and 10–2 1.Customer Satisfaction...
For Exercise 1 through 7, do a complete regression analysis by performing the following steps. a.Draw...
For Exercise 1 through 7, do a complete regression analysis by performing the following steps. a.Draw the scatter plot. b.Compute the value of the correlation coefficient. c.Test the significance of the correlation coefficient at α = 0.01, using Table I or use the P-value method. d.Determine the regression line equation if r is significant. e.Plot the regression line on the scatter plot, if appropriate. f.Predict y′ for a specific value of x, if appropriate. 3.Puppy Cuteness and Cost A researcher...
A.What is the value of the bond now? B. Calculate the Duration of the Bond.. Show...
A.What is the value of the bond now? B. Calculate the Duration of the Bond.. Show Formula for both C. If Interest rates are 6%. What is the Duration of a perpetuity? A bond makes semiannual payments. The coupon is $60. Par is $1,000. The bond has 16 years left to maturity. Interest rates are now 4.7% per year.
Do we have to ignore value of human life in benefit cost analysis?
Do we have to ignore value of human life in benefit cost analysis?
. Summarize the process of performing a cost-benefit analysis to determine the efficient level of pollution...
. Summarize the process of performing a cost-benefit analysis to determine the efficient level of pollution abatement. Identify common difficulties as well as shortcomings and positive aspects of such an analysis.
How do Multinational Companies (MNCs) benefit from using tax havens? Please illustrate your answer of a...
How do Multinational Companies (MNCs) benefit from using tax havens? Please illustrate your answer of a MNC benefiting from using a tax haven with an article available in internet and published within the last 24 month in a newspaper or magazine specialized in business or finance (blogs will not be used).
what are ruminants and how do they benefit from bacteria
what are ruminants and how do they benefit from bacteria
Discuss how multinational companies can deal with repatriation issue. How can companies benefit from using women...
Discuss how multinational companies can deal with repatriation issue. How can companies benefit from using women expatriate? Is there advantage in male versus female?
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT