Question

In: Computer Science

Exercise 1 (1 point) For this step, you will load the training and test sentiment datasets...

Exercise 1 (1 point) For this step, you will load the training and test sentiment datasets "twitdata_TEST.tsv" and "allTrainingData.tsv". The data should be loaded into 4 lists of strings: X_txt_train, X_txt_test, y_test, y_train. Note, when using csvreader, you need to pass the "quoting" the value csv.QUOTE_NONE.

import csv X_txt_train = ... y_train = ... X_txt_test = ... y_test = ...

assert(type(X_txt_train) == type(list())) assert(type(X_txt_train[0]) == type(str())) assert(type(X_txt_test) == type(list())) assert(type(X_txt_test[0]) == type(str())) assert(type(y_test) == type(list())) assert(type(y_train) == type(list())) assert(len(X_txt_test) == 3199) assert(len(y_test) == 3199) assert(len(X_txt_train) == 8018) assert(len(y_train) == 8018) print("Asserts Completed Successfully!")

Solutions

Expert Solution

ANSWER:

I have provided the properly commented and indented code so you can easily copy the code as well as check for correct indentation.
I have provided the output image of the code so you can easily cross-check for the correct output of the code.
Have a nice and healthy day!!

CODE

import csv

# reading train data file
file_train = open("allTrainingData.tsv")
# reading file using csv reader, using correct arguments
reader_train = csv.reader(file_train,delimiter="\t", quoting=csv.QUOTE_NONE)
# defining empty lists X_txt_train, y_train
X_txt_train = []
y_train = []
# looping through each row of reader and appending data in defined lists
for row in reader_train:
    # storing y and text data from row in temp variables
    y = row[2]
    # joining all further data into one with '\t' seperator
    X_txt = "\t".join(row[3:])
    # appending data in respective lists
    X_txt_train.append(X_txt)
    y_train.append(y)
    

# reading test data file
file_test = open("twitdata_TEST.tsv")
# reading file using csv reader, using correct arguments
reader_test = csv.reader(file_test,delimiter="\t", quoting=csv.QUOTE_NONE)
# defining empty lists X_txt_test, y_test
X_txt_test = []
y_test = []
# looping through each row of reader and appending data in defined lists
for row in reader_test:
    # storing y and text data from row in temp variables
    y = row[2]
    # joining all further data into one with '\t' seperator
    X_txt = "\t".join(row[3:])
    # appending data in respective lists
    X_txt_test.append(X_txt)
    y_test.append(y)
###

assert(type(X_txt_train) == type(list()))
assert(type(X_txt_train[0]) == type(str()))
assert(type(X_txt_test) == type(list()))
assert(type(X_txt_test[0]) == type(str()))
assert(type(y_test) == type(list()))
assert(type(y_train) == type(list()))
assert(len(X_txt_test) == 3199)
assert(len(y_test) == 3199)
assert(len(X_txt_train) == 8018)
assert(len(y_train) == 8018)
print("Asserts Completed Successfully!")

OUTPUT IMAGE


Related Solutions

Describe the differences between a ramp and a stage/step exercise test to exhaustion. What are the...
Describe the differences between a ramp and a stage/step exercise test to exhaustion. What are the pros and cons for each??
Hi I have an exercise and I want to you to solve it step by step,...
Hi I have an exercise and I want to you to solve it step by step, please.. I found the solution here but it did not have steps .. *Answer the following questions using the information below: Tiger Pride produces two product lines: T-shirts and Sweatshirts. Product profitability is analyzed as follows: T-SHIRTS SWEATSHIRTS Production and sales volume 60,000 units 35,000 units Selling price $16.00 $29.00 Direct material $ 2.00 $ 5.00 Direct labor $ 4.50 $ 7.20 Manufacturing overhead...
1. An army recruit is on a training exercise and instructed to walk due west for...
1. An army recruit is on a training exercise and instructed to walk due west for 8 km, then in a north-easterly direction for 4 mi, and finally due north for 15,840 ft. How far will he be from where he started? 2. A tittle turtle is placed at the origin of an xy grid drawn on a large sheet of paper. Each grid box is 1.0 cm by 1.0 cm. The turtle walks around for a while and finally...
Conduct a regression and correlation hypothesis test step 1 to step 5 on the data below...
Conduct a regression and correlation hypothesis test step 1 to step 5 on the data below before you answer each question. The number of fat calories and number of saturated fate grams for a random selection of nonbreakfast entrees are shown below. Use a .05 level of significance. Y X 9 190 8 220 13 270 17 360 23 460 27 540 What is the null and alternative hypotheses step one? a. H0: There is no relationship. H1: There is...
Exercise 3 Step 1: When you read Storing Data Using Sets, you learned that Python's set...
Exercise 3 Step 1: When you read Storing Data Using Sets, you learned that Python's set type allows us to create mutable collections of unordered distinct items. The items stored in a set must be immutable, so sets can contain values of type int, float or str, but we can't store lists or sets in sets. Tuples are immutable, so we can store tuples in sets. Try this experiment, which creates a set containing the points (1.0, 2.0), (4.0, 6.0)...
1. Think of three hypothetical datasets that you believe would have the binomial distribution, the uniform...
1. Think of three hypothetical datasets that you believe would have the binomial distribution, the uniform distribution, and the normal distribution. Use the textbook homework exercises as a reference, but as much as possible, use your own original examples. Try to be as realistic as possible. For example, 'height' is not a good example of data with uniform distribution, because you won't find the same number of people who are 7 feet tall as there are 5 feet tall. 2....
Prove that step by step and clear handwritten. Follow the comment Conception: Limit point 1.Prove N'=empty...
Prove that step by step and clear handwritten. Follow the comment Conception: Limit point 1.Prove N'=empty 2.Prove Q'=R 3. E=(2,5), E'=[2,5] my question is that why the collection limit point of E includes 2 and 5? obviously, 2 and 5 are not in E. If they can be limit point in E does that mean R can be all set's collection of all limit point? such as R'=R, Q'=R, N'=R???
How do you do Voges-Proskauer test properly? A step by step procedure and explanation of each...
How do you do Voges-Proskauer test properly? A step by step procedure and explanation of each step would be nice. I have checked online but there seems to be a variety of way. I am asking for someone who has done it professionally or have done it many times.
Exercise 4-2 Income statement format; single step and multiple step [LO4-1, 4-5] The following is a...
Exercise 4-2 Income statement format; single step and multiple step [LO4-1, 4-5] The following is a partial trial balance for the Green Star Corporation as of December 31, 2018: Account Title Debits Credits Sales revenue 1,350,000 Interest revenue 34,000 Gain on sale of investments 54,000 Cost of goods sold 730,000 Selling expenses 180,000 General and administrative expenses 79,000 Interest expense 44,000 Income tax expense 134,000 100,000 shares of common stock were outstanding throughout 2018. Required: 1. Prepare a single-step income...
Question 41 (1 point) The initial biochemicl step at the beginning of the replication state of...
Question 41 (1 point) The initial biochemicl step at the beginning of the replication state of DNA virus reproduction is Question 41 options: is the transcription of the viral nucleic acid into mRNAs the formation of the nucleocapsid is the translation of viral mRNAs into viral enzymes. is the replication of the viral nucleic acid using the host cell nucleases. Question 42 (1 point) Host specificity of a virus is due to Question 42 options: A) the presence of an...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT