In: Computer Science
I have a Python code that reads the text file, creates word list then calculates word frequency of each word. Please see below:
#Open file
f = open('example.txt', 'r')
#list created with all words
data=f.read().lower()
list1=data.split()
#empty dictionary
d={}
# Adding all elements of the list to a dictionary and assigning it's value as zero
for i in set(list1):
d[i]=0
# checking and counting the values
for i in list1:
for j in d.keys():
if i==j:
d[i]=d[i]+1
#Return all non-overlapping matches of pattern return pattern
print(d)
Question: How do I have my code only calculate specific list of words not the every single word in the file. for example: I only wanna know how many times (Apple, Banana, Orange, Watermelon, Blueberry) occurred throughout the text. Also apple/Apple/Apple! should count as same word. I appreciate your help. Please don't comment if you don't want to work on this question.
Here is the example text file: named example.txt
I Love apple, I don't like banana, but blueberry for me too.
Apple, banana, orange, watermelon are my fav.
Banana can keep you full. Watermelon is good for summer.
Banana!
Python code with comments pasted below.
#Open file in read mode
fr = open('example.txt', 'r')
#list of favourite fruits
fav_fruits=["apple", "banana", "orange", "watermelon",
"blueberry"]
#list created with all fruits/words in lowercase
data=fr.read().lower()
words=data.split()
#Initialize an empty dictionary for storing the the count of
favourite fruits only
fruits={}
#Traversing through the list named words
for word in words:
#Checking whether word is one of our favourite fruits
#word[0:len(word)-1] is to check whether the word is followed by !,
etc.
if word in fav_fruits or word[0:len(word)-1] in fav_fruits:
#if word is followed by ! or , then we need to remove it
#Only then banana! and banana will be treated as the same.
if word[0:len(word)-1] in fav_fruits:
word=word[0:len(word)-1]
#If word is not in the dictionary fruits, then
#Add word as the key of the dictionary and set its count to 1
if word not in fruits:
fruits[word]=1
#If word is already in the dictionary fruits, then
#Increment the value of the dictionary to 1 with the key word
else:
fruits[word]+=1
#printing the favourite fruits and count
for k,v in fruits.items():
print(k,"=",v)
Python code in IDLE pasted for better understanding of the
indent.
Output Screen
Input File - Example.txt