Question

In: Computer Science

Write a script named countmatches that expects at least two arguments on the command line. The...

Write a script named countmatches that expects at least two arguments on the command line.

  • The first argument is the pathname of a dna file containing a valid DNA string with no newline characters or white space characters of any kind within it. (It will be terminated with a newline character.) This dna file contains nothing but a sequence of the bases a, c, g, and t in any order.
  • The remaining arguments are strings containing only the bases a, c, g, and t in any order.

Error checking:

The script should check that the first argument is a file name, and that there is at least one other argument after it. If the first argument is not a file name or if it is missing anything after the filename, the script should output to the user

  • the appropriate user error,
  • a how-to-use-me message,

and then exit.

The script is not required to check that the file is in the proper form, or that the strings contains nothing but the letters a, c, g, and t.

The script is not required to check that the dna file contains a number of bases equal to a multiple of 3.

For each valid argument string, the program will search the DNA string in the file and count how many non-overlapping occurrences of that argument string are in the DNA string. To make sure you understand what non‐overlapping means, the string ata occurs just once in the string atata, not twice, because the two occurrences overlap.

If your script is called correctly, it will output for each argument a line containing the argument string followed by how many times it occurs in the string. If it finds no occurrences, it should output 0 as a count.


For example, if the string aaccgtttgtaaccggaac is in a file named dnafile, then your script should work like this:

$ ./countmatches dnafile ttt

ttt 1


$ countmatches dnafile aac ggg aaccg

aac 3


ggg 0


aaccg 2


Warning: if it is given valid arguments, the script is not to output anything except the strings and their associated counts. No fancy messages, no words!


Testing: There DNA text files are in the cs132 course directory,

/data/biocs/b/student.accounts/cs132/data/dna_textfiles

to give to your script as the file argument.

Hint: You can write this script using grep and one filter command that appears in the course material. Although there are many filters commands, you do not need all of them to write the script. You have to read more about grep to know how to use it. The one filter command appears in the course material already.

Solutions

Expert Solution

Please find the following program in shell script to find count of dna pattterns available in the input file.

Note:

I have added screen shots and comments for better understanding.

Program:

#!/bin/bash

#assign the total arguments passed to the script
count=$#

#throw error if the count is less than 2
if [ $count -lt 2 ]
then
echo "Invalid number of argument"
echo "$0 <file name> [arg1] [arg2] .."
exit 1
fi

#assign the $1 to fileName
fileName=$1

#shift skips the file name
shift

#check if the file present or not
if [ ! -f $fileName ]
then
echo "File: $fileName not found"
exit
fi

#now go through all the dna patggg aaccgterns passed to the argument

#inside the for loop, we use sed to repalce the overlapping pattern to a pattern.
#for example, if "ataata" present, then change it to "ata"

for dnaPattern in $*
do
#get the count of pattern match
patternCount=`sed "s/$dnaPattern$dnaPattern/$dnaPattern/g" <$fileName|grep -oP $dnaPattern |wc -l`
#print the output to screen
echo $dnaPattern $patternCount
done

Screen Shot:

Output:


Related Solutions

In Linux (Ubuntu), write a script to check command arguments (3 arguments maximum). Display the argument...
In Linux (Ubuntu), write a script to check command arguments (3 arguments maximum). Display the argument one by one. If there is no argument provided, remind users about the mistake. If there is an easy way to use a loop to get all arguments, use it? a. Display the source code in an editor (#4-1) b. Execute your script in the terminal, and display the command and the result (#4-2)
Write a program that takes two command line arguments at the time the program is executed....
Write a program that takes two command line arguments at the time the program is executed. You may assume the user enters only decimal numeric characters. The input must be fully qualified, and the user should be notified of any value out of range for a 23-bit unsigned integer. The first argument is to be considered a data field. This data field is to be is operated upon by a mask defined by the second argument. The program should display...
program c Write a program called filesearch that accepts two command-line arguments: A string A filename...
program c Write a program called filesearch that accepts two command-line arguments: A string A filename If the user did not supply both arguments, the program should display an error message and exit. The program opens the given filename. Each line that contains the given string is displayed. Use the strstr function to search each line for the string. You may assume no line is longer than 255 characters. The matching lines are displayed to standard output (normally the screen).
Write a program that prints the sum of its command-line arguments (assuming they are numbers). For...
Write a program that prints the sum of its command-line arguments (assuming they are numbers). For example, java Adder 3 2.5 -4.1 should print The sum is 1.4
LINUX In Linux command line write a shell script ex1.sh that uses IF THEN to Prompt...
LINUX In Linux command line write a shell script ex1.sh that uses IF THEN to Prompt the user to "Enter a number between 1 and 10". (Hint: Use the 'echo' and 'read' commands in the script. See the slide about the 'read' command) If the number is less than 5, print "The number is less than 5" (Hint: You will read input into a variable; e.g. read NUM. In the IF statement, enclose $NUM in quotes; e.g. "$NUM". Also, remember...
Write a program Median.java to read each file whose name is specified in the command-line arguments....
Write a program Median.java to read each file whose name is specified in the command-line arguments. That is, for each command-line argument, open it as a file and read it. The file contents are zero or more lines each containing a list of comma-separated integers, such as 1,2,3,4 or 99,120,33. You should parse each of these integers and save them in an ArrayList (if you prefer you may use an array, but an ArrayList is likely to be easier for...
(Intro/Basic) JAVA Write a small program that gets some numbers as command-line arguments and finds the...
(Intro/Basic) JAVA Write a small program that gets some numbers as command-line arguments and finds the maximum of those numbers. You can assume that the user will provide some command-line arguments when running the program (so you don’t have to worry about what to do if the user doesn’t provide any arguments -- that won’t happen). Also, you can assume that all the command line arguments will be floating-point numbers, i.e., numbers with a decimal point, like 2.95.
Complete Question 1a-c 1a) Write a C program that displays all the command line arguments that...
Complete Question 1a-c 1a) Write a C program that displays all the command line arguments that appear on the command line when the program is invoked. Use the file name cl.c for your c program. Test your program with cl hello goodbye and cl 1 2 3 4 5 6 7 8 and cl 1b) Write a C program that reads in a string from the keyboard. Use scanf with the conversion code %s. Recall that the 2nd arg in...
IN C LANGUAGE This program takes two command line arguments: an input filename a threshold Your...
IN C LANGUAGE This program takes two command line arguments: an input filename a threshold Your program will create two files: even.txt - contains all integers from the input file that are even and greater than the threshold odd.txt - contains all integers from the input file that are odd and greater than the threshold The input file will exist and only contain a set of integers. It will always be valid data. Output whitespace will be ignored. Name the...
Write a short bash script that takes in two arguments from a user, asks the user...
Write a short bash script that takes in two arguments from a user, asks the user whether they would like to add, subtract, multiply, or divide. Each of these operations must be a function that returns data.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT