Question

In: Computer Science

Write a script named countmatches that expects at least two arguments on the command line. The...

Write a script named countmatches that expects at least two arguments on the command line.

  • The first argument is the pathname of a dna file containing a valid DNA string with no newline characters or white space characters of any kind within it. (It will be terminated with a newline character.) This dna file contains nothing but a sequence of the bases a, c, g, and t in any order.
  • The remaining arguments are strings containing only the bases a, c, g, and t in any order.

Error checking:

The script should check that the first argument is a file name, and that there is at least one other argument after it. If the first argument is not a file name or if it is missing anything after the filename, the script should output to the user

  • the appropriate user error,
  • a how-to-use-me message,

and then exit.

The script is not required to check that the file is in the proper form, or that the strings contains nothing but the letters a, c, g, and t.

The script is not required to check that the dna file contains a number of bases equal to a multiple of 3.

For each valid argument string, the program will search the DNA string in the file and count how many non-overlapping occurrences of that argument string are in the DNA string. To make sure you understand what non‐overlapping means, the string ata occurs just once in the string atata, not twice, because the two occurrences overlap.

If your script is called correctly, it will output for each argument a line containing the argument string followed by how many times it occurs in the string. If it finds no occurrences, it should output 0 as a count.


For example, if the string aaccgtttgtaaccggaac is in a file named dnafile, then your script should work like this:

$ ./countmatches dnafile ttt

ttt 1


$ countmatches dnafile aac ggg aaccg

aac 3


ggg 0


aaccg 2


Warning: if it is given valid arguments, the script is not to output anything except the strings and their associated counts. No fancy messages, no words!


Testing: There DNA text files are in the cs132 course directory,

/data/biocs/b/student.accounts/cs132/data/dna_textfiles

to give to your script as the file argument.

Hint: You can write this script using grep and one filter command that appears in the course material. Although there are many filters commands, you do not need all of them to write the script. You have to read more about grep to know how to use it. The one filter command appears in the course material already.

Solutions

Expert Solution

Please find the following program in shell script to find count of dna pattterns available in the input file.

Note:

I have added screen shots and comments for better understanding.

Program:

#!/bin/bash

#assign the total arguments passed to the script
count=$#

#throw error if the count is less than 2
if [ $count -lt 2 ]
then
echo "Invalid number of argument"
echo "$0 <file name> [arg1] [arg2] .."
exit 1
fi

#assign the $1 to fileName
fileName=$1

#shift skips the file name
shift

#check if the file present or not
if [ ! -f $fileName ]
then
echo "File: $fileName not found"
exit
fi

#now go through all the dna patggg aaccgterns passed to the argument

#inside the for loop, we use sed to repalce the overlapping pattern to a pattern.
#for example, if "ataata" present, then change it to "ata"

for dnaPattern in $*
do
#get the count of pattern match
patternCount=`sed "s/$dnaPattern$dnaPattern/$dnaPattern/g" <$fileName|grep -oP $dnaPattern |wc -l`
#print the output to screen
echo $dnaPattern $patternCount
done

Screen Shot:

Output:


Related Solutions

In Linux (Ubuntu), write a script to check command arguments (3 arguments maximum). Display the argument...
In Linux (Ubuntu), write a script to check command arguments (3 arguments maximum). Display the argument one by one. If there is no argument provided, remind users about the mistake. If there is an easy way to use a loop to get all arguments, use it? a. Display the source code in an editor (#4-1) b. Execute your script in the terminal, and display the command and the result (#4-2)
Write a program that takes two command line arguments at the time the program is executed....
Write a program that takes two command line arguments at the time the program is executed. You may assume the user enters only decimal numeric characters. The input must be fully qualified, and the user should be notified of any value out of range for a 23-bit unsigned integer. The first argument is to be considered a data field. This data field is to be is operated upon by a mask defined by the second argument. The program should display...
write a bash shell program named L1 as follows . L1 expects exactly 2 command line...
write a bash shell program named L1 as follows . L1 expects exactly 2 command line arguments. expects $1 to contain only digits. expects $2 to have exactly 4 characters (can be any character). if NOT exactly 2 command line arguments, echo "need 2 args " and exit. otherwise, if $1 and/or $2 have incorrect value(s) (as above), echo "bad argX", where X is the first incorrect argument, and exit. otherwise, echo "good ". you can use echo,grep, pipe to...
UNIX ONLY Write a bash script that will accept a filename as a command line argument....
UNIX ONLY Write a bash script that will accept a filename as a command line argument. Your script should first check whether the filename has been passed as argument or not (hint: at least one argument has to be provided). If no argument has been provided your script should display appropriate message and exit. If a file name has been provided, your script should check if the file exists in the current directory and display the contents of the file....
program c Write a program called filesearch that accepts two command-line arguments: A string A filename...
program c Write a program called filesearch that accepts two command-line arguments: A string A filename If the user did not supply both arguments, the program should display an error message and exit. The program opens the given filename. Each line that contains the given string is displayed. Use the strstr function to search each line for the string. You may assume no line is longer than 255 characters. The matching lines are displayed to standard output (normally the screen).
Write a program that prints the sum of its command-line arguments (assuming they are numbers). For...
Write a program that prints the sum of its command-line arguments (assuming they are numbers). For example, java Adder 3 2.5 -4.1 should print The sum is 1.4
Write a C++ program that prints out all of the command line arguments passed to the...
Write a C++ program that prints out all of the command line arguments passed to the program. Each command line argument should be separated from the others with a comma and a space. If a command line argument ends in a comma, then another comma should NOT be added
LINUX In Linux command line write a shell script ex1.sh that uses IF THEN to Prompt...
LINUX In Linux command line write a shell script ex1.sh that uses IF THEN to Prompt the user to "Enter a number between 1 and 10". (Hint: Use the 'echo' and 'read' commands in the script. See the slide about the 'read' command) If the number is less than 5, print "The number is less than 5" (Hint: You will read input into a variable; e.g. read NUM. In the IF statement, enclose $NUM in quotes; e.g. "$NUM". Also, remember...
write an awk script that works as wc command.
write an awk script that works as wc command.
Write a program Median.java to read each file whose name is specified in the command-line arguments....
Write a program Median.java to read each file whose name is specified in the command-line arguments. That is, for each command-line argument, open it as a file and read it. The file contents are zero or more lines each containing a list of comma-separated integers, such as 1,2,3,4 or 99,120,33. You should parse each of these integers and save them in an ArrayList (if you prefer you may use an array, but an ArrayList is likely to be easier for...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT