SlideShare a Scribd company logo
CSCI
132:
Practical
Unix
and
Programming
Adjunct:
Trami
Dang
Assignment
4
CSCI  132  Practical  Unix  and  Programming   .docx
CSCI  132  Practical  Unix  and  Programming   .docx
CSCI  132  Practical  Unix  and  Programming   .docx
Fall
2018
Assignment 41
This set of exercises will strengthen your ability to write
relatively simple shell scripts
using various filters. As always, your goals should be clarity,
efficiency, and simplicity. It
has two parts.
1. The background context that was provided in the previous
assignment is repeated here
for your convenience. A DNA string is a sequence of the letters
a, c, g, and t in any
order, whose length is a multiple of three2. For example,
aacgtttgtaaccagaactgt
is a DNA string of length 21. Each sequence of three
consecutive letters is called a codon.
For example, in the preceding string, the codons are aac, gtt,
tgt, aac, cag, aac,
and tgt.
Your task is to write a script named codonhistogram that
expects a file name on the
command line. This file is supposed to be a dna textfile, which
means that it contains
only a DNA string with no newline characters or white space
characters of any kind; it is
a sequence of the letters a, c, g, and t of length 3n for some n.
The script must count the
number of occurrences of every codon in the file, assuming the
first codon starts at
position 13, and it must output the number of times each codon
occurs in the file, sorted
in order of decreasing frequency. For example, if dnafile is a
file containing the dna
string aacgtttgtaaccagaactgt, then the command
codonhistogram dnafile
should produce the following output:
3 aac
2 tgt
1 cag
1 gtt
because there are 3 aac codons, 2 tgt, 1 cag, and 1 gtt. Notice
that frequency comes
first, then the codon name.
1
This is licensed under the Creative Commons Attribution-
NonCommercial-ShareAlike 4.0 International
License. To view a copy of this license, visit
http://creativecommons.org/licenses/by-nc-sa/4.0/.
2
This is really just a simplification to make the assignment
easier. In reality, it is not necessarily a
multiple of 3.
3
Those of you who know a little about genomics know that the
open reading frame can be shifted to get a
different set of codons. I want any of you who know this much
to assume that there is only one open
reading frame – the one starting at position 1.
CSCI
132:
Practical
Unix
and
Programming
Adjunct:
Trami
Dang
Assignment
4
CSCI  132  Practical  Unix  and  Programming   .docx
CSCI  132  Practical  Unix  and  Programming   .docx
Fall
2018
Important: If two or more codons have the same frequency, your
script should break the
tie using alphabetical order of the codons. In this example, cag
and gtt each occur just
once, but because c precedes g, cag comes before gtt above.
Error checking: The script should check that it has at least one
argument. If it is missing
an argument, it should exit with the usage message
codonhistogram <dnafile>
If it has an argument, it must check that it is the name of an
ordinary file that it can read.
If it cannot, it must exit with the message
codonhistogram: cannot open file <filename argument> for
reading
It must check that the file has a number of characters that is a
multiple of 3 and that it has
only the characters a, c, g, and t and no others. If the file ends
with a newline character,
then the length must be a multiple of 3 plus 1. For any file not
satisfying these
constraints, it must exit with an error message.
2. Write a script called atomcoordinates that will accept the
name of a PDB file as
its only command line argument. Given this PDB file, it will
find all lines that start with
the word ATOM and will display, for each line that it finds, a
line of output containing the
atom's serial number and coordinates. For example, a line in the
PDB file that looks like
this:
ATOM 18 CB GLN A 3 83.556 52.126 45.080 1.00 26.06 C
would result in the following output line being displayed:
18 83.556 52.126 45.080
because the atom's serial number is 18 and its coordinates are
83.556, 52.126, and
45.080. How do you know where this information is? In a PDB
file, the data is in
specific columns. In particular, the atom's serial number is
always in columns 7 through
11, and the three coordinates start in column 31 and end in
column 54. Therefore, your
script has to extract the serial number and the coordinates from
these columns and display
them. Your job is to decide which filters can achieve this. This
will take some research.
Figure out which filters will work the best.
Error checking: Your script must check that it has at least one
command line argument,
and that it is a file that it can read. It must display a message if
either of these is not true.
CSCI
132:
Practical
Unix
and
Programming
CSCI  132  Practical  Unix  and  Programming   .docx
Adjunct:
Trami
Dang
Assignment
4
CSCI  132  Practical  Unix  and  Programming   .docx
CSCI  132  Practical  Unix  and  Programming   .docx
CSCI  132  Practical  Unix  and  Programming   .docx
Fall
2018
Grading Rubric
This homework is graded on a 100-point scale. Each script is
worth 50 points. Each script
will be graded on its correctness foremost. This means that it
does exactly what the
assignment states it must do, in detail. Correctness is worth
70% of the grade. Then it is
graded on its clarity, simplicity, and efficiency, as described
above. These qualitative
measures are worth 30% of the grade.
Submitting the Homework
Due Date: This assignment is due by the end of the day (i.e.
11:59PM, EST) on
Wednesday, October 31st. I will update the class accordingly of
when this particular
assignment is to be submitted to Blackboard as an assignment
submission.
If you complete the assignment before I announce the post of
Blackboard assignment
submission, you may post your assignment to my email, only as
a zip archive.
Submission details
In PDF format of your actions of command input with
screenshots of all output;
or as a zip file.
For remote logins: ssh to eniac.cs.hunter.cuny.edu with your
valid username
and password, and then ssh into any cslab host.
1. In your own home directory, create a directory named
assignment4_username
where username is your Linux Lab account username.
2. Put copies of the two scripts that you have written into this
directory. Make sure they
are named codonhistogram and atomcoordinates.
3. Run the commands:
$ zip -r assignment4_username.zip assignment4_username/
$ chmod 755 assignment4_username.zip
This will create the file assignment4_username.zip.
For Linux Lab users: once you have made the zip file, navigate
to its location in the file-
system and upload to Blackboard. For anyone working on the
assignment remotely, use
the scp command to securely copy it to your local computer, and
then upload the file to
CSCI
132:
Practical
Unix
and
Programming
Adjunct:
Trami
Dang
Assignment
4
CSCI  132  Practical  Unix  and  Programming   .docx
CSCI  132  Practical  Unix  and  Programming   .docx
CSCI  132  Practical  Unix  and  Programming   .docx
Fall
2018
Blackboard.
$ scp <[email protected]>:<path_of_zip_file>
<desired_path>
There is no whitespace on either side of the colon. Your login,
Your.Username
@eniac.cs.cuny.edu is named before the colon. The
<path_of_zip_file>
is absolute path on the remote machine, named after the colon.
Then type a whitespace
and specify the <desired_path> on your local file-system that
you would like to put
your zip file. If you run the command properly, it should bring
up a password prompt
from eniac.cs.hunter.cuny.edu. The zip file will be placed in
your specified
location. Now you are ready to upload your zip file to
Blackboard.
CSCI
132:
Practical
Unix
and
Programming
CSCI  132  Practical  Unix  and  Programming   .docx
Adjunct:
Trami
Dang
Assignment
3
CSCI  132  Practical  Unix  and  Programming   .docx
CSCI  132  Practical  Unix  and  Programming   .docx
Fall
2018
Assignment 31
Summary
The purpose of this assignment is to give you some practice in
bash scripting. When
you write a bash script, you are really writing a program in the
bash programming
language. bash is not just a shell, but a programming language
as well, and a bash
script can just as well be called a bash program. It is mentioned
now because very soon
you will begin writing programs in another programming
language, Perl; this is the first
in a sequence of small steps in mastering Perl.
We have been calling your programs shell scripts. A script is a
program, make no bones
about it. Scripts are programs written in a scripting language,
which is a special kind of
programming language. All scripting languages are
programming languages, but not vice
versa. The distinction will be explained in a later lecture. In this
case, bash is both a
programming language and a scripting language.
This assignment will begin with a review of some of the things
that have been covered in
class, and then introduce a few things not covered in class.
Some Important bash Instructions
A bash instruction is also called a statement. For example, the
if-instruction
if test $# -ne 2
then
echo “usage: $0 arg1 arg2”
exit
fi
echo “User input: $1 and $2”
is usually referred to as the if-statement. The test condition in
this case is $# -ne 2 .
If the test is true, where the number of the command parameters
($#) is not equal to 2, as
in (-ne 2), the statement(s) between then and if execute. If the
test is not true, bash
will skip those statements and execute what comes after the if-
statement, as in the second
echo statement.
CSCI  132  Practical  Unix  and  Programming   .docx
1
This is licensed under the Creative Commons Attribution-
NonCommercial-ShareAlike 4.0 International
License. To view a copy of this license, visit
http://creativecommons.org/licenses/by-nc-sa/4.0/.
CSCI
132:
Practical
Unix
and
Programming
CSCI  132  Practical  Unix  and  Programming   .docx
Adjunct:
Trami
Dang
Assignment
3
CSCI  132  Practical  Unix  and  Programming   .docx
CSCI  132  Practical  Unix  and  Programming   .docx
Fall
2018
The bash programming language has several statements that are
known as looping
statements. A looping statement is one that makes it possible to
repeat a sequence of
statements one or more times.
bash has a looping statement called the while-statement: the
while-statement is a
looping statement whose form (syntax) is
while <expression>
do
<list-of-statements>
done
in which <expression> is a statement such as the test command,
or any other statement
that can be evaluated as being true or false, and <list-of-
statements> is any sequence of
statements (including looping statements.) The following
snippet (little piece) of a script
shows one example of a while-statement:
echo -n "Try to guess my favorite color:"
read guess rest_of_line
mycolor=`cat secretfile`
# read about backquoted commands like `cat file`
while [ $guess != $mycolor ]
do
echo -n "Sorry, that is not my favorite color. Try again: "
read guess rest_of_line
done
In the above script, the <expression> part of the while-
statement is
[ $guess != $mycolor ]
and the <list-of-statements> is the list of two lines
echo -n "Sorry, that is not my favorite color. Try again: "
read guess rest_of_line
The above script will test whether guess is the same string as
mycolor, and if it is not,
it will execute the echo and read statements and then re-
evaluate the test command
that compares guess and mycolor. It will keep doing this until
the user enters a string
that is identical to the string stored in mycolor. When they do
match, the expression
becomes false and the “while loop” is exited.
A while-statement is usually called a while loop because if we
visualize the sequence of
executed statements as being connected by an imaginary thread,
then this thread loops
around and around the lines of the script.
CSCI
132:
Practical
Unix
and
Programming
CSCI  132  Practical  Unix  and  Programming   .docx
Adjunct:
Trami
Dang
Assignment
3
CSCI  132  Practical  Unix  and  Programming   .docx
CSCI  132  Practical  Unix  and  Programming   .docx
CSCI  132  Practical  Unix  and  Programming   .docx
Fall
2018
Bash also has a for-loop. (It has other loops too.) The for-loop
is very different from the
while-loop. It has two forms. One form (again the proper term is
syntax) is
for <variable> in <argument-list>
do
<list-of-statements>
done
and the other is
for <variable>
do
<list-of-statements>
done
The <variable> can be any valid variable name (words starting
with letters and
containing letters, digits, and the underscore character.) The
<argument-list> can be any
sequence of words, including words that look like numbers.
Examples of this are
for number in 1 2 3 4 5 6 7 8 9 10
for name in John Jacob Judy Jocelyn
for word in $*
As you can see, this can be very powerful.
As with the while-loop, the list of statements is any list of
statements, but the intention is
that the variable plays a role in this list. For example, the script
let sum=0
for number in 1 2 3 4 5 6 7 8 9 10
do
let square=$number*$number
let sum=$sum+$number
echo The square of $number is $square
done
echo The sum of the numbers is $sum.
displays ten lines showing the squares of the first ten positive
integers and then displays
their sum. Notice how the sum is calculated.
The second form of the for-loop does not need an argument list.
It automatically assigns
to the variable the successive words from the command line
arguments of the script when
it is run:
CSCI
132:
Practical
Unix
and
Programming
Adjunct:
Trami
Dang
Assignment
3
CSCI  132  Practical  Unix  and  Programming   .docx
CSCI  132  Practical  Unix  and  Programming   .docx
CSCI  132  Practical  Unix  and  Programming   .docx
Fall
2018
for word
do
echo $word.
done
It is the same as
for name in $*
do
echo $word.
done
It just prints the words found on the command line one after the
other on separate lines.
Your Tasks
This assignment consists of two exercises in writing relatively
simple shell scripts. The
objectives when writing any script are
clarity the script should be easy to understand by someone with
a basic knowledge
of UNIX;
efficiency the script should use the least resources possible; and
simplicity ���the script should be as simple as possible.
An example will demonstrate. Suppose we needed a script that
would count the number
of lines in a file named molecule containing the word 'ATOM'
anywhere on the line.
The following script would achieve this:
#!/bin/bash
grep ' ATOM ' molecule >| atomcount
wc -l atomcount >| answer
rm atomcount
cat answer
rm answer
but it is very inefficient (it needlessly creates files and then
removes them), it is hard to
understand because the reader spends more time reading it and
may not be familiar with
certain operators such as >|, and it is not as simple as it could
be. A simple, well-
documented, and efficient solution is
#!/bin/bash
CSCI
132:
Practical
Unix
and
Programming
Adjunct:
Trami
Dang
Assignment
3
CSCI  132  Practical  Unix  and  Programming   .docx
CSCI  132  Practical  Unix  and  Programming   .docx
Fall
2018
# Displays how many lines in file molecule contain ATOM as
# a complete word
# Written by Stewart Weiss
grep -c ' ATOM ' molecule
# The -c option to grep counts matching lines
It has comments to explain what it does and it achieves it with a
single command that can
be looked up easily.
Your job is to apply these ideas as you create solutions to the
following exercises.
1. The last command lists information about who has logged
into the computer on
which it is run. In particular, it has a column with the username,
the terminal on which
the user was connected, the internet address (the IP address)
from which they connected
to the computer, and the date and time that they logged in and
then logged out if they did.
If they logged out it also displays the total time they were
logged in. For example, this is
an entry for Dr. Weiss on cslab12: ���
sweiss pts/11 146.95.214.131 Thu Sep 14 13:05 - 14:27 (01:22)
If the username is too long it is truncated, but there are
options to display the full
username. For this exercise you are to write a bash script named
logincount that
takes a list of usernames as its command line arguments and
displays on the screen, for
each user name, a message of the form ���
Number of times that <username> logged into this machine is
<N>
where <N> is to be replaced by the number of records that the
last command output that
match <username> exactly. For example, if I enter the command
���logincount sweiss
it should output something like
Number of times that sweiss logged into this machine is 7
If a name given as an argument is not a username, nothing is
printed for that name. On
the other hand, if no names are given, it is an error and the
command should display the
error message, “Usage: logincount <list of usernames>”.
2. A DNA string is a sequence of the letters a, c, g, and t in any
order. For example,
aacgtttgtaaccag is a DNA string of length 15. Each sequence of
three consecutive
CSCI
132:
Practical
Unix
and
Programming
CSCI  132  Practical  Unix  and  Programming   .docx
Adjunct:
Trami
Dang
Assignment
3
CSCI  132  Practical  Unix  and  Programming   .docx
CSCI  132  Practical  Unix  and  Programming   .docx
CSCI  132  Practical  Unix  and  Programming   .docx
Fall
2018
letters is called a codon. For example, in the preceding string,
the codons are aac, gtt,
tgt, aac, and cag. If we ignored the first letter and started listing
the codons starting at
the second a, the codons would be acg, ttt, gta, and acc, and we
would ignore the
last ag. The letters are called bases.
A DNA string can be hundreds of thousands of codons long,
even millions of codons
long, which means that it is infeasible to count them by hand. It
would be useful to have a
simple script that could count the number of occurrences of a
specific codon in such a
string. For instance, for the example string above such a script
would tell us that aac
occurs three times and tgt occurs once.
Generally, we want to be able to find occurrences of arbitrary
sequences of bases in a
given DNA string, such as how many times ttatg occurs, or how
many times
cgacgattag occurs.
Your job is to write a script named countmatches that expects at
least two arguments
on the command line. The first argument is the pathname of a
file containing a valid
DNA string with no newline characters or white space
characters of any kind within it. (It
will be terminated with a newline character.) This file contains
nothing but a sequence of
the letters a, c, g, and t. DNA text files are located at
/data/biocs/b/student.accounts/cs132/data/dna_textfiles
to give to your script as the file argument.
The remaining arguments are strings containing only the bases
a, c, g, and t in any
order. For each valid argument string, it will search the DNA
string in the file and count
how many non-overlapping occurrences of that argument string
are in the DNA string. To
make sure you understand what non-overlapping means, the
string ata occurs just once
in the string atata, not twice, because the two occurrences
overlap.
If your script is called correctly, it will output for each
argument a line containing the
argument string followed by how many times it occurs in the
string. If it finds no
occurrences, it should output 0 as a count.
For example, if the string aaccgtttgtaaccggaac is in a file named
dnafile, then
your script should work like this:
$ countmatches dnafile ttt
ttt 1
$ countmatches dnafile aac ggg aaccg
aac 3
ggg 0
aaccg 2
CSCI
132:
Practical
Unix
and
Programming
Adjunct:
Trami
Dang
Assignment
3
CSCI  132  Practical  Unix  and  Programming   .docx
CSCI  132  Practical  Unix  and  Programming   .docx
Fall
2018
Warning: if it is given valid arguments, the script is not to
output anything except
the strings and their associated counts. No fancy messages, no
words! The script
should check that the first argument is a file name and that there
is at least one other
argument after it. If the first argument is not a file name, or if it
is missing anything after
the filename, the script should print a how-to-use-me message
and then exit. It is not
required to check that the file is in the proper form, or that the
string contains nothing but
the letters a, c, g, and t.
Hint: You can solve this problem using grep and one other
command that appears in
this document. Although there are other filters, you do not need
them to solve this
problem. You have to read more about grep to know how to use
it. The other command
has appeared in the slides already.
Grading Rubric
This homework is graded on a 100-point scale. Each script is
worth 50 points. Each script
will be graded on its correctness foremost. This means that it
does exactly what the
assignment states it must do, in detail. Correctness is worth
70% of the grade. Then it is
graded on its clarity, simplicity, and efficiency, as described
above. These qualitative
measures are worth 30% of the grade.
Submitting the Homework
Due Date: This assignment is due by the end of the day (i.e.
11:59PM, EST) on
Wednesday, October 31st. I will update the class accordingly of
when this particular
assignment is to be submitted to Blackboard as an assignment
submission.
If you complete the assignment before I announce the post of
Blackboard assignment
submission, you may post your assignment to my email, only as
a zip archive.
Submission details
In PDF format of your actions of command input with
screenshots of all output;
or as a zip file.
For remote logins: ssh to eniac.cs.hunter.cuny.edu with your
valid username
and password, and then ssh into any cslab host.
1. In your own home directory, create a directory named
assignment3_username
where username is your Linux Lab account username.
2. Put copies of the two scripts that you have written into this
directory. Make sure they
CSCI
132:
Practical
Unix
and
Programming
Adjunct:
Trami
Dang
Assignment
3
CSCI  132  Practical  Unix  and  Programming   .docx
CSCI  132  Practical  Unix  and  Programming   .docx
CSCI  132  Practical  Unix  and  Programming   .docx
Fall
2018
are named logincount and countmatches.
3. Run the commands:
$ zip -r assignment3_username.zip assignment3_username/
$ chmod 755 assignment3_username.zip
This will create the file assignment3_username.zip.
For Linux Lab users: once you have made the zip file, navigate
to its location in the file-
system and upload to Blackboard. For anyone working on the
assignment remotely, use
the scp command to securely copy it to your local computer, and
then upload the file to
Blackboard.
$ scp <[email protected]>:<path_of_zip_file>
<desired_path>
There is no whitespace on either side of the colon. Your login,
Your.Username
@eniac.cs.cuny.edu is named before the colon. The
<path_of_zip_file>
is absolute path on the remote machine, named after the colon.
Then type a whitespace
and specify the <desired_path> on your local file-system that
you would like to put
your zip file. If you run the command properly it should bring
up a password prompt
from eniac.cs.hunter.cuny.edu. The zip file will be placed in
your specified
location. Now you are ready to upload your zip file to
Blackboard.
Ad

Recommended

COMP 2103X1 Assignment 2Due Thursday, January 26 by 700 PM.docx
COMP 2103X1 Assignment 2Due Thursday, January 26 by 700 PM.docx
donnajames55
 
The Korn Shell is the UNIX shell (command execution program, often c.docx
The Korn Shell is the UNIX shell (command execution program, often c.docx
SUBHI7
 
Structures-2
Structures-2
arshpreetkaur07
 
BACKGROUND A shell provides a command-line interface for users. I.docx
BACKGROUND A shell provides a command-line interface for users. I.docx
wilcockiris
 
Unix And Shell Scripting
Unix And Shell Scripting
Jaibeer Malik
 
OverviewIn this assignment you will write your own shell i.docx
OverviewIn this assignment you will write your own shell i.docx
alfred4lewis58146
 
Reverse-engineering: Using GDB on Linux
Reverse-engineering: Using GDB on Linux
Rick Harris
 
ECS 60 Programming Assignment #1 (50 points) Winter 2016 .docx
ECS 60 Programming Assignment #1 (50 points) Winter 2016 .docx
jack60216
 
Bcsl 031 solve assignment
Bcsl 031 solve assignment
Indira Gnadhi National Open University (IGNOU)
 
cs3157-summer06-lab1
cs3157-summer06-lab1
tutorialsruby
 
cs3157-summer06-lab1
cs3157-summer06-lab1
tutorialsruby
 
This project explores usage of the IPC in the form of shared.pdf
This project explores usage of the IPC in the form of shared.pdf
adinathfashion1
 
1 of 9 CSCE 3600 Systems Programming Major Assignm.docx
1 of 9 CSCE 3600 Systems Programming Major Assignm.docx
ShiraPrater50
 
Buffer overflow tutorial
Buffer overflow tutorial
hughpearse
 
1 CMPS 12M Introduction to Data Structures Lab La.docx
1 CMPS 12M Introduction to Data Structures Lab La.docx
tarifarmarie
 
Chapter vvxxxxxxxxxxx1 - Part 1 (3).pptx
Chapter vvxxxxxxxxxxx1 - Part 1 (3).pptx
rajinevitable05
 
C tutorials
C tutorials
Amit Kapoor
 
1588147798Begining_ABUAD1.pdf
1588147798Begining_ABUAD1.pdf
SemsemSameer1
 
data.txtInternational Business Management l2 Cons.docx
data.txtInternational Business Management l2 Cons.docx
theodorelove43763
 
Shell Scripting and Programming.pptx
Shell Scripting and Programming.pptx
Harsha Patil
 
Shell Scripting and Programming.pptx
Shell Scripting and Programming.pptx
Harsha Patil
 
The Lab assignment will be graded out of 100 points.  There are .docx
The Lab assignment will be graded out of 100 points.  There are .docx
jmindy
 
20 -miscellaneous
20 -miscellaneous
Hector Garzo
 
OOP, Networking, Linux/Unix
OOP, Networking, Linux/Unix
Novita Sari
 
Part 5 of "Introduction to Linux for Bioinformatics": Working the command lin...
Part 5 of "Introduction to Linux for Bioinformatics": Working the command lin...
Joachim Jacob
 
C, C++ Interview Questions Part - 1
C, C++ Interview Questions Part - 1
ReKruiTIn.com
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects
PVS-Studio
 
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
apidays
 
CSIA 413 Cybersecurity Policy, Plans, and Programs.docx
CSIA 413 Cybersecurity Policy, Plans, and Programs.docx
mydrynan
 
CSIS 100CSIS 100 - Discussion Board Topic #1One of the object.docx
CSIS 100CSIS 100 - Discussion Board Topic #1One of the object.docx
mydrynan
 

More Related Content

Similar to CSCI  132  Practical  Unix  and  Programming   .docx (20)

Bcsl 031 solve assignment
Bcsl 031 solve assignment
Indira Gnadhi National Open University (IGNOU)
 
cs3157-summer06-lab1
cs3157-summer06-lab1
tutorialsruby
 
cs3157-summer06-lab1
cs3157-summer06-lab1
tutorialsruby
 
This project explores usage of the IPC in the form of shared.pdf
This project explores usage of the IPC in the form of shared.pdf
adinathfashion1
 
1 of 9 CSCE 3600 Systems Programming Major Assignm.docx
1 of 9 CSCE 3600 Systems Programming Major Assignm.docx
ShiraPrater50
 
Buffer overflow tutorial
Buffer overflow tutorial
hughpearse
 
1 CMPS 12M Introduction to Data Structures Lab La.docx
1 CMPS 12M Introduction to Data Structures Lab La.docx
tarifarmarie
 
Chapter vvxxxxxxxxxxx1 - Part 1 (3).pptx
Chapter vvxxxxxxxxxxx1 - Part 1 (3).pptx
rajinevitable05
 
C tutorials
C tutorials
Amit Kapoor
 
1588147798Begining_ABUAD1.pdf
1588147798Begining_ABUAD1.pdf
SemsemSameer1
 
data.txtInternational Business Management l2 Cons.docx
data.txtInternational Business Management l2 Cons.docx
theodorelove43763
 
Shell Scripting and Programming.pptx
Shell Scripting and Programming.pptx
Harsha Patil
 
Shell Scripting and Programming.pptx
Shell Scripting and Programming.pptx
Harsha Patil
 
The Lab assignment will be graded out of 100 points.  There are .docx
The Lab assignment will be graded out of 100 points.  There are .docx
jmindy
 
20 -miscellaneous
20 -miscellaneous
Hector Garzo
 
OOP, Networking, Linux/Unix
OOP, Networking, Linux/Unix
Novita Sari
 
Part 5 of "Introduction to Linux for Bioinformatics": Working the command lin...
Part 5 of "Introduction to Linux for Bioinformatics": Working the command lin...
Joachim Jacob
 
C, C++ Interview Questions Part - 1
C, C++ Interview Questions Part - 1
ReKruiTIn.com
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects
PVS-Studio
 
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
apidays
 
cs3157-summer06-lab1
cs3157-summer06-lab1
tutorialsruby
 
cs3157-summer06-lab1
cs3157-summer06-lab1
tutorialsruby
 
This project explores usage of the IPC in the form of shared.pdf
This project explores usage of the IPC in the form of shared.pdf
adinathfashion1
 
1 of 9 CSCE 3600 Systems Programming Major Assignm.docx
1 of 9 CSCE 3600 Systems Programming Major Assignm.docx
ShiraPrater50
 
Buffer overflow tutorial
Buffer overflow tutorial
hughpearse
 
1 CMPS 12M Introduction to Data Structures Lab La.docx
1 CMPS 12M Introduction to Data Structures Lab La.docx
tarifarmarie
 
Chapter vvxxxxxxxxxxx1 - Part 1 (3).pptx
Chapter vvxxxxxxxxxxx1 - Part 1 (3).pptx
rajinevitable05
 
1588147798Begining_ABUAD1.pdf
1588147798Begining_ABUAD1.pdf
SemsemSameer1
 
data.txtInternational Business Management l2 Cons.docx
data.txtInternational Business Management l2 Cons.docx
theodorelove43763
 
Shell Scripting and Programming.pptx
Shell Scripting and Programming.pptx
Harsha Patil
 
Shell Scripting and Programming.pptx
Shell Scripting and Programming.pptx
Harsha Patil
 
The Lab assignment will be graded out of 100 points.  There are .docx
The Lab assignment will be graded out of 100 points.  There are .docx
jmindy
 
OOP, Networking, Linux/Unix
OOP, Networking, Linux/Unix
Novita Sari
 
Part 5 of "Introduction to Linux for Bioinformatics": Working the command lin...
Part 5 of "Introduction to Linux for Bioinformatics": Working the command lin...
Joachim Jacob
 
C, C++ Interview Questions Part - 1
C, C++ Interview Questions Part - 1
ReKruiTIn.com
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects
PVS-Studio
 
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
apidays
 

More from mydrynan (20)

CSIA 413 Cybersecurity Policy, Plans, and Programs.docx
CSIA 413 Cybersecurity Policy, Plans, and Programs.docx
mydrynan
 
CSIS 100CSIS 100 - Discussion Board Topic #1One of the object.docx
CSIS 100CSIS 100 - Discussion Board Topic #1One of the object.docx
mydrynan
 
CSI Paper Grading Rubric- (worth a possible 100 points) .docx
CSI Paper Grading Rubric- (worth a possible 100 points) .docx
mydrynan
 
CSIA 413 Cybersecurity Policy, Plans, and ProgramsProject #4 IT .docx
CSIA 413 Cybersecurity Policy, Plans, and ProgramsProject #4 IT .docx
mydrynan
 
CSI 170 Week 3 AssingmentAssignment 1 Cyber Computer CrimeAss.docx
CSI 170 Week 3 AssingmentAssignment 1 Cyber Computer CrimeAss.docx
mydrynan
 
CSE422 Section 002 – Computer Networking Fall 2018 Ho.docx
CSE422 Section 002 – Computer Networking Fall 2018 Ho.docx
mydrynan
 
CSCI 714 Software Project Planning and EstimationLec.docx
CSCI 714 Software Project Planning and EstimationLec.docx
mydrynan
 
CSCI 561Research Paper Topic Proposal and Outline Instructions.docx
CSCI 561Research Paper Topic Proposal and Outline Instructions.docx
mydrynan
 
CSCI 561 DB Standardized Rubric50 PointsCriteriaLevels of .docx
CSCI 561 DB Standardized Rubric50 PointsCriteriaLevels of .docx
mydrynan
 
CryptographyLesson 10© Copyright 2012-2013 (ISC)², Inc. Al.docx
CryptographyLesson 10© Copyright 2012-2013 (ISC)², Inc. Al.docx
mydrynan
 
CSCI 352 - Digital Forensics Assignment #1 Spring 2020 .docx
CSCI 352 - Digital Forensics Assignment #1 Spring 2020 .docx
mydrynan
 
CSCE 1040 Homework 2 For this assignment we are going to .docx
CSCE 1040 Homework 2 For this assignment we are going to .docx
mydrynan
 
CSCE509–Spring2019Assignment3updated01May19DU.docx
CSCE509–Spring2019Assignment3updated01May19DU.docx
mydrynan
 
CSCI 2033 Elementary Computational Linear Algebra(Spring 20.docx
CSCI 2033 Elementary Computational Linear Algebra(Spring 20.docx
mydrynan
 
CSCE 3110 Data Structures & Algorithms Summer 2019 1 of .docx
CSCE 3110 Data Structures & Algorithms Summer 2019 1 of .docx
mydrynan
 
CSCI 340 Final Group ProjectNatalie Warden, Arturo Gonzalez, R.docx
CSCI 340 Final Group ProjectNatalie Warden, Arturo Gonzalez, R.docx
mydrynan
 
CSC-321 Final Writing Assignment In this assignment, you .docx
CSC-321 Final Writing Assignment In this assignment, you .docx
mydrynan
 
Cryptography is the application of algorithms to ensure the confiden.docx
Cryptography is the application of algorithms to ensure the confiden.docx
mydrynan
 
CSc3320 Assignment 6 Due on 24th April, 2013 Socket programming .docx
CSc3320 Assignment 6 Due on 24th April, 2013 Socket programming .docx
mydrynan
 
Cryptography KeysCryptography provides confidentiality, inte.docx
Cryptography KeysCryptography provides confidentiality, inte.docx
mydrynan
 
CSIA 413 Cybersecurity Policy, Plans, and Programs.docx
CSIA 413 Cybersecurity Policy, Plans, and Programs.docx
mydrynan
 
CSIS 100CSIS 100 - Discussion Board Topic #1One of the object.docx
CSIS 100CSIS 100 - Discussion Board Topic #1One of the object.docx
mydrynan
 
CSI Paper Grading Rubric- (worth a possible 100 points) .docx
CSI Paper Grading Rubric- (worth a possible 100 points) .docx
mydrynan
 
CSIA 413 Cybersecurity Policy, Plans, and ProgramsProject #4 IT .docx
CSIA 413 Cybersecurity Policy, Plans, and ProgramsProject #4 IT .docx
mydrynan
 
CSI 170 Week 3 AssingmentAssignment 1 Cyber Computer CrimeAss.docx
CSI 170 Week 3 AssingmentAssignment 1 Cyber Computer CrimeAss.docx
mydrynan
 
CSE422 Section 002 – Computer Networking Fall 2018 Ho.docx
CSE422 Section 002 – Computer Networking Fall 2018 Ho.docx
mydrynan
 
CSCI 714 Software Project Planning and EstimationLec.docx
CSCI 714 Software Project Planning and EstimationLec.docx
mydrynan
 
CSCI 561Research Paper Topic Proposal and Outline Instructions.docx
CSCI 561Research Paper Topic Proposal and Outline Instructions.docx
mydrynan
 
CSCI 561 DB Standardized Rubric50 PointsCriteriaLevels of .docx
CSCI 561 DB Standardized Rubric50 PointsCriteriaLevels of .docx
mydrynan
 
CryptographyLesson 10© Copyright 2012-2013 (ISC)², Inc. Al.docx
CryptographyLesson 10© Copyright 2012-2013 (ISC)², Inc. Al.docx
mydrynan
 
CSCI 352 - Digital Forensics Assignment #1 Spring 2020 .docx
CSCI 352 - Digital Forensics Assignment #1 Spring 2020 .docx
mydrynan
 
CSCE 1040 Homework 2 For this assignment we are going to .docx
CSCE 1040 Homework 2 For this assignment we are going to .docx
mydrynan
 
CSCE509–Spring2019Assignment3updated01May19DU.docx
CSCE509–Spring2019Assignment3updated01May19DU.docx
mydrynan
 
CSCI 2033 Elementary Computational Linear Algebra(Spring 20.docx
CSCI 2033 Elementary Computational Linear Algebra(Spring 20.docx
mydrynan
 
CSCE 3110 Data Structures & Algorithms Summer 2019 1 of .docx
CSCE 3110 Data Structures & Algorithms Summer 2019 1 of .docx
mydrynan
 
CSCI 340 Final Group ProjectNatalie Warden, Arturo Gonzalez, R.docx
CSCI 340 Final Group ProjectNatalie Warden, Arturo Gonzalez, R.docx
mydrynan
 
CSC-321 Final Writing Assignment In this assignment, you .docx
CSC-321 Final Writing Assignment In this assignment, you .docx
mydrynan
 
Cryptography is the application of algorithms to ensure the confiden.docx
Cryptography is the application of algorithms to ensure the confiden.docx
mydrynan
 
CSc3320 Assignment 6 Due on 24th April, 2013 Socket programming .docx
CSc3320 Assignment 6 Due on 24th April, 2013 Socket programming .docx
mydrynan
 
Cryptography KeysCryptography provides confidentiality, inte.docx
Cryptography KeysCryptography provides confidentiality, inte.docx
mydrynan
 
Ad

Recently uploaded (20)

What is FIle and explanation of text files.pptx
What is FIle and explanation of text files.pptx
Ramakrishna Reddy Bijjam
 
Introduction to Generative AI and Copilot.pdf
Introduction to Generative AI and Copilot.pdf
TechSoup
 
FEBA Sofia Univercity final diplian v3 GSDG 5.2025.pdf
FEBA Sofia Univercity final diplian v3 GSDG 5.2025.pdf
ChristinaFortunova
 
How to Manage Multi Language for Invoice in Odoo 18
How to Manage Multi Language for Invoice in Odoo 18
Celine George
 
Paper 107 | From Watchdog to Lapdog: Ishiguro’s Fiction and the Rise of “Godi...
Paper 107 | From Watchdog to Lapdog: Ishiguro’s Fiction and the Rise of “Godi...
Rajdeep Bavaliya
 
FIRST DAY HIGH orientation for mapeh subject in grade 10.pptx
FIRST DAY HIGH orientation for mapeh subject in grade 10.pptx
GlysdiEelesor1
 
june 10 2025 ppt for madden on art science is over.pptx
june 10 2025 ppt for madden on art science is over.pptx
roger malina
 
THERAPEUTIC COMMUNICATION included definition, characteristics, nurse patient...
THERAPEUTIC COMMUNICATION included definition, characteristics, nurse patient...
parmarjuli1412
 
ICT-8-Module-REVISED-K-10-CURRICULUM.pdf
ICT-8-Module-REVISED-K-10-CURRICULUM.pdf
penafloridaarlyn
 
How to Manage Inventory Movement in Odoo 18 POS
How to Manage Inventory Movement in Odoo 18 POS
Celine George
 
Sustainable Innovation with Immersive Learning
Sustainable Innovation with Immersive Learning
Leonel Morgado
 
ROLE PLAY: FIRST AID -CPR & RECOVERY POSITION.pptx
ROLE PLAY: FIRST AID -CPR & RECOVERY POSITION.pptx
Belicia R.S
 
Revista digital preescolar en transformación
Revista digital preescolar en transformación
guerragallardo26
 
BUSINESS QUIZ PRELIMS | QUIZ CLUB OF PSGCAS | 9 SEPTEMBER 2024
BUSINESS QUIZ PRELIMS | QUIZ CLUB OF PSGCAS | 9 SEPTEMBER 2024
Quiz Club of PSG College of Arts & Science
 
ABCs of Bookkeeping for Nonprofits TechSoup.pdf
ABCs of Bookkeeping for Nonprofits TechSoup.pdf
TechSoup
 
Introduction to problem solving Techniques
Introduction to problem solving Techniques
merlinjohnsy
 
How to Create an Event in Odoo 18 - Odoo 18 Slides
How to Create an Event in Odoo 18 - Odoo 18 Slides
Celine George
 
Assisting Individuals and Families to Promote and Maintain Health – Unit 7 | ...
Assisting Individuals and Families to Promote and Maintain Health – Unit 7 | ...
RAKESH SAJJAN
 
2025 June Year 9 Presentation: Subject selection.pptx
2025 June Year 9 Presentation: Subject selection.pptx
mansk2
 
How to Manage & Create a New Department in Odoo 18 Employee
How to Manage & Create a New Department in Odoo 18 Employee
Celine George
 
What is FIle and explanation of text files.pptx
What is FIle and explanation of text files.pptx
Ramakrishna Reddy Bijjam
 
Introduction to Generative AI and Copilot.pdf
Introduction to Generative AI and Copilot.pdf
TechSoup
 
FEBA Sofia Univercity final diplian v3 GSDG 5.2025.pdf
FEBA Sofia Univercity final diplian v3 GSDG 5.2025.pdf
ChristinaFortunova
 
How to Manage Multi Language for Invoice in Odoo 18
How to Manage Multi Language for Invoice in Odoo 18
Celine George
 
Paper 107 | From Watchdog to Lapdog: Ishiguro’s Fiction and the Rise of “Godi...
Paper 107 | From Watchdog to Lapdog: Ishiguro’s Fiction and the Rise of “Godi...
Rajdeep Bavaliya
 
FIRST DAY HIGH orientation for mapeh subject in grade 10.pptx
FIRST DAY HIGH orientation for mapeh subject in grade 10.pptx
GlysdiEelesor1
 
june 10 2025 ppt for madden on art science is over.pptx
june 10 2025 ppt for madden on art science is over.pptx
roger malina
 
THERAPEUTIC COMMUNICATION included definition, characteristics, nurse patient...
THERAPEUTIC COMMUNICATION included definition, characteristics, nurse patient...
parmarjuli1412
 
ICT-8-Module-REVISED-K-10-CURRICULUM.pdf
ICT-8-Module-REVISED-K-10-CURRICULUM.pdf
penafloridaarlyn
 
How to Manage Inventory Movement in Odoo 18 POS
How to Manage Inventory Movement in Odoo 18 POS
Celine George
 
Sustainable Innovation with Immersive Learning
Sustainable Innovation with Immersive Learning
Leonel Morgado
 
ROLE PLAY: FIRST AID -CPR & RECOVERY POSITION.pptx
ROLE PLAY: FIRST AID -CPR & RECOVERY POSITION.pptx
Belicia R.S
 
Revista digital preescolar en transformación
Revista digital preescolar en transformación
guerragallardo26
 
ABCs of Bookkeeping for Nonprofits TechSoup.pdf
ABCs of Bookkeeping for Nonprofits TechSoup.pdf
TechSoup
 
Introduction to problem solving Techniques
Introduction to problem solving Techniques
merlinjohnsy
 
How to Create an Event in Odoo 18 - Odoo 18 Slides
How to Create an Event in Odoo 18 - Odoo 18 Slides
Celine George
 
Assisting Individuals and Families to Promote and Maintain Health – Unit 7 | ...
Assisting Individuals and Families to Promote and Maintain Health – Unit 7 | ...
RAKESH SAJJAN
 
2025 June Year 9 Presentation: Subject selection.pptx
2025 June Year 9 Presentation: Subject selection.pptx
mansk2
 
How to Manage & Create a New Department in Odoo 18 Employee
How to Manage & Create a New Department in Odoo 18 Employee
Celine George
 
Ad

CSCI  132  Practical  Unix  and  Programming   .docx

  • 6. Fall 2018 Assignment 41 This set of exercises will strengthen your ability to write relatively simple shell scripts using various filters. As always, your goals should be clarity, efficiency, and simplicity. It has two parts.
  • 7. 1. The background context that was provided in the previous assignment is repeated here for your convenience. A DNA string is a sequence of the letters a, c, g, and t in any order, whose length is a multiple of three2. For example, aacgtttgtaaccagaactgt is a DNA string of length 21. Each sequence of three consecutive letters is called a codon. For example, in the preceding string, the codons are aac, gtt, tgt, aac, cag, aac, and tgt. Your task is to write a script named codonhistogram that expects a file name on the command line. This file is supposed to be a dna textfile, which means that it contains only a DNA string with no newline characters or white space characters of any kind; it is a sequence of the letters a, c, g, and t of length 3n for some n. The script must count the number of occurrences of every codon in the file, assuming the first codon starts at position 13, and it must output the number of times each codon occurs in the file, sorted in order of decreasing frequency. For example, if dnafile is a file containing the dna string aacgtttgtaaccagaactgt, then the command codonhistogram dnafile should produce the following output: 3 aac 2 tgt 1 cag
  • 8. 1 gtt because there are 3 aac codons, 2 tgt, 1 cag, and 1 gtt. Notice that frequency comes first, then the codon name.
  • 9. 1 This is licensed under the Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/4.0/. 2 This is really just a simplification to make the assignment easier. In reality, it is not necessarily a multiple of 3.
  • 10. 3 Those of you who know a little about genomics know that the open reading frame can be shifted to get a different set of codons. I want any of you who know this much to assume that there is only one open reading frame – the one starting at position 1. CSCI 132: Practical Unix and Programming
  • 16. Important: If two or more codons have the same frequency, your script should break the tie using alphabetical order of the codons. In this example, cag and gtt each occur just once, but because c precedes g, cag comes before gtt above. Error checking: The script should check that it has at least one argument. If it is missing an argument, it should exit with the usage message codonhistogram <dnafile> If it has an argument, it must check that it is the name of an ordinary file that it can read. If it cannot, it must exit with the message codonhistogram: cannot open file <filename argument> for reading It must check that the file has a number of characters that is a multiple of 3 and that it has only the characters a, c, g, and t and no others. If the file ends with a newline character, then the length must be a multiple of 3 plus 1. For any file not satisfying these constraints, it must exit with an error message. 2. Write a script called atomcoordinates that will accept the name of a PDB file as its only command line argument. Given this PDB file, it will find all lines that start with the word ATOM and will display, for each line that it finds, a line of output containing the atom's serial number and coordinates. For example, a line in the PDB file that looks like this:
  • 17. ATOM 18 CB GLN A 3 83.556 52.126 45.080 1.00 26.06 C would result in the following output line being displayed: 18 83.556 52.126 45.080 because the atom's serial number is 18 and its coordinates are 83.556, 52.126, and 45.080. How do you know where this information is? In a PDB file, the data is in specific columns. In particular, the atom's serial number is always in columns 7 through 11, and the three coordinates start in column 31 and end in column 54. Therefore, your script has to extract the serial number and the coordinates from these columns and display them. Your job is to decide which filters can achieve this. This will take some research. Figure out which filters will work the best. Error checking: Your script must check that it has at least one command line argument, and that it is a file that it can read. It must display a message if either of these is not true. CSCI 132: Practical Unix and Programming
  • 23. Fall 2018 Grading Rubric This homework is graded on a 100-point scale. Each script is worth 50 points. Each script will be graded on its correctness foremost. This means that it does exactly what the assignment states it must do, in detail. Correctness is worth 70% of the grade. Then it is graded on its clarity, simplicity, and efficiency, as described above. These qualitative measures are worth 30% of the grade. Submitting the Homework Due Date: This assignment is due by the end of the day (i.e. 11:59PM, EST) on Wednesday, October 31st. I will update the class accordingly of when this particular assignment is to be submitted to Blackboard as an assignment submission.
  • 24. If you complete the assignment before I announce the post of Blackboard assignment submission, you may post your assignment to my email, only as a zip archive. Submission details In PDF format of your actions of command input with screenshots of all output; or as a zip file. For remote logins: ssh to eniac.cs.hunter.cuny.edu with your valid username and password, and then ssh into any cslab host. 1. In your own home directory, create a directory named assignment4_username where username is your Linux Lab account username. 2. Put copies of the two scripts that you have written into this directory. Make sure they are named codonhistogram and atomcoordinates. 3. Run the commands: $ zip -r assignment4_username.zip assignment4_username/ $ chmod 755 assignment4_username.zip This will create the file assignment4_username.zip. For Linux Lab users: once you have made the zip file, navigate to its location in the file- system and upload to Blackboard. For anyone working on the assignment remotely, use the scp command to securely copy it to your local computer, and then upload the file to
  • 30. Fall 2018 Blackboard. $ scp <[email protected]>:<path_of_zip_file> <desired_path> There is no whitespace on either side of the colon. Your login, Your.Username @eniac.cs.cuny.edu is named before the colon. The <path_of_zip_file>
  • 31. is absolute path on the remote machine, named after the colon. Then type a whitespace and specify the <desired_path> on your local file-system that you would like to put your zip file. If you run the command properly, it should bring up a password prompt from eniac.cs.hunter.cuny.edu. The zip file will be placed in your specified location. Now you are ready to upload your zip file to Blackboard. CSCI 132: Practical Unix and Programming
  • 37. Assignment 31 Summary The purpose of this assignment is to give you some practice in bash scripting. When you write a bash script, you are really writing a program in the bash programming language. bash is not just a shell, but a programming language as well, and a bash script can just as well be called a bash program. It is mentioned now because very soon you will begin writing programs in another programming language, Perl; this is the first in a sequence of small steps in mastering Perl. We have been calling your programs shell scripts. A script is a program, make no bones about it. Scripts are programs written in a scripting language, which is a special kind of programming language. All scripting languages are programming languages, but not vice versa. The distinction will be explained in a later lecture. In this case, bash is both a programming language and a scripting language. This assignment will begin with a review of some of the things that have been covered in class, and then introduce a few things not covered in class.
  • 38. Some Important bash Instructions A bash instruction is also called a statement. For example, the if-instruction if test $# -ne 2 then echo “usage: $0 arg1 arg2” exit fi echo “User input: $1 and $2” is usually referred to as the if-statement. The test condition in this case is $# -ne 2 . If the test is true, where the number of the command parameters ($#) is not equal to 2, as in (-ne 2), the statement(s) between then and if execute. If the test is not true, bash will skip those statements and execute what comes after the if- statement, as in the second echo statement.
  • 40. 1 This is licensed under the Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/4.0/. CSCI 132: Practical Unix and Programming
  • 46. The bash programming language has several statements that are known as looping statements. A looping statement is one that makes it possible to repeat a sequence of statements one or more times. bash has a looping statement called the while-statement: the while-statement is a looping statement whose form (syntax) is while <expression> do <list-of-statements> done in which <expression> is a statement such as the test command, or any other statement that can be evaluated as being true or false, and <list-of- statements> is any sequence of statements (including looping statements.) The following snippet (little piece) of a script shows one example of a while-statement: echo -n "Try to guess my favorite color:" read guess rest_of_line mycolor=`cat secretfile` # read about backquoted commands like `cat file` while [ $guess != $mycolor ] do echo -n "Sorry, that is not my favorite color. Try again: "
  • 47. read guess rest_of_line done In the above script, the <expression> part of the while- statement is [ $guess != $mycolor ] and the <list-of-statements> is the list of two lines echo -n "Sorry, that is not my favorite color. Try again: " read guess rest_of_line The above script will test whether guess is the same string as mycolor, and if it is not, it will execute the echo and read statements and then re- evaluate the test command that compares guess and mycolor. It will keep doing this until the user enters a string that is identical to the string stored in mycolor. When they do match, the expression becomes false and the “while loop” is exited. A while-statement is usually called a while loop because if we visualize the sequence of executed statements as being connected by an imaginary thread, then this thread loops around and around the lines of the script. CSCI 132: Practical Unix and Programming
  • 53. Fall 2018 Bash also has a for-loop. (It has other loops too.) The for-loop is very different from the while-loop. It has two forms. One form (again the proper term is syntax) is for <variable> in <argument-list> do <list-of-statements> done and the other is for <variable> do <list-of-statements> done
  • 54. The <variable> can be any valid variable name (words starting with letters and containing letters, digits, and the underscore character.) The <argument-list> can be any sequence of words, including words that look like numbers. Examples of this are for number in 1 2 3 4 5 6 7 8 9 10 for name in John Jacob Judy Jocelyn for word in $* As you can see, this can be very powerful. As with the while-loop, the list of statements is any list of statements, but the intention is that the variable plays a role in this list. For example, the script let sum=0 for number in 1 2 3 4 5 6 7 8 9 10 do let square=$number*$number let sum=$sum+$number echo The square of $number is $square done echo The sum of the numbers is $sum. displays ten lines showing the squares of the first ten positive integers and then displays their sum. Notice how the sum is calculated. The second form of the for-loop does not need an argument list. It automatically assigns to the variable the successive words from the command line arguments of the script when it is run:
  • 60. Fall 2018 for word do echo $word. done It is the same as for name in $*
  • 61. do echo $word. done It just prints the words found on the command line one after the other on separate lines. Your Tasks This assignment consists of two exercises in writing relatively simple shell scripts. The objectives when writing any script are clarity the script should be easy to understand by someone with a basic knowledge of UNIX; efficiency the script should use the least resources possible; and simplicity ���the script should be as simple as possible. An example will demonstrate. Suppose we needed a script that would count the number of lines in a file named molecule containing the word 'ATOM' anywhere on the line. The following script would achieve this: #!/bin/bash grep ' ATOM ' molecule >| atomcount wc -l atomcount >| answer rm atomcount cat answer rm answer but it is very inefficient (it needlessly creates files and then
  • 62. removes them), it is hard to understand because the reader spends more time reading it and may not be familiar with certain operators such as >|, and it is not as simple as it could be. A simple, well- documented, and efficient solution is #!/bin/bash CSCI 132: Practical Unix and Programming
  • 68. # Displays how many lines in file molecule contain ATOM as # a complete word # Written by Stewart Weiss grep -c ' ATOM ' molecule # The -c option to grep counts matching lines It has comments to explain what it does and it achieves it with a single command that can be looked up easily. Your job is to apply these ideas as you create solutions to the following exercises. 1. The last command lists information about who has logged into the computer on which it is run. In particular, it has a column with the username, the terminal on which the user was connected, the internet address (the IP address) from which they connected to the computer, and the date and time that they logged in and then logged out if they did. If they logged out it also displays the total time they were logged in. For example, this is an entry for Dr. Weiss on cslab12: ��� sweiss pts/11 146.95.214.131 Thu Sep 14 13:05 - 14:27 (01:22) If the username is too long it is truncated, but there are options to display the full username. For this exercise you are to write a bash script named logincount that takes a list of usernames as its command line arguments and displays on the screen, for each user name, a message of the form ���
  • 69. Number of times that <username> logged into this machine is <N> where <N> is to be replaced by the number of records that the last command output that match <username> exactly. For example, if I enter the command ���logincount sweiss it should output something like Number of times that sweiss logged into this machine is 7 If a name given as an argument is not a username, nothing is printed for that name. On the other hand, if no names are given, it is an error and the command should display the error message, “Usage: logincount <list of usernames>”. 2. A DNA string is a sequence of the letters a, c, g, and t in any order. For example, aacgtttgtaaccag is a DNA string of length 15. Each sequence of three consecutive CSCI 132: Practical Unix and Programming
  • 75. Fall 2018 letters is called a codon. For example, in the preceding string, the codons are aac, gtt, tgt, aac, and cag. If we ignored the first letter and started listing the codons starting at the second a, the codons would be acg, ttt, gta, and acc, and we would ignore the last ag. The letters are called bases. A DNA string can be hundreds of thousands of codons long, even millions of codons long, which means that it is infeasible to count them by hand. It would be useful to have a simple script that could count the number of occurrences of a specific codon in such a string. For instance, for the example string above such a script would tell us that aac occurs three times and tgt occurs once. Generally, we want to be able to find occurrences of arbitrary sequences of bases in a
  • 76. given DNA string, such as how many times ttatg occurs, or how many times cgacgattag occurs. Your job is to write a script named countmatches that expects at least two arguments on the command line. The first argument is the pathname of a file containing a valid DNA string with no newline characters or white space characters of any kind within it. (It will be terminated with a newline character.) This file contains nothing but a sequence of the letters a, c, g, and t. DNA text files are located at /data/biocs/b/student.accounts/cs132/data/dna_textfiles to give to your script as the file argument. The remaining arguments are strings containing only the bases a, c, g, and t in any order. For each valid argument string, it will search the DNA string in the file and count how many non-overlapping occurrences of that argument string are in the DNA string. To make sure you understand what non-overlapping means, the string ata occurs just once in the string atata, not twice, because the two occurrences overlap. If your script is called correctly, it will output for each argument a line containing the argument string followed by how many times it occurs in the string. If it finds no occurrences, it should output 0 as a count. For example, if the string aaccgtttgtaaccggaac is in a file named dnafile, then
  • 77. your script should work like this: $ countmatches dnafile ttt ttt 1 $ countmatches dnafile aac ggg aaccg aac 3 ggg 0 aaccg 2 CSCI 132: Practical Unix and Programming
  • 83. Warning: if it is given valid arguments, the script is not to output anything except the strings and their associated counts. No fancy messages, no words! The script should check that the first argument is a file name and that there is at least one other argument after it. If the first argument is not a file name, or if it is missing anything after the filename, the script should print a how-to-use-me message and then exit. It is not required to check that the file is in the proper form, or that the string contains nothing but the letters a, c, g, and t. Hint: You can solve this problem using grep and one other command that appears in this document. Although there are other filters, you do not need them to solve this problem. You have to read more about grep to know how to use it. The other command has appeared in the slides already. Grading Rubric This homework is graded on a 100-point scale. Each script is worth 50 points. Each script will be graded on its correctness foremost. This means that it does exactly what the assignment states it must do, in detail. Correctness is worth 70% of the grade. Then it is graded on its clarity, simplicity, and efficiency, as described above. These qualitative measures are worth 30% of the grade.
  • 84. Submitting the Homework Due Date: This assignment is due by the end of the day (i.e. 11:59PM, EST) on Wednesday, October 31st. I will update the class accordingly of when this particular assignment is to be submitted to Blackboard as an assignment submission. If you complete the assignment before I announce the post of Blackboard assignment submission, you may post your assignment to my email, only as a zip archive. Submission details In PDF format of your actions of command input with screenshots of all output; or as a zip file. For remote logins: ssh to eniac.cs.hunter.cuny.edu with your valid username and password, and then ssh into any cslab host. 1. In your own home directory, create a directory named assignment3_username where username is your Linux Lab account username. 2. Put copies of the two scripts that you have written into this directory. Make sure they CSCI 132: Practical Unix and
  • 90. Fall 2018 are named logincount and countmatches. 3. Run the commands: $ zip -r assignment3_username.zip assignment3_username/ $ chmod 755 assignment3_username.zip This will create the file assignment3_username.zip. For Linux Lab users: once you have made the zip file, navigate to its location in the file- system and upload to Blackboard. For anyone working on the assignment remotely, use the scp command to securely copy it to your local computer, and then upload the file to
  • 91. Blackboard. $ scp <[email protected]>:<path_of_zip_file> <desired_path> There is no whitespace on either side of the colon. Your login, Your.Username @eniac.cs.cuny.edu is named before the colon. The <path_of_zip_file> is absolute path on the remote machine, named after the colon. Then type a whitespace and specify the <desired_path> on your local file-system that you would like to put your zip file. If you run the command properly it should bring up a password prompt from eniac.cs.hunter.cuny.edu. The zip file will be placed in your specified location. Now you are ready to upload your zip file to Blackboard.