Day 3 Materials
Downloads
- Download Slides
- Download Exercise files
References
- Interpretting Error Messages
- Basic File Input/Output
-
Note that this reference uses
Python2
, and hence theprint()
function does not use parenthesis (instead theprint
statement is used)
-
Day 3 solutions can be found here
Part I: Functions
-
Write a function to compute the
GC
content of a DNA sequence. The function should accept a single argument, the DNA sequence, and return the GC percentage. Test your function with the nucleotide sequence"AGCTATAGCATAGC"
. -
Write a function that calculates the percentage of a given nucleotide from a DNA sequence. The function should accept two arguments: the nucleotide of interest and the DNA sequence. It should return the nucleotide percentage. Test your function with the nucleotide sequence
"AGCTATAGCATAGC"
. -
Write a function that calculates the percentage each nucleotide in a given DNA sequence. of a given nucleotide from a DNA sequence. The function should accept a single argument, the DNA sequence, and return a dictionary containing
key:value
pairs ofnucleotide:percentage
. You can assume that the provided sequence contains only A, C, G, T. Test your function with the nucleotide sequence"AGCTATAGCATAGC"
. - Write a function to guess whether a provided sequence is DNA or protein. For this task, assume that any sequence comprised of % A, C, G, T is a DNA sequence. Test your function with the following two sequences:
"AGCTATGCATACGAGCATAGC"
"AGIILLCPKLKKQWTATWCAGCATADSARCVLMKGC"
- Modify the previous function to ignore all ambiguities in calculations. Use this list of ambiguous characters for this task:
ambig = ["B", "J", "N", "O", "X", "Z"]
. Test your function with the following sequence:"APAPPPKKLRATNNYPOPPBXXXXXNTYGCTATLMQASDFTDTCATAGC"
Part II: File Input/Output
Files used in these exercises can be downloaded from the course website. Be sure to write your scripts in the same directory as these files!
-
Open the file
file1.txt
in read-mode, and print its contents to screen. Use the.read()
method, which saves the contents of the file to a single string. Perform this task twice: once usingopen
andclose
, and once usingwith
control-flow. -
Open the file
file1.txt
in read-mode, and save all lines in this file to a list using the.readlines()
method. Write a new file calledupper_file1.txt
which contains the same contents offile1.txt
but in upper-case. Try to do this task using a single for-loop. -
Open the newly created file
upper_file1.txt
in read-mode. Loop over the file lines without using.read()
or.readlines()
, and print out lines as you loop. -
Modify the previous for-loop to only print out lines in
upper_file1.txt
which contain at least (i.e. ) 5 letterE
’s. -
You should notice 20 files named
file1.txt, file2.txt, ..., file20.txt
. Write a for-loop to open each of these files (Hint: use therange()
function to loop over file names). For each file, print each line that contains more than 25 characters. -
Write another for-loop over the same 20 files. For each file, create a second file named
fileX_odd.txt
(where X=1-20) which contains only the odd-numbered lines from the original file. For this, use a counter in the for-loop that goes over file lines (this will count the line numbers), but be careful: Remember that python indexing starts from 0, but the first line is technically line #1! -
Convert our zoo-keeper dictionaries into a comma-separated file with the header
animal,vore,food
, and rows should contain corresponding information, i.e.lion,carnivore,meat
. Perform this task with a single for-loop.category = {"lion": "carnivore", "gazelle": "herbivore", "anteater": "insectivore", "alligator": "homovore", "hedgehog": "insectivore", "cow": "herbivore", "tiger": "carnivore", "orangutan": "frugivore"} feed = {"carnivore": "meat", "herbivore": "grass", "frugivore": "mangos", "homovore": "visitors", "insectivore": "termites"}
-
Create a second zoo-keeper file by converting the CSV into a tab-separated file. Perform this task by reading in the CSV, replacing commas with tabs, where tabs can be created as the string
"\t"
. For example, the following snippet will replace all commas with tabs in a string calledmystring
:mystring2 = mystring.replace(",", "\t")