SlideShare a Scribd company logo
Functions and modules
       Karin Lagesen

  karin.lagesen@bio.uio.no
Homework:
            TranslateProtein.py
●   Input files are in
    /projects/temporary/cees-python-course/Karin
      ●   translationtable.txt - tab separated
      ●   dna31.fsa
●   Script should:
      ●   Open the translationtable.txt file and read it into a
          dictionary
      ●   Open the dna31.fsa file and read the contents.
      ●   Translates the DNA into protein using the dictionary
      ●   Prints the translation in a fasta format to the file
          TranslateProtein.fsa. Each protein line should be 60
          characters long.
Modularization
●   Programs can get big
●   Risk of doing the same thing many times
●   Functions and modules encourage
     ●   re-usability
     ●   readability
     ●   helps with maintenance
Functions
●   Most common way to modularize a
    program
●   Takes values as parameters, executes
    code on them, returns results
●   Functions also found builtin to Python:
     ●   open(filename, mode)
     ●   sum([list of numbers]
●   These do something on their parameters,
    and returns the results
Functions – how to define
    def FunctionName(param1, param2, ...):

          """ Optional Function desc (Docstring) """

          FUNCTION CODE ...

          return DATA



●
    keyword: def – says this is a function
●   functions need names
●   parameters are optional, but common
●   docstring useful, but not mandatory
●   FUNCTION CODE does something
●   keyword return results: return
Function example
     >>> def hello(name):
     ... results = "Hello World to " + name + "!"
     ... return results
     ...
     >>> hello()
     Traceback (most recent call last):
       File "<stdin>", line 1, in <module>
     TypeError: hello() takes exactly 1 argument (0 given)
     >>> hello("Lex")
     'Hello World to Lex!'
     >>>



●   Task: make script from this – take name
    from command line
●   Print results to screen
Function example
                    script
import sys

def hello(name):
  results = "Hello World to " + name + "!"
  return results

name = sys.argv[1]
functionresult = hello(name)
print functionresult




[karinlag@freebee]% python hello.py
Traceback (most recent call last):
  File "hello.py", line 8, in ?
   name = sys.argv[1]
IndexError: list index out of range
[karinlag@freebee]% python hello.py Lex
Hello World to Lex!
[karinlag@freebee]%
Returning values
●   Returning is not mandatory, if no return,
    None is returned by default
●   Can return more than one value - results
    will be shown as a tuple

               >>> def test(x, y):
               ... a = x*y
               ... return x, a
               ...
               >>> test(1,2)
               (1, 2)
               >>>
Function scope
●   Variables defined inside a function can
    only be seen there!
●   Access the value of variables defined
    inside of function: return variable
Scope example
>>> def test(x):
... z = 10
... print "the value of z is " + str(z)
... return x*2
...
>>> z = 50
>>> test(3)
the value of z is 10
6
>>> z
50
>>> x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined
>>>
Parameters
●   Functions can take parameters – not
    mandatory
●   Parameters follow the order in which they
    are given
    >>> def test(x, y):
    ... print x*2
    ... print y + str(x)
    ...
    >>> test(2, "y")
    4
    y2
    >>> test("y", 2)
    yy
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 3, in test
    TypeError: unsupported operand type(s) for +: 'int' and 'str'
    >>>
Named parameters
●   Can use named parameters
    >>> def test(x, y):
    ... print x*2
    ... print y + str(x)
    ...
    >>> test(2, "y")
    4
    y2
    >>> test("y", 2)
    yy
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 3, in test
    TypeError: unsupported operand type(s) for +: 'int' and 'str'
    >>> test(y="y", x=2)
    4
    y2
    >>>
Default parameters
●   Parameters can be given a default value
●   With default, parameter does not have to
    be specified, default will be used
●   Can still name parameter in parameter list
>>> def hello(name = "Everybody"):
... results = "Hello World to " + name + "!"
... return results
...
>>> hello("Anna")
'Hello World to Anna!'
>>> hello()
'Hello World to Everybody!'
>>> hello(name = "Annette")
'Hello World to Annette!'
>>>
Exercise
TranslateProteinFunctions.py
●   Use script from homework
●   Create the following functions:
     ●   get_translation_table(filename)
           –   return dict with codons and protein codes
     ●   read_dna_string(filename)
           –   return tuple with (descr, DNA_string)
     ●   translate_protein(dictionary, DNA_string)
           –   return the protein version of the DNA string
     ●   pretty_print(descr, protein_string, outname)
           –   write result to outname in fasta format
TranslateProteinFunctions.py
import sys



YOUR CODE GOES HERE!!!!

translationtable = sys.argv[1]
fastafile = sys.argv[2]
outfile      = sys.argv[3]

translation_dict = get_translation_table(translationtable)
description, DNA_string = read_dna_string(fastafile)
protein_string = translate_protein(translation_dict, DNA_string)
pretty_print(description, protein_string, outfile)
get_translation_table
def get_translation_table(translationtable):
  fh = open('translationtable.txt' , 'r')
  trans_dict = {}
  for line in fh:
     codon = line.split()[0]
         aa = line.split()[1]
     trans_dict[codon] = aa
  fh.close()
  return trans_dict
read_dna_string

def read_dna_string(fastafile):
  fh = open(fastafile, "r")
  line = fh.readline()
  header_line = line[1:-1]

  seq = ""
  for line in fh:
     seq += line[:-1]
  fh.close()
  return (header_line, seq)
translate_protein

def translate_protein(translation_dict, DNA_string):
  aa_seq = ""

  for i in range(0, len(DNA_string)-3, 3):
         codon = DNA_string[i:i+3]
         one_letter = translation_dict[codon]
         aa_seq += one_letter
  return aa_seq
pretty_print

def pretty_print(description, protein_string, outfile):
  fh = open(outfile, "w")
  fh.write(">" + description + "n")

  for i in range(0, len(protein_string), 60):
     fh.write(protein_string[i:i+60] + "n")
  fh.close()
Modules
●   A module is a file with functions, constants
    and other code in it
●   Module name = filename without .py
●   Can be used inside another program
●   Needs to be import-ed into program
●   Lots of builtin modules: sys, os, os.path....
●   Can also create your own
Using module
●   One of two import statements:
         1: import modulename
         2: from module import function/constant
●   If method 1:
     ●   modulename.function(arguments)
●   If method 2:
     ●   function(arguments) – module name not
         needed
     ●   beware of function name collision
Operating system modules –
      os and os.path
●   Modules dealing with files and operating
    system interaction
●   Commonly used methods:
     ●   os.getcwd() - get working directory
     ●   os.chdir(path) – change working directory
     ●   os.listdir([dir = .]) - get a list of all files in this
         directory
     ●   os.mkdir(path) – create directory
     ●   os.path.join(dirname, dirname/filename...)
Your own modules
●   Three steps:
       1. Create file with functions in it. Module
       name is same as filename without .py
       2. In other script, do
         import modulename
       3. In other script, use function like this:
         modulename.functionname(args)
Separating module use and
             main use
●   Files containing python code can be:
     ●   script file
     ●   module file
●   Module functions can be used in scripts
●   But: modules can also be scripts
●   Question is – how do you know if the code
    is being executed in the module script or
    an external script?
Module use / main use
●   When a script is being run, within that
    script a variable called __name__ will be
    set to the string “__main__”
●   Can test on this string to see if this script is
    being run
●   Benefit: can define functions in script that
    can be used in module mode later
Module mode / main mode
import sys



<code as before>
                                                     When this script is being used,
translationtable = sys.argv[1]                       this will always run, no matter what!
fastafile = sys.argv[2]
outfile      = sys.argv[3]

translation_dict = get_translation_table(translationtable)
description, DNA_string = read_dna_string(fastafile)
protein_string = translate_protein(translation_dict, DNA_string)
pretty_print(description, protein_string, outfile)
Module use / main use
# this is a script
import sys
import TranslateProteinFunctions

description, DNA_string = read_dna_string(sys.argv[1])
print description


[karinlag@freebee]% python modtest.py dna31.fsa
Traceback (most recent call last):
  File "modtest.py", line 2, in ?
   import TranslateProteinFunctions
  File "TranslateProteinFunctions.py", line 44, in ?
   fastafile = sys.argv[2]
IndexError: list index out of range
[karinlag@freebee]Karin%
TranslateProteinFuctions.py
         with main
import sys



<code as before>
if __name__ == “__main__”:
      translationtable = sys.argv[1]
      fastafile = sys.argv[2]
      outfile      = sys.argv[3]

      translation_dict = get_translation_table(translationtable)
      description, DNA_string = read_dna_string(fastafile)
      protein_string = translate_protein(translation_dict, DNA_string)
      pretty_print(description, protein_string, outfile)
ConcatFasta.py
●   Create a script that has the following:
      ●   function get_fastafiles(dirname)
            –   gets all the files in the directory, checks if they are
                fasta files (end in .fsa), returns list of fasta files
            –   hint: you need os.path to create full relative file
                names
      ●   function concat_fastafiles(filelist, outfile)
            –   takes a list of fasta files, opens and reads each of
                them, writes them to outfile
      ●   if __name__ == “__main__”:
            –   do what needs to be done to run script
●   Remember imports!

More Related Content

PPTX
Functions in python slide share
PDF
List , tuples, dictionaries and regular expressions in python
PPTX
Excel for beginner
PPTX
Sexual Differentiation During Development and Gonadal Dysgenesis
PPT
Pollution Control
PPTX
Introduction to php
PPTX
computer
PPTX
Materi UMKM.pptx
Functions in python slide share
List , tuples, dictionaries and regular expressions in python
Excel for beginner
Sexual Differentiation During Development and Gonadal Dysgenesis
Pollution Control
Introduction to php
computer
Materi UMKM.pptx

What's hot (20)

PPTX
Functions in c++
PPTX
Functions in python
PPTX
classes and objects in C++
PPT
Two dimensional array
PPT
FUNCTIONS IN c++ PPT
PPTX
Passing an Array to a Function (ICT Programming)
PPTX
Data Type Conversion in C++
PDF
Python recursion
PPTX
C++ Overview PPT
PDF
Python programming : Classes objects
PPTX
Static Data Members and Member Functions
PPTX
Inheritance in c++
ODP
Python Modules
PPTX
PPT
RECURSION IN C
PPTX
Functions in Python
PPTX
Introduction to c++
PPTX
Constructors in C++
PPSX
python Function
PDF
Datatypes in python
Functions in c++
Functions in python
classes and objects in C++
Two dimensional array
FUNCTIONS IN c++ PPT
Passing an Array to a Function (ICT Programming)
Data Type Conversion in C++
Python recursion
C++ Overview PPT
Python programming : Classes objects
Static Data Members and Member Functions
Inheritance in c++
Python Modules
RECURSION IN C
Functions in Python
Introduction to c++
Constructors in C++
python Function
Datatypes in python
Ad

Similar to Functions and modules in python (20)

PPTX
Functions2.pptx
PPTX
Functions and Modules.pptx
PPTX
PYTHON -Chapter 2 - Functions, Exception, Modules and Files -MAULIK BOR...
PDF
Porting to Python 3
PPTX
Python_Functions_Unit1.pptx
PPTX
System Calls.pptxnsjsnssbhsbbebdbdbshshsbshsbbs
PPTX
cbse class 12 Python Functions2 for class 12 .pptx
PDF
Functions2.pdf
PPT
functions modules and exceptions handlings.ppt
PPTX
Generators-in-Python-for-Developers.pptx
PPTX
PPTX
Python programming
PDF
Functions2.pdf
PPTX
Advance python
PDF
What's new in Python 3.11
PDF
Functions_21_22.pdf
PDF
Chapter Functions for grade 12 computer Science
PPTX
Built in function
PDF
Functions.pdf cbse board latest 2023-24 all covered
PDF
Python basic
Functions2.pptx
Functions and Modules.pptx
PYTHON -Chapter 2 - Functions, Exception, Modules and Files -MAULIK BOR...
Porting to Python 3
Python_Functions_Unit1.pptx
System Calls.pptxnsjsnssbhsbbebdbdbshshsbshsbbs
cbse class 12 Python Functions2 for class 12 .pptx
Functions2.pdf
functions modules and exceptions handlings.ppt
Generators-in-Python-for-Developers.pptx
Python programming
Functions2.pdf
Advance python
What's new in Python 3.11
Functions_21_22.pdf
Chapter Functions for grade 12 computer Science
Built in function
Functions.pdf cbse board latest 2023-24 all covered
Python basic
Ad

Recently uploaded (20)

PDF
Advanced IT Governance
PDF
Machine learning based COVID-19 study performance prediction
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Approach and Philosophy of On baking technology
PDF
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
PDF
Advanced Soft Computing BINUS July 2025.pdf
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Cloud computing and distributed systems.
PPT
Teaching material agriculture food technology
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
Advanced IT Governance
Machine learning based COVID-19 study performance prediction
Chapter 3 Spatial Domain Image Processing.pdf
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Electronic commerce courselecture one. Pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Dropbox Q2 2025 Financial Results & Investor Presentation
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Approach and Philosophy of On baking technology
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
Advanced Soft Computing BINUS July 2025.pdf
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
“AI and Expert System Decision Support & Business Intelligence Systems”
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Cloud computing and distributed systems.
Teaching material agriculture food technology
Per capita expenditure prediction using model stacking based on satellite ima...

Functions and modules in python

  • 2. Homework: TranslateProtein.py ● Input files are in /projects/temporary/cees-python-course/Karin ● translationtable.txt - tab separated ● dna31.fsa ● Script should: ● Open the translationtable.txt file and read it into a dictionary ● Open the dna31.fsa file and read the contents. ● Translates the DNA into protein using the dictionary ● Prints the translation in a fasta format to the file TranslateProtein.fsa. Each protein line should be 60 characters long.
  • 3. Modularization ● Programs can get big ● Risk of doing the same thing many times ● Functions and modules encourage ● re-usability ● readability ● helps with maintenance
  • 4. Functions ● Most common way to modularize a program ● Takes values as parameters, executes code on them, returns results ● Functions also found builtin to Python: ● open(filename, mode) ● sum([list of numbers] ● These do something on their parameters, and returns the results
  • 5. Functions – how to define def FunctionName(param1, param2, ...): """ Optional Function desc (Docstring) """ FUNCTION CODE ... return DATA ● keyword: def – says this is a function ● functions need names ● parameters are optional, but common ● docstring useful, but not mandatory ● FUNCTION CODE does something ● keyword return results: return
  • 6. Function example >>> def hello(name): ... results = "Hello World to " + name + "!" ... return results ... >>> hello() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: hello() takes exactly 1 argument (0 given) >>> hello("Lex") 'Hello World to Lex!' >>> ● Task: make script from this – take name from command line ● Print results to screen
  • 7. Function example script import sys def hello(name): results = "Hello World to " + name + "!" return results name = sys.argv[1] functionresult = hello(name) print functionresult [karinlag@freebee]% python hello.py Traceback (most recent call last): File "hello.py", line 8, in ? name = sys.argv[1] IndexError: list index out of range [karinlag@freebee]% python hello.py Lex Hello World to Lex! [karinlag@freebee]%
  • 8. Returning values ● Returning is not mandatory, if no return, None is returned by default ● Can return more than one value - results will be shown as a tuple >>> def test(x, y): ... a = x*y ... return x, a ... >>> test(1,2) (1, 2) >>>
  • 9. Function scope ● Variables defined inside a function can only be seen there! ● Access the value of variables defined inside of function: return variable
  • 10. Scope example >>> def test(x): ... z = 10 ... print "the value of z is " + str(z) ... return x*2 ... >>> z = 50 >>> test(3) the value of z is 10 6 >>> z 50 >>> x Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'x' is not defined >>>
  • 11. Parameters ● Functions can take parameters – not mandatory ● Parameters follow the order in which they are given >>> def test(x, y): ... print x*2 ... print y + str(x) ... >>> test(2, "y") 4 y2 >>> test("y", 2) yy Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in test TypeError: unsupported operand type(s) for +: 'int' and 'str' >>>
  • 12. Named parameters ● Can use named parameters >>> def test(x, y): ... print x*2 ... print y + str(x) ... >>> test(2, "y") 4 y2 >>> test("y", 2) yy Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in test TypeError: unsupported operand type(s) for +: 'int' and 'str' >>> test(y="y", x=2) 4 y2 >>>
  • 13. Default parameters ● Parameters can be given a default value ● With default, parameter does not have to be specified, default will be used ● Can still name parameter in parameter list >>> def hello(name = "Everybody"): ... results = "Hello World to " + name + "!" ... return results ... >>> hello("Anna") 'Hello World to Anna!' >>> hello() 'Hello World to Everybody!' >>> hello(name = "Annette") 'Hello World to Annette!' >>>
  • 14. Exercise TranslateProteinFunctions.py ● Use script from homework ● Create the following functions: ● get_translation_table(filename) – return dict with codons and protein codes ● read_dna_string(filename) – return tuple with (descr, DNA_string) ● translate_protein(dictionary, DNA_string) – return the protein version of the DNA string ● pretty_print(descr, protein_string, outname) – write result to outname in fasta format
  • 15. TranslateProteinFunctions.py import sys YOUR CODE GOES HERE!!!! translationtable = sys.argv[1] fastafile = sys.argv[2] outfile = sys.argv[3] translation_dict = get_translation_table(translationtable) description, DNA_string = read_dna_string(fastafile) protein_string = translate_protein(translation_dict, DNA_string) pretty_print(description, protein_string, outfile)
  • 16. get_translation_table def get_translation_table(translationtable): fh = open('translationtable.txt' , 'r') trans_dict = {} for line in fh: codon = line.split()[0] aa = line.split()[1] trans_dict[codon] = aa fh.close() return trans_dict
  • 17. read_dna_string def read_dna_string(fastafile): fh = open(fastafile, "r") line = fh.readline() header_line = line[1:-1] seq = "" for line in fh: seq += line[:-1] fh.close() return (header_line, seq)
  • 18. translate_protein def translate_protein(translation_dict, DNA_string): aa_seq = "" for i in range(0, len(DNA_string)-3, 3): codon = DNA_string[i:i+3] one_letter = translation_dict[codon] aa_seq += one_letter return aa_seq
  • 19. pretty_print def pretty_print(description, protein_string, outfile): fh = open(outfile, "w") fh.write(">" + description + "n") for i in range(0, len(protein_string), 60): fh.write(protein_string[i:i+60] + "n") fh.close()
  • 20. Modules ● A module is a file with functions, constants and other code in it ● Module name = filename without .py ● Can be used inside another program ● Needs to be import-ed into program ● Lots of builtin modules: sys, os, os.path.... ● Can also create your own
  • 21. Using module ● One of two import statements: 1: import modulename 2: from module import function/constant ● If method 1: ● modulename.function(arguments) ● If method 2: ● function(arguments) – module name not needed ● beware of function name collision
  • 22. Operating system modules – os and os.path ● Modules dealing with files and operating system interaction ● Commonly used methods: ● os.getcwd() - get working directory ● os.chdir(path) – change working directory ● os.listdir([dir = .]) - get a list of all files in this directory ● os.mkdir(path) – create directory ● os.path.join(dirname, dirname/filename...)
  • 23. Your own modules ● Three steps: 1. Create file with functions in it. Module name is same as filename without .py 2. In other script, do import modulename 3. In other script, use function like this: modulename.functionname(args)
  • 24. Separating module use and main use ● Files containing python code can be: ● script file ● module file ● Module functions can be used in scripts ● But: modules can also be scripts ● Question is – how do you know if the code is being executed in the module script or an external script?
  • 25. Module use / main use ● When a script is being run, within that script a variable called __name__ will be set to the string “__main__” ● Can test on this string to see if this script is being run ● Benefit: can define functions in script that can be used in module mode later
  • 26. Module mode / main mode import sys <code as before> When this script is being used, translationtable = sys.argv[1] this will always run, no matter what! fastafile = sys.argv[2] outfile = sys.argv[3] translation_dict = get_translation_table(translationtable) description, DNA_string = read_dna_string(fastafile) protein_string = translate_protein(translation_dict, DNA_string) pretty_print(description, protein_string, outfile)
  • 27. Module use / main use # this is a script import sys import TranslateProteinFunctions description, DNA_string = read_dna_string(sys.argv[1]) print description [karinlag@freebee]% python modtest.py dna31.fsa Traceback (most recent call last): File "modtest.py", line 2, in ? import TranslateProteinFunctions File "TranslateProteinFunctions.py", line 44, in ? fastafile = sys.argv[2] IndexError: list index out of range [karinlag@freebee]Karin%
  • 28. TranslateProteinFuctions.py with main import sys <code as before> if __name__ == “__main__”: translationtable = sys.argv[1] fastafile = sys.argv[2] outfile = sys.argv[3] translation_dict = get_translation_table(translationtable) description, DNA_string = read_dna_string(fastafile) protein_string = translate_protein(translation_dict, DNA_string) pretty_print(description, protein_string, outfile)
  • 29. ConcatFasta.py ● Create a script that has the following: ● function get_fastafiles(dirname) – gets all the files in the directory, checks if they are fasta files (end in .fsa), returns list of fasta files – hint: you need os.path to create full relative file names ● function concat_fastafiles(filelist, outfile) – takes a list of fasta files, opens and reads each of them, writes them to outfile ● if __name__ == “__main__”: – do what needs to be done to run script ● Remember imports!