SlideShare a Scribd company logo
Kernel Methods and Relational Learning in
                  Bioinformatics

                                  ir. Michiel Stock
                               Dr. Willem Waegeman
                             Prof. dr. Bernard De Baets

                             Faculty of Bioscience Engineering
                                     Ghent University


                                   November 2012




                                      KERMIT



ir. Michiel Stock (KERMIT)         Kernels for Bioinformatics    November 2012   1 / 40
Outline


1    Introduction

2    Kernel methods

3    Learning relations

4    Case studies
      Enzyme function prediction
      Protein-ligand interactions
      Microbial ecology

5    Conclusions



    ir. Michiel Stock (KERMIT)   Kernels for Bioinformatics   November 2012   2 / 40
Introduction


Introductory example

Problem statement
Predict protein-protein interactions based on high-throughput data.
        Based on a gold standard
        Typical features that can be
        used:
               Yeast two-hybrid
               Pfam profile
               Phylogenetic profile
               Localization
               PSI-BLAST
               Expression
               ...

  ir. Michiel Stock (KERMIT)       Kernels for Bioinformatics   November 2012   3 / 40
Introduction


Machine learning is widelyagaused in bioinformatics
      88               Larran‹ et al.




                                                                                                                    Downloaded from bib.oxfordjournals.org at Biomedische Bibliotheek o
               Figure 1: Classification of the topics where machine learning methods are applied.
 ir. Michiel Stock (KERMIT)                             Kernels for Bioinformatics                  November 2012                                                                         4 / 40
Introduction


Bioinformatics deals with complex data


Bioinformatics data is typically:
               in large dimension (e.g., microarrays or proteomics data)
            structured (e.g., gene sequences, small molecules, interaction
      networks, phylogenetic trees...)
            heterogeneous (e.g., vectors, sequences, graphs to describe
      the same protein)
            in large quantities (e.g., more than 106 known protein
      sequences)
               noisy (e.g., many features are not relevant)




  ir. Michiel Stock (KERMIT)     Kernels for Bioinformatics   November 2012   5 / 40
Kernel methods


Formal definition of a kernel

Kernels are non-linear functions defined over objects x ∈ X .
Definition
A function k : X × X → R is called a positive definite kernel if it is
symmetric, that is, k(x, x ) = k(x , x) for any two objects x, x ∈ X , and
positive semi-definite, that is,
                               N     N
                                         ci cj k(xi , xj ) ≥ 0
                               i=1 j=1

for any N > 0, any choice of N objects x1 , . . . , xN ∈ X , and any choice of
real numbers c1 , . . . , cN ∈ R.

Can be seen as generalized covariances.


  ir. Michiel Stock (KERMIT)        Kernels for Bioinformatics   November 2012   6 / 40
Kernel methods


Interpretation of kernels

    Suppose an object x has an
    implicit feature representation
    φ(x) ∈ F.
    A kernel function can be seen
    as a dot product in this
    feature space:                                                                X              F

         k(x, x ) = φ(x), φ(x )
                                                                                                     h (x), (x0 )i
                                                                                      k

    Linear models in this feature
    space F can be made:
                                                             dinsdag, 10 april 2012




                        T
          y (x) = w φ(x)
                 =            an k(xn , x)
                       n

 ir. Michiel Stock (KERMIT)               Kernels for Bioinformatics                      November 2012              7 / 40
Kernel methods


Many kernel methods exist
                                                           SVM
       Examples of popular kernel
       methods:
              Support vector machine
              (SVM)
              Regularized least squares
              (RLS)
              Kernel principal                             KPCA
              component analysis
              (KPCA)
       Learning algorithm is
       independent of the kernel
       representation!


 ir. Michiel Stock (KERMIT)       Kernels for Bioinformatics      November 2012   8 / 40
Kernel methods


Kernels for (protein) sequences

Spectrum kernel (SK)
The SK considers the number of k-mers m two sequences si and sj have in
common.


       SKk (si , sj ) =              N(m, si )∗N(m, sj )
                              m∈Σk

       with N(m, s) the number of k-mers
       m in sequence s.
              To predict structure, function...
              of DNA, RNA or proteins.
              A discriminative alternative for
              Hidden Markov Models.

 ir. Michiel Stock (KERMIT)               Kernels for Bioinformatics   November 2012   9 / 40
Kernel methods


Kernels for graphs (1)
Graph
Graphs are a set of interconnected objects, called vertices (or nodes), that
are connected through edges.

Graphs can show the structure of an object or interactions between
different objects.




                         Graph are important in bioinformatics!
  ir. Michiel Stock (KERMIT)         Kernels for Bioinformatics   November 2012   10 / 40
Kernel methods


Kernels for graphs (2)

Graph kernel
Constructing a similarity between graphs.
                                                    In chemoinformatics:


    Based on performing a
    random walk on both graphs
    and counting the number of                      In structural bioinformatics:
    matching walks.
    Usually very computationally
    demanding!



                                                                A
 ir. Michiel Stock (KERMIT)       Kernels for Bioinformatics          November 2012   11 / 40
Kernel methods


Kernels for graphs (3)

Diffusion kernel
Constructing a similarity between vertices within the same graph.

    Also based on performing a
    random walk on a graph.
    Captures the long-range
    relationships between
    vertices.
    Inspired by the heat
    equation. The kernel
    quantifies how quickly ‘heat’
    can spread from one node to
    another.


 ir. Michiel Stock (KERMIT)       Kernels for Bioinformatics   November 2012   12 / 40
Kernel methods


Kernels for fingerprints


                                                    Fingerprint representation of
    Objects that can be described                   an object:
    by a long binary vector x can
    be represented by the
    Tanimoto kernel:

      KTan (xm , xn ) =
                 xm , xn
                                 .
     xm , xm + xn , xn − xm , xn




 ir. Michiel Stock (KERMIT)       Kernels for Bioinformatics          November 2012   13 / 40
Learning relations


Kernels for pairs of objects


Problem statement
Predict the binding interaction between a given protein and a ligand
(small molecule). Learning Molecular docking.

        The problem deals with two
        types of objects:
               Proteins (graph kernel of
               structure, sequence
               kernel, fingerprints...)
               Ligand (fingerprints,
               graph kernel...)
        Label is for a pair of objects.


  ir. Michiel Stock (KERMIT)         Kernels for Bioinformatics   November 2012   14 / 40
Learning relations
ng and Ranking Algorithms for Bioinformatics
 example: pairs of objects
  Kernels for
        Applications
nomicsWillem Waegeman, Bernard De Baets
 Michiel Stock,
    Pairwise kernel
IT, Department of Mathematical Modelling, Statistics and Bioinformatics
of Combine the kernel matrices of the individual the process of druga kernel
   proteins and a database of ligands to aid objects to construct
istical model based objects.
   matrix for pairs of on a data set. Kernel methods allow for the
roductory example: chemogenomics
tein and a from individual kernels for the proteins and ligands:
   Starting ligand.
ding interactions between a set of proteins and a database of ligands to aid the process of drug
to model pairwise relations between different types of objects.
s
                                                     Data set                            Object kernels

                                       ( , )
                          By optimizing a ranking loss, our algorithms can also be used for
                                       ( , ) as shown on the right.
                          conditional ranking,
                                          ( , )
                                                                                  SVM
                          In short, our framework is ideally                suited for bioinformatics
                                                                                   RLS
                                             ...




                          challenges:
                                          ( , )
                            - efficient learning process
                                          ( , )                                      ...
                            - can handle complex objects (graphs, trees, sequences...)
                                                   Pairwise kernel
                            - ability to deal with information retrieval problems
                  Object kernels                                               Learning algorithm

 gorithms can also be used for
       ir. Michiel Stock (KERMIT)              Kernels for Bioinformatics         November 2012     15 / 40
( , )          Learning relations
                                                           SVM
  Conditional ranking (1)                                   RLS
                          ...
  Motivation( , )
  Suppose one is not )                                          ...
             ( , particularly interested in the exact value of the
  interaction but in the order of the proteins for a given ligand.
                                     Pairwise kernel
rnels                                                                                    Learning algorithm

ed for                                More relevant




                                                                                               More relevant
matics
                  Query 1                                                      Query 2




                                           Database objects
        ir. Michiel Stock (KERMIT)                    Kernels for Bioinformatics           November 2012       16 / 40
Learning relations


Conditional ranking (2)

       Based on a graph description,
       with e a pair of objects.
       Train the model:

       h(e) =< w, Φ(e) >=                   ae K Φ (e, e )
                                                       ¯
                                      e∈E

       using the algorithm:
                                                      2
       A(T ) = argmin L(h, T )+λ h                    H.
                      h∈H
                                                                       Figure 1 Example of a multi-graph. If this graph, on the left, would be used fo
                                                                       conditioned on C, then A scores better than E, which ranks higher than E, w
       Where we use a ranking loss:                                    higher than D and D ranks higher than B. There is no information about the re
                                                                       and G, respectively, our model could be used to include these two instances in
                                                                       are available. Notice that in this setting unconditional ranking of these objects
                                                                       graph is obviously intransitive. Figure reproduced from (Pahikkala et al., 2010).

       L(h, T ) =                    (ye −ye −h(e)+h(¯))2 .
                                           ¯         e
                                                                  The proposed framework is based on the Kronecker product ke
                       v ∈V e,¯∈Ev
                              e                                   implicit joint feature representations of queries and the sets of ob
                                                                  Exactly this kernel construction will allow a straightforward
                                                                  existing framework to dyadic relations and multi-task l
                                                                  (Objectives 1 and 2). It has been proposed independently by three
                                                                  modeling pairwise inputs in different application domains (Basilico
 ir. Michiel Stock (KERMIT)            Kernels for Bioinformatics et al. 2004, Ben-Hur et al. November a2012
                                                                                              2005). From different perspective, it h
                                                                                                                         17 / 40
Case studies    Enzyme function prediction


Predicting enzyme function

Problem statement
Predict the function (EC number) of an enzyme using structural
information of the active site.
      Data:                               active site of an
           1730 enzymes with 21           enzyme:
           different functions
           four different structural
           similarities
                     CavBase
                     maximum common
                     subgraph
                     labeled point cloud
                     superposition
                     fingerprints

 ir. Michiel Stock (KERMIT)         Kernels for Bioinformatics                  November 2012   18 / 40
Case studies    Enzyme function prediction


EC numbers

EC number
A functional label of an enzyme, based on the reaction that is catalyzed.

Example: EC 2.7.6.1 = ribose-phosphate diphosphokinase




  ir. Michiel Stock (KERMIT)    Kernels for Bioinformatics                  November 2012   19 / 40
Case studies       Enzyme function prediction


Defining catalytic similarity
Catalytic similarity
The catalytic similarity is the number of successive equal digits in the EC
number between two enzymes, starting from the first digit.




                                     0        EC 2.7.7.34
                                                                                  EC ?.?.?.?
                                               3                           2
                                                             0
                                          1
                       EC 4.2.3.90
                                                                            0
                                                     0
                                     0
                                                             EC 4.6.1.11

                                                                   2
                                                                                    EC 2.7.1.12
                                         EC 2.7.7.12


  ir. Michiel Stock (KERMIT)                   Kernels for Bioinformatics                      November 2012   20 / 40
Case studies                                                                      Enzyme function prediction


Data exploration


                                       Kernel PCA of the cb data                                                                                                                           Kernel PCA of the fp data                                                                                                                         Kernel PCA of the mcs data                                                                                                                                     Kernel PCA of the lpcs data



                                                                                                         ●●
                                                                                                         ●
                                                                                                         ●
                                                                                                        ●●
                                                                                                       ●●
                                                                                                        ● ●
                                                                                                       ●●
                                                                                                       ●●
                                                                                                      ●●
                                                                                                       ●                                                                                                                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             ●
                                                                                                     ●●
                                                                                                      ●●
                                                                                                     ●●
                                                                                                     ●●
                                                                                                    ●●
                                                                                                    ●●
                                                                                                   ●●
                                                                                                  ●●
                                                                                                  ●●
                                                                                                     ●                                                                                                                                                                                                                                                                                                                                                                                                                                                       ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ● ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              ●
                                                                                                 ●●                                                                                                                                                                                                                                                                                                                                                                                                                                                      ●● ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ● ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ● ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ● ●●
                                                                                                                                                                                                                                                                                                                                                                                    ●                                                                                                                                                                    ●●● ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ● ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         ●● ●●●




                                                                                                                                                                         0.8
                                                                                                                                                                                                                                                                                                                                                                 ● ● ● ● ● ●● ●●●● ● ●●●●
                                                                                                                                                                                                                                                                                                                                                                               ●         ● ●●                                                                                                                                                            ●● ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        ●●●●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         ● ● ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ● ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         ●●●●●
                                                                                                                                                                                                                                                                                                                                                                                        ●●● ●●
                                                                                                                                                                                                                                                                                                                                                                                      ●●●●● ● ●                                                                                                                                                           ● ●
                     400




                                                                                                                                                                                                                                  ● ●                                                                                                                               ●                    ●●● ●                                                                                                                                                            ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        ●●● ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ● ●
                                                                                                                                                                                                                                  ●                                                                                                                             ●●
                                                                                                                                                                                                                                                                                                                                                                ●●●
                                                                                                                                                                                                                                                                                                                                                               ●● ●
                                                                                                                                                                                                                                                                                                                                                               ●● ●
                                                                                                                                                                                                                                                                                                                                                              ●●●
                                                                                                                                                                                                                                                                                                                                                                   ●
                                                                                                                                                                                                                                                                                                                                                               ●●● ●
                                                                                                                                                                                                                                                                                                                                                                ●●●                   ●●●●
                                                                                                                                                                                                                                                                                                                                                                                         ●●
                                                                                                                                                                                                                                                                                                                                                                                  ● ● ●●● ●
                                                                                                                                                                                                                                                                                                                                                                                 ●●●●● ●●● ●●
                                                                                                                                                                                                                                                                                                                                                                              ● ●●●●●●●●●●● ●                                                                                                                                                           ●● ● ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           ● ●
                                                                                                                                                                                                                                  ●                                                                                                                           ● ●●
                                                                                                                                                                                                                                                                                                                                                              ●● ●
                                                                                                                                                                                                                                                                                                                                                              ●●●● ●
                                                                                                                                                                                                                                                                                                                                                               ●●              ●● ●● ●● ●● ●● ●
                                                                                                                                                                                                                                                                                                                                                                                ● ●● ●●●● ●●
                                                                                                                                                                                                                                                                                                                                                                                           ●
                                                                                                                                                                                                                                                                                                                                                                               ●● ● ●● ●● ● ● ●
                                                                                                                                                                                                                                                                                                                                                                            ●●●●●● ●● ●●●● ● ● ●
                                                                                                                                                                                                                                                                                                                                                             ●●●●● ●● ●●●●●●●●●●●●●●●●●●
                                                                                                                                                                                                                                                                                                                                                              ●●                   ●● ●● ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        ●●● ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        ●● ● ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       ● ●● ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        ●● ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ●
                                                                                                                                                                                                                     ●●
                                                                                                                                                                                                                      ●●                                                                                                                                     ●●
                                                                                                                                                                                                                                                                                                                                                              ●●
                                                                                                                                                                                                                                                                                                                                                             ● ●●
                                                                                                                                                                                                                                                                                                                                                             ●●                ●● ●● ● ●●●● ● ●
                                                                                                                                                                                                                                                                                                                                                                                ● ● ● ●●
                                                                                                                                                                                                                                                                                                                                                                            ● ● ● ●● ● ●● ●●●●●
                                                                                                                                                                                                                                                                                                                                                                                           ●● ●●    ●                                                                                                                                                   ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        ●● ●




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  4
                                                                      ●                                                                                                                                             ●●
                                                                                                                                                                                                                     ●●
                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                    ●● ●
                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                    ●●                                                                                                                                       ● ●
                                                                                                                                                                                                                                                                                                                                                             ●●
                                                                                                                                                                                                                                                                                                                                                              ●
                                                                                                                                                                                                                                                                                                                                                             ●●                ● ● ●● ● ● ●
                                                                                                                                                                                                                                                                                                                                                                           ● ●● ●●●●●●●● ●● ●●
                                                                                                                                                                                                                                                                                                                                                                                      ●● ● ● ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         ●● ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       ●● ●
                                       ●●●●                          ●
                                                                     ●
                                                                    ●●
                                                                      ●
                                                                      ●                                                                                                                                           ●●●
                                                                                                                                                                                                                   ●●
                                                                                                                                                                                                                    ●●
                                                                                                                                                                                                                                                                                                                                                            ●●               ● ● ● ●● ●● ●● ● ●    ●
                                                                                                                                                                         0.6
                                       ●●●●●●
                                        ●●●● ●
                                         ●●
                                         ●●●●                        ●
                                                                     ●●
                                                                     ●●
                                                                    ●●                                                                                                                                            ●●
                                                                                                                                                                                                                ●●●●                                                                                                                                                                                ●                                                                                                                                                     ● ●
                                       ●●●●●●
                                          ● ● ●●
                                          ●● ●●●●
                                         ● ●●●●                      ●
                                                                    ●●
                                                                     ●
                                                                    ●●
                                                                     ●
                                                                     ●
                                                                     ●●
                                                                      ●
                                                                      ●                                                                                                                                        ● ●●
                                                                                                                                                                                                                ●●●
                                                                                                                                                                                                               ●●● ●●
                                                                                                                                                                                                               ● ●●●
                                                                                                                                                                                                                    ●                                                                                                                                       ●
                                                                                                                                                                                                                                                                                                                                                            ●               ●●        ● ● ● ●●●
                                                                                                                                                                                                                                                                                                                                                                                       ●        ● ●
                                                                                                                                                                                                                                                                                                                                                                                                    ●                                                                                                                                                    ● ●




                                                                                                                                                                                                                                                                                                                                 1.0
                                            ●●●●●
                                                ●●
                                               ●●●●●
                                               ●●●●●
                                                ●●● ●
                                                ●●●●
                                                 ●●● ●
                                               ●●●●● ●
                                                 ●●● ●
                                                  ●●●               ●●
                                                                     ●
                                                                    ●●
                                                                     ●
                                                                     ●
                                                                     ●●
                                                                      ●
                                                                      ●                                                                                                                                        ●●●
                                                                                                                                                                                                               ● ●
                                                                                                                                                                                                              ●●●●
                                                                                                                                                                                                              ● ●●
                                                                                                                                                                                                                   ●
                                                                                                                                                                                                              ● ●● ●
                                                                                                                                                                                                               ● ●●●
                                                                                                                                                                                                             ●●●● ● ●●
                                                                                                                                                                                                              ● ●
                                                                                                                                                                                                                                                       ●                                                                                                                                  ●     ●● ●
                                                                                                                                                                                                                                                                                                                                                                                                 ●
                                                                                                                                                                                                                                                                                                                                                                                                 ●●                                                                                                                                                            ●
                     200




                                                                                                                                                                                                                                                                                                                                                                            ●                 ●●●                                                                                                                                                      ●      ●●
                                                  ●●●●●
                                                  ●●●●●
                                                  ●●●● ●
                                                   ●●●●●
                                                   ●●●●●
                                                    ●●●●●
                                                    ●●●●●
                                                      ●● ●
                                                    ●●●●●
                                                     ●●● ●
                                                      ●●
                                                     ●●● ● ●         ●                                                                                                                                        ● ●
                                                                                                                                                                                                            ●● ●●
                                                                                                                                                                                                            ●●●● ●
                                                                                                                                                                                                              ● ●
                                                                                                                                                                                                              ●                                                                                                                                                                                  ●●●
                                                       ● ● ●●
                                                      ●●● ●●
                                                       ●● ●●●
                                                       ●● ● ●                                                                                                                                               ●● ●
                                                                                                                                                                                                              ●●
                                                                                                                                                                                                            ● ●●
                                                                                                                                                                                                              ●●
                                                                                                                                                                                                           ●●●● ●
                                                                                                                                                                                                            ●● ●
                                                                                                                                                                                                           ●●●●
                                                                                                                                                                                                              ●
                                                                                                                                                                                                          ●●●● ● ●                                    ●●●
                                                                                                                                                                                                                                                      ●●
                                                                                                                                                                                                                                                       ●
                                                                                                                                                                                                                                                     ●●●
                                                                                                                                                                                                                                                      ●●
                                                                                                                                                                                                                                                    ●●●●
                                                                                                                                                                                                                                                        ●                                                                                                   ● ●                              ● ●●                                                                                                                                                       ●●
                                                              ● ● ●●●                                                                                                                                     ●●● ●●
                                                                                                                                                                                                            ●●
                                                                                                                                                                                                           ●● ●
                                                                                                                                                                                                          ●●● ●
                                                                                                                                                                                                          ●●●●
                                                                                                                                                                                                           ●●
                                                                                                                                                                                                           ●●●
                                                                                                                                                                                                          ●●●
                                                                                                                                                                                                         ●●●●●                                   ●●●●●
                                                                                                                                                                                                                                                       ●
                                                                                                                                                                                                                                                     ●●●
                                                                                                                                                                                                                                                     ●●●
                                                                                                                                                                                                                                                 ●●●●●
                                                                                                                                                                                                                                                 ●●●●
                                                                                                                                                                                                                                                    ●●●●
                                                                                                                                                                                                                                                       ●
                                                                                                                                                                                                                                                                                                                                                                        ●     ●      ● ●        ●
                                                                                                                                                                                                                                                                                                                                                                                                ●●
                                                                                                                                                                                                                                                                                                                                                                                                ●●                                                                                                                                                         ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ●●● ●
                                                                  ●●●                                                                                                                                    ●●●●
                                                                                                                                                                                                          ●●●
                                                                                                                                                                                                           ●●
                                                                                                                                                                                                        ●●●●●●
                                                                                                                                                                                                          ●●
                                                                                                                                                                                                        ●●●●●
                                                                                                                                                                                                         ●●●●                                     ●●●
                                                                                                                                                                                                                                                 ●●●●●
                                                                                                                                                                                                                                                   ● ●●




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  3
                                                                   ●●●
                                                                   ●●
                                                                  ●●●
                                                                   ●●●
                                                                    ●●
                                                                    ●●
                                                                  ●●●●
                                                                     ●
                                                                   ●●●
                                                                   ●●●
                                                                    ●●
                                                                    ●●
                                                                    ●●
                                                                     ●                                                                                                                                   ●●●● ●
                                                                                                                                                                                                         ●●●●
                                                                                                                                                                                                       ●●●●●
                                                                                                                                                                                                         ●●●
                                                                                                                                                                                                         ●●●●
                                                                                                                                                                                                        ●●●●            ●                        ● ●● ●
                                                                                                                                                                                                                                                ●●●●
                                                                                                                                                                                                                                              ●●●●●●
                                                                                                                                                                                                                                                  ●●●●
                                                                                                                                                                                                                                                 ●●●●                                                                                                                                        ●● ●
                                                                                                                                                                                                                                                                                                                                                                                               ●                                                                                                                                                              ●●
                                                                   ●●
                                                                   ●●
                                                                    ●●                                                                                                                                  ●●●●●                                   ●●● ●                                                                                                                                         ●●●                                                                                                                                                              ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            ●● ●
                                                                                                                                                                         0.4

                                                                   ●● ●
                                                                   ●●                                                                                                                                                                             ●●
                                                                                                                                                                                                                                                 ●●
                                                                                                                                                                                                                                                 ●●                                                                                                                                                   ●
                                                                     ●
                                                                     ●
                                                                    ●●
                                                                     ●
                                                                    ●●
                                                                     ●
                                                                   ●●●
                                                                      ●
                                                                      ●
                                                                    ●●●
                                                                     ●●
                                                                    ●●
                                                                     ●●                                                                                                                                   ●
                                                                                                                                                                                                         ●● ●●
                                                                                                                                                                                                        ●●●● ●
                                                                                                                                                                                                          ●● ●
                                                                                                                                                                                                       ●●●●
                                                                                                                                                                                                           ●
                                                                                                                                                                                                        ●●●●
                                                                                                                                                                                                         ●● ●
                                                                                                                                                                                                        ●●●●
                                                                                                                                                                                                         ●●●
                                                                                                                                                                                                         ●●                                    ●●●
                                                                                                                                                                                                                                               ●●●●
                                                                                                                                                                                                                                                ●●●
                                                                                                                                                                                                                                               ●● ●
                                                                                                                                                                                                                                               ●●●●
                                                                                                                                                                                                                                                ●●
                                                                                                                                                                                                                                                                                                                                                            ●
                                                                                                                                                                                                                                                                                                                                                                                             ●● ● ●
                                                                                                                                                                                                                                                                                                                                                                                           ●●●●●●● ●
                                                                                                                                                                                                                                                                                                                                                                                             ●●●●●●
                                                                                                                                                                                                                                                                                                                                                                                             ● ●●●●●                                                                                                                                                    ●      ●●          ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           ●●
                                                                    ●●●
                                                                      ●
                                                                     ●●
                                                                     ●
                                                                    ●●●●
                                                                       ●                                                                                                                               ●●●●
                                                                                                                                                                                                         ●● ●
                                                                                                                                                                                                      ●●●●
                                                                                                                                                                                                          ●
                                                                                                                                                                                                        ●●●
                                                                                                                                                                                                        ●●●
                                                                                                                                                                                                          ●●
                                                                                                                                                                                                      ●●●●●
                                                                                                                                                                                                        ●●●
                                                                                                                                                                                                         ●●
                                                                                                                                                                                                         ●●●                                    ●
                                                                                                                                                                                                                                               ●●
                                                                                                                                                                                                                                              ●●●          ●                                                                                                                                 ● ●●●
                                                                                                                                                                                                                                                                                                                                                                                              ●● ●
                                                                                                                                                                                                                                                                                                                                                                                            ●●●●● ● ●
                                                                                                                                                                                                                                                                                                                                                                                             ●● ●● ●
                                                                                                                                                                                                                                                                                                                                                                                        ● ●●●●●●●●
                                                                                                                                                                                                                                                                                                                                                                                            ●●●●● ●●                                                                                                                                                                  ● ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                ● ●●● ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      ● ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               ● ●●●●● ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     ●●
                                                                     ●●
                                                                      ●                                                                                                                                                                                                                                                                                                                     ●●●●●●●
                                                                                                                                                                                                                                                                                                                                                                                            ●● ● ●
                                                                                                                                                                                                                                                                                                                                                                                       ● ●●●●●●●●●
                                                                                                                                                                                                                                                                                                                                                                                             ●● ●                                                                                                                                                                   ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    ●●● ●●●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      ● ●● ●●●●●● ●●●●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     ●●




                                                                                                                                                                                                                                                                                                                                                                                                                                                            Second component
                                                                      ●                                                                                                                                                                                                                                                                                                                         ●
                                                                                                                                                                                                      ●●●●●
                                                                                                                                                                                                     ●●●●●
                                                                                                                                                                                                        ●● ●
                                                                                                                                                                                                       ●●
                                                                                                                                                                                                         ●●           ●●                      ●●      ● ●
                                                                                                                                                                                                                                                      ● ●●                                                                                                                                  ●●● ●●
                                                                                                                                                                                                                                                                                                                                                                                           ●●● ●● ●
                                                                                                                                                                                                                                                                                                                                                                                             ●● ●
                                                                                                                                                                                                                                                                                                                                                                                                ●
                                                                                                                                                                                                                                                                                                                                                                                           ●●●●● ●●                                                                                                                                                              ●●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 ●●●●● ●●●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 ●● ● ●●●●
                                                                                                                                                                                                      ●●●
                                                                                                                                                                                                      ●●●
                                                                                                                                                                                                       ●●
                                                                                                                                                                                                       ●●                                                                                                                                                                             ● ● ●●●●●●●●                                                                                                                                     ● ● ● ● ● ● ● ● ●● ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       ● ● ● ● ● ● ● ● ● ●● ● ●●●●●●●     ● ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           ●
                     0




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 ●●




                                                                                                                                                                                                                                                                                                                                 0.5
                                                                                                                                                                                                      ●●●
                                                                                                                                                                                                       ●●
                                                                                                                                                                                                       ●●            ● ●●                  ●   ●                                                                                                                             ● ●             ●●
                                                                                                                                                                                                                                                                                                                                                                                           ●●●●●●
                                                                                                                                                                                                                                                                                                                                                                                            ●●●●●                                                                                                                                                        ●                ●
                                                                                                                                                                                                      ●● ●
                                                                                                                                                                                                      ●●●
                                                                                                                                                                                                       ●●
                                                                                                                                                                                                       ●●
                                                                                                                                                                                                     ●● ●●
                                                                                                                                                                                                       ●●
                                                                                                                                                                                                    ●●● ●
                                                                                                                                                                                                                          ●
                                                                                                                                                                                                                                           ●●
                                                                                                                                                                                                                                              ●          ●
                                                                                                                                                                                                                                                                                                                                                                  ● ●
                                                                                                                                                                                                                                                                                                                                                                                 ●            ●●● ●                                                                                                                   ●●●●●●●●●● ● ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      ●●●●●●●●● ●●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       ●●●●●●●● ● ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       ●●●●●●●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       ●●●●● ●●●●●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       ●●                        ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 ●●
                                                                                                                                                                                                      ●●
                                                                                                                                                                                                    ●●● ●
                                                                                                                                                                                                    ●● ●●            ●●
                                                                                                                                                                                                                      ●
                                                                                                                                                                                                                      ●                  ●● ●     ●●                                                                                                               ●                         ●● ●
                                                                                                                                                                                                                                                                                                                                                                                        ● ●●● ●
                                                                                                                                                                                                                                                                                                                                                                                              ●                                                                                                                           ●●●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        ●●●●●●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ● ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         ●●● ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        ●●● ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        ●●●●●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         ●● ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        ●●●●●●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ●                                     ●
                                                                                                                                                      Third component




                                                                                                                                                                                                      ●              ●                                                                                                                                        ● ●● ●




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Second component
                                                                                                                                                                                                   ●●●●●
                                                                                                                                                                                                    ●●●                                                                                                                                                                                      ●




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  2
                                                                                                                                                                                                    ●●
                                                                                                                                                                                                    ●●●
                                                                                                                                                                                                   ●●●
                                                                                                                                                                                                    ●● ●            ●● ●               ● ●● ●                                                                                                                                           ● ●●●
                                                                                                                                                                                                                                                                                                                                                                                         ● ●●●                                                                                                                                                      ● ●●      ●
 Third component




                                                                                                                                                                                                    ●                    ●
                                                                                                                                                                         0.2




                                                                                                                                                                                                   ●● ●                                                                                                                                                                           ●                                                                                                                                                                           ●
                                                                                                                                                                                                    ●● ●
                                                                                                                                                                                                     ●●
                                                                                                                                                                                                     ●●                                   ●                                                                                                           ●              ●                   ●●●●




                                                                                                                                                                                                                                                                                                                                                                                                                                                                               Third component
                                                                                                                                                                                                   ●●● ●                                                                                                                                             ● ●                                 ●●
                                                                                                                                   Second component




                                                                      ●                                                                                                                          ●●●● ●
                                                                                                                                                                                                     ●         ● ●                ●●      ●                                                                                                         ●
                                                                                                                                                                                                                                                                                                                                                    ●●
                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                                                                                                                                                    ● ●                                                                                                                                                                                           ●         ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             ●●
                                                                                                                                                                                                  ●●●●
                                                                                                                                                                                                   ● ●
                                                                                                                                                                                                    ●           ●                                                                                                                                   ●● ● ●
                                                                                                                                                                                                                                                                                                                                                    ●●
                                                                                                                                                                                                                                                                                                                                                    ● ●●
                                                                                                                                                                                                                                                                                                                                                   ●●●● ● ● ●
                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                                                                                                                                                  ●●●●● ●●                             ●● ●●
                                                                                                                                                                                                                                                                                                                                                                                      ●● ● ●                                                                                                                                                       ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  ● ●● ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               ●




                                                                                                                                                                                                                                                                                                              Third component
                                                                                                                                                                                                 ●●●●        ● ●              ●                                                                                                                   ●●●● ● ●● ●
                                                                                                                                                                                                                                                                                                                                                    ●● ●
                                                                                                                                                                                                                                                                                                                                                  ●●●●● ●● ● ● ●
                                                                                                                                                                                                                                                                                                                                                  ●●●●●●●●●●●
                                                                                                                                                                                                                                                                                                                                                    ●● ●                                   ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  ● ● ● ●●
                     −200




                                                                                                                                                                                                 ●● ●                                                                                                                                               ●● ●
                                                                                                                                                                                                                                                                                                                                                     ●●
                                                                                                                                                                                                                                                                                                                                                  ●●●●● ● ●
                                                                                                                                                                                                                                                                                                                                                   ●●●●
                                                                                                                                                                                                                                                                                                                                                     ●●                                                                                                                                                                                                       ●




                                                                                                                                                                                                                                                                                           Second component
                                                                                                                                                                                                   ●                                                                                                                                                 ●●
                                                                                                                                                                                                                                                                                                                                                   ●●●●● ● ●
                                                                                                                                                                                                                                                                                                                                                    ●●●●● ●                           ●●
                                                                                                                                                                                                ●●●●●
                                                                                                                                                                                                   ●●●        ● ●                ●● ●                                                                                                                 ●●
                                                                                                                                                                                                                                                                                                                                                       ●●
                                                                                                                                                                                                                                                                                                                                                    ●●●●● ●
                                                                                                                                                                                                                                                                                                                                                     ●● ● ●
                                                                                                                                                                                                                                                                                                                                                    ●●●●●●● ●●
                                                                                                                                                                                                                                                                                                                                                     ●● ●
                                                                                                                                                                                                                                                                                                                                                    ●●●●● ●●●●●
                                                                                                                                                                                                                                                                                                                                                      ●●●●● ●
                                                                                                                                                                                                                                                                                                                                                    ●●●●●●● ● ●                       ●● ●                                                                                                                                                       ●●        ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           ●●
                                                                                                                                                                                                   ●●
                                                                                                                                                                                                   ●
                                                                                                                                                                                                  ●●
                                                                                                                                                                                                   ●
                                                                                                                                                                                                ●● ● ●●
                                                                                                                                                                                                   ●
                                                                                                                                                                                                 ●● ● ●                  ●● ●●
                                                                                                                                                                                                                       ● ●●●● ● ● ●
                                                                                                                                                                                                                                                                                                                                                       ●●●
                                                                                                                                                                                                                                                                                                                                                       ●●
                                                                                                                                                                                                                                                                                                                                                     ●●●●●●●●● ● ● ●
                                                                                                                                                                                                                                                                                                                                                      ●●● ● ●
                                                                                                                                                                                                                                                                                                                                                        ●
                                                                                                                                                                                                                                                                                                                                                        ●●
                                                                                                                                                                                                                                                                                                                                                      ●●●●●
                                                                                                                                                                                                                                                                                                                                                     ●●●●● ●● ●●
                                                                                                                                                                                                                                                                                                                                                        ●
                                                                                                                                                                                                                                                                                                                                                      ●●●●●
                                                                                                                                                                                                                                                                                                                                                       ●●
                                                                                                                                                                                                                                                                                                                                                        ●
                                                                                                                                                                                                                                                                                                                                                       ●●● ●
                                                                                                                                                                                                                                                                                                                                                       ●●●●●●
                                                                                                                                                                                                                                                                                                                                                      ●●●●● ● ●                     ●                                                                 2.5
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ●●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ●●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           ●●
                                                                                                                                                                                                                                                                                                                                                       ●●●●● ●
                                                                                                                                                                                                                                                                                                                                                        ●● ●          ● ●●                                                                                                                                                                               ●●●




                                                                                                                                                                                                                                                                                                                                 0.0
                                                                                                                                                                                                  ●                                                                                                                                                      ●● ●
                                                                                                                                                                                                                                                                                                                                                         ●●            ●




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  1
                                                                                                                                                                                                  ●
                                                                                                                                                                                                  ●●● ●● ●●
                                                                                                                                                                                                ● ●●         ●                        ●●
                                                                                                                                                                                                                            ●●● ● ●●● ●
                                                                                                                                                                                                                              ●●                                                                                                                       ●●●● ●                                                                                                                                                                                             ●●
                                                                                                                                                                                                                             ●●                                                                                                                           ● ● ●● ● ●
                                                                                                                                                                                                                                                                                                                                                         ●●●●●●● ●●●●●●                                                                                                                                                                                 ●●
                                                                                                                                                                         0.0




                                                                                                                                                                                                 ● ●●
                                                                                                                                                                                                ●●●● ● ●                      ●●      ●                                                                                                                  ● ● ●●● ● ●    ● ●                                                                                                                                                                     ●        ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          ●
                                                                      ●                                                                                                                          ●● ●
                                                                                                                                                                                                    ●
                                                                                                                                                                                                                             ● ●●      ●●
                                                                                                                                                                                                                                        ●                                                                                                                ● ●● ●●● ● ● ●
                                                                                                                                                                                                                                                                                                                                                          ●●● ●●
                                                                                                                                                                                                                                                                                                                                                             ● ●●  ●                                                                                                                                                                                    ●● ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         ●
                                                                      ●                                                                                                                          ●●●
                                                                                                                                                                                                ●●●●
                                                                                                                                                                                                   ●●                          ●
                                                                                                                                                                                                                            ●●● ●●
                                                                                                                                                                                                                             ●●● ●
                                                                                                                                                                                                                            ●●● ●
                                                                                                                                                                                                                                ●      ●●
                                                                                                                                                                                                                                        ●
                                                                                                                                                                                                                                        ●
                                                                                                                                                                                                                                        ●
                                                                                                                                                                                                                                        ●●                                                                                                                  ●
                                                                                                                                                                                                                                                                                                                                                           ●● ●●●●● ● ●●
                                                                                                                                                                                                                                                                                                                                                             ● ●● ● ●
                                                                                                                                                                                                                                                                                                                                                            ● ●● ● ●                                                                                                                                                                                  ●●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        ●●
                                                                                                                                                                                              ●● ●●●●
                                                                                                                                                                                                ●●
                                                                                                                                                                                                ●●
                                                                                                                                                                                               ●● ●                       ●●●● ● ● ●●
                                                                                                                                                                                                                             ●●
                                                                                                                                                                                                                           ●●● ●
                                                                                                                                                                                                                           ●●●
                                                                                                                                                                                                                          ●● ●● ●
                                                                                                                                                                                                                                        ●
                                                                                                                                                                                                                                        ●●                                                                                                                          ●
                                                                                                                                                                                                                                                                                                                                                                  ●● ●● ●
                                                                                                                                                                                                                                                                                                                                                              ●● ● ● ●●●                                                                        2.0                                                                                            ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      ●●●                                     3
                                                                                                                                                                                                   ●                        ●                                                                                                                                              ●                                                                                                                                                                          ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       ●
                     −400




                                                                                                                                                                                               ● ●●
                                                                                                                                                                                             ●● ●●
                                                                                                                                                                                             ●●● ●●                       ●●
                                                                                                                                                                                                                          ●● ● ●
                                                                                                                                                                                                                            ●                                                                                                                                ● ● ● ● ●●
                                                                                                                                                                                                                                                                                                                                                                ●
                                                                                                                                                                                                                                                                                                                                                                           ●●
                                                                                                                                                                                                                                                                                                                                                                           ●●                                                                                                                                                                         ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       ●
                                                                                                                            1000                                                              ●●     ●                   ●
                                                                                                                                                                                                                        ●●● ●● ● ●
                                                                                                                                                                                                                            ●       ●                                                                                                                        ● ●●
                                                                                                                                                                                                                                                                                                                                                                 ● ● ●●●   ●●
                                                                                                                                                                                                                                                                                                                                                                          ●●                                                                                                                                                                         ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      ●
                                                                                                                                                                                             ● ● ●●
                                                                                                                                                                                             ●●●● ●
                                                                                                                                                                                              ● ●●                         ●● ● ●
                                                                                                                                                                                                                                ●                                                                                                                                ●●
                                                                                                                                                                                                                                                                                                                                                                ●●● ●●●●
                                                                                                                                                                                                                                                                                                                                                                  ● ● ●●●●
                                                                                                                                                                                                                                                                                                                                                                  ●       ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      ●                                   2
                                                                                                                                                                         −0.2




                                                                                                                                                                                              ● ●  ●                                                                                                                                                              ● ●●●●●●●● ●




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  0
                                                                                                                                                                                            ● ●●                                  ●                                                  1.5                                                                              ● ●●●
                                                                                                                                                                                                                                                                                                                                                                     ●● ●● ●
                                                                                                                                                                                                                                                                                                                                                                          ●●●                                                             1.5                                                                                                        ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     ●
                                                                                                                          800                                                               ● ● ●●                         ●                                                                                                                                           ● ●●
                                                                                                                                                                                                                                                                                                                                                                       ● ●●●
                                                                                                                                                                                                                                                                                                                                                                       ●●●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    ●●




                                                                                                                                                                                                                                                                                                                                 −0.5
                                                                          ●                                                                                                                                                                                                                                                                                             ● ●
                                                                                                                                                                                                                                                                                                                                                                        ● ●                                                                                                                                                                        ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   ●●                                 1
                                                                                                                                                                                                                                ●                                              1.0                                                                                      ●●                                                                                                                                                                          ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   ●●
                                                                                                                                                                                                                          ●                                                                                                                                                                                                                                                                                                                       ●●●
                                                                                                                    600                                                                        ●●                                                                                                                                                                                                                                                                                                                                                 ●●
                     −600




                                                                      ●
                                                                      ●                                                                                                                         ●                       ●
                                                                                                                                                                                                                                                                                                                                                                        ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                    1.0                                                                                                           ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   ●                              0
                                                                      ●                                                                                                                                                                                                  0.5                                                                                            ●●●
                                                                                                                                                                                                                                                                                                                                                                         ●●                                                                                                                                                                       ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  ●
                                                                                                                                                                         −0.4




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  −1
                                                                      ●
                                                                      ●                                                                                                                                                                                                                                                                                                  ●●                                                                                                                                                                      ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 ●●
                                                                      ●
                                                                      ●
                                                                      ●
                                                                      ●
                                                                      ●                                       400                                                                                                                                                                                                                                                        ●●                                                                                                                                                                      ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 ●                           −1
                                                                      ●
                                                                      ●●                                                                                                                                                                                           0.0                                                                                                                                                        0.5                                                                                                                ●
                                                                      ●
                                                                      ●
                                                                      ●
                                                                      ●                                                                                                                                                                                                                                                                                                                                                                                                                                                                         ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               ●●




                                                                                                                                                                                                                                                                                                                                 −1.0
                                                                      ●
                                                                      ●
                                                                      ●●                                200                                                                                                                                                                                                                                                                                                                                                                                                                                   ●
                                                                      ●                                                                                                                                                                                                                                                                                                                                                                                                                                                                       ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               ●                        −2
                     −800




                                                                      ●●                                                                                                                                                                                    −0.5                                                                                                                                                        0.0                                                                                                                   ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              ●●
                                                                                                                                                                         −0.6




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              ●




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  −2
                                                                                                    0                                                                                                                                                                                                                                                                                                                                                                                                                                       ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             ●
                                                                                                                                                                                                                                                        −1.0                                                                                                                                                                                                                                                                                ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           ●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             ●                      −3
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            ●
                                                                                             −200                                                                                                                                                                                                                                                                                                                −0.5                                                                                                                      ●●                −4
                                                                                                                                                                                                                                                    −1.5                                                                                                                                                                                                                                                                                    ●
                     −1000




                                                                                                                                                                         −0.8




                                                                                                                                                                                                                                                                                                                                 −1.5




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  −3
                                                                                          −400                                                                                                                                                  −2.0                                                                                                                                                      −1.0                                                                                                                                       −5
                         −800   −600    −400   −200       0     200           400   600                                                                                      −1.0   −0.5    0.0     0.5       1.0      1.5       2.0      2.5                                                                                           −2   −1          0          1          2         3          4                                                                                                  −8   −6   −4    −2      0        2        4

                                            First component                                                                                                                                   First component                                                                                                                                           First component                                                                                                                                          First component




                             Hierarchical clustering of the cb data                                                                                                             Hierarchical clustering of the fp data                                                                                                               Hierarchical clustering of the mcs data                                                                                                                          Hierarchical clustering of the lpcs data
                   300




                                                                                                                                                                                                                                                                                                                                14




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 10
                                                                                                                                                                                                                                                                                                                                12
                   250




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 8
                                                                                                                                                                                                                                                                                                                                10
                                                                                                                                                                        10
                   200




                                                                                                                                                                                                                                                                                                                                8




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 6
                   150




                                                                                                                                                                                                                                                                                                                                6
                   100




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 4
                                                                                                                                                                        5




                                                                                                                                                                                                                                                                                                                                4
                   50




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 2
                                                                                                                                                                                                                                                                                                                                2
                   0




                                                                                                                                                                        0




                                                                                                                                                                                                                                                                                                                                0




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 0
                                                                      dist(D)                                                                                                                                               dist(D)                                                                                                                                                     dist(D)                                                                                                                                            dist(D)
                                                              hclust (*, "complete")                                                                                                                                hclust (*, "complete")                                                                                                                                      hclust (*, "complete")                                                                                                                             hclust (*, "complete")




                   ir. Michiel Stock (KERMIT)                                                                                                                                                                                              Kernels for Bioinformatics                                                                                                                                                                                                                                  November 2012                                                                      21 / 40
Case studies    Enzyme function prediction


Ranking enzymes


Ranking enzymes
For a query enzyme with unknown function, construct a ranking of a
database of annotated enzymes, based on structure. The top of the
ranking has likely the same function as the query.

     unsupervised: for a given query enzyme with unknown function, rank
     the database according to the structural similarity with the query
     supervised: first a ranking model h(v , v ) is constructed by using an
     independent training set. Subsequently for a given query enzyme v
     with unknown function, rank the enzymes vi from the database
     according to h(v , vi )



 ir. Michiel Stock (KERMIT)    Kernels for Bioinformatics                  November 2012   22 / 40
Case studies           Enzyme function prediction


           Results ranking enzymes

                                                                                                                        Table II
HE RESULTS OBTAINED FOR UNSUPERVISED AND SUPERVISED RANKING . F OR EACH COMBINATION OF CAVITY- B
PERFORMANCE MEASURE THE PERFORMANCE IS AVERAGED OVER THE DIFFERENT FOLDS AND QUERIES , WITH
WEEN PARENTHESES . F OR EVERY ROW THE BEST RANKING MODEL IS MARKED IN BOLD , WHILE THE WORST MO
                                        BY AN UNDERSCORE .
                                                               cb                                                      fp                                             mcs                                   lpcs
                                              RA               0.9062                        (0.0603)                  0.8815 (0.0689)                                0.8923 (0.0692)                       0.8877               (0.0607)
                                              MAP              0.9321                        (0.1531)                  0.7207 (0.235)                                 0.8846 (0.1578)                       0.7339               (0.2074)
            Unsupervised
                                              AUC              0.9636                        (0.0795)                  0.8655 (0.1387)                                0.9393 (0.0919)                       0.8794               (0.1126)
                                              nDCG             0.9922                        (0.0329)                  0.9349 (0.1424)                                0.9812 (0.0498)                       0.9471               (0.1112)
                                              RA               0.9951                        (0.017)                   0.995 (0.015)                                  0.9944 (0.0112)                       0.9952               (0.0156)
                                              MAP              0.9991                        (0.0092)                  0.9954 (0.0432)                                0.9989 (0.0076)                       0.9835               (0.0797)
             Supervised
                                              AUC              0.9976                        (0.0005)                  0.9967 (0.0184)                                0.9975 (0.0024)                       0.9934               (0.0368)
                                              nDCG             0.9968                        (0.0171)                  0.9942 (0.0424)                                0.987 (0.0398)                        0.9812               (0.0673)



A of the cb data                                    Kernel PCA of the fp data                                                                   Kernel PCA of the mcs data                                                           Kernel PCA of



                     ●●
                     ●
                     ●
                    ●●
                   ●●
                    ● ●
                   ●●
                   ●●
                  ●●
                   ●
                 ●●
                  ●●
                 ●●
                 ●●
                ●●
                ●●
               ●●
              ●●
              ●●
                 ●
             ●●                                                                                                                                                                ●
                                              0.8




                                                                                        ● ●                                                                    ● ● ● ● ● ●● ●●●● ● ●●●●
                                                                                                                                                                  ●       ●         ● ●●
                                                                                                                                                                                   ●●● ●●
                                                                                                                                                                                 ●●●●● ● ●
                                                                                                                                                                                    ●
                                                                                                                                                                                    ●●● ●
                                                                                        ●                                                                      ●●
                                                                                                                                                              ●●●
                                                                                                                                                               ●●●
                                                                                                                                                              ●● ●
                                                                                                                                                              ●● ●
                                                                                                                                                             ●●●
                                                                                                                                                                  ●
                                                                                                                                                              ●●● ●              ●●●●
                                                                                                                                                                             ● ● ●●● ●
                                                                                                                                                                            ●●●●● ●●● ●●
                                                                                                                                                                         ● ●●●●●●●●●●●●●
                                                                                        ●                                                                    ● ●●
                                                                                                                                                             ●● ●
                                                                                                                                                             ●●●● ●
                                                                                                                                                              ●●          ●● ●● ●● ●● ●● ●
                                                                                                                                                                           ● ●● ●●●● ●●
                                                                                                                                                                                      ●
                                                                                                                                                                           ●● ● ●● ●● ● ● ●
                                                                                                                                                                        ●●●●●● ●● ●●●● ● ● ●
                                                                                                                                                            ●●●●● ●● ●●●●●●●●●●●●●●●●●●
                                                                                                                                                             ●●               ●● ●● ●
                                                                           ●●
                                                                            ●●                                                                              ●●
                                                                                                                                                             ●●
                                                                                                                                                            ● ●●
                                                                                                                                                            ●●            ●● ●● ● ●●●● ● ●
                                                                                                                                                                           ● ● ● ● ●●
                                                                                                                                                                       ● ● ● ●● ● ●● ●●●●●
                                                                                                                                                                                      ●● ●●    ●




                                                                                                                                                                                                                                 4
     ●                                                                    ●●
                                                                           ●●
                                                                           ●
                                                                          ●● ●
                                                                           ●
                                                                          ●●                                                                                ● ●
                                                                                                                                                            ●●
                                                                                                                                                             ●
                                                                                                                                                            ●●         ● ●● ●●●●●●●● ●● ●●
                                                                                                                                                                                 ●
                                                                                                                                                                               ● ●● ●● ●      ●
     ●
     ●
     ●
    ●●
      ●                                                                 ●●●
                                                                         ●●
                                                                          ●●
                                                                                                                                                           ●●           ● ● ● ●● ●● ●● ● ●
                                                                                                                                                                                       ● ● ●
                                              0.6




     ●
     ●
     ●
    ●●
    ●●                                                                  ●●
                                                                      ●●●●                                                                                                                     ●
    ●●
    ●●
     ●
    ●●
     ●
     ●
     ●●
      ●                                                              ● ●●
                                                                      ●●●
                                                                     ●●● ●●
                                                                     ● ●●●
                                                                          ●                                                                                ●
                                                                                                                                                           ●            ●●       ● ● ● ●●●
                                                                                                                                                                                  ●         ● ●
                                                                                                                                                                                               ●
                                                                                                                                          1.0




    ●●
    ●●
    ●●
     ●
     ●
     ●●
      ●                                                              ●●●
                                                                     ● ●●
                                                                     ● ●
                                                                    ●●●
                                                                    ● ●●
                                                                         ●
                                                                    ● ●● ●
                                                                         ●
                                                                   ●●●● ● ●●
                                                                    ●●
                                                                                                             ●                                                          ●            ●      ●● ●
                                                                                                                                                                                            ●
                                                                                                                                                                                            ●
                                                                                                                                                                                          ●●●
                                                                                                                                                                                             ●
     ●                                                            ●●●●●
                                                                    ●●●
                                                                    ●●●
                                                                  ●●●● ●
                                                                    ●
                                                                  ●●●●
                                                                    ●●                                       ●●
                                                                                                            ●●                                             ● ●                          ● ●●
                                                                                                                                                                                            ●●●
                                                                 ●●●
                                                                 ●●●● ●
                                                                    ●
                                                                  ● ●●
                                                                ●●●●●
                                                                ●●●● ● ●
                                                                ●●● ●●
                                                                ●●●
                                                                 ●● ●                                       ●●●
                                                                                                           ●●●
                                                                                                          ●●●●
                                                                                                            ●●●                                                      ●          ● ●        ●
● ●●●                                                           ●●● ●
                                                               ●●●●
                                                                 ●●
                                                                ●●●
                                                               ●●●●
                                                                 ●●
                                                               ●●●●
                                                               ●●● ●                                   ●●●●●
                                                                                                       ●●●●●●●
                                                                                                       ●●●●●
                                                                                                          ●●
                                                                                                        ●●●
                                                                                                            ●
                                                                                                           ●●●                                                            ●                ●●
                                                                                                                                                                                            ●●
  ●●●                                                         ●●●● ●
                                                               ●● ●
                                                              ●●●● ●
                                                               ●●●●
                                                              ●●●●●                                    ●●●●●
                                                                                                         ● ●●




                                                                                                                                                                                                                                 3
  ●●●
   ●●
    ●●
   ●●                                                          ●●●●                                    ● ●● ●
                                                                                                          ●●
  ●●●
   ●●●
    ●●
    ●●
     ●●
    ●●
    ●●
   ●●
    ●●
     ●
    ●●
     ●●
     ●●                                                       ●●●●●
                                                               ●●●●
                                                              ●●●●
                                                              ●●●●            ●                       ●●●●
                                                                                                    ●●●●●●
                                                                                                       ●●●●
                                                                                                        ●●
                                                                                                      ●●● ●
                                                                                                                                                                                         ●● ●
                                                                                                                                                                                           ●
                                                                                                                                                                                         ●●● ● ●
                                              0.4




    ●●
     ●
   ●● ●
     ●
    ●●                                                       ●●●● ●
                                                               ●●●                                      ●●
                                                                                                       ●●
                                                                                                       ●●
    ●●
     ●●
    ●●
    ●●●
     ●●
   ●●●●                                                        ●● ●
                                                             ●●●●
                                                             ●●●●
                                                                ●
                                                               ●●
                                                              ●●●●
                                                              ●● ●
                                                              ●●●●
                                                              ●●●
                                                               ●●                                    ●●●
                                                                                                     ●●●●
                                                                                                      ●●●
                                                                                                     ●● ●
                                                                                                     ●●●●
                                                                                                      ●●
                                                                                                      ●                                                    ●                          ●●●●●●● ●
                                                                                                                                                                                         ● ● ●●●
                                                                                                                                                                                        ● ●●●●● ●
                                                                                                                                                                                            ●
    ●●●
     ●●
     ●●
     ●●
      ●●
       ●                                                     ●●●●
                                                              ●● ●
                                                                ●
                                                                ●●
                                                              ●●●
                                                            ●●●●
                                                            ●●●●●
                                                               ●●
                                                              ●●●
                                                               ●●
                                                              ●●●
                                                                 ●                                   ●●
                                                                                                    ●●●          ●                                                                     ●●●●●●●
                                                                                                                                                                                         ●● ●
                                                                                                                                                                                         ●● ●
                                                                                                                                                                                        ●●●●●
                                                                                                                                                                                         ●●● ● ●
                                                                                                                                                                                   ● ●●●●●●●●
                                                                                                                                                                                       ●●● ●●●
     ●●                                                                                                                                                                                ●●● ●● ●
                                                                                                                                                                                  ● ●●●●●●●●●
                                                                                                                                                                                       ●●●● ● ●




                                                                                                                                                                                                            component
      ●
      ●                                                       ●●
                                                            ●●●●●
                                                              ●● ●
                                                           ●●●●●
                                                              ●●
                                                               ●
                                                              ●●            ●●                      ●●      ● ●
                                                                                                            ● ●●                                                                        ●● ● ●
                                                                                                                                                                                           ●
                                                                                                                                                                                       ●●●●●●
                                                                                                                                                                                      ●● ●●●●
                                                                                                                                                                                        ●● ●
                                                                                                                                                                                        ●● ●
                                                                                                                                                                                           ●
                                                                                                                                                                                      ●●● ●●●●
                                                            ●●●
                                                            ●●●●                                                                                                                 ● ● ●●●●●●●●
                                                                                                                                          0.5




                                                            ●●●
                                                             ●●●
                                                              ●
                                                             ●●●
                                                             ●●●
                                                            ●●●
                                                             ●
                                                             ●             ● ●● ●                ●  ●●         ●                                                        ● ● ●            ●●●
                                                                                                                                                                                      ●● ● ●
                                                                                                                                                                                       ●● ● ●                                                             ●●
                                                                                                                                                                                                                                                          ●●
                                                                                                                                                                                                                                                          ● ●
                                                            ●●
                                                             ●
                                                            ●●
                                                             ●●
                                                          ●●● ●
                                                           ●● ●
                                                          ●●●              ●●
                                                                            ●                    ●●     ●●                                                       ● ●
                                                                                                                                                                  ●                                                                      ●●●●●●●●●● ● ●
                                                                                                                                                                                                                                         ●●●●●●●●● ●●●
                                                                                                                                                                                                                                          ●●●●●●●● ● ●
                                                                                                                                                                                                                                          ●●●●●●●●
                                                                                                                                                                                                                                          ●●●●● ●●●●●●
                                                                                                                                                                                                                                             ●●●●
                                                                                                                                                                                                                                           ●●●●●●●
                                                                                                                                                                                                                                             ● ●●
                                                                                                                                                                                                                                            ●●● ●
                                                                                                                                                                                                                                           ●●● ●●
                                                                                                                                                                                                                                           ●●●●●●
                                                                                                                                                                                                                                             ●●
                                                                                                                                                                                                                                            ●● ●●
                                                                                                                                                                                                                                            ●●
                                                                                                                                                                                                                                           ●●●●●●●
                                                                                                                                                                                                                                             ●●
                                                                                                                                                                                                                                             ●
                                                          ●● ●●             ●                  ●● ●                                                                                ● ●●● ●
                                                                                                                                                                                        ●●
                                                                                                                                                                                         ●
                                   omponent




                                                            ●
                                                         ●●●●●
                                                          ●●●              ●                                                                                 ● ●● ●                     ●




                                                                                                                                                                                                                                 2
                                                          ●●
                                                         ●●●
                                                         ●●●
                                                          ●● ●
                                                          ●               ●● ● ●             ● ●● ●                                                                                ● ●●●
                                                                                                                                                                                    ● ●●●
                                              0.2




                                                         ●● ●
                                                          ●●
                                                           ●●
                                                           ●●                                   ●                                                                  ●         ●      ●●● ●




                                                                                                                                                                                                                        ponent
                                                         ●●●●                                                                                     ●●
                                                                                                                                                   ●
                                                                                                                                                  ● ●●                              ●●
                          ponent




    ●                                                    ● ●
                                                       ●●●● ●
                                                           ●         ● ●                ●●      ●                                                  ●
                                                                                                                                                   ●
                                                                                                                                                  ● ●
                                                                                                                                                  ●
                                                        ●●●●
                                                          ● ●         ●                                                                            ●
                                                                                                                                                  ● ●●
                                                                                                                                                  ●●
                                                                                                                                                 ●●●● ● ● ●
                                                                                                                                                  ●● ●
                                                                                                                                                   ●
                                                                                                                                                ●●●●● ●●                          ●● ●●
                                                                                                                                                                                 ●● ● ●
                                                                                                                                  onent




                                                       ●●●●        ● ●              ●                                                            ●●●●● ● ●
                                                                                                                                                  ●● ●
                                                                                                                                                ●●●●●●● ● ● ●                         ●
                                                       ●● ●                                                                                     ●●●●●●●●●●●
                                                                                                                                                  ●● ●
                                                                                                                                                   ●●
                                                                                                                                                   ●●
                                                                                                                                                ●●●●● ● ●
                                                                                                                                                 ●●●●
                                                                                                                         onent




                                                         ●                                                                                      ●●●●● ●● ●
                                                                                                                                                  ●●● ● ●●
                                                                                                                                                   ●●                            ●●
                                                      ●●●●                                                                                         ●●● ●
            ir. Michiel Stock (KERMIT)                  ●●
                                                         ●
                                                         ●
                                                          ●
                                                         ●●●
                                                           ●
                                                      ●● ● ●●
                                                       ●●●
                                                      ●● ●● ●●
                                                                    ● ●
                                                                             ●● ●     ●●●
                                                                               ●●● ● ● ● ●
                                                                                 ●
                                                                                                               Kernels for Bioinformatics          ●●● ● ●
                                                                                                                                                    ● ●
                                                                                                                                                   ●●● ●
                                                                                                                                                  ●●●●● ●●●●●
                                                                                                                                                  ●●●●●●● ●●
                                                                                                                                                    ●●●
                                                                                                                                                  ●●●●●●●● ●
                                                                                                                                                    ●●●●
                                                                                                                                                    ●●●●● ●
                                                                                                                                                     ●●●
                                                                                                                                                     ●●●
                                                                                                                                                   ●●●●● ●●● ● ● ●
                                                                                                                                                      ●
                                                                                                                                                   ●●●●● ●● ●●
                                                                                                                                                    ●●●●●
                                                                                                                                                    ●●●●● ●
                                                                                                                                                      ●
                                                                                                                                                      ●
                                                                                                                                                      ●●
                                                                                                                                                     ●●● ●
                                                                                                                                                     ●●●●●●
                                                                                                                                                    ●●●●● ● ●
                                                                                                                                                      ●● ●
                                                                                                                                                     ●●●●● ●        ●          ●
                                                                                                                                                                                 ●● ●
                                                                                                                                                                                                    November 2012
                                                                                                                                                                                                      2.5                               23 / 40
                                                                                                                                          0.0




                                                        ●                                                                                               ●●           ● ●
                                                                                  ●●● ●● ●● ●




                                                                                                                                                                                                                                 1
                                                        ●
                                                        ●●        ●●               ● ● ● ●●                                                          ●●●● ●
                                                       ●●●                          ●●                                                                 ●●●●●●● ●●●●●●
                                                                                                                                                        ● ●● ●●● ● ●●● ●
                                              0.0




    ●                                                 ●●●● ● ● ●●
                                                        ●
                                                        ●●
                                                       ●● ●
                                                          ●                         ●●      ●
                                                                                             ●●
                                                                                              ●
                                                                                                                                                         ●● ●
                                                                                                                                                       ● ●● ●●● ● ● ●
                                                                                                                                                         ●●● ●●
                                                                                                                                                            ● ●● ●●
                                                                                                                                                                     ● ●
    ●                                                  ●●
                                                      ●●●●
                                                         ●●                        ●●●●
                                                                                   ●
                                                                                     ●
                                                                                  ●●● ●●
                                                                                    ●● ●
                                                                                     ●●       ●
                                                                                              ●
                                                                                              ●
                                                                                              ●
                                                                                             ●●
                                                                                              ●
                                                                                              ●
                                                                                               ●                                                           ●
                                                                                                                                                          ●● ●●●●● ● ●
                                                                                                                                                           ●● ●● ● ●
                                                                                                                                                                  ●
Case studies    Enzyme function prediction


Supervised ranking preserves hierarchies (1)




 ir. Michiel Stock (KERMIT)    Kernels for Bioinformatics                  November 2012   24 / 40
Case studies                   Enzyme function prediction


Supervised ranking preserves hierarchies (2)
                                     Unsupervised                          Unsupervised                                                              Unsupervised                         Unsupervised
                                         cb                                     fp                                                                       mcs                                  lpcs




                                                                                                           0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18
                                                                    0.50




                                                                                                                                                                                    1.0
                               140
                                                                                         ●                                                                         ●                                    ●




                                                                    0.45
                               120




                                                                                                                                                                                    0.8
                               100




                                                                    0.40




                                                                                                                                                                                    0.6
                  prediction




                                                       prediction




                                                                                             prediction




                                                                                                                                                                       prediction
                               80




                                                                    0.35
                                                                                 ●
                                                                                 ●
                               60




                                                                                                                                                                                                        ●




                                                                                                                                                                                    0.4
                                                                                                                                                                   ●




                                                                    0.30
                                                                                 ●
                                                                                 ●
                                                                                 ●
                               40




                                                                                                                                                      ●




                                                                                                                                                                                    0.2
                                                                    0.25
                               20




                                                                                                                                                               ●
                                                                            ●
                                                                            ●
                                      ●
                                      ●    ●                                                                                                          ●                                    ●
                                                                                                                                                      ●                                         ●
                                                                                                                                                                                                ●
                                                                                                                                                                                           ●




                                                                    0.20
                                                                                                                                                      ●




                                                                                                                                                                                    0.0
                                                                            ●                                                                         ●                                    ●
                               0




                                      0    1 2 4                            0    1 2 4                                                                0    1 2 4                           0    1 2 4
                                            cat.                                  cat.                                                                      cat.                                 cat.
                                          similarity                            similarity                                                                similarity                           similarity


                                      Supervised                            Supervised                                                                Supervised                           Supervised
                                         cb                                     fp                                                                       mcs                                  lpcs
                                                                    4




                                                                                                           4




                                                                                                                                                                                    4
                               4




                                                                    3




                                                                                                           3




                                                                                                                                                                                    3
                               3
                  prediction




                                                       prediction




                                                                                             prediction




                                                                                                                                                                       prediction
                                                                    2




                                                                                                           2




                                                                                                                                                                                    2
                               2




                                           ●
                                                                                 ●
                                                                    1




                                                                                                                                                                                    1
                                                                                                           1
                               1




                                                                            ●
                                                                            ●
                                                                            ●                                                                                                                   ●
                                                                                                                                                                                                ●
                                                                                                                                                                                                ●

                                           ●                                                                                                               ●
                                           ●                                                                                                               ●
                                                                    0




                                                                                                                                                                                                ●
                                                                                                                                                                                                ●
                                                                                                                                                                                           ●




                                                                                                                                                                                    0
                                                                                                                                                           ●
                                                                                                           0
                               0




                                                                                                                                                                                           ●
                                                                                                                                                                                           ●        ●
                                                                                                                                                                                           ●



                                      ●                                     ●
                                                                            ●
                                                                    −1




                                      0    1 2 4                            0    1 2 4                                                                0    1 2 4                           0    1 2 4
                                            cat.                                  cat.                                                                      cat.                                 cat.
                                          similarity                            similarity                                                                similarity                           similarity


 ir. Michiel Stock (KERMIT)                                                 Kernels for Bioinformatics                                                                                                  November 2012   25 / 40
Case studies      Protein-ligand interactions


Predicting protein-ligand interactions


Problem statement
Predict the binding interaction between a given protein and a ligand
(small molecule). Learning Molecular docking.


        Training using the Karaman
        dataset:
               317 kinase targets
               38 kinase inhibitors
        For each combination the
        dissociation coefficient Kd in
        nM is known.


  ir. Michiel Stock (KERMIT)        Kernels for Bioinformatics                   November 2012   26 / 40
Case studies    Protein-ligand interactions


Karaman dataset


                    © 2008 Nature Publishing Group http://www.nature.com/naturebiotechnology




 ir. Michiel Stock (KERMIT)                                                                     Kernels for Bioinformatics                   November 2012   27 / 40
Case studies    Protein-ligand interactions


Building a model


Features
     CavBase similarity for proteins
     Tanimoto kernel from the fingerprints derived from ligands
     Virtual docking results


Model types:
     Classification by specifying a cutoff value, using RLS.
     Conditional ranking, use one type of object to construct a ranking of
     the other type according to binding energy.



 ir. Michiel Stock (KERMIT)     Kernels for Bioinformatics                   November 2012   28 / 40
Case studies    Protein-ligand interactions


Protein-ligands results classification



              Test sampling   Cutoff [nM]            AUC
                                 1000               0.621584         (0.104163)
              new ligand
                                10000               0.653330         (0.107727)
                                 1000               0.812184         (0.185627)
              new protein
                                10000               0.801310         (0.157205)




     Cutoff value hardly matters
     Generalizing to new ligand harder than for new protein




 ir. Michiel Stock (KERMIT)    Kernels for Bioinformatics                   November 2012   29 / 40
Case studies    Protein-ligand interactions


Protein-ligands results ranking


Testing scheme: new query for the same database

                              Query type     Ranking error
                               Ligand        0.324000 (0.129307)
                               Protein       0.32799 (0.088344)




     Query type does not matter (much)
     Using protein as query somewhat more reliable




 ir. Michiel Stock (KERMIT)           Kernels for Bioinformatics                   November 2012   30 / 40
Case studies    Microbial ecology


Predicting microbial interactions


Problem statement
How do heterotrophic bacteria influence the growth of methanotrophic
bacteria?


       Dataset:
              10 methanotrophs
              27 heterotrophs
       Of each combination a time
       series of their collective
       growth (OD) was measured
       for 14 days.


 ir. Michiel Stock (KERMIT)      Kernels for Bioinformatics         November 2012   31 / 40
Case studies      Microbial ecology


Concept


                     Methanotrophs                             Heterotrophs
                                               Carbon
                                             compounds
     Methane



                                              vitamins?
                                             antibiotics?




   Features:                                    ⌦

 ir. Michiel Stock (KERMIT)    Kernels for Bioinformatics             November 2012   32 / 40
Case studies    Microbial ecology


Experimental setup




 ir. Michiel Stock (KERMIT)    Kernels for Bioinformatics         November 2012   33 / 40
Case studies          Microbial ecology


Optical density time series

                                      Meth_5 and Hetero_2                                                          Meth_7 and Hetero_10




                                                                                           0.20
              0.30
              0.25




                                                                                           0.15
                         max OD                                    ●
              0.20




                                                     ●
                                          ●

                                                            ●
                                                                           ●




                                                                                           0.10
                                                                                                      max OD
         OD




                                                                                      OD
              0.15




                                                                                                                                         ●
              0.10




                                  ●




                                                                                           0.05
                                                                                                                                         max increasment OD
                                                                                                                                                ●
              0.05




                                                            max increasment OD
                                                                                                                                                        ●
                         ●                                                                                                        ●




                                                                                           0.00
                                                                                                  ●   ●        ●       ●
              0.00




                     ●




                     0                5                     10                   15               0                5                     10                   15

                                              Time (days)                                                                  Time (days)




Three types of labels were derived from these plots:
    maximal optical density
    maximal increase in optical density
    time of maximal increase in optical density
  ir. Michiel Stock (KERMIT)                                       Kernels for Bioinformatics                                                 November 2012        34 / 40
Case studies            Microbial ecology


Labels for bacterial combinations
                                   Color Key
                                 and Histogram                Heat map of the log. of


                             6
                                                                   max. density

                     Count
                             4
                             2
                             0
                                  −6 −4 −2   0
                                    Value

                                                                                                          H 15
                                                                                                          H 13
                                                                                                          H 23
                                                                                                          H 19
                                                                                                          H 20
                                                                                                          H 17
                                                                                                          H 21
                                                                                                          H 22
                                                                                                          H2
                                                                                                          H 24




                                                                                                                 Heterotrophs
                                                                                                          H 14
                                                                                                          H 16
                                                                                                          H8
                                                                                                          H3
                                                                                                          H1
                                                                                                          H 25
                                                                                                          H5
                                                                                                          H4
                                                                                                          H 11
                                                                                                          H9
                                                                                                          H7
                                                                                                          NMS
                                                                                                          H 18
                                                                                                          H 10
                                                                                                          H 12
                                                                                                          H6
                                                 NMS

                                                         M9

                                                                M6

                                                                     M7

                                                                          M3

                                                                               M1

                                                                                      M5

                                                                                           M2

                                                                                                M4

                                                                      Methanotrophs                  M8

 ir. Michiel Stock (KERMIT)                            Kernels for Bioinformatics                                November 2012   35 / 40
Case studies    Microbial ecology


Regression results


Pairwise regression of the labels using support vector regression. Testing is
done by withholding each heterotroph in a leave-one-out scheme.

                 Label                     MSE/var            Spearman cor.
                 Max. OD                   0.8248             0.6875
                 Max. incr. OD             0.7888             0.57708
                 Time max. incr. OD        0.9694             0.3839

      This is a hard problem!
      Exact experimental conditions very important!




  ir. Michiel Stock (KERMIT)     Kernels for Bioinformatics            November 2012   36 / 40
Case studies    Microbial ecology


Extra feature selection

Idea
Look for the most relevant genes for interaction in the heterotrophs using
lasso regression (in combination with the LARS algorithm) or
Regularized Random Forests.
                                                         LARS:
                                                                                                                                LASSO
                                                                                          0     1    2 5             13         15    19    21    34          39        47    65



        For example, max. OD seems                                                                                                                                      *******
                                                                                                                                                                         * **
                                                                                                                                                                         *




                                                                                                                                                                                        40
                                                                                                                                                               * **
                                                                                                                                                              ** ***            ** *
                                                                                                                                                                                 ** *
                                                                                                                                                   * **
                                                                                                                                                  ****
                                                                                                                                           * **




                                                                                     2




                                                                                                                                                                                        87
                                                                                                                                      **

        to be determined by genes                                                                                    *
                                                                                                                     *
                                                                                                                                 **

                                                                                                                                                              ** ***
                                                                                                                                                               * **     ** *
                                                                                                                                                                               *
                                                                                                                                                                                *
                                                                                                                                                                                *
                                                                                                                                                                                *
                                                                                                                                                                                 *
                                                                                                                                                                                 *

                                                                                                                                                                         * **** * *
                                                                                                                                                                                   *
                                                                                                                                                                                   *




                                                                                                                                                                                        445 220
                                                                                                            *                                        **                  * *
                                                                                                                                                                        ********* *
                                                                                                                                                                         *
                                                                                                                                                                         * * *




                                                         Standardized Coefficients
                                                                                                          *                                * ** ****
                                                                                                                                                 *                **
                                                                                                                                                               * **     ********* *
                                                                                                                                                 **           ** ****    * * * ** *
                                                                                                                                                                         * * * **
                                                                                                                                                                         **
                                                                                                                                                                         *** * ** *
                                                                                                                                                                         *
                                                                                                                                                                        ********* *




                                                                                     0
        related to
                                                                                           *    *    * ** * *
                                                                                                        *
                                                                                                     * ** * *
                                                                                                        *            *
                                                                                                                     *
                                                                                                                     *
                                                                                                                     *           * * **
                                                                                                                                 **        * ** **** **
                                                                                                                                           * ** **** **
                                                                                                                                           *     *             * **
                                                                                                                                                              ** ***    ********* *
                                                                                                                                                                         *** * ** *
                                                                                                                                                                         * * *
                                                                                                       ** * *                    * * **
                                                                                                                                     **    * ** **** **
                                                                                                                                                 *             * **
                                                                                                                                                              ** ***
                                                                                                                                                                  *
                                                                                                                                                               * ***     *** * ** *
                                                                                                                                                                        ********* *
                                                                                                            *        *                          **** **
                                                                                                                                                 *                  *        *
                                                                                                                                                                         * * * *
                                                                                                                                                                         * *** ** *
                                                                                                                                                                        ********* *
                                                                                                                     *
                                                                                                                     *
                                                                                                                     *
                                                                                                                                 * * **
                                                                                                                                                              ** ***
                                                                                                                                                               * **      ** * *
                                                                                                                                                                        ****
                                                                                                                                                                         *
                                                                                                                                 **        * ** ****
                                                                                                                                                 * **
                                                                                                                                     **                       ** ***
                                                                                                                                                               * **     *******
                                                                                                                                                                         * **
                                                                                                                                                                         * *
                                                                                                *                                                                                ** *
                                                                                                                                                                                ** *
                                                                                                                                           * **




                                                                                                                                                                                        1234
        methenyltetrahydrofolate.

                                                                                     −2
                                                                                                                                                ****
                                                                                                                                                 *                                *
                                                                                                                                                                                  *
                                                                                                                                                     **                          * *
                                                                                                                                                              ** **
                                                                                                                                                               * **             ** *
                                                                                                                                                                    *        **
                                                                                                                                                                         * *
                                                                                                                                                                        ******
                                                                                                                                                                         * **
                                                                                                     *
                                                                                                          **
                                                                                                               * *

        Take with a large grain of
                                                                                     −4
                                                                                                                     *
                                                                                                                     *
                                                                                                                                 **
                                                                                                                                    **
                                                                                                                                           * **
                                                                                                                                                  ****
                                                                                                                                                   *

        salt!                                                                        −6
                                                                                                                                                         **
                                                                                                                                                              ** **
                                                                                                                                                               * ** *   **




                                                                                                                                                                                        238
                                                                                                                                                                         ** *
                                                                                                                                                                         *
                                                                                                                                                                             * *
                                                                                                                                                                           ****** *
                                                                                                                                                                             ** *

                                                                                          0.0       0.2                   0.4              0.6                0.8                1.0

                                                                                                                          |beta|/max|beta|




  ir. Michiel Stock (KERMIT)    Kernels for Bioinformatics                                                                                 November 2012                                          37 / 40
Conclusions


Take-home messages



              Use kernels for complex structured data.
           Relations can be learned by treating a pair of objects as a
     special kind of structured object.
           Predicting a ranking is in many cases a more relevant answer to
     a research question.
            Posing the right research question is of vital importance when
     building models!




 ir. Michiel Stock (KERMIT)     Kernels for Bioinformatics   November 2012   38 / 40
Conclusions


Further reading I

[1] A. Ben-Hur and W. S. Noble. Kernel methods for predicting
    protein-protein interactions. Bioinformatics, 21 Suppl 1:i38–46, June
    2005.
[2] S. Erdin, A. M. Lisewski, and O. Lichtarge. Protein function
    prediction: towards integration of similarity metrics. Current Opinion
    in Structural Biology, 21(2):180–8, Apr. 2011.
[3] L. Jacob and J.-P. Vert. Protein-ligand interaction prediction: an
    improved chemogenomics approach. Bioinformatics, 24(19):2149–56,
    Oct. 2008.
[4] T. Pahikkala, A. Airola, M. Stock, B. De Baets, and W. Waegeman.
    Efficient regularized least-squares algorithms for conditional ranking on
    relational data. Machine Learning, Submitted, 2012.
[5] B. Sch¨lkopf, K. Tsuda, and J.-P. Vert. Kernel Methods in
          o
    Computational Biology. 2004.
  ir. Michiel Stock (KERMIT)   Kernels for Bioinformatics   November 2012   39 / 40
Conclusions


Further reading II



[6] M. Stock. Learning pairwise relations in bioinformatics: three case
    studies. Master’s thesis, Ghent University, 2012.
[7] J.-P. Vert, J. Qiu, and W. S. Noble. A new pairwise kernel for
    biological network inference with support vector machines. BMC
    Bioinformatics, 8(S-10), Jan. 2007.
[8] W. Waegeman, T. Pahikkala, A. Airola, T. Salakoski, M. Stock, and
    B. De Baets. A kernel-based framework for learning graded relations
    from data. IEEE Transactions on Fuzzy Systems, 99:1, 2012.




  ir. Michiel Stock (KERMIT)   Kernels for Bioinformatics   November 2012   40 / 40

More Related Content

PDF
A new multiple classifiers soft decisions fusion approach for exons predictio...
PDF
NETWORK LEARNING AND TRAINING OF A CASCADED LINK-BASED FEED FORWARD NEURAL NE...
PDF
Object recognition with cortex like mechanisms pami-07
DOC
abstrakty přijatých příspěvků.doc
PDF
Reflectivity Parameter Extraction from RADAR Images Using Back Propagation Al...
PPT
Class01
PDF
Performance Evaluation of Classifiers used for Identification of Encryption A...
PDF
Hsis2005 Geospatial Nomadeyes Full
A new multiple classifiers soft decisions fusion approach for exons predictio...
NETWORK LEARNING AND TRAINING OF A CASCADED LINK-BASED FEED FORWARD NEURAL NE...
Object recognition with cortex like mechanisms pami-07
abstrakty přijatých příspěvků.doc
Reflectivity Parameter Extraction from RADAR Images Using Back Propagation Al...
Class01
Performance Evaluation of Classifiers used for Identification of Encryption A...
Hsis2005 Geospatial Nomadeyes Full

What's hot (17)

PDF
Multilabel Image Annotation using Multimodal Analysis
PDF
X trepan an extended trepan for
PDF
Analysis of Neocognitron of Neural Network Method in the String Recognition
PDF
A R T I F I C I A L N E U R A L N E T W O R K S J N T U M O D E L P A P ...
PDF
Neural Networks: Introducton
PDF
A simple framework for contrastive learning of visual representations
PDF
Spectra of Large Network
PDF
Computational model for artificial learning using formal concept analysis
PDF
Extremely Low Bit Transformer Quantization for On-Device NMT
PPT
Random Neural Network (Erol) by Engr. Edgar Carrillo II
PDF
Cerebellar Model Controller with new Model of Granule Cell-golgi Cell Buildi...
PDF
Algorithmic Information Theory and Computational Biology
PDF
SCALING THE HTM SPATIAL POOLER
PDF
Robust Ensemble Classifier Combination Based on Noise Removal with One-Class SVM
PDF
A Literature Survey: Neural Networks for object detection
PDF
Secure Multi-Party Computation Based Privacy Preserving Extreme Learning Mach...
PDF
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...
Multilabel Image Annotation using Multimodal Analysis
X trepan an extended trepan for
Analysis of Neocognitron of Neural Network Method in the String Recognition
A R T I F I C I A L N E U R A L N E T W O R K S J N T U M O D E L P A P ...
Neural Networks: Introducton
A simple framework for contrastive learning of visual representations
Spectra of Large Network
Computational model for artificial learning using formal concept analysis
Extremely Low Bit Transformer Quantization for On-Device NMT
Random Neural Network (Erol) by Engr. Edgar Carrillo II
Cerebellar Model Controller with new Model of Granule Cell-golgi Cell Buildi...
Algorithmic Information Theory and Computational Biology
SCALING THE HTM SPATIAL POOLER
Robust Ensemble Classifier Combination Based on Noise Removal with One-Class SVM
A Literature Survey: Neural Networks for object detection
Secure Multi-Party Computation Based Privacy Preserving Extreme Learning Mach...
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...
Ad

Viewers also liked (6)

PPSX
Cb08 gonzález jesús
PPSX
EasyGene oligo factory
PPTX
Bioinformatics in Gene Research
PPT
Primer Designing
PPT
PCR Primer desining
Cb08 gonzález jesús
EasyGene oligo factory
Bioinformatics in Gene Research
Primer Designing
PCR Primer desining
Ad

Similar to Bioinformatics kernels relations (20)

PDF
Kernel Methods and Relational Learning in Computational Biology
PDF
Poster genome engineering & Synthetic Biology 2016
PDF
Kernel methods for data integration in systems biology
PDF
Kernel methods for data integration in systems biology
PPT
Lec1-Into
PPTX
The application of artificial intelligence
PDF
2019 Fall Series: Postdoc Seminars - Special Guest Lecture, There is a Kernel...
PDF
Kernel methods in machine learning
PPT
Cornell Pbsb 20090126 Nets
PDF
Kernel based approaches in drug target interaction prediction
PDF
Kernel methods and variable selection for exploratory analysis and multi-omic...
PDF
Recent Advances in Kernel-Based Graph Classification
PDF
Lecture4 kenrels functions_rkhs
PPTX
A survey on graph kernels
PPTX
8. Data mining_warehousing_integration.pptx
PPTX
Natures Top 100 Papers - Phylogenetic Tree - ClustalW.pptx
PDF
A general frame for building optimal multiple SVM kernels
PDF
COMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSION
PDF
Advanced machine learning for metabolite identification
PDF
Enabling Biobank-Scale Genomic Processing with Spark SQL
Kernel Methods and Relational Learning in Computational Biology
Poster genome engineering & Synthetic Biology 2016
Kernel methods for data integration in systems biology
Kernel methods for data integration in systems biology
Lec1-Into
The application of artificial intelligence
2019 Fall Series: Postdoc Seminars - Special Guest Lecture, There is a Kernel...
Kernel methods in machine learning
Cornell Pbsb 20090126 Nets
Kernel based approaches in drug target interaction prediction
Kernel methods and variable selection for exploratory analysis and multi-omic...
Recent Advances in Kernel-Based Graph Classification
Lecture4 kenrels functions_rkhs
A survey on graph kernels
8. Data mining_warehousing_integration.pptx
Natures Top 100 Papers - Phylogenetic Tree - ClustalW.pptx
A general frame for building optimal multiple SVM kernels
COMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSION
Advanced machine learning for metabolite identification
Enabling Biobank-Scale Genomic Processing with Spark SQL

More from Michiel Stock (12)

PDF
Wiskunde voor Waterbeheer
PDF
How the mathematics behind Netflix will save the world
PDF
Disentangling ecological networks using graph embedding methods
PDF
2018 presentation montréal_handouts
PDF
A tour in optimal transport
PDF
Pairwise Learning for Synthetic Biology
PDF
PhD defence pairwise learning
PDF
Bioscience engineering together: participating at iGEM
PDF
Exact and efficient top-K inference for multi-target prediction by querying s...
PDF
A two-step method to incorporate task features for large output spaces
PDF
Enzyme Annotation using Conditional Ranking Algorithms
PDF
A Kernel Based Framework for Predicting Interactions Between Methanotrophs an...
Wiskunde voor Waterbeheer
How the mathematics behind Netflix will save the world
Disentangling ecological networks using graph embedding methods
2018 presentation montréal_handouts
A tour in optimal transport
Pairwise Learning for Synthetic Biology
PhD defence pairwise learning
Bioscience engineering together: participating at iGEM
Exact and efficient top-K inference for multi-target prediction by querying s...
A two-step method to incorporate task features for large output spaces
Enzyme Annotation using Conditional Ranking Algorithms
A Kernel Based Framework for Predicting Interactions Between Methanotrophs an...

Bioinformatics kernels relations

  • 1. Kernel Methods and Relational Learning in Bioinformatics ir. Michiel Stock Dr. Willem Waegeman Prof. dr. Bernard De Baets Faculty of Bioscience Engineering Ghent University November 2012 KERMIT ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 1 / 40
  • 2. Outline 1 Introduction 2 Kernel methods 3 Learning relations 4 Case studies Enzyme function prediction Protein-ligand interactions Microbial ecology 5 Conclusions ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 2 / 40
  • 3. Introduction Introductory example Problem statement Predict protein-protein interactions based on high-throughput data. Based on a gold standard Typical features that can be used: Yeast two-hybrid Pfam profile Phylogenetic profile Localization PSI-BLAST Expression ... ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 3 / 40
  • 4. Introduction Machine learning is widelyagaused in bioinformatics 88 Larran‹ et al. Downloaded from bib.oxfordjournals.org at Biomedische Bibliotheek o Figure 1: Classification of the topics where machine learning methods are applied. ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 4 / 40
  • 5. Introduction Bioinformatics deals with complex data Bioinformatics data is typically: in large dimension (e.g., microarrays or proteomics data) structured (e.g., gene sequences, small molecules, interaction networks, phylogenetic trees...) heterogeneous (e.g., vectors, sequences, graphs to describe the same protein) in large quantities (e.g., more than 106 known protein sequences) noisy (e.g., many features are not relevant) ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 5 / 40
  • 6. Kernel methods Formal definition of a kernel Kernels are non-linear functions defined over objects x ∈ X . Definition A function k : X × X → R is called a positive definite kernel if it is symmetric, that is, k(x, x ) = k(x , x) for any two objects x, x ∈ X , and positive semi-definite, that is, N N ci cj k(xi , xj ) ≥ 0 i=1 j=1 for any N > 0, any choice of N objects x1 , . . . , xN ∈ X , and any choice of real numbers c1 , . . . , cN ∈ R. Can be seen as generalized covariances. ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 6 / 40
  • 7. Kernel methods Interpretation of kernels Suppose an object x has an implicit feature representation φ(x) ∈ F. A kernel function can be seen as a dot product in this feature space: X F k(x, x ) = φ(x), φ(x ) h (x), (x0 )i k Linear models in this feature space F can be made: dinsdag, 10 april 2012 T y (x) = w φ(x) = an k(xn , x) n ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 7 / 40
  • 8. Kernel methods Many kernel methods exist SVM Examples of popular kernel methods: Support vector machine (SVM) Regularized least squares (RLS) Kernel principal KPCA component analysis (KPCA) Learning algorithm is independent of the kernel representation! ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 8 / 40
  • 9. Kernel methods Kernels for (protein) sequences Spectrum kernel (SK) The SK considers the number of k-mers m two sequences si and sj have in common. SKk (si , sj ) = N(m, si )∗N(m, sj ) m∈Σk with N(m, s) the number of k-mers m in sequence s. To predict structure, function... of DNA, RNA or proteins. A discriminative alternative for Hidden Markov Models. ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 9 / 40
  • 10. Kernel methods Kernels for graphs (1) Graph Graphs are a set of interconnected objects, called vertices (or nodes), that are connected through edges. Graphs can show the structure of an object or interactions between different objects. Graph are important in bioinformatics! ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 10 / 40
  • 11. Kernel methods Kernels for graphs (2) Graph kernel Constructing a similarity between graphs. In chemoinformatics: Based on performing a random walk on both graphs and counting the number of In structural bioinformatics: matching walks. Usually very computationally demanding! A ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 11 / 40
  • 12. Kernel methods Kernels for graphs (3) Diffusion kernel Constructing a similarity between vertices within the same graph. Also based on performing a random walk on a graph. Captures the long-range relationships between vertices. Inspired by the heat equation. The kernel quantifies how quickly ‘heat’ can spread from one node to another. ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 12 / 40
  • 13. Kernel methods Kernels for fingerprints Fingerprint representation of Objects that can be described an object: by a long binary vector x can be represented by the Tanimoto kernel: KTan (xm , xn ) = xm , xn . xm , xm + xn , xn − xm , xn ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 13 / 40
  • 14. Learning relations Kernels for pairs of objects Problem statement Predict the binding interaction between a given protein and a ligand (small molecule). Learning Molecular docking. The problem deals with two types of objects: Proteins (graph kernel of structure, sequence kernel, fingerprints...) Ligand (fingerprints, graph kernel...) Label is for a pair of objects. ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 14 / 40
  • 15. Learning relations ng and Ranking Algorithms for Bioinformatics example: pairs of objects Kernels for Applications nomicsWillem Waegeman, Bernard De Baets Michiel Stock, Pairwise kernel IT, Department of Mathematical Modelling, Statistics and Bioinformatics of Combine the kernel matrices of the individual the process of druga kernel proteins and a database of ligands to aid objects to construct istical model based objects. matrix for pairs of on a data set. Kernel methods allow for the roductory example: chemogenomics tein and a from individual kernels for the proteins and ligands: Starting ligand. ding interactions between a set of proteins and a database of ligands to aid the process of drug to model pairwise relations between different types of objects. s Data set Object kernels ( , ) By optimizing a ranking loss, our algorithms can also be used for ( , ) as shown on the right. conditional ranking, ( , ) SVM In short, our framework is ideally suited for bioinformatics RLS ... challenges: ( , ) - efficient learning process ( , ) ... - can handle complex objects (graphs, trees, sequences...) Pairwise kernel - ability to deal with information retrieval problems Object kernels Learning algorithm gorithms can also be used for ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 15 / 40
  • 16. ( , ) Learning relations SVM Conditional ranking (1) RLS ... Motivation( , ) Suppose one is not ) ... ( , particularly interested in the exact value of the interaction but in the order of the proteins for a given ligand. Pairwise kernel rnels Learning algorithm ed for More relevant More relevant matics Query 1 Query 2 Database objects ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 16 / 40
  • 17. Learning relations Conditional ranking (2) Based on a graph description, with e a pair of objects. Train the model: h(e) =< w, Φ(e) >= ae K Φ (e, e ) ¯ e∈E using the algorithm: 2 A(T ) = argmin L(h, T )+λ h H. h∈H Figure 1 Example of a multi-graph. If this graph, on the left, would be used fo conditioned on C, then A scores better than E, which ranks higher than E, w Where we use a ranking loss: higher than D and D ranks higher than B. There is no information about the re and G, respectively, our model could be used to include these two instances in are available. Notice that in this setting unconditional ranking of these objects graph is obviously intransitive. Figure reproduced from (Pahikkala et al., 2010). L(h, T ) = (ye −ye −h(e)+h(¯))2 . ¯ e The proposed framework is based on the Kronecker product ke v ∈V e,¯∈Ev e implicit joint feature representations of queries and the sets of ob Exactly this kernel construction will allow a straightforward existing framework to dyadic relations and multi-task l (Objectives 1 and 2). It has been proposed independently by three modeling pairwise inputs in different application domains (Basilico ir. Michiel Stock (KERMIT) Kernels for Bioinformatics et al. 2004, Ben-Hur et al. November a2012 2005). From different perspective, it h 17 / 40
  • 18. Case studies Enzyme function prediction Predicting enzyme function Problem statement Predict the function (EC number) of an enzyme using structural information of the active site. Data: active site of an 1730 enzymes with 21 enzyme: different functions four different structural similarities CavBase maximum common subgraph labeled point cloud superposition fingerprints ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 18 / 40
  • 19. Case studies Enzyme function prediction EC numbers EC number A functional label of an enzyme, based on the reaction that is catalyzed. Example: EC 2.7.6.1 = ribose-phosphate diphosphokinase ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 19 / 40
  • 20. Case studies Enzyme function prediction Defining catalytic similarity Catalytic similarity The catalytic similarity is the number of successive equal digits in the EC number between two enzymes, starting from the first digit. 0 EC 2.7.7.34 EC ?.?.?.? 3 2 0 1 EC 4.2.3.90 0 0 0 EC 4.6.1.11 2 EC 2.7.1.12 EC 2.7.7.12 ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 20 / 40
  • 21. Case studies Enzyme function prediction Data exploration Kernel PCA of the cb data Kernel PCA of the fp data Kernel PCA of the mcs data Kernel PCA of the lpcs data ●● ● ● ●● ●● ● ● ●● ●● ●● ● ● ● ●● ●● ●● ●● ●● ●● ●● ●● ●● ● ● ● ● ● ●● ●● ● ● ● ●● ●● ●● ● ● ● ●● ●● ● ●● ● ●● ● ●●● ● ● ● ●● ●●● 0.8 ● ● ● ● ● ●● ●●●● ● ●●●● ● ● ●● ●● ● ●●●●● ● ● ●● ● ● ●●●●● ●●● ●● ●●●●● ● ● ● ● 400 ● ● ● ●●● ● ● ● ●●● ● ● ● ● ●● ●●● ●● ● ●● ● ●●● ● ●●● ● ●●● ●●●● ●● ● ● ●●● ● ●●●●● ●●● ●● ● ●●●●●●●●●●● ● ●● ● ● ● ● ● ● ●● ●● ● ●●●● ● ●● ●● ●● ●● ●● ●● ● ● ●● ●●●● ●● ● ●● ● ●● ●● ● ● ● ●●●●●● ●● ●●●● ● ● ● ●●●●● ●● ●●●●●●●●●●●●●●●●●● ●● ●● ●● ● ●●● ●● ● ●● ● ●● ●● ● ●● ●● ●● ● ● ●● ●● ●● ●● ● ●● ●● ●● ●● ● ●●●● ● ● ● ● ● ●● ● ● ● ●● ● ●● ●●●●● ●● ●● ● ●● ●● ● 4 ● ●● ●● ● ●● ● ● ●● ● ● ●● ● ●● ● ● ●● ● ● ● ● ●● ●●●●●●●● ●● ●● ●● ● ● ● ●● ● ●● ●● ●● ● ●●●● ● ● ●● ● ● ●●● ●● ●● ●● ● ● ● ●● ●● ●● ● ● ● 0.6 ●●●●●● ●●●● ● ●● ●●●● ● ●● ●● ●● ●● ●●●● ● ● ● ●●●●●● ● ● ●● ●● ●●●● ● ●●●● ● ●● ● ●● ● ● ●● ● ● ● ●● ●●● ●●● ●● ● ●●● ● ● ● ●● ● ● ● ●●● ● ● ● ● ● ● 1.0 ●●●●● ●● ●●●●● ●●●●● ●●● ● ●●●● ●●● ● ●●●●● ● ●●● ● ●●● ●● ● ●● ● ● ●● ● ● ●●● ● ● ●●●● ● ●● ● ● ●● ● ● ●●● ●●●● ● ●● ● ● ● ● ●● ● ● ●● ● 200 ● ●●● ● ●● ●●●●● ●●●●● ●●●● ● ●●●●● ●●●●● ●●●●● ●●●●● ●● ● ●●●●● ●●● ● ●● ●●● ● ● ● ● ● ●● ●● ●●●● ● ● ● ● ●●● ● ● ●● ●●● ●● ●● ●●● ●● ● ● ●● ● ●● ● ●● ●● ●●●● ● ●● ● ●●●● ● ●●●● ● ● ●●● ●● ● ●●● ●● ●●●● ● ● ● ● ●● ●● ● ● ●●● ●●● ●● ●● ●● ● ●●● ● ●●●● ●● ●●● ●●● ●●●●● ●●●●● ● ●●● ●●● ●●●●● ●●●● ●●●● ● ● ● ● ● ● ●● ●● ● ●●● ● ●●● ●●●● ●●● ●● ●●●●●● ●● ●●●●● ●●●● ●●● ●●●●● ● ●● 3 ●●● ●● ●●● ●●● ●● ●● ●●●● ● ●●● ●●● ●● ●● ●● ● ●●●● ● ●●●● ●●●●● ●●● ●●●● ●●●● ● ● ●● ● ●●●● ●●●●●● ●●●● ●●●● ●● ● ● ●● ●● ●● ●● ●●●●● ●●● ● ●●● ● ●● ● 0.4 ●● ● ●● ●● ●● ●● ● ● ● ●● ● ●● ● ●●● ● ● ●●● ●● ●● ●● ● ●● ●● ●●●● ● ●● ● ●●●● ● ●●●● ●● ● ●●●● ●●● ●● ●●● ●●●● ●●● ●● ● ●●●● ●● ● ●● ● ● ●●●●●●● ● ●●●●●● ● ●●●●● ● ●● ●● ●● ●● ●● ●●● ● ●● ● ●●●● ● ●●●● ●● ● ●●●● ● ●●● ●●● ●● ●●●●● ●●● ●● ●●● ● ●● ●●● ● ● ●●● ●● ● ●●●●● ● ● ●● ●● ● ● ●●●●●●●● ●●●●● ●● ● ●● ● ●●● ●● ● ●● ● ●●●●● ● ●● ●● ● ●●●●●●● ●● ● ● ● ●●●●●●●●● ●● ● ●● ●●● ●●●● ● ●● ●●●●●● ●●●●● ●● Second component ● ● ●●●●● ●●●●● ●● ● ●● ●● ●● ●● ● ● ● ●● ●●● ●● ●●● ●● ● ●● ● ● ●●●●● ●● ●●● ●● ●●●●● ●●●● ●● ● ●●●● ●●● ●●● ●● ●● ● ● ●●●●●●●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●●●●●●● ● ●● ● 0 ●● 0.5 ●●● ●● ●● ● ●● ● ● ● ● ●● ●●●●●● ●●●●● ● ● ●● ● ●●● ●● ●● ●● ●● ●● ●●● ● ● ●● ● ● ● ● ● ●●● ● ●●●●●●●●●● ● ● ●●●●●●●●● ●●● ●●●●●●●● ● ● ●●●●●●●● ●●●●● ●●●●●● ●● ● ●● ●● ●●● ● ●● ●● ●● ● ● ●● ● ●● ● ●● ● ● ●●● ● ● ●●●● ●●●●●●● ● ●● ●●● ● ●●● ●● ●●●●●● ●● ●● ●● ●● ●●●●●●● ●● ● ● Third component ● ● ● ●● ● Second component ●●●●● ●●● ● 2 ●● ●●● ●●● ●● ● ●● ● ● ●● ● ● ●●● ● ●●● ● ●● ● Third component ● ● 0.2 ●● ● ● ● ●● ● ●● ●● ● ● ● ●●●● Third component ●●● ● ● ● ●● Second component ● ●●●● ● ● ● ● ●● ● ● ●● ● ● ● ● ●● ●● ●●●● ● ● ● ● ●● ● ● ●● ● ●● ●●●● ● ● ● ● ●●●●● ●● ●● ●● ●● ● ● ● ● ●● ● ● Third component ●●●● ● ● ● ●●●● ● ●● ● ●● ● ●●●●● ●● ● ● ● ●●●●●●●●●●● ●● ● ● ● ● ● ●● −200 ●● ● ●● ● ●● ●●●●● ● ● ●●●● ●● ● Second component ● ●● ●●●●● ● ● ●●●●● ● ●● ●●●●● ●●● ● ● ●● ● ●● ●● ●●●●● ● ●● ● ● ●●●●●●● ●● ●● ● ●●●●● ●●●●● ●●●●● ● ●●●●●●● ● ● ●● ● ●● ●● ● ●● ●● ● ●● ● ●● ● ●● ● ●● ● ● ●● ●● ● ●●●● ● ● ● ●●● ●● ●●●●●●●●● ● ● ● ●●● ● ● ● ●● ●●●●● ●●●●● ●● ●● ● ●●●●● ●● ● ●●● ● ●●●●●● ●●●●● ● ● ● 2.5 ● ●●● ●●● ●● ●● ●●●●● ● ●● ● ● ●● ●●● 0.0 ● ●● ● ●● ● 1 ● ●●● ●● ●● ● ●● ● ●● ●●● ● ●●● ● ●● ●●●● ● ●● ●● ● ● ●● ● ● ●●●●●●● ●●●●●● ●● 0.0 ● ●● ●●●● ● ● ●● ● ● ● ●●● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ●● ● ● ●● ●●● ● ● ● ●●● ●● ● ●● ● ●● ● ●● ●● ●● ● ● ●●● ●●●● ●● ● ●●● ●● ●●● ● ●●● ● ● ●● ● ● ● ●● ● ●● ●●●●● ● ●● ● ●● ● ● ● ●● ● ● ●●● ● ●● ●● ●●●● ●● ●● ●● ● ●●●● ● ● ●● ●● ●●● ● ●●● ●● ●● ● ● ●● ● ●● ●● ● ●● ● ● ●●● 2.0 ● ●● ●● ●●● 3 ● ● ● ●● ● −400 ● ●● ●● ●● ●●● ●● ●● ●● ● ● ● ● ● ● ● ●● ● ●● ●● ●● ●● ● 1000 ●● ● ● ●●● ●● ● ● ● ● ● ●● ● ● ●●● ●● ●● ●● ● ●● ● ● ● ●● ●●●● ● ● ●● ●● ● ● ● ●● ●●● ●●●● ● ● ●●●● ● ●● ● ●● ●● ● 2 −0.2 ● ● ● ● ●●●●●●●● ● 0 ● ●● ● 1.5 ● ●●● ●● ●● ● ●●● 1.5 ●● ● 800 ● ● ●● ● ● ●● ● ●●● ●●●● ●● ●● −0.5 ● ● ● ● ● ●● ●● ● ●● 1 ● 1.0 ●● ●● ● ● ●● ● ●●● 600 ●● ●● −600 ● ● ● ● ●● 1.0 ●● ● ● ●● ● ● ● 0 ● 0.5 ●●● ●● ●● ● −0.4 −1 ● ● ●● ●● ●● ● ● ● ● ● 400 ●● ●● ● ●● ●● ● −1 ● ●● 0.0 0.5 ● ● ● ● ● ●● ● ●● −1.0 ● ● ●● 200 ● ● ●● ● −2 −800 ●● −0.5 0.0 ● ●● ● ●● −0.6 ●● ● ●● ● −2 0 ●● ● −1.0 ● ●● ● ● ● ● −3 ● −200 −0.5 ●● −4 −1.5 ● −1000 −0.8 −1.5 −3 −400 −2.0 −1.0 −5 −800 −600 −400 −200 0 200 400 600 −1.0 −0.5 0.0 0.5 1.0 1.5 2.0 2.5 −2 −1 0 1 2 3 4 −8 −6 −4 −2 0 2 4 First component First component First component First component Hierarchical clustering of the cb data Hierarchical clustering of the fp data Hierarchical clustering of the mcs data Hierarchical clustering of the lpcs data 300 14 10 12 250 8 10 10 200 8 6 150 6 100 4 5 4 50 2 2 0 0 0 0 dist(D) dist(D) dist(D) dist(D) hclust (*, "complete") hclust (*, "complete") hclust (*, "complete") hclust (*, "complete") ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 21 / 40
  • 22. Case studies Enzyme function prediction Ranking enzymes Ranking enzymes For a query enzyme with unknown function, construct a ranking of a database of annotated enzymes, based on structure. The top of the ranking has likely the same function as the query. unsupervised: for a given query enzyme with unknown function, rank the database according to the structural similarity with the query supervised: first a ranking model h(v , v ) is constructed by using an independent training set. Subsequently for a given query enzyme v with unknown function, rank the enzymes vi from the database according to h(v , vi ) ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 22 / 40
  • 23. Case studies Enzyme function prediction Results ranking enzymes Table II HE RESULTS OBTAINED FOR UNSUPERVISED AND SUPERVISED RANKING . F OR EACH COMBINATION OF CAVITY- B PERFORMANCE MEASURE THE PERFORMANCE IS AVERAGED OVER THE DIFFERENT FOLDS AND QUERIES , WITH WEEN PARENTHESES . F OR EVERY ROW THE BEST RANKING MODEL IS MARKED IN BOLD , WHILE THE WORST MO BY AN UNDERSCORE . cb fp mcs lpcs RA 0.9062 (0.0603) 0.8815 (0.0689) 0.8923 (0.0692) 0.8877 (0.0607) MAP 0.9321 (0.1531) 0.7207 (0.235) 0.8846 (0.1578) 0.7339 (0.2074) Unsupervised AUC 0.9636 (0.0795) 0.8655 (0.1387) 0.9393 (0.0919) 0.8794 (0.1126) nDCG 0.9922 (0.0329) 0.9349 (0.1424) 0.9812 (0.0498) 0.9471 (0.1112) RA 0.9951 (0.017) 0.995 (0.015) 0.9944 (0.0112) 0.9952 (0.0156) MAP 0.9991 (0.0092) 0.9954 (0.0432) 0.9989 (0.0076) 0.9835 (0.0797) Supervised AUC 0.9976 (0.0005) 0.9967 (0.0184) 0.9975 (0.0024) 0.9934 (0.0368) nDCG 0.9968 (0.0171) 0.9942 (0.0424) 0.987 (0.0398) 0.9812 (0.0673) A of the cb data Kernel PCA of the fp data Kernel PCA of the mcs data Kernel PCA of ●● ● ● ●● ●● ● ● ●● ●● ●● ● ●● ●● ●● ●● ●● ●● ●● ●● ●● ● ●● ● 0.8 ● ● ● ● ● ● ● ●● ●●●● ● ●●●● ● ● ● ●● ●●● ●● ●●●●● ● ● ● ●●● ● ● ●● ●●● ●●● ●● ● ●● ● ●●● ● ●●● ● ●●●● ● ● ●●● ● ●●●●● ●●● ●● ● ●●●●●●●●●●●●● ● ● ●● ●● ● ●●●● ● ●● ●● ●● ●● ●● ●● ● ● ●● ●●●● ●● ● ●● ● ●● ●● ● ● ● ●●●●●● ●● ●●●● ● ● ● ●●●●● ●● ●●●●●●●●●●●●●●●●●● ●● ●● ●● ● ●● ●● ●● ●● ● ●● ●● ●● ●● ● ●●●● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ●●●●● ●● ●● ● 4 ● ●● ●● ● ●● ● ● ●● ● ● ●● ● ●● ● ●● ●●●●●●●● ●● ●● ● ● ●● ●● ● ● ● ● ● ●● ● ●●● ●● ●● ●● ● ● ● ●● ●● ●● ● ● ● ● ● 0.6 ● ● ● ●● ●● ●● ●●●● ● ●● ●● ● ●● ● ● ●● ● ● ●● ●●● ●●● ●● ● ●●● ● ● ● ●● ● ● ● ●●● ● ● ● ● 1.0 ●● ●● ●● ● ● ●● ● ●●● ● ●● ● ● ●●● ● ●● ● ● ●● ● ● ●●●● ● ●● ●● ● ● ● ●● ● ● ● ●●● ● ● ●●●●● ●●● ●●● ●●●● ● ● ●●●● ●● ●● ●● ● ● ● ●● ●●● ●●● ●●●● ● ● ● ●● ●●●●● ●●●● ● ● ●●● ●● ●●● ●● ● ●●● ●●● ●●●● ●●● ● ● ● ● ● ●●● ●●● ● ●●●● ●● ●●● ●●●● ●● ●●●● ●●● ● ●●●●● ●●●●●●● ●●●●● ●● ●●● ● ●●● ● ●● ●● ●●● ●●●● ● ●● ● ●●●● ● ●●●● ●●●●● ●●●●● ● ●● 3 ●●● ●● ●● ●● ●●●● ● ●● ● ●● ●●● ●●● ●● ●● ●● ●● ●● ●● ●● ● ●● ●● ●● ●●●●● ●●●● ●●●● ●●●● ● ●●●● ●●●●●● ●●●● ●● ●●● ● ●● ● ● ●●● ● ● 0.4 ●● ● ●● ● ● ●● ●●●● ● ●●● ●● ●● ●● ●● ●● ●● ●●● ●● ●●●● ●● ● ●●●● ●●●● ● ●● ●●●● ●● ● ●●●● ●●● ●● ●●● ●●●● ●●● ●● ● ●●●● ●● ● ● ●●●●●●● ● ● ● ●●● ● ●●●●● ● ● ●●● ●● ●● ●● ●● ● ●●●● ●● ● ● ●● ●●● ●●●● ●●●●● ●● ●●● ●● ●●● ● ●● ●●● ● ●●●●●●● ●● ● ●● ● ●●●●● ●●● ● ● ● ●●●●●●●● ●●● ●●● ●● ●●● ●● ● ● ●●●●●●●●● ●●●● ● ● component ● ● ●● ●●●●● ●● ● ●●●●● ●● ● ●● ●● ●● ● ● ● ●● ●● ● ● ● ●●●●●● ●● ●●●● ●● ● ●● ● ● ●●● ●●●● ●●● ●●●● ● ● ●●●●●●●● 0.5 ●●● ●●● ● ●●● ●●● ●●● ● ● ● ●● ● ● ●● ● ● ● ● ●●● ●● ● ● ●● ● ● ●● ●● ● ● ●● ● ●● ●● ●●● ● ●● ● ●●● ●● ● ●● ●● ● ● ● ●●●●●●●●●● ● ● ●●●●●●●●● ●●● ●●●●●●●● ● ● ●●●●●●●● ●●●●● ●●●●●● ●●●● ●●●●●●● ● ●● ●●● ● ●●● ●● ●●●●●● ●● ●● ●● ●● ●●●●●●● ●● ● ●● ●● ● ●● ● ● ●●● ● ●● ● omponent ● ●●●●● ●●● ● ● ●● ● ● 2 ●● ●●● ●●● ●● ● ● ●● ● ● ● ●● ● ● ●●● ● ●●● 0.2 ●● ● ●● ●● ●● ● ● ● ●●● ● ponent ●●●● ●● ● ● ●● ●● ponent ● ● ● ●●●● ● ● ● ● ●● ● ● ● ● ● ● ●●●● ● ● ● ● ● ●● ●● ●●●● ● ● ● ●● ● ● ●●●●● ●● ●● ●● ●● ● ● onent ●●●● ● ● ● ●●●●● ● ● ●● ● ●●●●●●● ● ● ● ● ●● ● ●●●●●●●●●●● ●● ● ●● ●● ●●●●● ● ● ●●●● onent ● ●●●●● ●● ● ●●● ● ●● ●● ●● ●●●● ●●● ● ir. Michiel Stock (KERMIT) ●● ● ● ● ●●● ● ●● ● ●● ●●● ●● ●● ●● ● ● ●● ● ●●● ●●● ● ● ● ● ● Kernels for Bioinformatics ●●● ● ● ● ● ●●● ● ●●●●● ●●●●● ●●●●●●● ●● ●●● ●●●●●●●● ● ●●●● ●●●●● ● ●●● ●●● ●●●●● ●●● ● ● ● ● ●●●●● ●● ●● ●●●●● ●●●●● ● ● ● ●● ●●● ● ●●●●●● ●●●●● ● ● ●● ● ●●●●● ● ● ● ●● ● November 2012 2.5 23 / 40 0.0 ● ●● ● ● ●●● ●● ●● ● 1 ● ●● ●● ● ● ● ●● ●●●● ● ●●● ●● ●●●●●●● ●●●●●● ● ●● ●●● ● ●●● ● 0.0 ● ●●●● ● ● ●● ● ●● ●● ● ● ●● ● ●● ● ●● ● ● ●● ●●● ● ● ● ●●● ●● ● ●● ●● ● ● ● ●● ●●●● ●● ●●●● ● ● ●●● ●● ●● ● ●● ● ● ● ● ●● ● ● ● ● ●● ●●●●● ● ● ●● ●● ● ● ●
  • 24. Case studies Enzyme function prediction Supervised ranking preserves hierarchies (1) ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 24 / 40
  • 25. Case studies Enzyme function prediction Supervised ranking preserves hierarchies (2) Unsupervised Unsupervised Unsupervised Unsupervised cb fp mcs lpcs 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.50 1.0 140 ● ● ● 0.45 120 0.8 100 0.40 0.6 prediction prediction prediction prediction 80 0.35 ● ● 60 ● 0.4 ● 0.30 ● ● ● 40 ● 0.2 0.25 20 ● ● ● ● ● ● ● ● ● ● ● ● 0.20 ● 0.0 ● ● ● 0 0 1 2 4 0 1 2 4 0 1 2 4 0 1 2 4 cat. cat. cat. cat. similarity similarity similarity similarity Supervised Supervised Supervised Supervised cb fp mcs lpcs 4 4 4 4 3 3 3 3 prediction prediction prediction prediction 2 2 2 2 ● ● 1 1 1 1 ● ● ● ● ● ● ● ● ● ● 0 ● ● ● 0 ● 0 0 ● ● ● ● ● ● ● −1 0 1 2 4 0 1 2 4 0 1 2 4 0 1 2 4 cat. cat. cat. cat. similarity similarity similarity similarity ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 25 / 40
  • 26. Case studies Protein-ligand interactions Predicting protein-ligand interactions Problem statement Predict the binding interaction between a given protein and a ligand (small molecule). Learning Molecular docking. Training using the Karaman dataset: 317 kinase targets 38 kinase inhibitors For each combination the dissociation coefficient Kd in nM is known. ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 26 / 40
  • 27. Case studies Protein-ligand interactions Karaman dataset © 2008 Nature Publishing Group http://www.nature.com/naturebiotechnology ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 27 / 40
  • 28. Case studies Protein-ligand interactions Building a model Features CavBase similarity for proteins Tanimoto kernel from the fingerprints derived from ligands Virtual docking results Model types: Classification by specifying a cutoff value, using RLS. Conditional ranking, use one type of object to construct a ranking of the other type according to binding energy. ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 28 / 40
  • 29. Case studies Protein-ligand interactions Protein-ligands results classification Test sampling Cutoff [nM] AUC 1000 0.621584 (0.104163) new ligand 10000 0.653330 (0.107727) 1000 0.812184 (0.185627) new protein 10000 0.801310 (0.157205) Cutoff value hardly matters Generalizing to new ligand harder than for new protein ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 29 / 40
  • 30. Case studies Protein-ligand interactions Protein-ligands results ranking Testing scheme: new query for the same database Query type Ranking error Ligand 0.324000 (0.129307) Protein 0.32799 (0.088344) Query type does not matter (much) Using protein as query somewhat more reliable ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 30 / 40
  • 31. Case studies Microbial ecology Predicting microbial interactions Problem statement How do heterotrophic bacteria influence the growth of methanotrophic bacteria? Dataset: 10 methanotrophs 27 heterotrophs Of each combination a time series of their collective growth (OD) was measured for 14 days. ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 31 / 40
  • 32. Case studies Microbial ecology Concept Methanotrophs Heterotrophs Carbon compounds Methane vitamins? antibiotics? Features: ⌦ ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 32 / 40
  • 33. Case studies Microbial ecology Experimental setup ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 33 / 40
  • 34. Case studies Microbial ecology Optical density time series Meth_5 and Hetero_2 Meth_7 and Hetero_10 0.20 0.30 0.25 0.15 max OD ● 0.20 ● ● ● ● 0.10 max OD OD OD 0.15 ● 0.10 ● 0.05 max increasment OD ● 0.05 max increasment OD ● ● ● 0.00 ● ● ● ● 0.00 ● 0 5 10 15 0 5 10 15 Time (days) Time (days) Three types of labels were derived from these plots: maximal optical density maximal increase in optical density time of maximal increase in optical density ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 34 / 40
  • 35. Case studies Microbial ecology Labels for bacterial combinations Color Key and Histogram Heat map of the log. of 6 max. density Count 4 2 0 −6 −4 −2 0 Value H 15 H 13 H 23 H 19 H 20 H 17 H 21 H 22 H2 H 24 Heterotrophs H 14 H 16 H8 H3 H1 H 25 H5 H4 H 11 H9 H7 NMS H 18 H 10 H 12 H6 NMS M9 M6 M7 M3 M1 M5 M2 M4 Methanotrophs M8 ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 35 / 40
  • 36. Case studies Microbial ecology Regression results Pairwise regression of the labels using support vector regression. Testing is done by withholding each heterotroph in a leave-one-out scheme. Label MSE/var Spearman cor. Max. OD 0.8248 0.6875 Max. incr. OD 0.7888 0.57708 Time max. incr. OD 0.9694 0.3839 This is a hard problem! Exact experimental conditions very important! ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 36 / 40
  • 37. Case studies Microbial ecology Extra feature selection Idea Look for the most relevant genes for interaction in the heterotrophs using lasso regression (in combination with the LARS algorithm) or Regularized Random Forests. LARS: LASSO 0 1 2 5 13 15 19 21 34 39 47 65 For example, max. OD seems ******* * ** * 40 * ** ** *** ** * ** * * ** **** * ** 2 87 ** to be determined by genes * * ** ** *** * ** ** * * * * * * * * **** * * * * 445 220 * ** * * ********* * * * * * Standardized Coefficients * * ** **** * ** * ** ********* * ** ** **** * * * ** * * * * ** ** *** * ** * * ********* * 0 related to * * * ** * * * * ** * * * * * * * * * ** ** * ** **** ** * ** **** ** * * * ** ** *** ********* * *** * ** * * * * ** * * * * ** ** * ** **** ** * * ** ** *** * * *** *** * ** * ********* * * * **** ** * * * * * * * * *** ** * ********* * * * * * * ** ** *** * ** ** * * **** * ** * ** **** * ** ** ** *** * ** ******* * ** * * * ** * ** * * ** 1234 methenyltetrahydrofolate. −2 **** * * * ** * * ** ** * ** ** * * ** * * ****** * ** * ** * * Take with a large grain of −4 * * ** ** * ** **** * salt! −6 ** ** ** * ** * ** 238 ** * * * * ****** * ** * 0.0 0.2 0.4 0.6 0.8 1.0 |beta|/max|beta| ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 37 / 40
  • 38. Conclusions Take-home messages Use kernels for complex structured data. Relations can be learned by treating a pair of objects as a special kind of structured object. Predicting a ranking is in many cases a more relevant answer to a research question. Posing the right research question is of vital importance when building models! ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 38 / 40
  • 39. Conclusions Further reading I [1] A. Ben-Hur and W. S. Noble. Kernel methods for predicting protein-protein interactions. Bioinformatics, 21 Suppl 1:i38–46, June 2005. [2] S. Erdin, A. M. Lisewski, and O. Lichtarge. Protein function prediction: towards integration of similarity metrics. Current Opinion in Structural Biology, 21(2):180–8, Apr. 2011. [3] L. Jacob and J.-P. Vert. Protein-ligand interaction prediction: an improved chemogenomics approach. Bioinformatics, 24(19):2149–56, Oct. 2008. [4] T. Pahikkala, A. Airola, M. Stock, B. De Baets, and W. Waegeman. Efficient regularized least-squares algorithms for conditional ranking on relational data. Machine Learning, Submitted, 2012. [5] B. Sch¨lkopf, K. Tsuda, and J.-P. Vert. Kernel Methods in o Computational Biology. 2004. ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 39 / 40
  • 40. Conclusions Further reading II [6] M. Stock. Learning pairwise relations in bioinformatics: three case studies. Master’s thesis, Ghent University, 2012. [7] J.-P. Vert, J. Qiu, and W. S. Noble. A new pairwise kernel for biological network inference with support vector machines. BMC Bioinformatics, 8(S-10), Jan. 2007. [8] W. Waegeman, T. Pahikkala, A. Airola, T. Salakoski, M. Stock, and B. De Baets. A kernel-based framework for learning graded relations from data. IEEE Transactions on Fuzzy Systems, 99:1, 2012. ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 40 / 40