SlideShare a Scribd company logo
AOT LAB
                                            DII, UNIPR



    SOCIAL
    NETWORK
    ANALYSIS
    Enrico Franchi (efranchi@ce.unipr.it)



1
Outline

    SNA = Complex Network Analysis on Social Networks


    Notation & Metrics             Degree Distribution
                                   Path Lengths
                                   Transitivity
    Models                          Random Graphs
                                    Small-Worlds
                                    Preferential Attachment


   Models Discussion


    Conclusion

                                                              2
Network                                       Directed Network
G = (V, E) E ⊂ V        2
                                              k out
                                                      = ∑ A ij        k = ∑ A ji
                                                                       in

{(x, x) x ∈V } ∩ E = ∅
                                               i                      i
                                                           j                j

                                              ki = kiin + kiout

                                              Undirected Network
Adjacency Matrix
                                              A symmetric
       ⎧1 if (i,j) ∈E
A ij = ⎨
       ⎩0 otherwise                           ki = ∑ A ji = ∑ A ij
                                                       j          j




                        px = # {i ki = x }
                            1
Degree Distribution
                            n
Average Degree          k =n   −1
                                    ∑k    x
                                    x∈V
                                                                                   3
Measure of Transitivity

                                   ()
                                         −1
                                    ki
Local Clustering Coefficient   Ci = 2         T (i)
                                                  T(i): # distinct triangles with i as vertex




                                  1
Clustering Coefficient         C = ∑ Ci
                                  n i∈V



C=
   ( number of closed paths of length 2 ) = ( number of triangles ) × 3
       ( number of paths of length 2 )     ( number of connected triples )


                                                                                         4
Shortest Path Length and Diameter
     scalar operations
                               AB = A + .⋅ B                The matrix product depends from

 ( A,+,⋅)                     [ AB]ij = ∑ A ik ⋅ Bkj
                                                            the operations of the semi-ring

                                           k

Set of Adjacency Matrices
                                                                 min


    Other matrix products make sense: e.g.,    ( A,+,^ ) or ( A,^,+ )

    We consider:                       (
                   Sk (M) = M + .^ M k ^ .+ M k         )
    Shortest path lengths matrix:   L = ( Sn … S1 ) ( M )

    Diameter:   d = max L           Average shortest path:       = Lij
                         ij
                                                                                         5
Computational Complexity of ASPL:

  All pairs shortest path matrix based (parallelizable):     ( ) α ≈ 3/ 4
                                                           O n   3+α



  All pairs shortest path Bellman-Ford:                    O (n )3



  All pairs shortest path Dijkstra w. Fibonacci Heaps:     O ( n log n + nm )
                                                                 2




Computing the CPL

 x = M q (S)       q#S elements are ≤ than x and (1-q)#S are > than x

 x = Lqδ (S)       q#S(1-δ) elements are ≤ than x and (1-q)#S(1-δ) are > than x



Huber Algorithm

      2 2 (1 − δ )
                        2
                              Let R a random sample of S such that #R=s, then
  s = 2 ln
     q      δ 2              Lqδ(S) = Mq(R) with probability p = 1-ε.
                                                                                  6
2 2 (1 − δ )
                   2

s = 2 ln
   q      δ 2




                       7
Facebook Hugs Degree Distribution


10000000                                                  Nodes: 1322631 Edges: 1555597
                                                          m/n: 1.17             CPL: 11.74
 1000000
                                                          Clustering Coefficient: 0.0527
                                                          Number of Components: 18987
  100000
                                                          Isles: 0

   10000
                                                          Largest Component Size: 1169456


    1000

                                                                  For large k we have
     100
                                                                  statistical fluctuations

      10



       1
           1                      10                              100                        1000


               For small k power-laws do not hold                                                   8
Many networks have
power-law degree distribution.                       pk ∝ k         −γ
                                                                         γ >1
•   Citation networks
                                                     k   r
                                                             =?
•   Biological networks

•   WWW graph

•   Internet graph

•   Social Networks

                             Power-Law: ! gamma=3

              1000000

               100000

                10000

                 1000

                  100

                   10

                     1

                  0.1                                                           9
                         1      10             100           1000
Erdös-Rényi Random Graphs
                                                                       Connectedness
                                                          p            Threshold     log n / n
G(n, p)
                                                   p
G(n, m)                              p
                                           p
                                                              p
                                                                                p
Ensembles of Graphs                            p                       p
When describe values of                            p
properties, we actually the          p                                      Pr(Aij = 1) = p
expected value of the property


d := d = ∑ Pr(G)⋅ d(G) ∝
                          log n
                                                       Pr(G) = p       m
                                                                           (1− p)
                                                                                 () n
                                                                                    2 −m

         G               log k
    ⎛ n⎞
 m =⎜ ⎟ p             k = (n − 1)p       C = k (n − 1)    −1
    ⎝ 2⎠
     ⎛ n − 1⎞ k                                                   k
                                                                   k
pk = ⎜      ⎟ p (1− p)
                       n−1−k
                                 n→∞           pk = e   − k
                                                                                           10
     ⎝k ⎠                                                         k!
p


      Watts-Strogatz Model
      In the modified model, we only add the edges.


   ki = κ + si             ps = e   −κ s   (κ p )   s
                                                          C=
                                                                    3(κ − 2)
                                             s!              4(κ − 1) + 8κ p + 4κ p 2
Edges in
the lattice # added
                           pk = e   −κ s   (κ p )   k−κ
                                                          ≈
                                                             log(npκ )
            shortcuts
                                           ( k − κ )!          κ p
                                                                 2



                                                                                   11
Strogatz-Watts Model - 10000 nodes k = 4
                 1
                                                CPL(p)/CPL(0)
                                                C(p)/C(0)
                0.8
CPL(p)/CPL(0)




                0.6
  C(p)/C(0)




                0.4


                0.2


                 0
                      0   0.2      0.4   p   0.6         0.8          1

                           Short CPL
                                             Large Clustering Coefficient   12
                           Threshold
                                             Threshold
13
Matt Britt ©
Barabási-Albert Model                                      Connectedness              log n
                                                           Threshold                log log n

BARABASI-ALBERT-MODEL(G,M0,STEPS)              Pr(V = x ) =            ∑       Pr(E = e) =
  FOR K FROM 1 TO STEPS                                            e∈N ( x )

    N0 ← NEW-NODE(G)                                               kx   2k x
                                                                 =    =
    ADD-NODE(G,N0)                                                 m ∑ kx
    A ← MAKE-ARRAY()                                                            x
    FOR N IN NODES(G)
                                                      −3
      PUSH(A, N)                             pk ∝ x
      FOR J IN DEGREE(N)
                                               log n
        PUSH(A, N)                        ≈
    FOR J FROM 1 TO M                        log log n
      N ← RANDOM-CHOICE(A)
                                                   −3/4
      ADD-LINK (N0, N)                    C≈n
                                                                         Scale-free entails
                                                                         short CPL
                         Transitivity disappears                                                14
                         with network size                    No analytical proof available
OSN               Refs.        Users Links <k> C CP         d      γ    r
                                                      L
Club Nexus    Adamic et al     2.5 K 10 K 8.2 0.2 4         13    n.a. n.a.
Cyworld        Ahn et al       12 M 191 M 31.6 0.2 3.2      16         -0.1
Cyworld T      Ahn et al        92 K 0.7 M 15.3 0.3 7.2    n.a.   n.a. 0.4
LiveJournal   Mislove et al     5 M 77 M 17 0.3 5.9         20          0.2
Flickr        Mislove et al    1.8 M 22 M 12.2 0.3 5.7      27          0.2
Twitter        Kwak et al      41 M 1700 M n.a. n.a. 4     4.1         n.a.
Orkut         Mislove et al     3 M 223 M 106 0.2 4.3        9    1.5 0.1
Orkut          Ahn et al       100 K 1.5 M 30.2 0.3 3.8    n.a.   3.7 0.3
Youtube       Mislove et al    1.1 M 5 M 4.29 0.1 5.1       21          -0
Facebook       Gjoka et al      1 M n.a. n.a. 0.2 n.a.     n.a.        0.23
FB H           Nazir et al      51 K 116 K n.a. 0.4 n.a.    29         n.a.
FB GL          Nazir et al     277 K 600 K n.a. 0.3 n.a.    45         n.a.
BrightKite    Scellato et al    54 K 213 K 7.88 0.2 4.7    n.a.        n.a.
FourSquare    Scellato et al    58 K 351 K 12 0.3 4.6      n.a.        n.a.
LiveJournal   Scellato et al   993 K 29.6 M 29.9 0.2 4.9   n.a.        n.a.
Twitter        Java et al       87 K 829 K 18.9 0.1 n.a.     6         0.59
Twitter       Scellato et al   409 K 183 M 447 0.2 2.8     n.a.        n.a.
                                                                              15
Static           Deg       C         Rigid

    ER       Yes              Poisson   Low       -

    WS       Yes              Poisson   Ok        Yes

    BA       No               PL γ=3    Fixable   Yes

•   Moreover:

•    Mostly no navigability

•    Uniformity assumption

•    Sometimes too complex for analytic study

•    Few features studied

•    Power-law?

                                                          16
Alternative models for degree distributions
Power-laws are difficult to fit.
When they do, there are often better distributions.


        Power-law with cutoff almost always fits better than plain power-law.

                                   f (x;γ , β ) = x −γ eβ x
        Sometimes the log-normal distribution is more appropriate

                                    1           ⎛ − ( log(x / m))2 ⎞
                  f (x;σ , m) =             exp ⎜                  ⎟
                                xσ (2π )1/2
                                                ⎝       2σ  2
                                                                   ⎠

        Most of the times random and preferential attachment processes concur

                        F(x;r) = 1− (rm)1+r (x + rm)−(1+r )
              r→0                                               r→∞
                                                                                      17
     scale-free                                               negative exponential dist.
Massachussets       1st run: 64/296 arrived, most
                                 Boston   delivered to him by 2 men
Nebraska
                                          2nd run: 24/160 arrived, 2/3
                                          delivered by “Mr. Jacobs”
        Omaha
                                          2 ≤ hops ≤ 10; µ=5.x
      Wichita               6 Degrees
                                          CPL, hubs, ...

    Kansas                                ... and Kleinberg’s Intuition


Milgram’s Experiment
•     Random people from Omaha & Wichita were asked to
      send a postcard to a person in Boston:

•     Write the name on the postcard

•     Forward the message only to people personally known
                                                                          18
      that was more likely to know the target
Biased Preferential Attachment
At each step:

    A new node is added to the network and is assigned to one of the
    sets P, I and L according to a probability distribution h
            +
    e0 ∈       edges are added to the network

     for each edge (u,v) u is chosen with distribution D0 and:

         if u ∈ I, v is a new node and is assigned to P;

         if u ∈ L, v is chosen according to Dγ.

                  ⎧(β + 1)(ku + 1)            u ∈L
           β      ⎪
          D (u) ∝ ⎨ ku + 1                    u ∈I
                  ⎪0                          u ∈P
                  ⎩

          No analytic results available.
                                                                       19
Transitive Linking Model [Davidsen 02]
 Transitive Linking
    I    At each step:
         TL: a random node is chosen, and it introduces two other nodes that
             are linked to it; if the node does not have 2 edges, it introduces
             himself to a random node
        RM: with probability p a node is chosen and removed along its edges
             and replaced with a node with one random edge
    I    When p ⇤ 1 the TL dominates the process:
            I   the degree distribution is a power-law with cutoff
            I   1 C = p(⌅k ⇧ 1), i.e., quite large in practice
    I    For larger values of p the two different process concur to form an
         exponential degree distribution
    I    for p ⇥ 1 the degree distribution is essentially a Poisson
         distribution


  Instead of p it would make sense to have distinct p and r
Bergenti, Franchi, Poggi (Univ. Parma)   Models for Agent-based Simulation of SN   SNAMAS ’11   11 / 19
  parameters for nodes leaving and entering the network

                     Few analytic results available.
                                                                                                          20
[1]	

  Dorogovtsev, S. N. and Mendes, J. F. F. 2003 Evolution of Networks: From Biological Nets
   to the Internet and WWW (Physics). Oxford University Press, USA.
[2]	

  Watts, D. J. 2003 Small Worlds: The Dynamics of Networks between Order and
   Randomness (Princeton Studies in Complexity). Princeton University Press.
[3]	

  Jackson, M. O. 2010 Social and Economic Networks. Princeton University Press.
[4]	

  Newman, M. 2010 Networks: An Introduction. Oxford University Press, USA.
[5]	

  Wasserman, S. and Faust, K. 1994 Social Network Analysis: Methods and Applications
   (Structural Analysis in the Social Sciences). Cambridge University Press.
[6]	

  Scott, J. P. 2000 Social Network Analysis: A Handbook. Sage Publications Ltd.
[7]	

  Kepner, J. and Gilbert, J. 2011 Graph Algorithms in the Language of Linear Algebra
   (Software, Environments, and Tools). Society for Industrial & Applied Mathematics.
[8]	

  Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. 2009 Introduction to
   Algorithms. The MIT Press.
[9]	

  Skiena, S. S. 2010 The Algorithm Design Manual. Springer.
[10]	

 Bollobas, B. 1998 Modern Graph Theory. Springer.
[11]	

 Watts, D. J. and Strogatz, S. H. 1998. Collective dynamics of ‘small-world’networks.
   Nature. 393, 6684, 440-442.
[12]	

 Barabási, A. L. and Albert, R. 1999. Emergence of scaling in random networks. Science.
   286, 5439, 509.
[13]	

 Kleinberg, J. 2000. The small-world phenomenon: an algorithm perspective. Proceedings of
   the thirty-second annual ACM symposium on Theory of computing. 163-170.
[14]	

 Milgram, S. 1967. The small world problem. Psychology today. 2, 1, 60-67.

                                                                                           21
Thanks for your kind attention.




Enrico Franchi (efranchi@ce.unipr.it)
AOTLAB, Dipartimento Ingegneria dell’Informazione,
Università di Parma




                                                     22

More Related Content

PDF
Complex and Social Network Analysis in Python
PDF
Epsrcws08 campbell isvm_01
PDF
Discrete Models in Computer Vision
PDF
Output Units and Cost Function in FNN
PDF
Lecture4 xing
PDF
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
PDF
NCCU CPDA Lecture 12 Attribute Based Encryption
PDF
23 industrial engineering
Complex and Social Network Analysis in Python
Epsrcws08 campbell isvm_01
Discrete Models in Computer Vision
Output Units and Cost Function in FNN
Lecture4 xing
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
NCCU CPDA Lecture 12 Attribute Based Encryption
23 industrial engineering

What's hot (20)

PDF
Nelly Litvak – Asymptotic behaviour of ranking algorithms in directed random ...
PDF
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
PDF
Codes and Isogenies
PDF
Efficient end-to-end learning for quantizable representations
PDF
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
PDF
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
PDF
Iclr2016 vaeまとめ
PDF
YSC 2013
PDF
A Novel Methodology for Designing Linear Phase IIR Filters
PDF
Auto encoding-variational-bayes
PDF
Information-theoretic clustering with applications
PPTX
Introduction to Elliptic Curve Cryptography
PPTX
The world of loss function
PDF
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018
PDF
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
PDF
Auto-encoding variational bayes
PDF
Optimal interval clustering: Application to Bregman clustering and statistica...
PDF
Digital Signal Processing[ECEG-3171]-Ch1_L02
PPTX
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
PDF
CVPR2010: Advanced ITinCVPR in a Nutshell: part 4: additional slides
Nelly Litvak – Asymptotic behaviour of ranking algorithms in directed random ...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Codes and Isogenies
Efficient end-to-end learning for quantizable representations
(DL hacks輪読) How to Train Deep Variational Autoencoders and Probabilistic Lad...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Iclr2016 vaeまとめ
YSC 2013
A Novel Methodology for Designing Linear Phase IIR Filters
Auto encoding-variational-bayes
Information-theoretic clustering with applications
Introduction to Elliptic Curve Cryptography
The world of loss function
Reinforcement Learning (Reloaded) - Xavier Giró-i-Nieto - UPC Barcelona 2018
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
Auto-encoding variational bayes
Optimal interval clustering: Application to Bregman clustering and statistica...
Digital Signal Processing[ECEG-3171]-Ch1_L02
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
CVPR2010: Advanced ITinCVPR in a Nutshell: part 4: additional slides
Ad

Similar to Social Network Analysis (20)

PDF
Cdc18 dg lee
PDF
Numerical Linear Algebra for Data and Link Analysis.
PDF
Continuum Modeling and Control of Large Nonuniform Networks
PPT
Thesis : &quot;IBBET : In Band Bandwidth Estimation for LAN&quot;
PPTX
Networking Assignment Help
PDF
TunUp final presentation
PDF
MVPA with SpaceNet: sparse structured priors
PPTX
Topology Matters in Communication
PDF
Pres110811
PDF
Csr2011 june14 14_00_agrawal
PDF
PPTX
Statistical Physics Assignment Help
PDF
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
PDF
Pydata Katya Vasilaky
PDF
Quantum assignment
PDF
kactl.pdf
PDF
New Mathematical Tools for the Financial Sector
PDF
NIPS2010: optimization algorithms in machine learning
PDF
Fast dct algorithm using winograd’s method
PDF
Bayesian inference on mixtures
Cdc18 dg lee
Numerical Linear Algebra for Data and Link Analysis.
Continuum Modeling and Control of Large Nonuniform Networks
Thesis : &quot;IBBET : In Band Bandwidth Estimation for LAN&quot;
Networking Assignment Help
TunUp final presentation
MVPA with SpaceNet: sparse structured priors
Topology Matters in Communication
Pres110811
Csr2011 june14 14_00_agrawal
Statistical Physics Assignment Help
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
Pydata Katya Vasilaky
Quantum assignment
kactl.pdf
New Mathematical Tools for the Financial Sector
NIPS2010: optimization algorithms in machine learning
Fast dct algorithm using winograd’s method
Bayesian inference on mixtures
Ad

More from rik0 (13)

PDF
Python intro
PDF
Game theory
PDF
Social choice
PDF
Clojure Interoperability
PDF
Pydiomatic
PDF
Pycrashcourse4.0 pdfjam
KEY
Twcrashcourse
KEY
Pyimproved again
KEY
Pycrashcourse3.1
KEY
Pycrashcourse3.0
KEY
Pycrashcourse2.0
KEY
Pycrashcourse
KEY
Pyimproved
Python intro
Game theory
Social choice
Clojure Interoperability
Pydiomatic
Pycrashcourse4.0 pdfjam
Twcrashcourse
Pyimproved again
Pycrashcourse3.1
Pycrashcourse3.0
Pycrashcourse2.0
Pycrashcourse
Pyimproved

Recently uploaded (20)

PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
project resource management chapter-09.pdf
PDF
WOOl fibre morphology and structure.pdf for textiles
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
DP Operators-handbook-extract for the Mautical Institute
PPTX
A Presentation on Touch Screen Technology
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PPTX
A Presentation on Artificial Intelligence
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Mushroom cultivation and it's methods.pdf
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
A novel scalable deep ensemble learning framework for big data classification...
PDF
August Patch Tuesday
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Enhancing emotion recognition model for a student engagement use case through...
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Digital-Transformation-Roadmap-for-Companies.pptx
Group 1 Presentation -Planning and Decision Making .pptx
MIND Revenue Release Quarter 2 2025 Press Release
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
project resource management chapter-09.pdf
WOOl fibre morphology and structure.pdf for textiles
SOPHOS-XG Firewall Administrator PPT.pptx
DP Operators-handbook-extract for the Mautical Institute
A Presentation on Touch Screen Technology
Agricultural_Statistics_at_a_Glance_2022_0.pdf
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
A Presentation on Artificial Intelligence
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Mushroom cultivation and it's methods.pdf
Assigned Numbers - 2025 - Bluetooth® Document
A novel scalable deep ensemble learning framework for big data classification...
August Patch Tuesday
gpt5_lecture_notes_comprehensive_20250812015547.pdf

Social Network Analysis

  • 1. AOT LAB DII, UNIPR SOCIAL NETWORK ANALYSIS Enrico Franchi ([email protected]) 1
  • 2. Outline SNA = Complex Network Analysis on Social Networks Notation & Metrics Degree Distribution Path Lengths Transitivity Models Random Graphs Small-Worlds Preferential Attachment Models Discussion Conclusion 2
  • 3. Network Directed Network G = (V, E) E ⊂ V 2 k out = ∑ A ij k = ∑ A ji in {(x, x) x ∈V } ∩ E = ∅ i i j j ki = kiin + kiout Undirected Network Adjacency Matrix A symmetric ⎧1 if (i,j) ∈E A ij = ⎨ ⎩0 otherwise ki = ∑ A ji = ∑ A ij j j px = # {i ki = x } 1 Degree Distribution n Average Degree k =n −1 ∑k x x∈V 3
  • 4. Measure of Transitivity () −1 ki Local Clustering Coefficient Ci = 2 T (i) T(i): # distinct triangles with i as vertex 1 Clustering Coefficient C = ∑ Ci n i∈V C= ( number of closed paths of length 2 ) = ( number of triangles ) × 3 ( number of paths of length 2 ) ( number of connected triples ) 4
  • 5. Shortest Path Length and Diameter scalar operations AB = A + .⋅ B The matrix product depends from ( A,+,⋅) [ AB]ij = ∑ A ik ⋅ Bkj the operations of the semi-ring k Set of Adjacency Matrices min Other matrix products make sense: e.g., ( A,+,^ ) or ( A,^,+ ) We consider: ( Sk (M) = M + .^ M k ^ .+ M k ) Shortest path lengths matrix: L = ( Sn … S1 ) ( M ) Diameter: d = max L Average shortest path:  = Lij ij 5
  • 6. Computational Complexity of ASPL: All pairs shortest path matrix based (parallelizable): ( ) α ≈ 3/ 4 O n 3+α All pairs shortest path Bellman-Ford: O (n )3 All pairs shortest path Dijkstra w. Fibonacci Heaps: O ( n log n + nm ) 2 Computing the CPL x = M q (S) q#S elements are ≤ than x and (1-q)#S are > than x x = Lqδ (S) q#S(1-δ) elements are ≤ than x and (1-q)#S(1-δ) are > than x Huber Algorithm 2 2 (1 − δ ) 2 Let R a random sample of S such that #R=s, then s = 2 ln q  δ 2 Lqδ(S) = Mq(R) with probability p = 1-ε. 6
  • 7. 2 2 (1 − δ ) 2 s = 2 ln q  δ 2 7
  • 8. Facebook Hugs Degree Distribution 10000000 Nodes: 1322631 Edges: 1555597 m/n: 1.17 CPL: 11.74 1000000 Clustering Coefficient: 0.0527 Number of Components: 18987 100000 Isles: 0 10000 Largest Component Size: 1169456 1000 For large k we have 100 statistical fluctuations 10 1 1 10 100 1000 For small k power-laws do not hold 8
  • 9. Many networks have power-law degree distribution. pk ∝ k −γ γ >1 • Citation networks k r =? • Biological networks • WWW graph • Internet graph • Social Networks Power-Law: ! gamma=3 1000000 100000 10000 1000 100 10 1 0.1 9 1 10 100 1000
  • 10. Erdös-Rényi Random Graphs Connectedness p Threshold log n / n G(n, p) p G(n, m) p p p p Ensembles of Graphs p p When describe values of p properties, we actually the p Pr(Aij = 1) = p expected value of the property d := d = ∑ Pr(G)⋅ d(G) ∝ log n Pr(G) = p m (1− p) () n 2 −m G log k ⎛ n⎞ m =⎜ ⎟ p k = (n − 1)p C = k (n − 1) −1 ⎝ 2⎠ ⎛ n − 1⎞ k k k pk = ⎜ ⎟ p (1− p) n−1−k n→∞ pk = e − k 10 ⎝k ⎠ k!
  • 11. p Watts-Strogatz Model In the modified model, we only add the edges. ki = κ + si ps = e −κ s (κ p ) s C= 3(κ − 2) s! 4(κ − 1) + 8κ p + 4κ p 2 Edges in the lattice # added pk = e −κ s (κ p ) k−κ ≈ log(npκ ) shortcuts ( k − κ )! κ p 2 11
  • 12. Strogatz-Watts Model - 10000 nodes k = 4 1 CPL(p)/CPL(0) C(p)/C(0) 0.8 CPL(p)/CPL(0) 0.6 C(p)/C(0) 0.4 0.2 0 0 0.2 0.4 p 0.6 0.8 1 Short CPL Large Clustering Coefficient 12 Threshold Threshold
  • 14. Barabási-Albert Model Connectedness log n Threshold log log n BARABASI-ALBERT-MODEL(G,M0,STEPS) Pr(V = x ) = ∑ Pr(E = e) = FOR K FROM 1 TO STEPS e∈N ( x ) N0 ← NEW-NODE(G) kx 2k x = = ADD-NODE(G,N0) m ∑ kx A ← MAKE-ARRAY() x FOR N IN NODES(G) −3 PUSH(A, N) pk ∝ x FOR J IN DEGREE(N) log n PUSH(A, N) ≈ FOR J FROM 1 TO M log log n N ← RANDOM-CHOICE(A) −3/4 ADD-LINK (N0, N) C≈n Scale-free entails short CPL Transitivity disappears 14 with network size No analytical proof available
  • 15. OSN Refs. Users Links <k> C CP d γ r L Club Nexus Adamic et al 2.5 K 10 K 8.2 0.2 4 13 n.a. n.a. Cyworld Ahn et al 12 M 191 M 31.6 0.2 3.2 16 -0.1 Cyworld T Ahn et al 92 K 0.7 M 15.3 0.3 7.2 n.a. n.a. 0.4 LiveJournal Mislove et al 5 M 77 M 17 0.3 5.9 20 0.2 Flickr Mislove et al 1.8 M 22 M 12.2 0.3 5.7 27 0.2 Twitter Kwak et al 41 M 1700 M n.a. n.a. 4 4.1 n.a. Orkut Mislove et al 3 M 223 M 106 0.2 4.3 9 1.5 0.1 Orkut Ahn et al 100 K 1.5 M 30.2 0.3 3.8 n.a. 3.7 0.3 Youtube Mislove et al 1.1 M 5 M 4.29 0.1 5.1 21 -0 Facebook Gjoka et al 1 M n.a. n.a. 0.2 n.a. n.a. 0.23 FB H Nazir et al 51 K 116 K n.a. 0.4 n.a. 29 n.a. FB GL Nazir et al 277 K 600 K n.a. 0.3 n.a. 45 n.a. BrightKite Scellato et al 54 K 213 K 7.88 0.2 4.7 n.a. n.a. FourSquare Scellato et al 58 K 351 K 12 0.3 4.6 n.a. n.a. LiveJournal Scellato et al 993 K 29.6 M 29.9 0.2 4.9 n.a. n.a. Twitter Java et al 87 K 829 K 18.9 0.1 n.a. 6 0.59 Twitter Scellato et al 409 K 183 M 447 0.2 2.8 n.a. n.a. 15
  • 16. Static Deg C Rigid ER Yes Poisson Low - WS Yes Poisson Ok Yes BA No PL γ=3 Fixable Yes • Moreover: • Mostly no navigability • Uniformity assumption • Sometimes too complex for analytic study • Few features studied • Power-law? 16
  • 17. Alternative models for degree distributions Power-laws are difficult to fit. When they do, there are often better distributions. Power-law with cutoff almost always fits better than plain power-law. f (x;γ , β ) = x −γ eβ x Sometimes the log-normal distribution is more appropriate 1 ⎛ − ( log(x / m))2 ⎞ f (x;σ , m) = exp ⎜ ⎟ xσ (2π )1/2 ⎝ 2σ 2 ⎠ Most of the times random and preferential attachment processes concur F(x;r) = 1− (rm)1+r (x + rm)−(1+r ) r→0 r→∞ 17 scale-free negative exponential dist.
  • 18. Massachussets 1st run: 64/296 arrived, most Boston delivered to him by 2 men Nebraska 2nd run: 24/160 arrived, 2/3 delivered by “Mr. Jacobs” Omaha 2 ≤ hops ≤ 10; µ=5.x Wichita 6 Degrees CPL, hubs, ... Kansas ... and Kleinberg’s Intuition Milgram’s Experiment • Random people from Omaha & Wichita were asked to send a postcard to a person in Boston: • Write the name on the postcard • Forward the message only to people personally known 18 that was more likely to know the target
  • 19. Biased Preferential Attachment At each step: A new node is added to the network and is assigned to one of the sets P, I and L according to a probability distribution h + e0 ∈ edges are added to the network for each edge (u,v) u is chosen with distribution D0 and: if u ∈ I, v is a new node and is assigned to P; if u ∈ L, v is chosen according to Dγ. ⎧(β + 1)(ku + 1) u ∈L β ⎪ D (u) ∝ ⎨ ku + 1 u ∈I ⎪0 u ∈P ⎩ No analytic results available. 19
  • 20. Transitive Linking Model [Davidsen 02] Transitive Linking I At each step: TL: a random node is chosen, and it introduces two other nodes that are linked to it; if the node does not have 2 edges, it introduces himself to a random node RM: with probability p a node is chosen and removed along its edges and replaced with a node with one random edge I When p ⇤ 1 the TL dominates the process: I the degree distribution is a power-law with cutoff I 1 C = p(⌅k ⇧ 1), i.e., quite large in practice I For larger values of p the two different process concur to form an exponential degree distribution I for p ⇥ 1 the degree distribution is essentially a Poisson distribution Instead of p it would make sense to have distinct p and r Bergenti, Franchi, Poggi (Univ. Parma) Models for Agent-based Simulation of SN SNAMAS ’11 11 / 19 parameters for nodes leaving and entering the network Few analytic results available. 20
  • 21. [1] Dorogovtsev, S. N. and Mendes, J. F. F. 2003 Evolution of Networks: From Biological Nets to the Internet and WWW (Physics). Oxford University Press, USA. [2] Watts, D. J. 2003 Small Worlds: The Dynamics of Networks between Order and Randomness (Princeton Studies in Complexity). Princeton University Press. [3] Jackson, M. O. 2010 Social and Economic Networks. Princeton University Press. [4] Newman, M. 2010 Networks: An Introduction. Oxford University Press, USA. [5] Wasserman, S. and Faust, K. 1994 Social Network Analysis: Methods and Applications (Structural Analysis in the Social Sciences). Cambridge University Press. [6] Scott, J. P. 2000 Social Network Analysis: A Handbook. Sage Publications Ltd. [7] Kepner, J. and Gilbert, J. 2011 Graph Algorithms in the Language of Linear Algebra (Software, Environments, and Tools). Society for Industrial & Applied Mathematics. [8] Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. 2009 Introduction to Algorithms. The MIT Press. [9] Skiena, S. S. 2010 The Algorithm Design Manual. Springer. [10] Bollobas, B. 1998 Modern Graph Theory. Springer. [11] Watts, D. J. and Strogatz, S. H. 1998. Collective dynamics of ‘small-world’networks. Nature. 393, 6684, 440-442. [12] Barabási, A. L. and Albert, R. 1999. Emergence of scaling in random networks. Science. 286, 5439, 509. [13] Kleinberg, J. 2000. The small-world phenomenon: an algorithm perspective. Proceedings of the thirty-second annual ACM symposium on Theory of computing. 163-170. [14] Milgram, S. 1967. The small world problem. Psychology today. 2, 1, 60-67. 21
  • 22. Thanks for your kind attention. Enrico Franchi ([email protected]) AOTLAB, Dipartimento Ingegneria dell’Informazione, Università di Parma 22