Advanced Data Structures & Algorithm Analysi

Course Instructor: Dr. C. Sreedhar
ADVANCED DATA
STRUCTURES &
ALGORITHMS ANALYSIS
B.Tech III Sem
Scheme 2023
Source: Some of the images, content are copied from Internet source

Asymptotic Notation
 Big-O notation:
 provides upper bound of a function.
 represents worst-case scenario
 Omega notation:
 provides lower bound of a function.
 represents best-case scenario
 Theta notation:
 Provides both an upper and lower bound
 represents the average-case scenario

O(n2
)
O(2n
)
O(n!)
O(1) < O(log(n)) < O(n) < O(n log(n)) < O(n2
) < O(2n
) <
O(n!)
Source:
Copied from
internet

Big O notation
f(n) = O(g(n)),
iff
 positive constants
c and n0,
such that 0  f(n) 
cg(n), n  n0
Source:
Copied from
internet

Big O notation: Example
Consider f(n) = 3n+2
Can this function be represented as O(g(n)) ?
f(n) = O(g(n)), iff  positive constants c and n0, such that 0  f(n)  cg(n), n  n0
f(n)  c*g(n)
3n+2  c*g(n)
3n+2  4*n for all n>=2
f(n) = O(g(n)) ie., 3n+2 = O(n), c=4 and n0
=2

Big O notation: Example 2
 f(n) = 3n2
+ 2n + 4.
 To find upper bound of f(n), we have to find c and
n0 such that 0 ≤ f (n) ≤ c × g (n) for all n ≥ n0
 C=9 and n0 = 1
 0 ≤ 3n2
+2n + 4 ≤ 9n2
 f(n) = O (g(n)) = O (n2
) for c = 9, n0 =
1

Big O notation (O)
 Big O notation is helpful in finding the worst-case
time complexity of a particular program.
 Examples of Big O time complexity:
 Linear search: O(N), where N is the number of elements in the given array
 Binary search: O(Log N), where N is the number of elements in the given array
 Bubble sort: O(n2
), where N is the number of elements in the given array

Omega notation
f (n) = Ω(g(n)),
iff
 positive constants
c and n0,
such that 0 c g(n) f
≤ ≤
(n), n  n0

Omega notation: Example
f (n) =3n+2
Can this function be represented as Ω (g(n)) ?
f (n) = Ω(g(n)), iff  positive constants c and n0,
such that 0 ≤ c g(n) ≤ f (n), n  n0
c*g(n)  f(n); c= 3 and n0 = 1
3n  3n+2
f(n) = Ω (g(n)) = Ω (n) for c = 3, n0 =
1

Omega notation: Example
f (n) =8n2
+2n-3
Can this function be represented as Ω (g(n)) ?
f (n) = Ω(g(n)), iff  positive constants c and n0, such that
0 ≤ c g(n) ≤ f (n), n  n0
c*g(n)  f(n); c= 7 and n0 = 1
7*n2
 8n2
+2n-3
f(n) = Ω (g(n)) = Ω (n2
) for c = 7, n0 =
1

Theta notation
f (n) = Θ(g(n)), iff
 positive constants c1,
c2 and n0 ,such that
0 ≤ c1 *g(n) ≤ f(n) ≤
c2*g(n)  n ≥ n0 Source:
Copied from
internet

Theta notation: Example
f (n) = 4n+3
Can this function be represented as Θ (g(n)) ?
f (n) = Θ(g(n)), iff  positive constants c1, c2 and n0 ,such that
0 ≤ c1 *g(n) ≤ f(n) ≤ c2*g(n)  n ≥ n0
Consider c1=4, c2=5 and n0 = 3
0 ≤ 4 *3 ≤ 4(3)+3 ≤ 5*3
f(n) = Θ (g(n)) = Θ (n) for c1 = 4, c2 = 5, n0 = 1

Theta notation: Example
 Consider f(n) = 3n+2. Represent f(n) in terms of
Θ(n) and Find c1,c2 and n0
 Solution
 C1= 3
 C2=4
 n0 = 2

O(1): Constant
Time
O(n): Linear Time
Source:
Copied from
internet

O(log(n)) : Logarithmic Time
For example, when n is
8, the while loop will
iterate for log_2(8) = 3
times
Source:
Copied from
internet

O(nlog(n)) Example
Source:
Copied from
internet

Factorial Time Algorithms – O(n!)

AVL Tree
 AVL tree: a binary search tree that uses modified add and remove
operations to stay balanced as its elements change.
 AVL trees are self-balancing binary search trees.

Invented in 1962 by Adelson, Velskii and Landis (AVL)
 Properties of AVL:
 Sub-trees of every node differ in height by at most one level.
 Every sub-tree is an AVL tree
 In AVL trees, balancing factor of each node is either 0 or 1 or -1

Self balancing trees
 make sure that a tree remains balanced as we insert
new nodes.
 Examples:
 AVL trees
 Red-black trees
 Splay trees
 B-trees

AVL Tree
 basic idea: When nodes are added to / removed
from the tree, if the tree becomes unbalanced,
repair the tree until balance is restored.
 Balance factor (b) of every node is -1 b 
1

AVL Tree
 At any node, there are 3 possibilities

Balance factor
 Bal_Factor(T) = Height(T.left) - Height(T.right)
 In AVL tree, no node's two child subtrees differ in height by more than 1.
 Balance factor value are: -1, 0 or 1.
 If balance factor of any node is 1, it means that the left sub-tree is one
level higher than the right sub-tree.
 If balance factor of any node is 0, it means that the left sub-tree and right
sub-tree contain equal height.
 If balance factor of any node is -1, it means that the left sub-tree is one
level lower than the right sub-tree

Defining node, Height and balance factor
2
1 3
struct Node
{
int key;
struct Node *left;
struct Node *right;
int height;
};
int getHeight(struct Node *n)
{
if(n==NULL) return 0;
return n->height;
}
int getBalanceFactor(struct Node * n)
{
if(n==NULL)
return 0;
return getHeight(n->left) - getHeight(n->right);
}

AVL Tree: Balance Factor Example1
Balanced Factor(X)=Height(left Subtree (X)) – Height(right Subtree(X))
Bal_Factor(8)= H(LS(8)) - H(RS(8))
= 0 – 0
= 0
0
= 0 – 0
= 0
0 0 0
= 1 – 1
= 0
0 0
= 2 – 2
= 0
0

AVL Tree: Balance Factor Example2
Balanced Factor(X)=Height(left Subtree (X)) – Height(right Subtree(X))
30
10
8 21
56
64
12
= 0 – 0
= 0
0
= 0 – 0
= 0
0 0
= 1 – 0
= 1
1
= 1 – 2
= -1
-1
= 0 – 1
= -1
-1
= 3 – 2
= 1
1

AVL Tree
Rotations
Single
Rotations
Double
Rotations
Left Rotations
(LL Rotation)
Right Rotations
(RR Rotation)
Left-Right
Rotations
(LR Rotation)
Right-Left
Rotations
(RL Rotation)
Unbalanced Tree  Balanced Tree

AVL Tree Insertion: Left unbalanced (RR)
3
2
1
0
1
2
Left Unbalanced ie., Left
Heavy Right Rotation
2
0
1
0
3
0
Unbalanced Tree Balanced Tree

AVL Tree Insertion: Right unbalanced (LL)
1
2
3
0
1
2
Right Unbalanced ie., Right is
heavy Left Rotation
Unbalanced Tree Balanced Tree
2
0
1
0
3
0

AVL Tree: Double Rotation
IF tree is right heavy
{
IF tree's right subtree is left heavy
{ Perform Double Left rotation }
ELSE
{ Perform Single Left rotation }
}
ELSE IF tree is left heavy
{
IF tree's left subtree is right heavy
{ Perform Double Right rotation }
ELSE { Perform Single Right rotation } }

AVL: Double Rotations: LR Rotation
 LR rotation is a combination of a left rotation followed by a right rotation.
It is performed when the left subtree of a node is unbalanced to the right,
and the right subtree of the left child of that node is unbalanced to the
left.
3
1
2

AVL: LR Rotation
3
1
2
3
2
1
2
0
1
0
3
0

AVL: RL Rotation
 An RL rotation is a combination of a right rotation followed by a left
rotation. It is performed when the right subtree of a node is unbalanced to
the left, and the left subtree of the right child of that node is unbalanced
to the right
3
1
2

AVL: RL Rotation
3
1
2
1
2
3
2
1 3

Exercise
 Construct AVL Tree with the following
trees 1, 2, 3, 4, 5, 6

Construct AVL tree: 1,2,3,4,5,6
1
1
1
2
3
0
0
-1
2
2
0
-1
-2
Affected Node: 1
Tree is Right heavy
Hence, Rotate Left
(LL)
1
2
3
0 0
0

1
2
3
4
0
0 -1
-1
4
1
2
3
4
5
0
-1
-2
0
-2 Affected Node: 3
Tree is Right heavy
Hence, Rotate Left
(LL)
1
2
4
3 5
0
0
0
0
-1

1
2
4
3 5
6
0
-1
0
-1
0
-2
Affected Node: 2
Tree is Right heavy
Hence, Rotate Left
(LL)
1
2
4
5
6
3
0 0 0
0
-1
0

LR Rotation (Double Rotation)
P
GP
C
LC
P
GP
C
RC
In LR Rotation
(Before Rotation)
Grand Parent (GP)
Parent (P)
Child (C)
Left Child (LC)
Right Child (RC)
Left Tree is heavy in which Right subtree is heavy
New node is inserted as Left child?? or Right child??
Before Before

P
GP
C
LC
In LR Rotation
(Before Rotation)
Grand Parent (GP)
Parent (P)
Child (C)
Left Child (LC)
New node is inserted as Left child
In LR Rotation
(After Rotation)
C is stored as Root
P is Left Child of C
GP is Right Child of C
C’s right tree is GP’s left tree
C’s left tree is P’s right tree
C
P GP
LC
Before After

P
GP
C
RC
In LR Rotation
(Before Rotation)
Grand Parent (GP)
Parent (P)
Child (C)
Right Child (RC)
New node is inserted as Right child
In LR Rotation
(After Rotation)
C is stored as Root
C
P GP
RC
Before
After

Construct AVL tree: 30,9,38,6,11
And Insert the node 10
9
30
38
6 11
10 0
0
0
1
-1
2
Affected Node: 30
Tree is Left heavy & Right subtree is
heavy
Hence, Left-Right (LR) Rotation
P
GP
C
LC
C
P
LC
GP
11
9
10
30
6 38
Insert(10)
Before
After Final Tree

Construct AVL tree: 30,9,38,6,11
9
30
38
6 11
12 0
0
0
-1
-1
2
Affected Node: 30
Tree is Left heavy & Right subtree is
heavy
C
P GP
RC
P
GP
C
RC
11
9 30
12
6 38
Insert(10)
Before
After
Final Tree

AVL: 60,50,80,40,52,70,90,30,42,51,54
Insert Node 56
60
50 80
40 70 90
52
30 42 51 54
56
0
0
0 0
0
0
0
0
-1
-1
2
Affected Node: 30
Tree is Left heavy & Right subtree is heavy
C is stored as Root
52
50 60
40 51
30 42
54
56
80
70 90

Construct AVL: 4, 2, 10, 7, 11, 1, 3, 6, 9, 12, Insert(8)
4 4
2
4
2 1
0
4
2 1
0
7
4
2 1
0
7 1
1
4
2 1
0
7 1
1
1
4
2 1
0
7
1
1
1 3
4
2 1
0
7
1
1
1 3
6
4
2 1
0
7
1
1
1 3
6 9

Construct AVL: 4, 2, 10, 7, 11, 1, 3, 6, 9, 12, 8
4
2 1
0
7
1
1
1 3
6 9 1
2
4
2 1
0
7
1
1
1 3
6 9 1
2
8
0
0
0
0
0 -1
-1 1
1
0
-2
Affected Node: 4
Tree is Rigt heavy & Left subtree is
heavy
In RL Rotation
Before Rotation
Grand Parent (GP): 4
Parent (P): 10
Child (C): 7
After Rotation
C is stored in Root : 7
GP as Left Child : 4
P as Right Child: 10
C’s right tree is P’s
left tree
C’s left tree is GP’s
right tree

4
2 1
0
7
1
1
1 3
6 9 1
2
8
0
0 0
0 -1
-1 1
1
0
-2
Affected Node: 4
heavy
Hence, Right-Left (RL) Rotation
7
4
1
0
2
1 3
1
1
1
2
6 9
8
C is stored as Root
GP is Left Child of C
P is Right Child of C
C’s right tree is P’s left tree
C’s left tree is GP’s right tree

Construct AVL tree: 5,2,7,1,4,6,9,3,16
2
5
7
1 4 6 9
3 16
15
0
0 0
0
1
-2
Affected Node: 9
heavy
Hence, Right-Left (RL) Rotation
2
5
1 4
3
7
6 15
9 16

Deletion in AVL Tree
 Deletion in AVL Tree is similar to Deletion of BST
 Case 1: Deleting a node which is a leaf node
 Case 2: Deleting a node with only one child
 Case 3: Deleting a node with two children
 In AVL Tree, after deleting a node, balance should
be checked, if out of range, rotate

Ex1: Consider AVL Tree, Delete node:
Case1
1
2
4
5
6
3 0
Delete Node 1
(Leaf Node)
2
4
5
6
3
0
-1
-1
0

Ex2: Consider AVL Tree, Delete node: Case1
18
20
46
54
60
23
7
Delete Node 60
18
20
46
54
23
7
0
1
0
1 0
2
Perform RR Rotation

Ex2: Consider AVL Tree, Delete node: Case1 contd..
18
20
46
54
23
7
0
0
1
0
1 0
2
20
18
7
RR Rotation 46
54
23
0
0 0
0
1
0

Ex2: Consider AVL Tree, Delete node: Case 2
2
4
5
6
3
Delete Node 2
(Only one child)
Connect Grand Parent to Child
3
4
5
6
0
0 -1
-1

12
20
24
8 30
26
21
40
Delete Node 12
20
24
8
30
26
21
40
0
0
0
0
-1
0
-2

20
24
8
30
26
21
40
0
0
0
0
-1
0
-2 Node affected: 20
Tree is Right heavy
LL Rotation 24
20
8
30
26 40
21
0 0
0
0
0
0
0

Ex4: Case 2; Delete node 30
20
10 30
5 15 25
12
Delete Node 30
20
10
5 15
25
12 0
0
0
1
-1
2
Node affected: 20
Tree is Left heavy & Right subtree is heavy
Perform LR Rotation

20
10
5 15
25
12 0
0
0
1
-1
2
Ex4: Case 2; Delete node 30 contd…
LR Rotation
P's right subtree left most child ie.,12 becomes right child of P after rotation
15
10 20
5 25
12
C is stored as Root

Ex5: Case 3; Delete node 3
5
3 15
1 4 6 16
9
5
Node to be deleted is 3
Find the largest in left subtree or
Find the smallest in right subtree

AVL Tree Applications
 AVL trees are commonly used to implement dictionaries or associative
arrays, where data is stored in key-value pairs.
 It is used in applications that require improved searching apart from
the database applications.
 In databases, AVL trees can be used for indexing purposes. They
provide efficient retrieval of records based on keys, ensuring that
operations like searching for a specific record or range queries are
performed with optimal time complexity.

AVL Tree Applications
 In compilers and interpreters, symbol tables are used to store
identifiers (variables, functions, etc.) along with their associated
information (data type, scope, etc.). AVL trees can efficiently
handle symbol tables, ensuring quick lookup and insertion of
identifiers during the compilation or interpretation process.
 AVL trees are also used extensively in database applications
in which insertions and deletions are fewer but there are
frequent lookups for data required.

B-Tree : Introduction
 Binary Tree:
 Each node have max. of two children. Root, leaf, height
 Operations: Insertion, deletion, searching, traversals
 Applications: File system indexing, syntax tree parsing, BST,
 Binary Search Tree:
 Same property as binary tree. But Nodes in the left subtree are less than
root and nodes in the right subtree are greater than root.
 Operations: Insertion, deletion, searching, traversals
 Applications: Spell check, Decision trees, data compression

B-Tree: Introduction
 B-tree is a self-balancing tree data structure that
maintains sorted data and allows searches,
sequential access, insertions, and deletions in
logarithmic time.
 B-tree is well suited for storage systems that read
and write relatively large blocks of data, such as
databases and file systems.

B-Tree: Introduction
 B-tree is a special type of self-balancing search tree in which
each node can contain more than one key and can have more
than two children.
 Also known as a height-balanced m-way tree.
 The height of a B-Tree is kept low by putting maximum possible
keys in a B-Tree node.
 B-trees are balanced search trees designed to work well on
disks or other direct access secondary storage devices.

B-Tree: Example 1
 Construct B-tree of order 3 by inserting the following keys:
1,2,3,4,5,6,7,8.
 Solution:
 Given m =3
 Minimum keys per node: Mnk = [m/2]-1 = [1.5]-1=2-1=1
 Minimum no. of children per node: Mnc = [m/2] = 2
 Maximum keys per node: Mxk = (m-1) = (3-1) = 2
 Maximum no. of children per node: Mxc = m = 3

B-Tree: Example 1 contd…
Construct B-tree of order 3 by inserting the following keys: 1,2,3,4,5,6,7,8.
Orde
r 3
Mnk 1
Mnc 2
Mxk 2
Mxc 3
I(1)
1
I(2)
1 2
I(3)
1 2 3
Split=[m/2]= 2
1 2 3
2
1 3

Orde
r 3
Mnk 1
Mnc 2
Mxk 2
Mxc 3
2
1 3
I(4)
4
4
I(5)
2
1 3 4
4 5
1 2 3
Split=[m/2]= 2 2 4
1 3 5
2 4
1 3 5
I(6)
6
6

Orde
r 3
Mnk 1
Mnc 2
Mxk 2
Mxc 3
I(7)
2 4
1 3 5 6
6 7
1 2 3
Split=[m/2]= 2
2 4
1 5 7
3
6
1 2 3
Split=[m/2]= 2
4
2 6
1 3 5 7

Construct B-Tree with keys 1 to 11 of order 3
4
2 6 8
1 3 5 7 9 10
4 8
2 6 10
1 3 5 7 9 11

B-Tree with order 3: Example 3
Keys: 5,3,2,9,7,8,6,10,12,1
I(5)
5
I(3)
5
3
I(2)
5
3
2
1 2 3
Split=[m/2]=3/2=2
3
2 5
Orde
r 3
Mnk 1
Mnc 2
Mxk 2
Mxc 3

3
2 5
Keys: 5,3,2,9,7,8,6,10,12,1
9
I(9)
9
I(7)
3
2 5 7 9
Orde
r 3
Mnk 1
Mnc 2
Mxk 2
Mxc 3
1 2 3
Split=[m/2]=3/2=2
3 7
2 5 9

3 7
2 5 9
Keys: 5,3,2,9,7,8,6,10,12,1
Orde
r 3
Mnk 1
Mnc 2
Mxk 2
Mxc 3
I(8)
8
8
3 7
2 5 9
8
I(6)
6
6

3 7
2 5 9
8
6
6
Keys: 5,3,2,9,7,8,6,10,12,1
Orde
r 3
Mnk 1
Mnc 2
Mxk 2
Mxc 3
I(10)
10
1 2 3
Split=[m/2]=3/2=2
3 7
2 5 8
6
6 10
9
1 2 3
Split=[m/2]=3/2=2
7
3 9
2 5 6 8 10

Keys: 5,3,2,9,7,8,6,10,12,1
Orde
r 3
Mnk 1
Mnc 2
Mxk 2
Mxc 3
7
3 9
2 5 6 8 10
I(12)
12
12
7
3 9
2 5 6 8 10
I(1)
12
12
1

B-Tree: Deletion
Two cases: Case 1: Deletion from a leaf node; Case 2: Deletion from a internal node
Case1: Deletion from a leaf node
1. Search for the value to delete.
2. If the value is in a leaf node, simply delete it from the node.
3. If underflow happens, rebalance the tree.
Case 2: Deletion from a internal node
1. Choose a new separator (either the largest element in the left subtree or
the smallest element in the right subtree), remove it from the leaf node it is
in, and replace the element to be deleted with the new separator.
2. The previous step deleted an element (the new separator) from a leaf
node. If that leaf node is now deficient (has fewer than the required
number of nodes), then rebalance the tree starting from the leaf node.

B-Tree: Deletion
Rebalancing starts from a leaf and proceeds toward the root until tree is balanced.
If deleting an element from a node has brought it under minimum size, then some
elements must be redistributed to bring all nodes up to the minimum. Usually, the
redistribution involves moving an element from a sibling node that has more than
the minimum number of nodes. That redistribution operation is called a rotation.
If no sibling can spare an element, then the deficient node must be merged
with a sibling. The merge causes the parent to lose a separator element, so
the parent may become deficient and need rebalancing. The merging and
rebalancing may continue all the way to the root.

The algorithm to rebalance the tree is as follows:
•If deficient node's right sibling exists and has more than ‘Mnk’, then rotate
left
a) Copy separator from parent to end of deficient node
b) Replace separator in parent with first element of right sibling (right
sibling loses one node but still has at least the minimum number of
elements)
c) The tree is now balanced
• Otherwise, if the deficient node's left sibling exists and has more than
the minimum number of elements, then rotate right
a) Copy separator from parent to start of deficient node
b) Replace separator in parent with last element of left sibling (left sibling
loses one node but still has at least the minimum number of elements)
c) The tree is now balanced

• Otherwise, if both immediate siblings have only Mnk, then merge with
a sibling sandwiching their separator taken off from their parent
a) Copy separator to end of left node (the left node may be the deficient
node or it may be the sibling with the minimum number of elements)
b) Move all elements from the right node to the left node (the left node
now has the maximum number of elements, and the right node –
empty)
c) Remove separator from parent along with its empty right child (the
parent loses an element)
 If the parent is the root and now has no elements, then free it and
make the merged node the new root (tree becomes shallower)
Otherwise, if the parent has fewer than the required number of
elements, then rebalance the parent.

B-Tree deletion example 1
 Construct B-Tree with the following keys:
1,2,3,4,5,6,7,8 and perform deletion operation

Construct B-Tree (Order 3) with the following keys: 1,2,3,4,5,6,7,8
4
2 6
1 3 5 7 8
Delete 7
4
2 6
1 3 5 8
Explanation
Key 7 is leaf node
By deleting 7, No
violation
7

4
2 6
1 3 5 7 8
Delete 8
Explanation
Key 8 is leaf node
By deleting 8, No
violation
Consider B-Tree (Order 3) with the following keys: 1,2,3,4,5,6,7,8
8
4
2 6
1 3 5 7

4
2 6
1 3 5 7 8
Delete 3 Explanation
Key 3 is leaf node
By deleting 3, violation
(Left child has only one key)
When siblings do not have
sufficient keys to borrow,
merge with parent.
3
1 2
4 6
5 7 8

4
2 6
1 3 5 7 8
Delete 1
1
2 3
4 6
5 7 8

4
2 6
1 3 5 7 8
Delete 5
5
6
7
8
4
2
1 3

4
2 6
1 3 5 7 8
8
Delete 6
6
5
7
8
4
2
1 3

4
2 6
1 3 5 7 8
Delete 2
2
1 3
4 6
5 7 8

4
2 6
1 3 5 7 8
Delete 4
Explanation
If root node is to be
deleted, find the largest key
in left subtree and replace
with node to be deleted
3
2
1 5
6
7 8

4
2 6
1 3 5 8
Delete 6
6
5 8
When the node to be deleted contains only one left child and one right child, then merge children
4
2
1 3

Priority Queue
 A priority queue is an abstract data type that behaves similarly to
the normal queue except that each element has some priority.
 The priority of the elements in a priority queue will determine the
order in which elements are removed from the priority queue.
 An element with the higher priority will be deleted before the
deletion of the lesser priority.
 If two elements in a priority queue have the same priority, they will
be arranged using the FIFO principle.

Priority Queue
A Priority Queue can be implemented using –
Arrays
Linked List
Heaps

Array implementation of Priority Queue
 Consider the elements with priority
Element 5 10 3 20 15
Priority 2 1 3 5 4
20 15 3 5 10

Array implementation of Priority Queue
 Insert new element 30 with priority 6
Element 5 10 3 20 15
Priority 2 1 3 5 4
20 15 3 5 10
30
6
30
Deletion: Element with highest priority will be deleted first

Linked List implementation of Priority Queue
Element 800 300 100 200 600
Priority 5 1 3 2 4
300 1 200 2 100 3 600 4 800 5

Heap Trees
 A heap is a tree-like data structure in which the tree is a complete
binary tree (node can have utmost two children) that satisfies the
heap property.
 Heap property: Value of Parent is either greater than or equal to
(max heap) or less than or equal to (min heap) value of Child. Also
called a binary heap.
 Heap is essentially a complete binary tree and hence can be
efficiently represented using array-based representation.
 A heap is described in memory using linear arrays in a sequential
manner.

98
Unit 2
•Graphs
Terminology, Representations, Basic Search and
Traversals - BFS, DFS, Biconnected Components &
DFS.
•String Searching Algorithms
Brute-Force algorithm, Robin-Karp algorithm,
Boyer-Moore algorithm

99
Graph: Terminology
• graph: A data structure containing:
 a set of vertices V, (sometimes called nodes)
 a set of edges E, where an edge
represents a connection between 2 vertices.
• Graph G = (V, E)
• an edge is a pair (v, w) where v, w are in V
 V = {a, b, c, d}
 E = {(a, c), (b, c), (b, d), (c, d)}
a
c
b
d

100
Graph: Terminology
b) Directed Graph c) Weighted Directed Graph

101
Graph: Terminology
• reachable: Vertex a is reachable from b
if a path exists from a to b.
• connected: A graph is connected if every
vertex is reachable from any other.
 Is the graph at top right connected?
• strongly connected: When every vertex
has an edge to every other vertex.
X
U
V
W
Z
Y
a
c
b
e
d
f
g
h
a
c
b
d
a
c
b
d
e

102
Graph: Terminology
• cycle: A path that begins and ends at the same node.
 example: {b, g, f, c, a} or {V, X, Y, W, U, V}.
 example: {c, d, a} or {U, W, V, U}.
 acyclic graph: One that does
not contain any cycles.
• Directed graph without cycles
is called a directed acyclic graph or DAG
• loop: An edge directly from a node to itself.
 Many graphs don't allow loops.
X
U
V
W
Z
Y
a
c
b
e
d
f
g
h

103
Graph Terminology
• There are two commonly used methods for representing graphs:
Adjacency Matrix, Adjacency List
Directed Graph Adjacency matrix Adjacency List

105
DFS Example 3
1  2  4  8 5  6  3  7
-
Algorithm DFS(v)
{
visited[v]:=1
for each vertex w adjacent from v
do
{
if(visited[w]=0) then DFS(w);
}
}

107
BFS: Breadth First Search
• In breadth first search, we start at a vertex v and mark it as visited.
• The vertex v is at this time said to be unexplored. A vertex is said to
have been explored by an algorithm when the algorithm has visited
all vertices adjacent from it.
• All unvisited vertices adjacent from v are visited next. These are new
unexplored vertices. Vertex v is now said to be explored.

110
Biconnected components and DFS
• Biconnected graph:
 A biconnected graph is a connected graph that has no articulation points
• Articulation point:
 A vertex V of graph G is an articulation point iff the deletion of v, together
with the deletion of all edges incident to v, leaves behind graph that has
atleast two connected components.
• Biconnected component:
 A biconnected component has a connected graph G is a maximal
biconnected subgraph H of G

111
Articulation point
• A vertex V of graph G is an articulation point iff the deletion of v,
together with the deletion of all edges incident to v, leaves behind
graph that has atleast two connected components
3
2
0
1
4
AP: 0,3

112
Biconnected Components and DFS
1
4
3
10
9 2
8
7
5
6
1
2
3
4
5
6
7
8
9
10
Tree edges back edges

113
L(u)
where L(u) is the lowest depth first number that
can be reached from u using a path of descendents
followed by atmost one backedge.
If u is not the root, then u is an articulation
point iff u has a child w such that L[w] ≥ dfn[u]

L(6) = 10
L(10) = 4
L(9) = 5
L(5) = min(D(5),L(6),D(8),D(2))
= min(9, 10, 7, 6) = 6
L(7) = min(D(7),L(5), D(2))
= min(8, 6, 6) = 6
L(8) = min(D(8),L(7),D(5))
= min(7, 6, 9) = 6
L(2) = min(D(2),L(8),D(1),D(7),D(5))
= min(6,6,1,8,9) = 1
L(4) = min(D(4),L(3))
= min(2, 1) = 1
L(1) = min(D(1),L(2),D(2))
= min(1,1,6) = 1
1 2 3 4 5 6 7 8 9 10
1 1 1 1 6 10 6 6 5 4

1 2 3 4 5 6 7 8 9 10
1 1 1 1 6 10 6 6 5 4
where L(u) is the lowest depth first number that can be reached
from u using a path of descendents followed by atmost one
backedge.
If u is not the root, then u is an articulation point
iff u has a child w such that L[w] ≥ dfn[u]
u = 4
L[w] ≥ dfn[u]
L[3] ≥ dfn[4]
1 ≥ 2 False
u = 3
L[10] ≥ dfn[3] = 4 ≥ 3 True (3,10)
L[9] ≥ dfn[3] = 5 ≥ 3 True (3,9)
L[2] ≥ dfn[3] = 1 ≥ 3 False
u = 10 No
child
u = 9 No child
u = 2
L[8] ≥ dfn[2]
6 ≥ 6 True (2,8)
u = 8
L[7] ≥ dfn[8]
6 ≥ 7 False
u = 5
L[6] ≥ dfn[5]
10 ≥ 9 True (5,6)
u = 7
1 ≥ 8 False
Articulation Points are: 2,3,5

Advanced Data Structures & Algorithm Analysi

117
String Searching
• A string is a sequence of characters. Ex: “Java Program”, “Abacra
dhabra”, “East West”.
• Let P be a string of size m:
 A substring P[i .. j] of P is the subsequence of P consisting of the
characters with ranks between i and j
• Given strings T (text) and P (pattern), the pattern matching problem
consists of finding a substring of T equal to P

118
String Searching: Brute Force
• The brute-force pattern matching algorithm compares the pattern P
with the text T for each possible shift of P relative to T, until either
– a match is found, or
– all placements of the pattern have been tried
• Brute-force pattern matching runs in time O(nm)
• worst-case running time is Θ((n-m+1)m)

119
String Matching: Brute Force
• Example 1:
•Text (T) = abccba baccab
•Pattern (P) = cab
• Example 2:
•Text (T) = eceaecebecec cseacsebcseccsed
•Pattern (P) = cseb

void string_search_brute(char *T, char *P, int n, int m)
{
int i, j, found=0;
for(i=0; i < n - m + 1; i++)
{
for (j=0; j < m; j++)
{
if (T[i + j] != P[j]) break;
}
if (j == m)
{ // If we have reached end of pattern, we have found the pattern in
string
found = 1;
break;
}
}
if (found) “Found pattern at index i”
else “Pattern not found in Text”

121
String Searching: Brute Force

122
String Searching: Rabin-Karp
• Given |Σ| = d, the Rabin-Karp algorithm treats each string in Σ*
as if
it were a number in radix d notation.
• Ex: if Σ = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} then d = 10, and the string "274"
to have value 2 x d2
+ 7 x d1
+ 4 x d0
= 2 x 100 + 7 x 10 + 4 x 1.
• Similarly, if Σ = {a, b} then d = 2, and we map the characters of {a, b}
to decimal values {0, 1}. Then "bab" = 1 x d2
+ 0 x d1
+1 x d0
=
1 x 4 + 0 x 2 +1 x 1.
• Hexadecimal notation uses Σ = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E,
F} and d = 16.

123
• Example 1:
•String (T) = 123443215678
•Pattern (P) = 21
• Solution:
• T.len = 12; P.len = 2; Choose Prime Q=11
• Compute Hash on P  H(P) = P mod Q = 21 mod 11 = 10
• Compute Hash on T H(T) = 12 mod 11 = 1, Not a match ( H(P) ≠ H(T) )
=23 mod 11 = 1, Not amatch
34 mod 11 = 1 (No match); …43 mod 11 = 10, Match  Spurious hit …
21 mod 11 = 10, Match

124
• Example 2:
•String (T) = ABCCDDAEFG
•Pattern (P) = CDD
• Solution:
• T.len = 10; P.len = 3; Choose Prime Q=13
• Hash on P  H(P) = P mod Q = (3 x 102
+ 4 x 101
+ 4 x 100
)mod 13 = 6
• Compute Hash on T H(T) = (1 x 102
+ 2 x 101
+ 3 x 100
) mod 13 = 6, Spurious hit
=(2 x 102
+ 3 x 101
+ 3 x 100
) mod 13 = 12, Not
amatch
….
1 2 3 4 5 6 7 8 9 10
A B C D E F G H I J

Algorithm RabinKarp( char T[], char P[], int n, int
m)
{ // The inputs are pattern P, input string T
hP
= hash(P) // m characters of given
Pattern P
hT
= hash(T[0..m-1] //first m characters from text
T
for S= 0 to n-m do
{
if(hP
== hT
) // Two hash values are compared
{
If (P[0..m-1]==T[S+0 .. S +m-1])
{ Print “Pattern Found with shift S“
return }
}

126
Boyer-Moore
• The Boyer-Moore pattern matching
technique is based on two Phases.
• 1. The looking-glass phase
 find P in T by moving backwards
through P, starting at its end
2. The character-jump phase
 when a mismatch occurs at T[i] = = x
 Character in pattern P[j] is not
same as T[i]
• There are 3 possible: cases, tried in
order.
x a
T
i
b a
P
j

127
Case 1
• If P contains x some where, then try to
shift P right to align the last occurrence
of x in P with T[i].
x a
i
b a
j
x c
x a
T
inew
b a
P
jnew
x c
? ?
and
move i and
j right, so
j at end

128
Case 2
• If P contains x somewhere, but a shift right to the last occurrence is
not possible, then shift P right by 1 character to T[i+1].
a x
T
i
a x
P
j
c w
a x
T
inew
a x
P
jnew
c w
?
and
move i and
j right, so
j at end
x
x is after j position
x

129
Case 3
• If cases 1 and 2 do not apply, then shift P to align P[0] with T[i+1].
x a
T
i
b a
P
j
d c
x a
T
inew
b a
P
jnew
d c
? ?
and
move i and
j right, so
j at end
No x in P
?
0

132
Boyer Moore: Worst case
T: "aaaaa…a"
P: "baaaaa"

133
Boyer Moore: Last Occurance Fn.
• Boyer-Moore’s algorithm preprocesses the pattern P and the
alphabet A to build a last occurrence function L()
 L() maps all the letters in A to integers
• L(x) is defined as: // x is a letter in A
 the largest index i such that P[i] = = x, or
 -1 if no such index exists

134
Last Occurance L(): Example
• A = {a, b, c, d}
• P: "abacab"
-1
3
5
4
L(x)
d
c
b
a
x
a b a c a b
0 1 2 3 4 5
P
L() stores indexes into P[]

Algorithm Boyer_Moore (char T[], char P[], int n, int m)
{
call BuildLast(P,m,L);
int i=j = m-1;
do { if (P[j] == T[i])
{ if (j == 0) return i; // match
else { // looking-glass technique
i--; j--;
}
}
else { // character jump technique
int lo = L[T[i]]; //last occurence
i = i + m - min(j, 1+lo);
j = m - 1;
}
}
while (i <= n-1); return -1; // no match

Advanced Data Structures & Algorithm Analysi

More Related Content

What's hot (20)

Similar to Advanced Data Structures & Algorithm Analysi (20)

More from Sreedhar Chowdam (20)

Recently uploaded (20)

Advanced Data Structures & Algorithm Analysi