SlideShare a Scribd company logo
Prof P Sreenivasa Kumar
Department of CS&E, IITM
1
File Organization and Indexing
The data of a RDB is ultimately stored in disk files
Disk space management:
Should Operating System services be used ?
Should RDBMS manage the disk space by itself ?
2nd option is preferred as RDBMS requires complete
control over when a block or page in main memory buffer
is written to the disk.
This is important for recovering data when system
crash occurs
Prof P Sreenivasa Kumar
Department of CS&E, IITM
2
Structure of Disks
Disk
several platters stacked on
a rotating spindle
one read / write head per surface
for fast access
platter has several tracks
• ~10,000 per inch
each track - several sectors
each sector - blocks
unit of data transfer - block
cylinder i - track i on all platters
Platters
Read/write head
}
track
sector
Speed:
7000 to
10000 rpm
Prof P Sreenivasa Kumar
Department of CS&E, IITM
3
Data Transfer from Disk
Address of a block: Surface No, Cylinder No, Block No
Data transfer:
Move the r/w head to the appropriate track
• time needed - seek time – ~ 12 to 14 ms
Wait for the appropriate block to come under r/w head
• time needed - rotational delay - ~3 to 4ms (avg)
Access time: Seek time + rotational delay
Blocks on the same cylinder - roughly close to each other
- access time-wise
- cylinder i, cylinder (i + 1), cylinder (i + 2) etc.
Prof P Sreenivasa Kumar
Department of CS&E, IITM
4
Data Records and Files
Fixed length record type: each field is of fixed length
• in a file of these type of records, the record number can be
used to locate a specific record
• the number of records, the length of each field are available
in file header
Variable length record type:
• arise due to missing fields, repeating fields, variable length
fields
• special separator symbols are used to indicate the field
boundaries and record boundaries
• the number of records, the separator symbols used are
recorded in the file header
Prof P Sreenivasa Kumar
Department of CS&E, IITM
5
Packing Records into Blocks
Record length much less than block size
• The usual case
• Blocking factor b = B/r B - block size (bytes)
r - record length (bytes)
- maximum no. of records that can be stored in a block
Record length greater than block size
• spanned organization is used
Record
1 1 2 2 3 3
File blocks:
sequence of blocks containing all the records of the file
Prof P Sreenivasa Kumar
Department of CS&E, IITM
6
Mapping File Blocks onto the Disk Blocks
Contiguous allocation
• Consecutive file blocks are stored in consecutive disk blocks
• Pros: File scanning can be done fast using double buffering
Cons: Expanding the file by including a new block in the middle
of the sequence - difficult
Linked allocation
• each file block is assigned to some disk block
• each disk block has a pointer to next block of the sequence
• file expansion is easy; but scanning is slow
Mixed allocation
Prof P Sreenivasa Kumar
Department of CS&E, IITM
7
Operations on Files
Insertion of a new record: may involve searching for appropriate
location for the new record
Deletion of a record: locating a record –may involve search;
delete the record –may involve movement of other records
Update a record field/fields: equivalent to delete and insert
Search for a record: given value of a key field / non-key field
Range search: given range values for a key / non-key field
How successfully we can carry out these operations
depends on the organization of the file and the availability
of indexes
Prof P Sreenivasa Kumar
Department of CS&E, IITM
8
Primary File Organization
The logical policy / method used for placing records into file blocks
Example: Student file - organized to have students records sorted
in increasing order of the “rollNo” values
Goal: To ensure that operations performed frequently on the file
execute fast
• conflicting demands may be there
• example: on student file, access based on rollNo and also
access based on name may both be frequent
• we choose to make rollNo access fast
• For making name access fast, additional access structures
are needed.
- more details later
Prof P Sreenivasa Kumar
Department of CS&E, IITM
9
Different File Organization Methods
We will discuss Heap files, Sorted files and Hashed files
Heap file:
Records are appended to the file as they are inserted
Simplest organization
Insertion - Read the last file block, append the record and
write back the block - easy
Locating a record given values for any attribute
• requires scanning the entire file – very costly
Heap files are often used only along with other access structures.
Prof P Sreenivasa Kumar
Department of CS&E, IITM
10
Sorted files / Sequential files (1/2)
Ordering field: The field whose values are used for sorting the
records in the data file
Ordering key field: An ordering field that is also a key
Sorted file / Sequential file:
Data file whose records are arranged such that the values of the
ordering field are in ascending order
Locating a record given the value X of the ordering field:
Binary search can be performed
Address of the nth file block can be obtained from
the file header
O(log N) disk accesses to get the required block- efficient
Range search is also efficient
Prof P Sreenivasa Kumar
Department of CS&E, IITM
11
Sorted files / Sequential files (2/2)
Inserting a new record:
Ordering gets affected
• costly as all blocks following the block in which insertion is
performed may have to be modified
Hence not done directly in the file
• all inserted records are kept in an auxiliary file
• periodically file is reorganized - auxiliary file and main file
are merged
• locating record
• carried out first on auxiliary file and then the main file.
Deleting a record
• deletion markers are used.
Prof P Sreenivasa Kumar
Department of CS&E, IITM
12
Hashed Files
Very useful file organization, if quick access to the data record is
needed given the value of a single attribute.
Hashing field: The attribute on which quick access is needed and
on which hashing is performed
Data file: organized as a buckets with numbers 0,1, …, (M − 1)
(bucket - a block or a few consecutive blocks)
Hash function h: maps the values from the domain of the hashing
attribute to bucket numbers
Prof P Sreenivasa Kumar
Department of CS&E, IITM
13
Inserting Records into a Hashed File
Insertion: for the given record R,
apply h on the value of hashing
attribute to get the bucket number r.
If there is space in bucket r,
place R there else place R in the
overflow chain of bucket r.
The overflow chains of all the
buckets are maintained in the
overflow buckets.
0
1
2
M-1
Main buckets
Overflow
buckets
Overflow
chain
Prof P Sreenivasa Kumar
Department of CS&E, IITM
14
Deleting Records from a Hashed File
Deletion: Locate the record R to be
deleted by applying h.
Remove R from its bucket/overflow
chain. If possible, bring a record from
the overflow chain into the bucket
0
1
2
M-1
Main buckets
Overflow
buckets
Overflow
chain
Search: Given the hash filed value
k, compute r = h(k). Get the bucket
r and search for the record. If not
found, search the overflow chain
of bucket r.
Prof P Sreenivasa Kumar
Department of CS&E, IITM
15
Performance of Static Hashing
Static hashing:
The hashing method discussed so far
The number of main buckets is fixed
Locating a record given the value of the hashing attribute
most often – one block access
Capacity of the hash file C = r * M records
(r - no. of records per bucket, M - no. of main buckets)
Disadvantage with static hashing:
If actual records in the file is much less than C
• wastage of disk space
If actual records in the file is much more than C
• long overflow chains – degraded performance
Prof P Sreenivasa Kumar
Department of CS&E, IITM
16
Hashing for Dynamic File Organization
Dynamic files
files where record insertions and deletion take place frequently
the file keeps growing and also shrinking
Hashing for dynamic file organization
Bucket numbers are integers
The binary representation of bucket numbers
Exploited cleverly to devise dynamic hashing schemes
Two schemes
• Extendible hashing
• Linear hashing
Prof P Sreenivasa Kumar
Department of CS&E, IITM
17
The k-bit sequence corresponding to a record R:
Apply hashing function to the value of the hashing field of R
to get the bucket number r
Convert r into its binary representation to get the bit sequence
Take the trailing k bits
Extendible Hashing (1/2)
Prof P Sreenivasa Kumar
Department of CS&E, IITM
18
All records with 3-bit
Sequence ‘111’
Extendible Hashing (2/2)
The # of
trailing
bits used in
the directory
Global depth d=3
000
001
010
011
100
101
110
111
Directory
2
3
3
2
3
3
Local depth
All records with 2-bit
Sequence ‘01’
The number of bits in the
common suffix of bit
sequences corresponding to
the records in the bucket
Locating a record
Match the d-bit sequence with an entry in the directory and go to
the corresponding bucket to find the record
Prof P Sreenivasa Kumar
Department of CS&E, IITM
19
Insertion in Extendible Hashing Scheme (1/2)
2 - bit sequence for the record to be inserted: 00
full
00
01
10
11
1
2
2
b0
b1
b2
d=2
b0 Full: Bucket b0 is split
All records whose 2-bit sequence is ‘10’ are
sent to a new bucket b3. Others are retained in b0
Directory is modified.
b0 Not full: New record is placed in b0. No changes in the directory.
00
01
10
11
d=2
all local
depth = 2
b0
b3
b2
b1
Prof P Sreenivasa Kumar
Department of CS&E, IITM
20
Insertion in Extendible Hashing Scheme (2/2)
2 - bit sequence for the record to be inserted: 10
00
01
10
11
d=2
full
b0
b1
b2
b3
all local
depth = 2
000
001
010
011
100
101
110
111
d=3
2
2
3
3
2
b0
b1
b3
b2
b4
b3 not full: new record placed in b3. No changes.
b3 full : b3 is split, directory is doubled, all records with 3-bit
sequence 110 sent to b4. Others in b3.
In general, if the local depth of the bucket to be split is equal to the
global depth, directory is doubled
Prof P Sreenivasa Kumar
Department of CS&E, IITM
21
Deletion in Extendible Hashing Scheme
00
01
10
11
d=2
b0
b1
b2
b3
all local
depth = 2
000
001
010
011
100
101
110
111
d=3
2
2
3
3
2
b0
b1
b3
b2
b4
Matching pair of data buckets:
k-bit sequences have a common k-1 bit suffix, e.g, b3 & b4
Due to deletions, if a pair of matching data buckets
-- become less than half full – try to merge them into one bucket
If the local depth of all buckets is one less than the global depth
-- reduce the directory to half its size
Prof P Sreenivasa Kumar
Department of CS&E, IITM
22
Extendible Hashing Example
Bucket capacity – 2 Initial buckets = 1
Insert 45,22
0
0
45
22
22
12
45
1
1
22
12
45
1
1
11
1
1
0
1
1
0
Global
depth
Local depth
Insert 12
Insert 11
Bucket overflows
local depth = global depth
⇒ Directory doubles and split image
is created
45 101101
22 10110
12 1100
11 1011
Prof P Sreenivasa Kumar
Department of CS&E, IITM
23
Insert 15
2
00
01
10
11
45
12
2
2
2
00
01
10
11
45
2
22
12
1
2
11
15
10
22
15
11
2
2
Insert 10
Overflow occurs.
Global depth = local depth
Directory doubles and split occurs
Overflows occurs.
Since local depth < global depth
Split image is created
Directory is not doubled
45 101101
22 10110
12 1100
11 1011
15 1111
10 1010
Prof P Sreenivasa Kumar
Department of CS&E, IITM
24
Linear Hashing
Does not require a separate directory structure
Uses a family of hash functions h0, h1, h2,….
• the range of hi is double the range of hi-1
• hi(x) = x mod 2iM
M - the initial no. of buckets
(Assume that the hashing field is an integer)
Initial hash functions
h0(x) = x mod M
h1(x) = x mod 2M
Prof P Sreenivasa Kumar
Department of CS&E, IITM
25
Insertion (1/3)
Initially the structure has M main buckets
( 0 ,…, M-1 ) and a few overflow buckets
To insert a record with hash field value x,
place the record in bucket ho(x)
When the first overflow in any bucket occurs:
Say, overflow occurred in bucket s
Insert the record in the overflow chain of bucket s
Create a new bucket M
Split the bucket 0 by using h1
Some records stay in bucket 0 and
some go to bucket M.
.
.
0
1
2
M-1
M
Overflow
buckets
Split image
of bucket 0
Prof P Sreenivasa Kumar
Department of CS&E, IITM
26
Insertion (2/3)
On first overflow,
irrespective of where it occurs, bucket 0 is split
On subsequent overflows
buckets 1, 2, 3, … are split in that order
(This why the scheme is called linear hashing)
N: the next bucket to be split
After M overflows,
all the original M buckets are split.
We switch to hash functions h1, h2
and set N = 0.
ho h1 hi
h1 h2 hi+1
… …
.
.
.
0
1
2
M-1
Split
images
M
M+1
.
.
Prof P Sreenivasa Kumar
Department of CS&E, IITM
27
Nature of Hash Functions
hi(x) = x mod 2iM. Let M' = 2iM
Note that if hi(x) = k then x = M'r + k, k < M'
and hi+1(x) = (M'r + k) mod 2M' = k or M' + k
M'– the current number of original buckets.
Since,
r – even – (M'2s + k) mod 2M' = k
r – odd – ( M'(2s + 1) + k ) mod 2M' = M' + k
Prof P Sreenivasa Kumar
Department of CS&E, IITM
28
Insertion (3/3)
Say the hash functions in use are hi, hi+1
To insert record with hash field value x,
Compute hi(x)
if hi(x) < N, the original bucket is already split
place the record in bucket hi+1(x)
else place the record in bucket hi(x)
Prof P Sreenivasa Kumar
Department of CS&E, IITM
29
Linear Hashing Example
Initial Buckets = 1 Bucket capacity = 2 records
N
0
Hash functions
h0 = x mod 1
h1 = x mod 2
Split pointer
Insert 12, 11
N
0 12
11
N
0 12
14
1 11
h0 = x mod 2
h1 = x mod 4
Insert 14
B0 overflows
Bucket pointed by
N is split
Hash functions are
changed
Prof P Sreenivasa Kumar
Department of CS&E, IITM
30
Insert 13
N
0 12
14
1 11
N
0 12
1 11
h0 = x mod 2
h1 = x mod 4
13
9
142
Insert 9
B1 overflows
B0 is split using h1
and split image
is created
N
0 12
1 11
13
9
142
Insert 10
h1 is
applied here
10
Insert 18
overflow at B2
split B1
h0 = x mod 4
h1 = x mod 8
0
1
2
3
12
9
13
14
10
11
18
N
13
Prof P Sreenivasa Kumar
Department of CS&E, IITM
31
Index Structures
Index: A disk data structure
– enables efficient retrieval of a record
given the value (s) of certain attributes
– indexing attributes
Primary Index:
Index built on ordering key field of a file
Clustering Index:
Index built on ordering non-key field of a file
Secondary Index:
Index built on any non-ordering field of a file
Prof P Sreenivasa Kumar
Department of CS&E, IITM
32
Primary Index
Can be built on ordered / sorted files
Index attribute – ordering key field (OKF)
Index Entry:
Index file: ordered file (sorted on OKF)
size-no. of blocks in the data file
Index file blocking factor BFi = B/(V +P)
(B-block size, V-OKF size, P-block pointer size)
- generally more than data file blocking factor
No of Index file blocks bi = b/BFi
(b - no. of data file blocks)
value of OKF for
the first record of
a block Bj
disk address
of Bj
101
121
129
240
.
.
.
.
101
104
121
123
129
130
240
244
.
.
.
.
0
1
2
b
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Ordering key
(RollNo) Data
file
Prof P Sreenivasa Kumar
Department of CS&E, IITM
33
Record Access Using Primary Index
Given Ordering key field (OKF) value: x
Carry out binary search on the index file
m – value of OKF for the first record in the middle block k of
the index file
x < m: do binary search on blocks 0 – (k −1) of index file
x ≥ m: if there is an index entry in block k with OKF value x,
use the corresponding block pointer,
get the data file block and
search for the data record with OKF value x
else do binary search on blocks k +1,…, bi of index file
Maximum block accesses required: ⌈log2
bi⌉
Prof P Sreenivasa Kumar
Department of CS&E, IITM
34
An Example
Data file:
No. of blocks b = 9500
Block size B = 4KB
OKF length V = 15 bytes
Block pointer length p = 6 bytes
Index file
No. of records ri = 9500
Size of entry V + P = 21 bytes
Blocking factor BFi = 4096/21 = 195
No. of blocks bi = ri/BFi = 49
Max No. of block accesses for getting record
using the primary index 1 + log2
bi = 7
Max No. of block accesses for getting record
without using primary index log2
b = 14
Prof P Sreenivasa Kumar
Department of CS&E, IITM
35
Making the Index Multi-level
49 entries
9500
entries
Second level
index
1 block
First level
index
49 blocks
data file
9500 blocks
Index file – itself an ordered file
– another level of index can be built
Multilevel Index –
Successive levels of indices are built till the last level has one block
height – no. of levels
block accesses: height + 1
(no binary search required)
For the example data file:
No of block accesses required with
multi-level primary index: 3
without any index: 14
.
.
.
.
.
.
.
.
.
.
Prof P Sreenivasa Kumar
Department of CS&E, IITM
36
Range Search, Insertion and Deletion
Range search on the ordering key field:
Get records with OKF value between x1 and x2 (inclusive)
Use the index to locate the record with OKF value x1 and read
succeeding records till OKF value exceeds x2.
Very efficient
Insertion: Data file – keep 25% of space in each block free
-- to take care of future insertions
index doesn't get changed
-- or use overflow chains for blocks that overflow
Deletion: Handle using deletion markers so that index doesn’t get
affected
Basically, avoid changes to index
Prof P Sreenivasa Kumar
Department of CS&E, IITM
37
Clustering Index
Built on ordered files where ordering field is not a key
Index attribute: ordering field (OF)
Index entry:
Index file: Ordered file (sorted on OF)
size – no. of distinct values of OF
Distinct value Vi
of the OF
address of the first
block that has a record with OF value Vi
Prof P Sreenivasa Kumar
Department of CS&E, IITM
38
Secondary Index
Built on any non-ordering field (NOF) of a data file.
Case I: NOF is also a key (Secondary key)
Case II: NOF is not a key: two options
(1)
(2)
Remarks:
(1) index entry – variable length record
(2) index entry – fixed length – One more level of indirection
value of the NOF Vi pointer to the record with Vi as the NOF value
value of the NOF Vi
value of the NOF Vi
pointer(s) to the record(s) with Vi as the NOF value
pointer to a block that has pointer(s) to the record(s)
with Vi as the NOF value
Prof P Sreenivasa Kumar
Department of CS&E, IITM
39
Secondary Index (key)
Can be built on ordered and also other type of files
Index attribute: non-ordering key field
Index entry:
Index file: ordered file (sorted on NOF values)
No. of entries – same as the no. of records in the data file
Index file blocking factor Bfi = B/(V+Pr)
(B: block size, V: length of the NOF,
Pr: length of a record pointer)
Index file blocks = ⎡r/Bfi⎤
(r – no. of records in the data file)
value of the NOF Vi pointer to the record with Vi as the NOF value
Prof P Sreenivasa Kumar
Department of CS&E, IITM
40
An Example
Data file:
No. of records r = 90,000 Block size B = 4KB
Record length R = 100 bytes BF = 4096/100 = 40,
b = 90000/40 = 2250
NOF length V = 15 bytes length of a record pointer Pr = 7 bytes
Index file :
No. of records ri = 90,000 record length = V + Pr = 22 bytes
BFi = 4096/22 = 186 No. of blocks bi = 90000/186 = 484
Max no. of block accesses to get a record
using the secondary index 1 + log2
bi = 10
Avg no. of block accesses to get a record
without using the secondary index b/2 = 1125
A very significant improvement
Prof P Sreenivasa Kumar
Department of CS&E, IITM
41
Multi-level Secondary Indexes
Secondary indexes can also be converted to multi-level indexes
First level index
– as many entries as there are records in the data file
First level index is an ordered file
so, in the second level index, the number of entries will be
equal to the number of blocks in the first level index
rather than the number of records
Similarly in other higher levels
Prof P Sreenivasa Kumar
Department of CS&E, IITM
42
Making the Secondary Index Multi-level
484 entries
90000
entries
Second level
index
3 blocks
First level
index
484 blocks
data file
90000
records
Multilevel Index –
Successive levels of indices are built
till the last level has one block
height – no. of levels
block accesses: height + 1
For the example data file:
No of block accesses required:
multi-level index: 4
single level index: 10
.
.
.
.
.
.
.
.
.
.
3 entries
1 block
2250
blocks
Prof P Sreenivasa Kumar
Department of CS&E, IITM
43
Index Sequential Access Method (ISAM) Files
ISAM files –
Ordered files with a multilevel primary/clustering index
Insertions:
Handled using overflow chains at data file blocks
Deletions:
Handled using deletion markers
Most suitable for files that are relatively static
If the files are dynamic, we need to go for dynamic multi-level
index structures based on B+- trees
Prof P Sreenivasa Kumar
Department of CS&E, IITM
44
B+
- trees
Balanced search trees
• all leaves are at the same level
Leaf node entries point to the actual data records
• all leaf nodes are linked up as a list
Internal node entries carry only index information
In B-trees, internal nodes carry data records also
The fan-out in B-trees is less
Makes sure that blocks are always at least half filled
Supports both random and sequential access of records
Prof P Sreenivasa Kumar
Department of CS&E, IITM
45
Order
Order (m) of an Internal Node
• Order of an internal node is the maximum number of tree
pointers held in it.
• Maximum of (m-1) keys can be present in an internal node
Order (mleaf) of a Leaf Node
• Order of a leaf node is the maximum number of record
pointers held in it. It is equal to the number of keys in a
leaf node.
Prof P Sreenivasa Kumar
Department of CS&E, IITM
46
Internal Nodes
An internal node of a B+
- tree of order m:
It contains at least pointers, except when it is the root node
It contains at most m pointers.
If it has P1, P2, …, Pj pointers with
K1 < K2 < K3 … < Kj-1 as keys, where ≤ j ≤ m, then
• P1 points to the subtree with records having key value x ≤ K1
• Pi (1 < i < j) points to the subtree with records having
key value x such that Ki-1 < x ≤ Ki
• Pj points to records with key value x > Kj-1
2
m
2
m
Prof P Sreenivasa Kumar
Department of CS&E, IITM
47
Internal Node Structure
2
m
≤ j ≤ m
P1 K1 P2 Pi PjK2 KiKi-1 Kj-1… … …
x ≤ K1 Ki-1 < x ≤ Ki Kj-1 < x
2 5 12
x ≤ 2
2 < x ≤ 5 5 < x ≤ 12
x > 12
Example
-
Prof P Sreenivasa Kumar
Department of CS&E, IITM
48
Leaf Node Structure
Structure of leaf node of B+
- of order mleaf :
It contains one block pointer P to point to next leaf node
At least record pointers and key values
At most mleaf record pointers and key values
If a node has keys K1 < K2 < … < Kj with Pr1, Pr2… Prj as record
pointers and P as block pointer, then
Pri points to record with Ki as the search field value, 1 ≤ i ≤ j
P points to next leaf block
K1 K2 KjPr1 Pr2 Pj P… …
leafm
2
leafm
2
……
Prof P Sreenivasa Kumar
Department of CS&E, IITM
49
Order Calculation
Block size: B, Size of Indexing field: V
Size of block pointer: P, Size of record pointer: Pr
Order of Internal node (m):
As there can be at most m block pointers and (m-1) keys
(m*P) + ((m-1) * V) ≤ B
m can be calculated by solving the above equation.
Order of leaf node:
As there can be at most mleaf record pointers and keys
with one block pointer in a leaf node,
mleaf can be calculated by solving
(mleaf * (Pr + V)) + P ≤ B
Prof P Sreenivasa Kumar
Department of CS&E, IITM
50
Example Order Calculation
Given B = 512 bytes V = 8 bytes
P = 6 bytes Pr = 7 bytes. Then
Internal node order m = ?
m * P + ((m-1) *V) ≤ B
m * 6 + ((m-1) *8) ≤ 512
14m ≤ 520
m ≤ 37
Leaf order mleaf = ?
mleaf (Pr + V) + P ≤ 512
mleaf (7 + 8) + 6 ≤ 512
15mleaf ≤ 506
mleaf ≤ 33
Prof P Sreenivasa Kumar
Department of CS&E, IITM
51
Example B+
- tree
m = 3 mleaf = 2
3
2
7
4 9
1 2 3 4 6 7 8 9 12 15
- - -
- - ^
Prof P Sreenivasa Kumar
Department of CS&E, IITM
52
Insertion into B+
- trees
1. Every node is inserted at leaf level
If leaf node overflows, then
• Node is split at j =
• First j entries are kept in original node
• Entities from j+1 are moved to new node
• jth key value is replicated in the parent of the leaf.
If Internal node overflows
• Node is split at j =
• Values and pointers up to Pj are kept in original node
• jth key value is moved to parent of the internal node
• Pj+1 to the rest of entries are moved to new node.
leaf(m 1)
2
+
(m 1)
2
+
Prof P Sreenivasa Kumar
Department of CS&E, IITM
53
Example of Insertions
m = 3 mleaf = 2
Insert 20, 11
11 20 11 14
14
20^
^
^
-
-
Insert 14
Overflow. leaf is split
at j = = 2
14 is replicated to upper level
leaf(m 1)
2
+
Insert 25
14
1411 20 25
14
1411 20 25 30
^- 25
^
Inserted at
leaf level
Insert 30
Overflow.
split at 25.
25 is moved
up
1
2
3 4
Prof P Sreenivasa Kumar
Department of CS&E, IITM
54
Insert 12 Overflow at leaf level.
- Split at leaf level,
- Triggers overflow at internal node
- Split occurs at internal node
14
30
12 25
11 12 14 2520
.5
-
- -
- -
^ ^
Internal node split
at j =
split at 14 and 14 is
moved up
2
m
Prof P Sreenivasa Kumar
Department of CS&E, IITM
55
Insert 22
14
12 22 25
11 12 14 20 22 25 30
-
-
-
^
^
Insert 23, 24
14 24
12 22 25
11 12 14 20 22 23 24 25 30
- ^
6
7
Prof P Sreenivasa Kumar
Department of CS&E, IITM
56
Deletion in B+
- trees
Delete the entry from the leaf node
Delete the entry if it is present in Internal node and replace with
the entry to its left in that position.
If underflow occurs after deletion
• Distribute the entries from left sibling
if not possible – Distribute the entries from right sibling
if not possible – Merge the node with left and right sibling
Prof P Sreenivasa Kumar
Department of CS&E, IITM
57
Example
14 24
12 22 25
11 12 14 20 22 23 24 25 30
14 24
12 22 25
11 12 14 22 23 24 25 30
Delete 20
Removed entry
from leaf here
Prof P Sreenivasa Kumar
Department of CS&E, IITM
58
14 24
12 23 25
11 12 14 23 24 25 30
14
12 23
11 12 14 23 25 30
Delete 24
Delete 22
25
Entry 22 is removed
from leaf and
internal node
Entries from right
sibling are
distributed to left
Prof P Sreenivasa Kumar
Department of CS&E, IITM
59
12
11 23
11 12 23 25 30
Delete 14
25
Delete 12
23 25
2311 25 30
Level drop has occurred
Prof P Sreenivasa Kumar
Department of CS&E, IITM
60
Advantages of B+
- trees:
1) Any record can be fetched in equal number of disk accesses.
2) Range queries can be performed easily as leaves are linked up
3) Height of the tree is less as only keys are used for indexing
4) Supports both random and sequential access.
Disadvantages of B+
- trees:
Insert and delete operations are complicated
Root node becomes a hotspot

More Related Content

PDF
1 introduction
PDF
Introduction to DBMS and SQL Overview
PPT
Unit 02 dbms
PPT
Unit01 dbms 2
PDF
Relational Database Design
PDF
Query Processing, Query Optimization and Transaction
DOC
Basic IMS For Applications
PPT
23. Advanced Datatypes and New Application in DBMS
1 introduction
Introduction to DBMS and SQL Overview
Unit 02 dbms
Unit01 dbms 2
Relational Database Design
Query Processing, Query Optimization and Transaction
Basic IMS For Applications
23. Advanced Datatypes and New Application in DBMS

What's hot (18)

PDF
Data and File Structure Lecture Notes
PDF
Dms01
PPS
Database system-DBMS
PPT
Unit 04 dbms
PPTX
Distributed design alternatives
PPT
358 33 powerpoint-slides_16-files-their-organization_chapter-16
PPT
Chapter25
PDF
RDBMS Arch & Models
PPT
Normalization of database tables
PDF
DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007...
PDF
Final exam in advance dbms
PDF
1816 1819
PPT
Chapter16
PPTX
Database systems - Chapter 2 (Remaining)
PDF
Cs2305 programming paradigms lecturer notes
DOCX
Database Management System
PPT
Distributed D B
PPTX
Data Modeling
Data and File Structure Lecture Notes
Dms01
Database system-DBMS
Unit 04 dbms
Distributed design alternatives
358 33 powerpoint-slides_16-files-their-organization_chapter-16
Chapter25
RDBMS Arch & Models
Normalization of database tables
DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007...
Final exam in advance dbms
1816 1819
Chapter16
Database systems - Chapter 2 (Remaining)
Cs2305 programming paradigms lecturer notes
Database Management System
Distributed D B
Data Modeling
Ad

Viewers also liked (20)

PPTX
Managing your tech career
PDF
6 relational schema_design
PDF
4 the sql_standard
PPT
Best Practices for Database Schema Design
PPTX
Webinar: Build an Application Series - Session 2 - Getting Started
PDF
3 relational model
PDF
MySQL Replication: Pros and Cons
PDF
Distributed Postgres
ZIP
Week3 Lecture Database Design
PPTX
Database Design
PDF
2 entity relationship_model
PPTX
English gcse final tips
PDF
Postgres-XC Write Scalable PostgreSQL Cluster
PDF
Escalabilidade, Sharding, Paralelismo e Bigdata com PostgreSQL? Yes, we can!
PPTX
Database design concept
PPT
Database design
PPT
Best Practices for Database Schema Design
PDF
Database Schema
PPT
Database design
Managing your tech career
6 relational schema_design
4 the sql_standard
Best Practices for Database Schema Design
Webinar: Build an Application Series - Session 2 - Getting Started
3 relational model
MySQL Replication: Pros and Cons
Distributed Postgres
Week3 Lecture Database Design
Database Design
2 entity relationship_model
English gcse final tips
Postgres-XC Write Scalable PostgreSQL Cluster
Escalabilidade, Sharding, Paralelismo e Bigdata com PostgreSQL? Yes, we can!
Database design concept
Database design
Best Practices for Database Schema Design
Database Schema
Database design
Ad

Similar to 5 data storage_and_indexing (20)

PPTX
files,indexing,hashing,linear and non linear hashing
PPT
File organization 1
PPTX
DBMS Data Storage and Query Processing.
PPT
Data Indexing Presentation-My.pptppt.ppt
PPTX
Relational database management system file organisation.pptx
PPT
Database Management Systems full lecture
PPTX
file organization ppt on dbms types of f
PPTX
DBMS-Unit5-PPT.pptx important for revision
PDF
File Organization
PPTX
Elmasri Navathe Primary Files database A
PPT
Hashing gt1
PPTX
Relational Database Management System
PDF
Database management system chapter thirt
PPTX
normalization process in relational data base management
PDF
fileorganizationandintroductionofdbms-210313163900.pdf
PPTX
File organization and introduction of DBMS
PPT
File organization
PPTX
storage techniques_overview-1.pptx
PPTX
Ch 17 disk storage, basic files structure, and hashing
files,indexing,hashing,linear and non linear hashing
File organization 1
DBMS Data Storage and Query Processing.
Data Indexing Presentation-My.pptppt.ppt
Relational database management system file organisation.pptx
Database Management Systems full lecture
file organization ppt on dbms types of f
DBMS-Unit5-PPT.pptx important for revision
File Organization
Elmasri Navathe Primary Files database A
Hashing gt1
Relational Database Management System
Database management system chapter thirt
normalization process in relational data base management
fileorganizationandintroductionofdbms-210313163900.pdf
File organization and introduction of DBMS
File organization
storage techniques_overview-1.pptx
Ch 17 disk storage, basic files structure, and hashing

Recently uploaded (20)

PDF
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PDF
Level 2 – IBM Data and AI Fundamentals (1)_v1.1.PDF
PPTX
additive manufacturing of ss316l using mig welding
PPTX
Fundamentals of Mechanical Engineering.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
PPT on Performance Review to get promotions
PDF
Categorization of Factors Affecting Classification Algorithms Selection
PPTX
Geodesy 1.pptx...............................................
PDF
III.4.1.2_The_Space_Environment.p pdffdf
PPTX
Artificial Intelligence
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPT
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPT
Project quality management in manufacturing
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPT
introduction to datamining and warehousing
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Automation-in-Manufacturing-Chapter-Introduction.pdf
Level 2 – IBM Data and AI Fundamentals (1)_v1.1.PDF
additive manufacturing of ss316l using mig welding
Fundamentals of Mechanical Engineering.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPT on Performance Review to get promotions
Categorization of Factors Affecting Classification Algorithms Selection
Geodesy 1.pptx...............................................
III.4.1.2_The_Space_Environment.p pdffdf
Artificial Intelligence
Internet of Things (IOT) - A guide to understanding
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Project quality management in manufacturing
Embodied AI: Ushering in the Next Era of Intelligent Systems
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
introduction to datamining and warehousing

5 data storage_and_indexing

  • 1. Prof P Sreenivasa Kumar Department of CS&E, IITM 1 File Organization and Indexing The data of a RDB is ultimately stored in disk files Disk space management: Should Operating System services be used ? Should RDBMS manage the disk space by itself ? 2nd option is preferred as RDBMS requires complete control over when a block or page in main memory buffer is written to the disk. This is important for recovering data when system crash occurs
  • 2. Prof P Sreenivasa Kumar Department of CS&E, IITM 2 Structure of Disks Disk several platters stacked on a rotating spindle one read / write head per surface for fast access platter has several tracks • ~10,000 per inch each track - several sectors each sector - blocks unit of data transfer - block cylinder i - track i on all platters Platters Read/write head } track sector Speed: 7000 to 10000 rpm
  • 3. Prof P Sreenivasa Kumar Department of CS&E, IITM 3 Data Transfer from Disk Address of a block: Surface No, Cylinder No, Block No Data transfer: Move the r/w head to the appropriate track • time needed - seek time – ~ 12 to 14 ms Wait for the appropriate block to come under r/w head • time needed - rotational delay - ~3 to 4ms (avg) Access time: Seek time + rotational delay Blocks on the same cylinder - roughly close to each other - access time-wise - cylinder i, cylinder (i + 1), cylinder (i + 2) etc.
  • 4. Prof P Sreenivasa Kumar Department of CS&E, IITM 4 Data Records and Files Fixed length record type: each field is of fixed length • in a file of these type of records, the record number can be used to locate a specific record • the number of records, the length of each field are available in file header Variable length record type: • arise due to missing fields, repeating fields, variable length fields • special separator symbols are used to indicate the field boundaries and record boundaries • the number of records, the separator symbols used are recorded in the file header
  • 5. Prof P Sreenivasa Kumar Department of CS&E, IITM 5 Packing Records into Blocks Record length much less than block size • The usual case • Blocking factor b = B/r B - block size (bytes) r - record length (bytes) - maximum no. of records that can be stored in a block Record length greater than block size • spanned organization is used Record 1 1 2 2 3 3 File blocks: sequence of blocks containing all the records of the file
  • 6. Prof P Sreenivasa Kumar Department of CS&E, IITM 6 Mapping File Blocks onto the Disk Blocks Contiguous allocation • Consecutive file blocks are stored in consecutive disk blocks • Pros: File scanning can be done fast using double buffering Cons: Expanding the file by including a new block in the middle of the sequence - difficult Linked allocation • each file block is assigned to some disk block • each disk block has a pointer to next block of the sequence • file expansion is easy; but scanning is slow Mixed allocation
  • 7. Prof P Sreenivasa Kumar Department of CS&E, IITM 7 Operations on Files Insertion of a new record: may involve searching for appropriate location for the new record Deletion of a record: locating a record –may involve search; delete the record –may involve movement of other records Update a record field/fields: equivalent to delete and insert Search for a record: given value of a key field / non-key field Range search: given range values for a key / non-key field How successfully we can carry out these operations depends on the organization of the file and the availability of indexes
  • 8. Prof P Sreenivasa Kumar Department of CS&E, IITM 8 Primary File Organization The logical policy / method used for placing records into file blocks Example: Student file - organized to have students records sorted in increasing order of the “rollNo” values Goal: To ensure that operations performed frequently on the file execute fast • conflicting demands may be there • example: on student file, access based on rollNo and also access based on name may both be frequent • we choose to make rollNo access fast • For making name access fast, additional access structures are needed. - more details later
  • 9. Prof P Sreenivasa Kumar Department of CS&E, IITM 9 Different File Organization Methods We will discuss Heap files, Sorted files and Hashed files Heap file: Records are appended to the file as they are inserted Simplest organization Insertion - Read the last file block, append the record and write back the block - easy Locating a record given values for any attribute • requires scanning the entire file – very costly Heap files are often used only along with other access structures.
  • 10. Prof P Sreenivasa Kumar Department of CS&E, IITM 10 Sorted files / Sequential files (1/2) Ordering field: The field whose values are used for sorting the records in the data file Ordering key field: An ordering field that is also a key Sorted file / Sequential file: Data file whose records are arranged such that the values of the ordering field are in ascending order Locating a record given the value X of the ordering field: Binary search can be performed Address of the nth file block can be obtained from the file header O(log N) disk accesses to get the required block- efficient Range search is also efficient
  • 11. Prof P Sreenivasa Kumar Department of CS&E, IITM 11 Sorted files / Sequential files (2/2) Inserting a new record: Ordering gets affected • costly as all blocks following the block in which insertion is performed may have to be modified Hence not done directly in the file • all inserted records are kept in an auxiliary file • periodically file is reorganized - auxiliary file and main file are merged • locating record • carried out first on auxiliary file and then the main file. Deleting a record • deletion markers are used.
  • 12. Prof P Sreenivasa Kumar Department of CS&E, IITM 12 Hashed Files Very useful file organization, if quick access to the data record is needed given the value of a single attribute. Hashing field: The attribute on which quick access is needed and on which hashing is performed Data file: organized as a buckets with numbers 0,1, …, (M − 1) (bucket - a block or a few consecutive blocks) Hash function h: maps the values from the domain of the hashing attribute to bucket numbers
  • 13. Prof P Sreenivasa Kumar Department of CS&E, IITM 13 Inserting Records into a Hashed File Insertion: for the given record R, apply h on the value of hashing attribute to get the bucket number r. If there is space in bucket r, place R there else place R in the overflow chain of bucket r. The overflow chains of all the buckets are maintained in the overflow buckets. 0 1 2 M-1 Main buckets Overflow buckets Overflow chain
  • 14. Prof P Sreenivasa Kumar Department of CS&E, IITM 14 Deleting Records from a Hashed File Deletion: Locate the record R to be deleted by applying h. Remove R from its bucket/overflow chain. If possible, bring a record from the overflow chain into the bucket 0 1 2 M-1 Main buckets Overflow buckets Overflow chain Search: Given the hash filed value k, compute r = h(k). Get the bucket r and search for the record. If not found, search the overflow chain of bucket r.
  • 15. Prof P Sreenivasa Kumar Department of CS&E, IITM 15 Performance of Static Hashing Static hashing: The hashing method discussed so far The number of main buckets is fixed Locating a record given the value of the hashing attribute most often – one block access Capacity of the hash file C = r * M records (r - no. of records per bucket, M - no. of main buckets) Disadvantage with static hashing: If actual records in the file is much less than C • wastage of disk space If actual records in the file is much more than C • long overflow chains – degraded performance
  • 16. Prof P Sreenivasa Kumar Department of CS&E, IITM 16 Hashing for Dynamic File Organization Dynamic files files where record insertions and deletion take place frequently the file keeps growing and also shrinking Hashing for dynamic file organization Bucket numbers are integers The binary representation of bucket numbers Exploited cleverly to devise dynamic hashing schemes Two schemes • Extendible hashing • Linear hashing
  • 17. Prof P Sreenivasa Kumar Department of CS&E, IITM 17 The k-bit sequence corresponding to a record R: Apply hashing function to the value of the hashing field of R to get the bucket number r Convert r into its binary representation to get the bit sequence Take the trailing k bits Extendible Hashing (1/2)
  • 18. Prof P Sreenivasa Kumar Department of CS&E, IITM 18 All records with 3-bit Sequence ‘111’ Extendible Hashing (2/2) The # of trailing bits used in the directory Global depth d=3 000 001 010 011 100 101 110 111 Directory 2 3 3 2 3 3 Local depth All records with 2-bit Sequence ‘01’ The number of bits in the common suffix of bit sequences corresponding to the records in the bucket Locating a record Match the d-bit sequence with an entry in the directory and go to the corresponding bucket to find the record
  • 19. Prof P Sreenivasa Kumar Department of CS&E, IITM 19 Insertion in Extendible Hashing Scheme (1/2) 2 - bit sequence for the record to be inserted: 00 full 00 01 10 11 1 2 2 b0 b1 b2 d=2 b0 Full: Bucket b0 is split All records whose 2-bit sequence is ‘10’ are sent to a new bucket b3. Others are retained in b0 Directory is modified. b0 Not full: New record is placed in b0. No changes in the directory. 00 01 10 11 d=2 all local depth = 2 b0 b3 b2 b1
  • 20. Prof P Sreenivasa Kumar Department of CS&E, IITM 20 Insertion in Extendible Hashing Scheme (2/2) 2 - bit sequence for the record to be inserted: 10 00 01 10 11 d=2 full b0 b1 b2 b3 all local depth = 2 000 001 010 011 100 101 110 111 d=3 2 2 3 3 2 b0 b1 b3 b2 b4 b3 not full: new record placed in b3. No changes. b3 full : b3 is split, directory is doubled, all records with 3-bit sequence 110 sent to b4. Others in b3. In general, if the local depth of the bucket to be split is equal to the global depth, directory is doubled
  • 21. Prof P Sreenivasa Kumar Department of CS&E, IITM 21 Deletion in Extendible Hashing Scheme 00 01 10 11 d=2 b0 b1 b2 b3 all local depth = 2 000 001 010 011 100 101 110 111 d=3 2 2 3 3 2 b0 b1 b3 b2 b4 Matching pair of data buckets: k-bit sequences have a common k-1 bit suffix, e.g, b3 & b4 Due to deletions, if a pair of matching data buckets -- become less than half full – try to merge them into one bucket If the local depth of all buckets is one less than the global depth -- reduce the directory to half its size
  • 22. Prof P Sreenivasa Kumar Department of CS&E, IITM 22 Extendible Hashing Example Bucket capacity – 2 Initial buckets = 1 Insert 45,22 0 0 45 22 22 12 45 1 1 22 12 45 1 1 11 1 1 0 1 1 0 Global depth Local depth Insert 12 Insert 11 Bucket overflows local depth = global depth ⇒ Directory doubles and split image is created 45 101101 22 10110 12 1100 11 1011
  • 23. Prof P Sreenivasa Kumar Department of CS&E, IITM 23 Insert 15 2 00 01 10 11 45 12 2 2 2 00 01 10 11 45 2 22 12 1 2 11 15 10 22 15 11 2 2 Insert 10 Overflow occurs. Global depth = local depth Directory doubles and split occurs Overflows occurs. Since local depth < global depth Split image is created Directory is not doubled 45 101101 22 10110 12 1100 11 1011 15 1111 10 1010
  • 24. Prof P Sreenivasa Kumar Department of CS&E, IITM 24 Linear Hashing Does not require a separate directory structure Uses a family of hash functions h0, h1, h2,…. • the range of hi is double the range of hi-1 • hi(x) = x mod 2iM M - the initial no. of buckets (Assume that the hashing field is an integer) Initial hash functions h0(x) = x mod M h1(x) = x mod 2M
  • 25. Prof P Sreenivasa Kumar Department of CS&E, IITM 25 Insertion (1/3) Initially the structure has M main buckets ( 0 ,…, M-1 ) and a few overflow buckets To insert a record with hash field value x, place the record in bucket ho(x) When the first overflow in any bucket occurs: Say, overflow occurred in bucket s Insert the record in the overflow chain of bucket s Create a new bucket M Split the bucket 0 by using h1 Some records stay in bucket 0 and some go to bucket M. . . 0 1 2 M-1 M Overflow buckets Split image of bucket 0
  • 26. Prof P Sreenivasa Kumar Department of CS&E, IITM 26 Insertion (2/3) On first overflow, irrespective of where it occurs, bucket 0 is split On subsequent overflows buckets 1, 2, 3, … are split in that order (This why the scheme is called linear hashing) N: the next bucket to be split After M overflows, all the original M buckets are split. We switch to hash functions h1, h2 and set N = 0. ho h1 hi h1 h2 hi+1 … … . . . 0 1 2 M-1 Split images M M+1 . .
  • 27. Prof P Sreenivasa Kumar Department of CS&E, IITM 27 Nature of Hash Functions hi(x) = x mod 2iM. Let M' = 2iM Note that if hi(x) = k then x = M'r + k, k < M' and hi+1(x) = (M'r + k) mod 2M' = k or M' + k M'– the current number of original buckets. Since, r – even – (M'2s + k) mod 2M' = k r – odd – ( M'(2s + 1) + k ) mod 2M' = M' + k
  • 28. Prof P Sreenivasa Kumar Department of CS&E, IITM 28 Insertion (3/3) Say the hash functions in use are hi, hi+1 To insert record with hash field value x, Compute hi(x) if hi(x) < N, the original bucket is already split place the record in bucket hi+1(x) else place the record in bucket hi(x)
  • 29. Prof P Sreenivasa Kumar Department of CS&E, IITM 29 Linear Hashing Example Initial Buckets = 1 Bucket capacity = 2 records N 0 Hash functions h0 = x mod 1 h1 = x mod 2 Split pointer Insert 12, 11 N 0 12 11 N 0 12 14 1 11 h0 = x mod 2 h1 = x mod 4 Insert 14 B0 overflows Bucket pointed by N is split Hash functions are changed
  • 30. Prof P Sreenivasa Kumar Department of CS&E, IITM 30 Insert 13 N 0 12 14 1 11 N 0 12 1 11 h0 = x mod 2 h1 = x mod 4 13 9 142 Insert 9 B1 overflows B0 is split using h1 and split image is created N 0 12 1 11 13 9 142 Insert 10 h1 is applied here 10 Insert 18 overflow at B2 split B1 h0 = x mod 4 h1 = x mod 8 0 1 2 3 12 9 13 14 10 11 18 N 13
  • 31. Prof P Sreenivasa Kumar Department of CS&E, IITM 31 Index Structures Index: A disk data structure – enables efficient retrieval of a record given the value (s) of certain attributes – indexing attributes Primary Index: Index built on ordering key field of a file Clustering Index: Index built on ordering non-key field of a file Secondary Index: Index built on any non-ordering field of a file
  • 32. Prof P Sreenivasa Kumar Department of CS&E, IITM 32 Primary Index Can be built on ordered / sorted files Index attribute – ordering key field (OKF) Index Entry: Index file: ordered file (sorted on OKF) size-no. of blocks in the data file Index file blocking factor BFi = B/(V +P) (B-block size, V-OKF size, P-block pointer size) - generally more than data file blocking factor No of Index file blocks bi = b/BFi (b - no. of data file blocks) value of OKF for the first record of a block Bj disk address of Bj 101 121 129 240 . . . . 101 104 121 123 129 130 240 244 . . . . 0 1 2 b . . . . . . . . . . . . . . . . Ordering key (RollNo) Data file
  • 33. Prof P Sreenivasa Kumar Department of CS&E, IITM 33 Record Access Using Primary Index Given Ordering key field (OKF) value: x Carry out binary search on the index file m – value of OKF for the first record in the middle block k of the index file x < m: do binary search on blocks 0 – (k −1) of index file x ≥ m: if there is an index entry in block k with OKF value x, use the corresponding block pointer, get the data file block and search for the data record with OKF value x else do binary search on blocks k +1,…, bi of index file Maximum block accesses required: ⌈log2 bi⌉
  • 34. Prof P Sreenivasa Kumar Department of CS&E, IITM 34 An Example Data file: No. of blocks b = 9500 Block size B = 4KB OKF length V = 15 bytes Block pointer length p = 6 bytes Index file No. of records ri = 9500 Size of entry V + P = 21 bytes Blocking factor BFi = 4096/21 = 195 No. of blocks bi = ri/BFi = 49 Max No. of block accesses for getting record using the primary index 1 + log2 bi = 7 Max No. of block accesses for getting record without using primary index log2 b = 14
  • 35. Prof P Sreenivasa Kumar Department of CS&E, IITM 35 Making the Index Multi-level 49 entries 9500 entries Second level index 1 block First level index 49 blocks data file 9500 blocks Index file – itself an ordered file – another level of index can be built Multilevel Index – Successive levels of indices are built till the last level has one block height – no. of levels block accesses: height + 1 (no binary search required) For the example data file: No of block accesses required with multi-level primary index: 3 without any index: 14 . . . . . . . . . .
  • 36. Prof P Sreenivasa Kumar Department of CS&E, IITM 36 Range Search, Insertion and Deletion Range search on the ordering key field: Get records with OKF value between x1 and x2 (inclusive) Use the index to locate the record with OKF value x1 and read succeeding records till OKF value exceeds x2. Very efficient Insertion: Data file – keep 25% of space in each block free -- to take care of future insertions index doesn't get changed -- or use overflow chains for blocks that overflow Deletion: Handle using deletion markers so that index doesn’t get affected Basically, avoid changes to index
  • 37. Prof P Sreenivasa Kumar Department of CS&E, IITM 37 Clustering Index Built on ordered files where ordering field is not a key Index attribute: ordering field (OF) Index entry: Index file: Ordered file (sorted on OF) size – no. of distinct values of OF Distinct value Vi of the OF address of the first block that has a record with OF value Vi
  • 38. Prof P Sreenivasa Kumar Department of CS&E, IITM 38 Secondary Index Built on any non-ordering field (NOF) of a data file. Case I: NOF is also a key (Secondary key) Case II: NOF is not a key: two options (1) (2) Remarks: (1) index entry – variable length record (2) index entry – fixed length – One more level of indirection value of the NOF Vi pointer to the record with Vi as the NOF value value of the NOF Vi value of the NOF Vi pointer(s) to the record(s) with Vi as the NOF value pointer to a block that has pointer(s) to the record(s) with Vi as the NOF value
  • 39. Prof P Sreenivasa Kumar Department of CS&E, IITM 39 Secondary Index (key) Can be built on ordered and also other type of files Index attribute: non-ordering key field Index entry: Index file: ordered file (sorted on NOF values) No. of entries – same as the no. of records in the data file Index file blocking factor Bfi = B/(V+Pr) (B: block size, V: length of the NOF, Pr: length of a record pointer) Index file blocks = ⎡r/Bfi⎤ (r – no. of records in the data file) value of the NOF Vi pointer to the record with Vi as the NOF value
  • 40. Prof P Sreenivasa Kumar Department of CS&E, IITM 40 An Example Data file: No. of records r = 90,000 Block size B = 4KB Record length R = 100 bytes BF = 4096/100 = 40, b = 90000/40 = 2250 NOF length V = 15 bytes length of a record pointer Pr = 7 bytes Index file : No. of records ri = 90,000 record length = V + Pr = 22 bytes BFi = 4096/22 = 186 No. of blocks bi = 90000/186 = 484 Max no. of block accesses to get a record using the secondary index 1 + log2 bi = 10 Avg no. of block accesses to get a record without using the secondary index b/2 = 1125 A very significant improvement
  • 41. Prof P Sreenivasa Kumar Department of CS&E, IITM 41 Multi-level Secondary Indexes Secondary indexes can also be converted to multi-level indexes First level index – as many entries as there are records in the data file First level index is an ordered file so, in the second level index, the number of entries will be equal to the number of blocks in the first level index rather than the number of records Similarly in other higher levels
  • 42. Prof P Sreenivasa Kumar Department of CS&E, IITM 42 Making the Secondary Index Multi-level 484 entries 90000 entries Second level index 3 blocks First level index 484 blocks data file 90000 records Multilevel Index – Successive levels of indices are built till the last level has one block height – no. of levels block accesses: height + 1 For the example data file: No of block accesses required: multi-level index: 4 single level index: 10 . . . . . . . . . . 3 entries 1 block 2250 blocks
  • 43. Prof P Sreenivasa Kumar Department of CS&E, IITM 43 Index Sequential Access Method (ISAM) Files ISAM files – Ordered files with a multilevel primary/clustering index Insertions: Handled using overflow chains at data file blocks Deletions: Handled using deletion markers Most suitable for files that are relatively static If the files are dynamic, we need to go for dynamic multi-level index structures based on B+- trees
  • 44. Prof P Sreenivasa Kumar Department of CS&E, IITM 44 B+ - trees Balanced search trees • all leaves are at the same level Leaf node entries point to the actual data records • all leaf nodes are linked up as a list Internal node entries carry only index information In B-trees, internal nodes carry data records also The fan-out in B-trees is less Makes sure that blocks are always at least half filled Supports both random and sequential access of records
  • 45. Prof P Sreenivasa Kumar Department of CS&E, IITM 45 Order Order (m) of an Internal Node • Order of an internal node is the maximum number of tree pointers held in it. • Maximum of (m-1) keys can be present in an internal node Order (mleaf) of a Leaf Node • Order of a leaf node is the maximum number of record pointers held in it. It is equal to the number of keys in a leaf node.
  • 46. Prof P Sreenivasa Kumar Department of CS&E, IITM 46 Internal Nodes An internal node of a B+ - tree of order m: It contains at least pointers, except when it is the root node It contains at most m pointers. If it has P1, P2, …, Pj pointers with K1 < K2 < K3 … < Kj-1 as keys, where ≤ j ≤ m, then • P1 points to the subtree with records having key value x ≤ K1 • Pi (1 < i < j) points to the subtree with records having key value x such that Ki-1 < x ≤ Ki • Pj points to records with key value x > Kj-1 2 m 2 m
  • 47. Prof P Sreenivasa Kumar Department of CS&E, IITM 47 Internal Node Structure 2 m ≤ j ≤ m P1 K1 P2 Pi PjK2 KiKi-1 Kj-1… … … x ≤ K1 Ki-1 < x ≤ Ki Kj-1 < x 2 5 12 x ≤ 2 2 < x ≤ 5 5 < x ≤ 12 x > 12 Example -
  • 48. Prof P Sreenivasa Kumar Department of CS&E, IITM 48 Leaf Node Structure Structure of leaf node of B+ - of order mleaf : It contains one block pointer P to point to next leaf node At least record pointers and key values At most mleaf record pointers and key values If a node has keys K1 < K2 < … < Kj with Pr1, Pr2… Prj as record pointers and P as block pointer, then Pri points to record with Ki as the search field value, 1 ≤ i ≤ j P points to next leaf block K1 K2 KjPr1 Pr2 Pj P… … leafm 2 leafm 2 ……
  • 49. Prof P Sreenivasa Kumar Department of CS&E, IITM 49 Order Calculation Block size: B, Size of Indexing field: V Size of block pointer: P, Size of record pointer: Pr Order of Internal node (m): As there can be at most m block pointers and (m-1) keys (m*P) + ((m-1) * V) ≤ B m can be calculated by solving the above equation. Order of leaf node: As there can be at most mleaf record pointers and keys with one block pointer in a leaf node, mleaf can be calculated by solving (mleaf * (Pr + V)) + P ≤ B
  • 50. Prof P Sreenivasa Kumar Department of CS&E, IITM 50 Example Order Calculation Given B = 512 bytes V = 8 bytes P = 6 bytes Pr = 7 bytes. Then Internal node order m = ? m * P + ((m-1) *V) ≤ B m * 6 + ((m-1) *8) ≤ 512 14m ≤ 520 m ≤ 37 Leaf order mleaf = ? mleaf (Pr + V) + P ≤ 512 mleaf (7 + 8) + 6 ≤ 512 15mleaf ≤ 506 mleaf ≤ 33
  • 51. Prof P Sreenivasa Kumar Department of CS&E, IITM 51 Example B+ - tree m = 3 mleaf = 2 3 2 7 4 9 1 2 3 4 6 7 8 9 12 15 - - - - - ^
  • 52. Prof P Sreenivasa Kumar Department of CS&E, IITM 52 Insertion into B+ - trees 1. Every node is inserted at leaf level If leaf node overflows, then • Node is split at j = • First j entries are kept in original node • Entities from j+1 are moved to new node • jth key value is replicated in the parent of the leaf. If Internal node overflows • Node is split at j = • Values and pointers up to Pj are kept in original node • jth key value is moved to parent of the internal node • Pj+1 to the rest of entries are moved to new node. leaf(m 1) 2 + (m 1) 2 +
  • 53. Prof P Sreenivasa Kumar Department of CS&E, IITM 53 Example of Insertions m = 3 mleaf = 2 Insert 20, 11 11 20 11 14 14 20^ ^ ^ - - Insert 14 Overflow. leaf is split at j = = 2 14 is replicated to upper level leaf(m 1) 2 + Insert 25 14 1411 20 25 14 1411 20 25 30 ^- 25 ^ Inserted at leaf level Insert 30 Overflow. split at 25. 25 is moved up 1 2 3 4
  • 54. Prof P Sreenivasa Kumar Department of CS&E, IITM 54 Insert 12 Overflow at leaf level. - Split at leaf level, - Triggers overflow at internal node - Split occurs at internal node 14 30 12 25 11 12 14 2520 .5 - - - - - ^ ^ Internal node split at j = split at 14 and 14 is moved up 2 m
  • 55. Prof P Sreenivasa Kumar Department of CS&E, IITM 55 Insert 22 14 12 22 25 11 12 14 20 22 25 30 - - - ^ ^ Insert 23, 24 14 24 12 22 25 11 12 14 20 22 23 24 25 30 - ^ 6 7
  • 56. Prof P Sreenivasa Kumar Department of CS&E, IITM 56 Deletion in B+ - trees Delete the entry from the leaf node Delete the entry if it is present in Internal node and replace with the entry to its left in that position. If underflow occurs after deletion • Distribute the entries from left sibling if not possible – Distribute the entries from right sibling if not possible – Merge the node with left and right sibling
  • 57. Prof P Sreenivasa Kumar Department of CS&E, IITM 57 Example 14 24 12 22 25 11 12 14 20 22 23 24 25 30 14 24 12 22 25 11 12 14 22 23 24 25 30 Delete 20 Removed entry from leaf here
  • 58. Prof P Sreenivasa Kumar Department of CS&E, IITM 58 14 24 12 23 25 11 12 14 23 24 25 30 14 12 23 11 12 14 23 25 30 Delete 24 Delete 22 25 Entry 22 is removed from leaf and internal node Entries from right sibling are distributed to left
  • 59. Prof P Sreenivasa Kumar Department of CS&E, IITM 59 12 11 23 11 12 23 25 30 Delete 14 25 Delete 12 23 25 2311 25 30 Level drop has occurred
  • 60. Prof P Sreenivasa Kumar Department of CS&E, IITM 60 Advantages of B+ - trees: 1) Any record can be fetched in equal number of disk accesses. 2) Range queries can be performed easily as leaves are linked up 3) Height of the tree is less as only keys are used for indexing 4) Supports both random and sequential access. Disadvantages of B+ - trees: Insert and delete operations are complicated Root node becomes a hotspot