SlideShare a Scribd company logo
2
Most read
6
Most read
12
Most read
Searching and Hashing
•Hash Table
• Hash Functions
• Collision Resolution Strategies
• Hash Table Implementation
Hashing
• The searching time of Linear and binary searching techniques
depends on the number of elements.
• Hashing is a search technique, its searching time does not depend
on the number of elements.
• Hashing technique is a search technique in which the required
record is located by using a function. Search time is independent
of the position of the record in the file. The function used to
locate the record is called the hash function.
• A hash function h transforms a key K into a table index L at which
the record with key K is placed and h(K) is called the hash of key
K.
h(K) = k  L
• Hash functions
• The main criteria for the selection of a hash function are
– it should be easy and quick to compute
– should produce an even distribution of keys across the range of indices
– should produce distinct indices
Hashing• There are several basic methods that can be used to
build a hash function.
• Division
An integer key is divided by the table size and the
remainder is taken as the hash value.
Hash value = (key) mod (table_size)
or
Hash value = (key) mod (table_size) + 1
The second one starts the hash value from 1 instead of 0
Best hash values are obtained when table_size is a
prime.
Truncation
Part of the key is ignored and the remaining portion is
used as the index. The method, though simple, fails to
give uniform distribution.
Ex: Given a key of seven digits, then the first, fourth and
seventh digits can make hash function so that the key
2345678 maps to 258.
Hashing
• Folding
The key is divided into several parts and the parts are
combined in a convenient way to get the index. Often
addition or multiplication is used for combining the parts.
This process, termed folding, makes use of all the
information in the key and hence can produce better
distribution of the indices.
Ex: Given a key of seven digits can be divided into groups of
three, two and two digits, the groups are added and the
result according to requirement can be used as such or
processed further.
2345678 maps to 234+56+78 = 368.
• Midsquare method
The key is multiplied by itself and the middle few digits of
the square is taken as index. The number of middle digits to
be taken is dependent on the number of digits allowed in
the index. Since the middle digits of a square is dependent
on all the digits in the key, the chances of keys hashing into
same indices are expected to be small.
Hashing
• Hash collision
• Given a set of keys k1, k2, ….kn a perfect hash function is defined as
one wherein hash-value of ki is not equal to hash value of kj for all
distinct i and j.
• Some times more than one distinct keys give the same hash value.
This is called hash collision or hash clash. This situation is resolved
in several ways.
• Linear probing or linear open
• The simplest method of resolving hash clashes is to search the table
sequentially for the desired key or the empty location. The search is
started from the location the collision occurs. The colliding record is
placed in the next available space. The storage space is considered
as a circular linear space so that when the last location is reached
the search goes to the first location. The method is called linear
probing because of the linear nature of searching.
Hashing
Hashing
• Rehashing or double hashing
• In the method called rehashing a secondary
hash function is used on the hash key. The
hash value is used as input to the rehash
function and a new hash value is computed.
The rehash function is used successively until
a distinct hash value is resulted.
Hashing
6 6
Put on 3rd pos.
from 5th pos.
Put on 5th pos.
from 6th pos.
Hashing
• Quadratic probing
• This approach tries to correct the clustering problem
of linear probing by introducing a quadratic
increment function. Probing is done at locations
given by
( Hash value + j2 ) mod (table_ size) with
j=1,2,3………..
• Quadratic probing reduces clustering considerably
but all the locations are not probed by this method.
When table_size is a prime almost half of the
locations are probed. But if the table_size is a power
of two, relatively few locations are probed.
Hashing
Hashing
• Hashing with buckets
• In this approach multiple keys are hashed to a single
location. The locations are slotted to contain more than
one key. Each of this multi-key location is called a
bucket. Each of these buckets can hold multiple entries
up to a point. This approach allows multiple entries to
hash at the same location. When the bucket is full
collisions are to be handled again.
• Chaining
• In this method called chaining, a linked list of all items
whose key hash into the same value is built. During the
search hash function is first applied to the key and then
the linked list, called chain, is searched sequentially for
the target key. In this technique an extra link field is
added to each table position.
Hashing
Hashing
• There are several advantages by this approach.
• Considerable space is saved when the records are large.
Since hash table is an array and the array space is allocated
at the time of compilation, considerable amount of space is
wasted if some array elements are not occupied. As the
space required for pointers are small, the space wasted will
not be much even if the space allocated remains empty.
• Adding a link to the record and organizing all the records
with a single hash address as a linked list handle collision.
Good hash function will give short linked list enabling quick
search. Clustering is prevented as keys with distinct hash
addresses go to different lists.
• The average length of the linked lists remain small and the
efficiency of the sequential search of the lists is maintained.
• Deletion becomes easy and quick in chained hash table.
Hashing
• There are disadvantages also in the chained hash
table method.
• When the records are small, the space used for
links becomes considerable in comparison with
the space required for storing the records.
• When the hash table is small, there would be
collisions making some of the chains long. This
slows down searching
• However, a good hash function minimizes the
collision and spreads the records uniformly
throughout the file. Larger the range of hash
functions less chances of hash clashes. This
involves the trade-offs between time and space.
Hashing
• Hashing facilitates direct access to a table. For
this reason this scheme is preferable to other
search techniques. The biggest draw back in this
scheme is that the records in a hash table are
not stored in the sorted order of keys.
• They do not minimize hash collisions and hence
cannot access any record directly from its key
thus defeating the basic purpose of hashing.
• In view of speed the hash methods compare
better than other search methods when the size
of the file is large.

More Related Content

PPT
block ciphers
PDF
Block Ciphers and the Data Encryption Standard
PPT
Properties of relations
PPTX
Public Key Cryptography
PPTX
Tree (Data Structure & Discrete Mathematics)
PPTX
Symmetric encryption
PPTX
Diffie hellman key exchange algorithm
block ciphers
Block Ciphers and the Data Encryption Standard
Properties of relations
Public Key Cryptography
Tree (Data Structure & Discrete Mathematics)
Symmetric encryption
Diffie hellman key exchange algorithm

What's hot (20)

PPTX
Hashing algorithms and its uses
PPTX
Diffie Hellman Key Exchange
PPTX
Double DES & Triple DES
PPT
Parsing
PPTX
PDF
Introduction to Cryptography
PDF
Asymmetric Cryptography
PPT
4.4 hashing
PDF
Binary search algorithm
PPTX
Encryption algorithms
PPT
Binary search tree in data structures
PPTX
Data Encryption Standard (DES)
PPT
Cryptography Fundamentals
PPT
Distributed Hash Table
PPTX
cryptography
PPTX
Cryptography
PPT
Encryption technology
PPT
Message authentication and hash function
PDF
Introduction to Cryptography
Hashing algorithms and its uses
Diffie Hellman Key Exchange
Double DES & Triple DES
Parsing
Introduction to Cryptography
Asymmetric Cryptography
4.4 hashing
Binary search algorithm
Encryption algorithms
Binary search tree in data structures
Data Encryption Standard (DES)
Cryptography Fundamentals
Distributed Hash Table
cryptography
Cryptography
Encryption technology
Message authentication and hash function
Introduction to Cryptography
Ad

Similar to Hashing (20)

PPTX
Hashing techniques, Hashing function,Collision detection techniques
PDF
Hashing and File Structures in Data Structure.pdf
PPSX
Data Structure and Algorithms: What is Hash Table ppt
PPTX
Data Structures-Topic-Hashing, Collision
PDF
hashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdf
PPT
Ch17 Hashing
PPTX
Hashing using a different methods of technic
PPT
Design data Analysis hashing.ppt by piyush
PDF
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
PPT
Data Structure and Algorithms Hashing
PPTX
hashing in data strutures advanced in languae java
PPTX
Presentation.pptx
PDF
DataBaseManagementSystems-BTECH--UNIT-5.pdf
PDF
LECT 10, 11-DSALGO(Hashing).pdf
PPT
Hashing in Data Structure and analysis of Algorithms
PPTX
Hashing And Hashing Tables
PPTX
Hashing.pptx
PPTX
Hashing
PPTX
hashing in data structure for Btech.pptx
PPTX
hashing in data structure for engineering.pptx
Hashing techniques, Hashing function,Collision detection techniques
Hashing and File Structures in Data Structure.pdf
Data Structure and Algorithms: What is Hash Table ppt
Data Structures-Topic-Hashing, Collision
hashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdf
Ch17 Hashing
Hashing using a different methods of technic
Design data Analysis hashing.ppt by piyush
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Data Structure and Algorithms Hashing
hashing in data strutures advanced in languae java
Presentation.pptx
DataBaseManagementSystems-BTECH--UNIT-5.pdf
LECT 10, 11-DSALGO(Hashing).pdf
Hashing in Data Structure and analysis of Algorithms
Hashing And Hashing Tables
Hashing.pptx
Hashing
hashing in data structure for Btech.pptx
hashing in data structure for engineering.pptx
Ad

More from invertis university (10)

DOCX
Data link control notes
DOC
Program listds
PDF
PPT
data structure on bca.
PPT
data structure on bca.
PPTX
System security
Data link control notes
Program listds
data structure on bca.
data structure on bca.
System security

Recently uploaded (20)

PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Computing-Curriculum for Schools in Ghana
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
UNIT III MENTAL HEALTH NURSING ASSESSMENT
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
PDF
Trump Administration's workforce development strategy
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PPTX
Orientation - ARALprogram of Deped to the Parents.pptx
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
RMMM.pdf make it easy to upload and study
PDF
Classroom Observation Tools for Teachers
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
master seminar digital applications in india
PDF
A systematic review of self-coping strategies used by university students to ...
2.FourierTransform-ShortQuestionswithAnswers.pdf
Computing-Curriculum for Schools in Ghana
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
UNIT III MENTAL HEALTH NURSING ASSESSMENT
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Final Presentation General Medicine 03-08-2024.pptx
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
Trump Administration's workforce development strategy
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Practical Manual AGRO-233 Principles and Practices of Natural Farming
Orientation - ARALprogram of Deped to the Parents.pptx
Module 4: Burden of Disease Tutorial Slides S2 2025
RMMM.pdf make it easy to upload and study
Classroom Observation Tools for Teachers
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Anesthesia in Laparoscopic Surgery in India
master seminar digital applications in india
A systematic review of self-coping strategies used by university students to ...

Hashing

  • 1. Searching and Hashing •Hash Table • Hash Functions • Collision Resolution Strategies • Hash Table Implementation
  • 2. Hashing • The searching time of Linear and binary searching techniques depends on the number of elements. • Hashing is a search technique, its searching time does not depend on the number of elements. • Hashing technique is a search technique in which the required record is located by using a function. Search time is independent of the position of the record in the file. The function used to locate the record is called the hash function. • A hash function h transforms a key K into a table index L at which the record with key K is placed and h(K) is called the hash of key K. h(K) = k  L • Hash functions • The main criteria for the selection of a hash function are – it should be easy and quick to compute – should produce an even distribution of keys across the range of indices – should produce distinct indices
  • 3. Hashing• There are several basic methods that can be used to build a hash function. • Division An integer key is divided by the table size and the remainder is taken as the hash value. Hash value = (key) mod (table_size) or Hash value = (key) mod (table_size) + 1 The second one starts the hash value from 1 instead of 0 Best hash values are obtained when table_size is a prime. Truncation Part of the key is ignored and the remaining portion is used as the index. The method, though simple, fails to give uniform distribution. Ex: Given a key of seven digits, then the first, fourth and seventh digits can make hash function so that the key 2345678 maps to 258.
  • 4. Hashing • Folding The key is divided into several parts and the parts are combined in a convenient way to get the index. Often addition or multiplication is used for combining the parts. This process, termed folding, makes use of all the information in the key and hence can produce better distribution of the indices. Ex: Given a key of seven digits can be divided into groups of three, two and two digits, the groups are added and the result according to requirement can be used as such or processed further. 2345678 maps to 234+56+78 = 368. • Midsquare method The key is multiplied by itself and the middle few digits of the square is taken as index. The number of middle digits to be taken is dependent on the number of digits allowed in the index. Since the middle digits of a square is dependent on all the digits in the key, the chances of keys hashing into same indices are expected to be small.
  • 5. Hashing • Hash collision • Given a set of keys k1, k2, ….kn a perfect hash function is defined as one wherein hash-value of ki is not equal to hash value of kj for all distinct i and j. • Some times more than one distinct keys give the same hash value. This is called hash collision or hash clash. This situation is resolved in several ways. • Linear probing or linear open • The simplest method of resolving hash clashes is to search the table sequentially for the desired key or the empty location. The search is started from the location the collision occurs. The colliding record is placed in the next available space. The storage space is considered as a circular linear space so that when the last location is reached the search goes to the first location. The method is called linear probing because of the linear nature of searching.
  • 7. Hashing • Rehashing or double hashing • In the method called rehashing a secondary hash function is used on the hash key. The hash value is used as input to the rehash function and a new hash value is computed. The rehash function is used successively until a distinct hash value is resulted.
  • 8. Hashing 6 6 Put on 3rd pos. from 5th pos. Put on 5th pos. from 6th pos.
  • 9. Hashing • Quadratic probing • This approach tries to correct the clustering problem of linear probing by introducing a quadratic increment function. Probing is done at locations given by ( Hash value + j2 ) mod (table_ size) with j=1,2,3……….. • Quadratic probing reduces clustering considerably but all the locations are not probed by this method. When table_size is a prime almost half of the locations are probed. But if the table_size is a power of two, relatively few locations are probed.
  • 11. Hashing • Hashing with buckets • In this approach multiple keys are hashed to a single location. The locations are slotted to contain more than one key. Each of this multi-key location is called a bucket. Each of these buckets can hold multiple entries up to a point. This approach allows multiple entries to hash at the same location. When the bucket is full collisions are to be handled again. • Chaining • In this method called chaining, a linked list of all items whose key hash into the same value is built. During the search hash function is first applied to the key and then the linked list, called chain, is searched sequentially for the target key. In this technique an extra link field is added to each table position.
  • 13. Hashing • There are several advantages by this approach. • Considerable space is saved when the records are large. Since hash table is an array and the array space is allocated at the time of compilation, considerable amount of space is wasted if some array elements are not occupied. As the space required for pointers are small, the space wasted will not be much even if the space allocated remains empty. • Adding a link to the record and organizing all the records with a single hash address as a linked list handle collision. Good hash function will give short linked list enabling quick search. Clustering is prevented as keys with distinct hash addresses go to different lists. • The average length of the linked lists remain small and the efficiency of the sequential search of the lists is maintained. • Deletion becomes easy and quick in chained hash table.
  • 14. Hashing • There are disadvantages also in the chained hash table method. • When the records are small, the space used for links becomes considerable in comparison with the space required for storing the records. • When the hash table is small, there would be collisions making some of the chains long. This slows down searching • However, a good hash function minimizes the collision and spreads the records uniformly throughout the file. Larger the range of hash functions less chances of hash clashes. This involves the trade-offs between time and space.
  • 15. Hashing • Hashing facilitates direct access to a table. For this reason this scheme is preferable to other search techniques. The biggest draw back in this scheme is that the records in a hash table are not stored in the sorted order of keys. • They do not minimize hash collisions and hence cannot access any record directly from its key thus defeating the basic purpose of hashing. • In view of speed the hash methods compare better than other search methods when the size of the file is large.