SlideShare a Scribd company logo
HASHING
• How to store Employee records, which consists of Telephone no.,
Name, City, Department, Salary, etc…
Telephone Name City Dept. Salary
2
Solution
• Use an Array:
• Search takes a linear time.
• If stored in sorted order this search can be done in O(log n) using
binary search but then insertion and deletion becomes costly because
we have to maintain the sorted order.
3
Solution
• Use an Linked List:
• Search takes again linear time to search for particular record as se need to go
through all the records until the required one is found.
• Use balanced BST to store records:
• Insertion takes O(log n).
• Search takes O(log n).
• Deletion takes O(log n).
4
Solution
• Create a Direct Access Table:
• Insertion takes O(1).
• Search takes O(1).
• Deletion takes O(1).
5
• Limitations:
• 1] Size of Table: m * 10^n
• Where, m = size of pointer to the record
n = digits in the telephone number
Telephone Memory Location
(Pointer)
…………………….
9856234178 0x1…
8695234512 0x2…
…………………….
Name City Dept …….
Tejas Pune IT
Name City Dept …….
Rahul Mumbai HR
• Limitations:
• 2] Integer may not hold the
size of n digits
6
Introduction to Hashing
• Provides O(1) time on average for insert, search and delete.
• Find items with keys matching a given search key
• Given an array A, containing n keys, and a search key x, find the
index i such as x=A[i]
• Example of Record
KEY Other Data
7
Continue..
• Hashing is an important Data Structure which is designed to use a
special function called the Hash function.
• Hash function is used to map a given value with a particular key for
faster access of elements.
• The efficiency of mapping depends of the efficiency of the hash
function used.
• Some examples of how hashing is used in our lives include:
• In universities, each student is assigned a unique roll number that can be
used to retrieve information about them.
• In libraries, each book is assigned a unique number that can be used to
determine information about the book, such as its exact position in the
library or the users it has been issued to etc.
8
Continue..
• Assume that you have an object and you want to assign a key to it
to make searching easy.
• To store the key/value pair, you can use a simple array like a data
structure where keys (integers) can be used directly as an index to
store values.
• However, in cases where the keys are large and cannot be used
directly as an index, you should use hashing.
9
Continue..
• In hashing, large keys are converted into small keys by using hash
functions.
• The values are then stored in a data structure called hash table.
• The idea of hashing is to distribute entries (key/value pairs)
uniformly across an array.
• Each element is assigned a key (converted key).
• By using that key you can access the element in O(1) time. Using
the key, the algorithm (hash function) computes an index that
suggests where an entry can be found or inserted.
10
Important Keywords:
• Hash Table:
• A data structure used for storing and retrieving data quickly.
Where each entry in hash table is made using hash function.
• Hash Function:
• It maps a big number or string into a small integer that can be
used as index in hash table.
• Used to place data in hash table and also to retrieve data
from hash table.
• A function that can be used to map a data set of an arbitrary
size to a data set of a fixed size, which falls into the hash
table.
• The values returned by a hash function are called hash values,
hash codes, hash sums, or simply hashes.
• Bucket:
• Each position of hash table is called bucket.
• Collision:
• It is a situation in which hash function returns the same
address for more than one record.
0 25
1 Bucket 16
2 42
3 33
4 54
Given: 25, 33, 54, 16, 42 and table
of size 5.
Hash Function is KEY mod 5
Key mod 5
25 mod 5 = 0
33 mod 5 = 3
54 mod 5 = 4
16 mod 5 = 1
42 mod 5 = 2
30 mod 5 = 0
This is collision.
Hash Table
11
Important Keywords:
• Probe:
• Each calculation of an address and test for success
is known as probe.
• Synonym:
• The set of keys that has the same location are
called synonyms.
• Ex: 25 and 30 are the synonyms.
• Overflow:
• When hash table becomes full and new record
needs to be inserted and resulting into more
collisions the it is called overflow.
• Perfect hash function:
• It is a function that maps distinct key elements into
the hash table with no collisions.
0 25
1 16
2 42
3 33
4 54
Entry of 25, 33, 54, 16, 42 in a table
of size 5 with probe 5
30
Where 30 mod 5 = 0
is a collision, synonym for 25 and it
also shows the overflow situation
Hash Table
Consider 25, 33, 54, 16, 42, 30.
Show Perfect Hash Function
12
• To achieve a good hashing mechanism, It is important to
have a good hash function with the following basic
requirements:
• Easy to compute
• Uniform distribution
• Less Collision
13
Important Keywords:
14
Types of Hash Functions
• Division Method
• Multiplication Method
• Extraction Method
• Mid Square Method
• Folding and Universal Method
15
Division Method
• It depends upon the remainder of division.
• Typically the divisor is table length.
• Ex: 54, 72, 89, 37 is to be placed in hash table & if table size 10.
0
1
2
3
4
5
6
7
8
9
Hash Function
h(key) = record % table_size
54 % 10 = 4
72 % 10 = 2
89 % 10 = 9
37 % 10 = 7
54
72
89
37
Disadvantage:
The consecutive keys map to
consecutive hash values in the
hash table. This leads to a poor
performance.
16
Multiplication Method
17
Extraction Method
• Some digits are extracted from the key to form the address location in
hash table.
• Ex:
Suppose 1st, 3rd and 4th digits from left are selected for hash key.
4 9 7 8 2 4
478
At 478 location in the
hash table of size 1000
the key can be stored.
18
Mid Square Method
• Square the key.
• Then extract middle part of the result.
• That will be the location of key element in hash table.
• Ex:
A] 3111 = Square(3111) B] 16 = Square(16)
9678321
At 783 location in the
hash table of size 1000
the key can be stored.
256
At 56 location in
the hash table.
19
Folding Method
• Two folding techniques:
• A] Fold Shift:
The key is divided into separate
parts based on size of required address.
Finally add those parts.
• Ex: 345678123
• And required address is of 3 digits.
• B] Fold Boundary:
The key is divided into separate
parts based on size of required address.
Then left & right parts are folded and
then add those parts.
• Ex: 345678123
345678123
345
678
123
-------
1146
345678123
543
678
321
-------
1542
20
Universal Hashing
• In this technique randomized algorithm is used to select the hash
function at random from family of hash functions with the help of
certain mathematical properties.
• Let U is the set of Universal Keys and H be the finite collection of Hash
functions mapping U into {0, 1, …….., m-1}.
• Say there is collision or your hash function perform badly for input
key or a pair (x, y) i.e. { (x,y) | h(x)=h(y)}.
U
0
.
.
m-1
Bad portion
of key
21
Universal Hashing
• The fundamental weakness of hash functions is collision so to avoid
that what we do we can choose the Universal hashing where the hash
function randomly at the runtime.
• Lets consider H = set of all hash functions h from U to {0,1,…,m-1}.
{ h | h : U  (0,1,…,m-1) }
H
Bad portion
22
Universal Hashing
23
24
Properties of Good Hash Functions
• Rules for choosing good hash function:
1. The hash function should be simple to compute.
2. Number of collisions should be less ideally no collision should occur (perfect
hash function).
3. Hash function should produce such keys which will get distributed uniformly
over an array.
4. The hash function should depend on every bit of the key and not on the
extracted portion of the key.
25
Collision Resolution Strategies
26
Separate chaining (open hashing)
• The open hashing is also known as separate chaining in which
collisions are stored outside the table.
• Separate chaining is one of the most commonly used collision
resolution techniques.
• It is usually implemented using linked lists. In separate chaining, each
element of the hash table is a linked list.
• To store an element in the hash table you must insert it into a specific
linked list.
• If there is any collision (i.e. two different elements have same hash
value) then store both the elements in the same linked list.
27
Input:
81, 0, 64, 25, 1, 36, 4, 49, 16, 9.
28
Advantages:
1. Simple to implement.
2. Hash table never fills up, we can always add more elements to chain.
3. Less sensitive to the hash function.
4. It is mostly used when it is unknown how many and how frequently
keys may be inserted or deleted
Disadvantages:
1. Wastage of space.
2. Cache performance of chaining is not good as keys are stored using
linked list.
3. If the chain becomes long then search time can become O(n) in worst
case.
4. Uses extra space for links.
29
Open Addressing
• A] Linear Probing with chaining (without
replacement):
• A separate chain table is maintained for
colliding data.
• The address of colliding data is stored with
first colliding element in chain table.
• We can place element at next available
bucket or slot and will keep track of that in
chain column.
Ex: 131, 3, 4, 21, 61, 6, 71, 8, 9.
Index Data Chain
0 -1
1 -1
2 -1
3 -1
4 -1
5 -1
6 -1
7 -1
8 -1
9 -1
131
3
4
21 mod 10 = 1 Collision
21
2
61 mod 10 = 1 Collision
61
5
6
71
7
8
9
71 mod 10 = 1 Collision
-1
Index Data Chain
0 -1 -1
1 131 2
2 21 5
3 3 -1
4 4 -1
5 61 7
6 6 -1
7 71 -1
8 8 -1
9 9 -1
• Drawback: Finding the next empty location and that results into the element actually belonging to
that empty location cannot obtain its location.
30
Linear Probing with chaining (without replacement)
Ex: Consider elements as 21, 31, 41, 2, 6, 5, 7, 55 with
table size of 10.
Index Data Chain
0 -1
1 -1
2 -1
3 -1
4 -1
5 -1
6 -1
7 -1
8 -1
9 -1
21
41
2
31
2
5
3
6
-1
Index Data Chain
0 -1 -1
1 21 2
2 31 3
3 41 4
4 2 -1
5 5 8
6 6 -1
7 7 -1
8 55 -1
9 -1 -1
7
55
-1
4
8
31
B]Linear Probing with chaining (with replacement)
• Ex: 131, 21, 31, 4, 5, 2.
Index Data Chain
0 -1
1 -1
2 -1
3 -1
4 -1
5 -1
6 -1
7 -1
8 -1
9 -1
131
31
4
21 mod 10 = 1 Collision
21
2
31 mod 10 = 1 Collision
5
3
2 mod 10 = 2
Index Data Chain
0 -1
1 131 2
2
3 31 -1
4 4 -1
5 5 -1
6
7 -1
8 -1
9 -1
Replace 21 by 2 and
accordingly chain table will
be updated.
2 21 3
21 3
-1
6
Index Data Chain
0 -1 -1
1 131 6
2 2 -1
3 31 -1
4 4 -1
5 5 -1
6 21 3
7 -1 -1
8 -1 -1
9 -1 -1
32
B]Linear Probing with chaining (with replacement)
Ex: 131, 3, 4, 21, 61, 6, 71, 16, 85.
Index Data Chain
0 -1
1 131 2
2 21 5
3 3 -1
4 4 -1
5 61 7
6 6 -1
7 71 -1
8 -1
9 -1
16 mod 10 = 6 85 mod 10 = 5
Index Data Chain
0 -1
1 131 2
2 21 5
3 3 -1
4 4 -1
5 61 7
6 6 8
7 71 -1
8 16 -1
9 -1
Index Data Chain
0 -1
1 131 2
2 21 5
3 3 -1
4 4 -1
5 85 -1
6 6 8
7 71 -1
8 16 -1
9 61 7
131, 3, 4, 21, 61, 6, 71
33
Quadratic Probing
• Problem of clustering
• Accumulation of Consecutive cells n hash table
• When we want to add element we have to go for
many probes means need to search for more
elements to put element in table.
Index Data
0 79
1 41
2 22
3 32
4 44
5 33
6
7
8 18
9 59
• Solution: Quadratic Probing
34
Clustering Example
• Ex: 2, 22, 12, 42, 32
Index Data Chain
0 -1
1 -1
2 2 3
3 22 4
4 12 5
5 42 6
6 32 -1
7 -1
8 -1
9 -1
Index Data
0
1
2 2
3
4
5
6
7
8
9
[Hash(key)+i^2] mod M
(2 + 0^2)mod 10 = 2
(2 + 1^2)mod 10 = 3
22
12
(2 + 2^2)mod 10 = 6
(2 + 3^2)mod 10 = 11  1
42
(2 + 4^2)mod 10 = 18  8
32
35
Quadratic Probing
• Example: 37, 90, 55, 22, 11, 17, 49, 87 with m = 10. Index Data
0 90
1 11
2 22
3
4
5 55
6
7 37
8
9
17
49
87
36
Clustering Example
• Ex: 2, 12, 22, 32, 42, 52, 62
Index Data
0
1 32
2 2
3 12
4
5
6 22
7 52
8 42
9
[Hash(key)+i^2] mod M
(2 + 6^2)mod 10 = 38  8
(2 + 4^2)mod 10 = 18  8
(2 + 7^2)mod 10 = 51  1
(2 + 8^2)mod 10 = 66  6
(2 + 9^2)mod 10 = 83  3
(2 + 10^2)mod 10 = 102  2
This is known as secondary
clustering.
Here we are unable to add
an element in quadrating
probing because the index
calculated is never empty.
37
Double Hashing
• Second hash function is applied to the key when collision occurs.
38
Example
K H1 H2
18
41
22
44
5 7 - (18 % 7) = 7 - 4 = 3
2 1
9 6
5 5
39
Example
K H1 H2
18 5 3
41 2 1
22 9 6
44 5 5
Index Data
0
1
2
3
4
5
6
7
8
9
10
11
12
18
For 41 = 2 + 0 * 1 = 2
41
For 22 = 9 + 0 * 6 = 9 22
For 44 = 5 + 0 * 5 = 5
For 44 = 5 + 1 * 5 = 10
42
40
Home Work
41
Ad

More Related Content

What's hot (20)

Huffman coding
Huffman coding Huffman coding
Huffman coding
Nazmul Hyder
 
Scan line method
Scan line methodScan line method
Scan line method
Pooja Dixit
 
Hashing In Data Structure
Hashing In Data Structure Hashing In Data Structure
Hashing In Data Structure
Meghaj Mallick
 
8 QUEENS PROBLEM.pptx
8 QUEENS PROBLEM.pptx8 QUEENS PROBLEM.pptx
8 QUEENS PROBLEM.pptx
sunidhi740916
 
Topological Sorting
Topological SortingTopological Sorting
Topological Sorting
ShahDhruv21
 
Shuffle exchange networks
Shuffle exchange networksShuffle exchange networks
Shuffle exchange networks
Lahiru Danushka
 
Searching and Sorting Techniques in Data Structure
Searching and Sorting Techniques in Data StructureSearching and Sorting Techniques in Data Structure
Searching and Sorting Techniques in Data Structure
Balwant Gorad
 
AVL Tree Data Structure
AVL Tree Data StructureAVL Tree Data Structure
AVL Tree Data Structure
Afaq Mansoor Khan
 
Dynamic storage allocation techniques
Dynamic storage allocation techniquesDynamic storage allocation techniques
Dynamic storage allocation techniques
Shashwat Shriparv
 
Demand paging
Demand pagingDemand paging
Demand paging
Trinity Dwarka
 
Segmentation in Operating Systems.
Segmentation in Operating Systems.Segmentation in Operating Systems.
Segmentation in Operating Systems.
Muhammad SiRaj Munir
 
Data Structures : hashing (1)
Data Structures : hashing (1)Data Structures : hashing (1)
Data Structures : hashing (1)
Home
 
Hash table in data structure and algorithm
Hash table in data structure and algorithmHash table in data structure and algorithm
Hash table in data structure and algorithm
Aamir Sohail
 
Multi ways trees
Multi ways treesMulti ways trees
Multi ways trees
SHEETAL WAGHMARE
 
Directed Acyclic Graph Representation of basic blocks
Directed Acyclic Graph Representation of basic blocksDirected Acyclic Graph Representation of basic blocks
Directed Acyclic Graph Representation of basic blocks
Mohammad Vaseem Akaram
 
0/1 knapsack
0/1 knapsack0/1 knapsack
0/1 knapsack
Amin Omi
 
Threaded Binary Tree.pptx
Threaded Binary Tree.pptxThreaded Binary Tree.pptx
Threaded Binary Tree.pptx
MOULIMANDALMCA
 
SCHEDULING DAGs WITHOUT CONSIDERING COMMUNICATION
SCHEDULING DAGs WITHOUT CONSIDERING COMMUNICATIONSCHEDULING DAGs WITHOUT CONSIDERING COMMUNICATION
SCHEDULING DAGs WITHOUT CONSIDERING COMMUNICATION
University of Technology - Iraq
 
Circle drawing algo.
Circle drawing algo.Circle drawing algo.
Circle drawing algo.
Mohd Arif
 
Lec 17 heap data structure
Lec 17 heap data structureLec 17 heap data structure
Lec 17 heap data structure
Sajid Marwat
 
Scan line method
Scan line methodScan line method
Scan line method
Pooja Dixit
 
Hashing In Data Structure
Hashing In Data Structure Hashing In Data Structure
Hashing In Data Structure
Meghaj Mallick
 
8 QUEENS PROBLEM.pptx
8 QUEENS PROBLEM.pptx8 QUEENS PROBLEM.pptx
8 QUEENS PROBLEM.pptx
sunidhi740916
 
Topological Sorting
Topological SortingTopological Sorting
Topological Sorting
ShahDhruv21
 
Shuffle exchange networks
Shuffle exchange networksShuffle exchange networks
Shuffle exchange networks
Lahiru Danushka
 
Searching and Sorting Techniques in Data Structure
Searching and Sorting Techniques in Data StructureSearching and Sorting Techniques in Data Structure
Searching and Sorting Techniques in Data Structure
Balwant Gorad
 
Dynamic storage allocation techniques
Dynamic storage allocation techniquesDynamic storage allocation techniques
Dynamic storage allocation techniques
Shashwat Shriparv
 
Segmentation in Operating Systems.
Segmentation in Operating Systems.Segmentation in Operating Systems.
Segmentation in Operating Systems.
Muhammad SiRaj Munir
 
Data Structures : hashing (1)
Data Structures : hashing (1)Data Structures : hashing (1)
Data Structures : hashing (1)
Home
 
Hash table in data structure and algorithm
Hash table in data structure and algorithmHash table in data structure and algorithm
Hash table in data structure and algorithm
Aamir Sohail
 
Directed Acyclic Graph Representation of basic blocks
Directed Acyclic Graph Representation of basic blocksDirected Acyclic Graph Representation of basic blocks
Directed Acyclic Graph Representation of basic blocks
Mohammad Vaseem Akaram
 
0/1 knapsack
0/1 knapsack0/1 knapsack
0/1 knapsack
Amin Omi
 
Threaded Binary Tree.pptx
Threaded Binary Tree.pptxThreaded Binary Tree.pptx
Threaded Binary Tree.pptx
MOULIMANDALMCA
 
Circle drawing algo.
Circle drawing algo.Circle drawing algo.
Circle drawing algo.
Mohd Arif
 
Lec 17 heap data structure
Lec 17 heap data structureLec 17 heap data structure
Lec 17 heap data structure
Sajid Marwat
 

Similar to Hashing techniques, Hashing function,Collision detection techniques (20)

Data Structures-Topic-Hashing, Collision
Data Structures-Topic-Hashing, CollisionData Structures-Topic-Hashing, Collision
Data Structures-Topic-Hashing, Collision
sailaja156145
 
Hash tables
Hash tablesHash tables
Hash tables
International Islamic University
 
LECT 10, 11-DSALGO(Hashing).pdf
LECT 10, 11-DSALGO(Hashing).pdfLECT 10, 11-DSALGO(Hashing).pdf
LECT 10, 11-DSALGO(Hashing).pdf
MuhammadUmerIhtisham
 
Lec12-Hash-Tables-27122022-125641pm.pptx
Lec12-Hash-Tables-27122022-125641pm.pptxLec12-Hash-Tables-27122022-125641pm.pptx
Lec12-Hash-Tables-27122022-125641pm.pptx
IqraHanif27
 
Working with python Nice PPT must try very good
Working with python Nice PPT must try very goodWorking with python Nice PPT must try very good
Working with python Nice PPT must try very good
MuhammadChala
 
hashing in data structures and its applications
hashing in data structures and its applicationshashing in data structures and its applications
hashing in data structures and its applications
manjeshbngowda
 
Hashing .pptx
Hashing .pptxHashing .pptx
Hashing .pptx
ParagAhir1
 
HASHING IS NOT YASH IT IS HASH.pptx
HASHING IS NOT YASH IT IS HASH.pptxHASHING IS NOT YASH IT IS HASH.pptx
HASHING IS NOT YASH IT IS HASH.pptx
JITTAYASHWANTHREDDY
 
Unit viii searching and hashing
Unit   viii searching and hashing Unit   viii searching and hashing
Unit viii searching and hashing
Tribhuvan University
 
Hashing
HashingHashing
Hashing
Dawood Faheem Abbasi
 
Hashing and Collision Advanced data structure and algorithm
Hashing and Collision Advanced data structure and algorithmHashing and Collision Advanced data structure and algorithm
Hashing and Collision Advanced data structure and algorithm
yogoso2948
 
Hashing Techniques in Data Strucures and Algorithm
Hashing Techniques in Data Strucures and AlgorithmHashing Techniques in Data Strucures and Algorithm
Hashing Techniques in Data Strucures and Algorithm
BipinNaik9
 
hashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdf
hashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdfhashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdf
hashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdf
timoemin50
 
Hashing And Hashing Tables
Hashing And Hashing TablesHashing And Hashing Tables
Hashing And Hashing Tables
Chinmaya M. N
 
Hashing
HashingHashing
Hashing
Amar Jukuntla
 
hashing in data structure for Btech.pptx
hashing in data structure for Btech.pptxhashing in data structure for Btech.pptx
hashing in data structure for Btech.pptx
soniasharmafdp
 
hashing in data structure for Btech .pptx
hashing in data structure for Btech .pptxhashing in data structure for Btech .pptx
hashing in data structure for Btech .pptx
soniasharmafdp
 
hashing in data structure for engineering.pptx
hashing in data structure for engineering.pptxhashing in data structure for engineering.pptx
hashing in data structure for engineering.pptx
soniasharmafdp
 
Unit 8 searching and hashing
Unit   8 searching and hashingUnit   8 searching and hashing
Unit 8 searching and hashing
Dabbal Singh Mahara
 
4.4 hashing
4.4 hashing4.4 hashing
4.4 hashing
Krish_ver2
 
Data Structures-Topic-Hashing, Collision
Data Structures-Topic-Hashing, CollisionData Structures-Topic-Hashing, Collision
Data Structures-Topic-Hashing, Collision
sailaja156145
 
Lec12-Hash-Tables-27122022-125641pm.pptx
Lec12-Hash-Tables-27122022-125641pm.pptxLec12-Hash-Tables-27122022-125641pm.pptx
Lec12-Hash-Tables-27122022-125641pm.pptx
IqraHanif27
 
Working with python Nice PPT must try very good
Working with python Nice PPT must try very goodWorking with python Nice PPT must try very good
Working with python Nice PPT must try very good
MuhammadChala
 
hashing in data structures and its applications
hashing in data structures and its applicationshashing in data structures and its applications
hashing in data structures and its applications
manjeshbngowda
 
HASHING IS NOT YASH IT IS HASH.pptx
HASHING IS NOT YASH IT IS HASH.pptxHASHING IS NOT YASH IT IS HASH.pptx
HASHING IS NOT YASH IT IS HASH.pptx
JITTAYASHWANTHREDDY
 
Hashing and Collision Advanced data structure and algorithm
Hashing and Collision Advanced data structure and algorithmHashing and Collision Advanced data structure and algorithm
Hashing and Collision Advanced data structure and algorithm
yogoso2948
 
Hashing Techniques in Data Strucures and Algorithm
Hashing Techniques in Data Strucures and AlgorithmHashing Techniques in Data Strucures and Algorithm
Hashing Techniques in Data Strucures and Algorithm
BipinNaik9
 
hashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdf
hashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdfhashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdf
hashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdf
timoemin50
 
Hashing And Hashing Tables
Hashing And Hashing TablesHashing And Hashing Tables
Hashing And Hashing Tables
Chinmaya M. N
 
hashing in data structure for Btech.pptx
hashing in data structure for Btech.pptxhashing in data structure for Btech.pptx
hashing in data structure for Btech.pptx
soniasharmafdp
 
hashing in data structure for Btech .pptx
hashing in data structure for Btech .pptxhashing in data structure for Btech .pptx
hashing in data structure for Btech .pptx
soniasharmafdp
 
hashing in data structure for engineering.pptx
hashing in data structure for engineering.pptxhashing in data structure for engineering.pptx
hashing in data structure for engineering.pptx
soniasharmafdp
 
Ad

Recently uploaded (20)

Introduction to Python and its basics.pdf
Introduction to Python and its basics.pdfIntroduction to Python and its basics.pdf
Introduction to Python and its basics.pdf
sumitt6_25730773
 
Jeff Menashe - A Dedicated Senior Software Engineer
Jeff Menashe - A Dedicated Senior Software EngineerJeff Menashe - A Dedicated Senior Software Engineer
Jeff Menashe - A Dedicated Senior Software Engineer
Jeff Menashe
 
1.10 Functions in C++,call by value .pdf
1.10 Functions in C++,call by value .pdf1.10 Functions in C++,call by value .pdf
1.10 Functions in C++,call by value .pdf
VikasNirgude2
 
Espresso PD Official MP_eng Version.pptx
Espresso PD Official MP_eng Version.pptxEspresso PD Official MP_eng Version.pptx
Espresso PD Official MP_eng Version.pptx
NingChacha1
 
Health & Safety .........................
Health & Safety .........................Health & Safety .........................
Health & Safety .........................
shadyozq9
 
860556374-10280271.pptx PETROLEUM COKE CALCINATION PLANT
860556374-10280271.pptx PETROLEUM COKE CALCINATION PLANT860556374-10280271.pptx PETROLEUM COKE CALCINATION PLANT
860556374-10280271.pptx PETROLEUM COKE CALCINATION PLANT
Pierre Celestin Eyock
 
Full document for AI powered resume Analyzer
Full document for AI powered resume AnalyzerFull document for AI powered resume Analyzer
Full document for AI powered resume Analyzer
4213SWARNABCSE
 
HSE Induction for heat stress work .pptx
HSE Induction for heat stress work .pptxHSE Induction for heat stress work .pptx
HSE Induction for heat stress work .pptx
agraahmed
 
Observability and Instrumentation via OpenTelemetry.pptx
Observability and Instrumentation via OpenTelemetry.pptxObservability and Instrumentation via OpenTelemetry.pptx
Observability and Instrumentation via OpenTelemetry.pptx
grahnkarljohan
 
Unleashing the Power of Salesforce Flows &amp_ Slack Integration!.pptx
Unleashing the Power of Salesforce Flows &amp_ Slack Integration!.pptxUnleashing the Power of Salesforce Flows &amp_ Slack Integration!.pptx
Unleashing the Power of Salesforce Flows &amp_ Slack Integration!.pptx
SanjeetMishra29
 
698642933-DdocfordownloadEEP-FAKE-PPT.pptx
698642933-DdocfordownloadEEP-FAKE-PPT.pptx698642933-DdocfordownloadEEP-FAKE-PPT.pptx
698642933-DdocfordownloadEEP-FAKE-PPT.pptx
speedcomcyber25
 
🚀 TDX Bengaluru 2025 Unwrapped: Key Highlights, Innovations & Trailblazer Tak...
🚀 TDX Bengaluru 2025 Unwrapped: Key Highlights, Innovations & Trailblazer Tak...🚀 TDX Bengaluru 2025 Unwrapped: Key Highlights, Innovations & Trailblazer Tak...
🚀 TDX Bengaluru 2025 Unwrapped: Key Highlights, Innovations & Trailblazer Tak...
SanjeetMishra29
 
Full_Cybersecurity_Project_Report_30_Pages.pdf
Full_Cybersecurity_Project_Report_30_Pages.pdfFull_Cybersecurity_Project_Report_30_Pages.pdf
Full_Cybersecurity_Project_Report_30_Pages.pdf
Arun446808
 
Supplier_PFMEA_Workshop_rev 22_04_27.pptx
Supplier_PFMEA_Workshop_rev 22_04_27.pptxSupplier_PFMEA_Workshop_rev 22_04_27.pptx
Supplier_PFMEA_Workshop_rev 22_04_27.pptx
dariojaen1977
 
introduction to Rapid Tooling and Additive Manufacturing Applications
introduction to Rapid Tooling and Additive Manufacturing Applicationsintroduction to Rapid Tooling and Additive Manufacturing Applications
introduction to Rapid Tooling and Additive Manufacturing Applications
vijimech408
 
digital computing plotform synopsis.pptx
digital computing plotform synopsis.pptxdigital computing plotform synopsis.pptx
digital computing plotform synopsis.pptx
ssuser2b4c6e1
 
DeFAIMint | 🤖Mint to DeFAI. Vibe Trading as NFT
DeFAIMint | 🤖Mint to DeFAI. Vibe Trading as NFTDeFAIMint | 🤖Mint to DeFAI. Vibe Trading as NFT
DeFAIMint | 🤖Mint to DeFAI. Vibe Trading as NFT
Kyohei Ito
 
MS Project - Pelaksanaan Proyek Fisik 2020
MS Project - Pelaksanaan Proyek Fisik 2020MS Project - Pelaksanaan Proyek Fisik 2020
MS Project - Pelaksanaan Proyek Fisik 2020
Bagus ardian
 
Hostelmanagementsystemprojectreport..pdf
Hostelmanagementsystemprojectreport..pdfHostelmanagementsystemprojectreport..pdf
Hostelmanagementsystemprojectreport..pdf
RajChouhan43
 
Environment .................................
Environment .................................Environment .................................
Environment .................................
shadyozq9
 
Introduction to Python and its basics.pdf
Introduction to Python and its basics.pdfIntroduction to Python and its basics.pdf
Introduction to Python and its basics.pdf
sumitt6_25730773
 
Jeff Menashe - A Dedicated Senior Software Engineer
Jeff Menashe - A Dedicated Senior Software EngineerJeff Menashe - A Dedicated Senior Software Engineer
Jeff Menashe - A Dedicated Senior Software Engineer
Jeff Menashe
 
1.10 Functions in C++,call by value .pdf
1.10 Functions in C++,call by value .pdf1.10 Functions in C++,call by value .pdf
1.10 Functions in C++,call by value .pdf
VikasNirgude2
 
Espresso PD Official MP_eng Version.pptx
Espresso PD Official MP_eng Version.pptxEspresso PD Official MP_eng Version.pptx
Espresso PD Official MP_eng Version.pptx
NingChacha1
 
Health & Safety .........................
Health & Safety .........................Health & Safety .........................
Health & Safety .........................
shadyozq9
 
860556374-10280271.pptx PETROLEUM COKE CALCINATION PLANT
860556374-10280271.pptx PETROLEUM COKE CALCINATION PLANT860556374-10280271.pptx PETROLEUM COKE CALCINATION PLANT
860556374-10280271.pptx PETROLEUM COKE CALCINATION PLANT
Pierre Celestin Eyock
 
Full document for AI powered resume Analyzer
Full document for AI powered resume AnalyzerFull document for AI powered resume Analyzer
Full document for AI powered resume Analyzer
4213SWARNABCSE
 
HSE Induction for heat stress work .pptx
HSE Induction for heat stress work .pptxHSE Induction for heat stress work .pptx
HSE Induction for heat stress work .pptx
agraahmed
 
Observability and Instrumentation via OpenTelemetry.pptx
Observability and Instrumentation via OpenTelemetry.pptxObservability and Instrumentation via OpenTelemetry.pptx
Observability and Instrumentation via OpenTelemetry.pptx
grahnkarljohan
 
Unleashing the Power of Salesforce Flows &amp_ Slack Integration!.pptx
Unleashing the Power of Salesforce Flows &amp_ Slack Integration!.pptxUnleashing the Power of Salesforce Flows &amp_ Slack Integration!.pptx
Unleashing the Power of Salesforce Flows &amp_ Slack Integration!.pptx
SanjeetMishra29
 
698642933-DdocfordownloadEEP-FAKE-PPT.pptx
698642933-DdocfordownloadEEP-FAKE-PPT.pptx698642933-DdocfordownloadEEP-FAKE-PPT.pptx
698642933-DdocfordownloadEEP-FAKE-PPT.pptx
speedcomcyber25
 
🚀 TDX Bengaluru 2025 Unwrapped: Key Highlights, Innovations & Trailblazer Tak...
🚀 TDX Bengaluru 2025 Unwrapped: Key Highlights, Innovations & Trailblazer Tak...🚀 TDX Bengaluru 2025 Unwrapped: Key Highlights, Innovations & Trailblazer Tak...
🚀 TDX Bengaluru 2025 Unwrapped: Key Highlights, Innovations & Trailblazer Tak...
SanjeetMishra29
 
Full_Cybersecurity_Project_Report_30_Pages.pdf
Full_Cybersecurity_Project_Report_30_Pages.pdfFull_Cybersecurity_Project_Report_30_Pages.pdf
Full_Cybersecurity_Project_Report_30_Pages.pdf
Arun446808
 
Supplier_PFMEA_Workshop_rev 22_04_27.pptx
Supplier_PFMEA_Workshop_rev 22_04_27.pptxSupplier_PFMEA_Workshop_rev 22_04_27.pptx
Supplier_PFMEA_Workshop_rev 22_04_27.pptx
dariojaen1977
 
introduction to Rapid Tooling and Additive Manufacturing Applications
introduction to Rapid Tooling and Additive Manufacturing Applicationsintroduction to Rapid Tooling and Additive Manufacturing Applications
introduction to Rapid Tooling and Additive Manufacturing Applications
vijimech408
 
digital computing plotform synopsis.pptx
digital computing plotform synopsis.pptxdigital computing plotform synopsis.pptx
digital computing plotform synopsis.pptx
ssuser2b4c6e1
 
DeFAIMint | 🤖Mint to DeFAI. Vibe Trading as NFT
DeFAIMint | 🤖Mint to DeFAI. Vibe Trading as NFTDeFAIMint | 🤖Mint to DeFAI. Vibe Trading as NFT
DeFAIMint | 🤖Mint to DeFAI. Vibe Trading as NFT
Kyohei Ito
 
MS Project - Pelaksanaan Proyek Fisik 2020
MS Project - Pelaksanaan Proyek Fisik 2020MS Project - Pelaksanaan Proyek Fisik 2020
MS Project - Pelaksanaan Proyek Fisik 2020
Bagus ardian
 
Hostelmanagementsystemprojectreport..pdf
Hostelmanagementsystemprojectreport..pdfHostelmanagementsystemprojectreport..pdf
Hostelmanagementsystemprojectreport..pdf
RajChouhan43
 
Environment .................................
Environment .................................Environment .................................
Environment .................................
shadyozq9
 
Ad

Hashing techniques, Hashing function,Collision detection techniques

  • 2. • How to store Employee records, which consists of Telephone no., Name, City, Department, Salary, etc… Telephone Name City Dept. Salary 2
  • 3. Solution • Use an Array: • Search takes a linear time. • If stored in sorted order this search can be done in O(log n) using binary search but then insertion and deletion becomes costly because we have to maintain the sorted order. 3
  • 4. Solution • Use an Linked List: • Search takes again linear time to search for particular record as se need to go through all the records until the required one is found. • Use balanced BST to store records: • Insertion takes O(log n). • Search takes O(log n). • Deletion takes O(log n). 4
  • 5. Solution • Create a Direct Access Table: • Insertion takes O(1). • Search takes O(1). • Deletion takes O(1). 5
  • 6. • Limitations: • 1] Size of Table: m * 10^n • Where, m = size of pointer to the record n = digits in the telephone number Telephone Memory Location (Pointer) ……………………. 9856234178 0x1… 8695234512 0x2… ……………………. Name City Dept ……. Tejas Pune IT Name City Dept ……. Rahul Mumbai HR • Limitations: • 2] Integer may not hold the size of n digits 6
  • 7. Introduction to Hashing • Provides O(1) time on average for insert, search and delete. • Find items with keys matching a given search key • Given an array A, containing n keys, and a search key x, find the index i such as x=A[i] • Example of Record KEY Other Data 7
  • 8. Continue.. • Hashing is an important Data Structure which is designed to use a special function called the Hash function. • Hash function is used to map a given value with a particular key for faster access of elements. • The efficiency of mapping depends of the efficiency of the hash function used. • Some examples of how hashing is used in our lives include: • In universities, each student is assigned a unique roll number that can be used to retrieve information about them. • In libraries, each book is assigned a unique number that can be used to determine information about the book, such as its exact position in the library or the users it has been issued to etc. 8
  • 9. Continue.. • Assume that you have an object and you want to assign a key to it to make searching easy. • To store the key/value pair, you can use a simple array like a data structure where keys (integers) can be used directly as an index to store values. • However, in cases where the keys are large and cannot be used directly as an index, you should use hashing. 9
  • 10. Continue.. • In hashing, large keys are converted into small keys by using hash functions. • The values are then stored in a data structure called hash table. • The idea of hashing is to distribute entries (key/value pairs) uniformly across an array. • Each element is assigned a key (converted key). • By using that key you can access the element in O(1) time. Using the key, the algorithm (hash function) computes an index that suggests where an entry can be found or inserted. 10
  • 11. Important Keywords: • Hash Table: • A data structure used for storing and retrieving data quickly. Where each entry in hash table is made using hash function. • Hash Function: • It maps a big number or string into a small integer that can be used as index in hash table. • Used to place data in hash table and also to retrieve data from hash table. • A function that can be used to map a data set of an arbitrary size to a data set of a fixed size, which falls into the hash table. • The values returned by a hash function are called hash values, hash codes, hash sums, or simply hashes. • Bucket: • Each position of hash table is called bucket. • Collision: • It is a situation in which hash function returns the same address for more than one record. 0 25 1 Bucket 16 2 42 3 33 4 54 Given: 25, 33, 54, 16, 42 and table of size 5. Hash Function is KEY mod 5 Key mod 5 25 mod 5 = 0 33 mod 5 = 3 54 mod 5 = 4 16 mod 5 = 1 42 mod 5 = 2 30 mod 5 = 0 This is collision. Hash Table 11
  • 12. Important Keywords: • Probe: • Each calculation of an address and test for success is known as probe. • Synonym: • The set of keys that has the same location are called synonyms. • Ex: 25 and 30 are the synonyms. • Overflow: • When hash table becomes full and new record needs to be inserted and resulting into more collisions the it is called overflow. • Perfect hash function: • It is a function that maps distinct key elements into the hash table with no collisions. 0 25 1 16 2 42 3 33 4 54 Entry of 25, 33, 54, 16, 42 in a table of size 5 with probe 5 30 Where 30 mod 5 = 0 is a collision, synonym for 25 and it also shows the overflow situation Hash Table Consider 25, 33, 54, 16, 42, 30. Show Perfect Hash Function 12
  • 13. • To achieve a good hashing mechanism, It is important to have a good hash function with the following basic requirements: • Easy to compute • Uniform distribution • Less Collision 13
  • 15. Types of Hash Functions • Division Method • Multiplication Method • Extraction Method • Mid Square Method • Folding and Universal Method 15
  • 16. Division Method • It depends upon the remainder of division. • Typically the divisor is table length. • Ex: 54, 72, 89, 37 is to be placed in hash table & if table size 10. 0 1 2 3 4 5 6 7 8 9 Hash Function h(key) = record % table_size 54 % 10 = 4 72 % 10 = 2 89 % 10 = 9 37 % 10 = 7 54 72 89 37 Disadvantage: The consecutive keys map to consecutive hash values in the hash table. This leads to a poor performance. 16
  • 18. Extraction Method • Some digits are extracted from the key to form the address location in hash table. • Ex: Suppose 1st, 3rd and 4th digits from left are selected for hash key. 4 9 7 8 2 4 478 At 478 location in the hash table of size 1000 the key can be stored. 18
  • 19. Mid Square Method • Square the key. • Then extract middle part of the result. • That will be the location of key element in hash table. • Ex: A] 3111 = Square(3111) B] 16 = Square(16) 9678321 At 783 location in the hash table of size 1000 the key can be stored. 256 At 56 location in the hash table. 19
  • 20. Folding Method • Two folding techniques: • A] Fold Shift: The key is divided into separate parts based on size of required address. Finally add those parts. • Ex: 345678123 • And required address is of 3 digits. • B] Fold Boundary: The key is divided into separate parts based on size of required address. Then left & right parts are folded and then add those parts. • Ex: 345678123 345678123 345 678 123 ------- 1146 345678123 543 678 321 ------- 1542 20
  • 21. Universal Hashing • In this technique randomized algorithm is used to select the hash function at random from family of hash functions with the help of certain mathematical properties. • Let U is the set of Universal Keys and H be the finite collection of Hash functions mapping U into {0, 1, …….., m-1}. • Say there is collision or your hash function perform badly for input key or a pair (x, y) i.e. { (x,y) | h(x)=h(y)}. U 0 . . m-1 Bad portion of key 21
  • 22. Universal Hashing • The fundamental weakness of hash functions is collision so to avoid that what we do we can choose the Universal hashing where the hash function randomly at the runtime. • Lets consider H = set of all hash functions h from U to {0,1,…,m-1}. { h | h : U  (0,1,…,m-1) } H Bad portion 22
  • 24. 24
  • 25. Properties of Good Hash Functions • Rules for choosing good hash function: 1. The hash function should be simple to compute. 2. Number of collisions should be less ideally no collision should occur (perfect hash function). 3. Hash function should produce such keys which will get distributed uniformly over an array. 4. The hash function should depend on every bit of the key and not on the extracted portion of the key. 25
  • 27. Separate chaining (open hashing) • The open hashing is also known as separate chaining in which collisions are stored outside the table. • Separate chaining is one of the most commonly used collision resolution techniques. • It is usually implemented using linked lists. In separate chaining, each element of the hash table is a linked list. • To store an element in the hash table you must insert it into a specific linked list. • If there is any collision (i.e. two different elements have same hash value) then store both the elements in the same linked list. 27
  • 28. Input: 81, 0, 64, 25, 1, 36, 4, 49, 16, 9. 28
  • 29. Advantages: 1. Simple to implement. 2. Hash table never fills up, we can always add more elements to chain. 3. Less sensitive to the hash function. 4. It is mostly used when it is unknown how many and how frequently keys may be inserted or deleted Disadvantages: 1. Wastage of space. 2. Cache performance of chaining is not good as keys are stored using linked list. 3. If the chain becomes long then search time can become O(n) in worst case. 4. Uses extra space for links. 29
  • 30. Open Addressing • A] Linear Probing with chaining (without replacement): • A separate chain table is maintained for colliding data. • The address of colliding data is stored with first colliding element in chain table. • We can place element at next available bucket or slot and will keep track of that in chain column. Ex: 131, 3, 4, 21, 61, 6, 71, 8, 9. Index Data Chain 0 -1 1 -1 2 -1 3 -1 4 -1 5 -1 6 -1 7 -1 8 -1 9 -1 131 3 4 21 mod 10 = 1 Collision 21 2 61 mod 10 = 1 Collision 61 5 6 71 7 8 9 71 mod 10 = 1 Collision -1 Index Data Chain 0 -1 -1 1 131 2 2 21 5 3 3 -1 4 4 -1 5 61 7 6 6 -1 7 71 -1 8 8 -1 9 9 -1 • Drawback: Finding the next empty location and that results into the element actually belonging to that empty location cannot obtain its location. 30
  • 31. Linear Probing with chaining (without replacement) Ex: Consider elements as 21, 31, 41, 2, 6, 5, 7, 55 with table size of 10. Index Data Chain 0 -1 1 -1 2 -1 3 -1 4 -1 5 -1 6 -1 7 -1 8 -1 9 -1 21 41 2 31 2 5 3 6 -1 Index Data Chain 0 -1 -1 1 21 2 2 31 3 3 41 4 4 2 -1 5 5 8 6 6 -1 7 7 -1 8 55 -1 9 -1 -1 7 55 -1 4 8 31
  • 32. B]Linear Probing with chaining (with replacement) • Ex: 131, 21, 31, 4, 5, 2. Index Data Chain 0 -1 1 -1 2 -1 3 -1 4 -1 5 -1 6 -1 7 -1 8 -1 9 -1 131 31 4 21 mod 10 = 1 Collision 21 2 31 mod 10 = 1 Collision 5 3 2 mod 10 = 2 Index Data Chain 0 -1 1 131 2 2 3 31 -1 4 4 -1 5 5 -1 6 7 -1 8 -1 9 -1 Replace 21 by 2 and accordingly chain table will be updated. 2 21 3 21 3 -1 6 Index Data Chain 0 -1 -1 1 131 6 2 2 -1 3 31 -1 4 4 -1 5 5 -1 6 21 3 7 -1 -1 8 -1 -1 9 -1 -1 32
  • 33. B]Linear Probing with chaining (with replacement) Ex: 131, 3, 4, 21, 61, 6, 71, 16, 85. Index Data Chain 0 -1 1 131 2 2 21 5 3 3 -1 4 4 -1 5 61 7 6 6 -1 7 71 -1 8 -1 9 -1 16 mod 10 = 6 85 mod 10 = 5 Index Data Chain 0 -1 1 131 2 2 21 5 3 3 -1 4 4 -1 5 61 7 6 6 8 7 71 -1 8 16 -1 9 -1 Index Data Chain 0 -1 1 131 2 2 21 5 3 3 -1 4 4 -1 5 85 -1 6 6 8 7 71 -1 8 16 -1 9 61 7 131, 3, 4, 21, 61, 6, 71 33
  • 34. Quadratic Probing • Problem of clustering • Accumulation of Consecutive cells n hash table • When we want to add element we have to go for many probes means need to search for more elements to put element in table. Index Data 0 79 1 41 2 22 3 32 4 44 5 33 6 7 8 18 9 59 • Solution: Quadratic Probing 34
  • 35. Clustering Example • Ex: 2, 22, 12, 42, 32 Index Data Chain 0 -1 1 -1 2 2 3 3 22 4 4 12 5 5 42 6 6 32 -1 7 -1 8 -1 9 -1 Index Data 0 1 2 2 3 4 5 6 7 8 9 [Hash(key)+i^2] mod M (2 + 0^2)mod 10 = 2 (2 + 1^2)mod 10 = 3 22 12 (2 + 2^2)mod 10 = 6 (2 + 3^2)mod 10 = 11  1 42 (2 + 4^2)mod 10 = 18  8 32 35
  • 36. Quadratic Probing • Example: 37, 90, 55, 22, 11, 17, 49, 87 with m = 10. Index Data 0 90 1 11 2 22 3 4 5 55 6 7 37 8 9 17 49 87 36
  • 37. Clustering Example • Ex: 2, 12, 22, 32, 42, 52, 62 Index Data 0 1 32 2 2 3 12 4 5 6 22 7 52 8 42 9 [Hash(key)+i^2] mod M (2 + 6^2)mod 10 = 38  8 (2 + 4^2)mod 10 = 18  8 (2 + 7^2)mod 10 = 51  1 (2 + 8^2)mod 10 = 66  6 (2 + 9^2)mod 10 = 83  3 (2 + 10^2)mod 10 = 102  2 This is known as secondary clustering. Here we are unable to add an element in quadrating probing because the index calculated is never empty. 37
  • 38. Double Hashing • Second hash function is applied to the key when collision occurs. 38
  • 39. Example K H1 H2 18 41 22 44 5 7 - (18 % 7) = 7 - 4 = 3 2 1 9 6 5 5 39
  • 40. Example K H1 H2 18 5 3 41 2 1 22 9 6 44 5 5 Index Data 0 1 2 3 4 5 6 7 8 9 10 11 12 18 For 41 = 2 + 0 * 1 = 2 41 For 22 = 9 + 0 * 6 = 9 22 For 44 = 5 + 0 * 5 = 5 For 44 = 5 + 1 * 5 = 10 42 40

Editor's Notes

  • #27: Collision resolution is the most important issue in hash table implementations.
  翻译: