SlideShare a Scribd company logo
Introduction
• Hashing is an important data structure designed to solve the problem of
efficiently finding and storing data in an array. For example, if you
have a list of 20000 numbers, and you have given a number to search
in that list- you will scan each number in the list until you find a match.
• Hashing is mainly used to overcome the drawbacks of linear sets
&binary sets.
• O(n)-time complexity of linear search
• O(logn)-time complexity for binary search
• O(1)-hashing
What is hashing?
• Hashing is a popular technique in computer science that involves mapping large data sets
to fixed-length values. It is a process of converting a data set of variable size into a data
set of a fixed size.
• It is also known as the message digest function.
• It is a technique that uniquely identifies a specific item from a collection of similar items.
• Hashing is a technique in which we can perform searching operation in a constant
computing time.
Static hashing
1.Hash tables
2.Hash functions
Hash tables
• Hash table is data structure(array) which contains fixed cells where
each cell contains key & value.
• It can be defined as a bucket where the data are stored in an
array format. These data have their own index value. If the
index values are known then the process of accessing the
data is quicker.
Hash function
• The mathematical function to be applied on keys to obtain
indexes for their corresponding values into the Hash Table.
• Hash Function can be defined as an algorithm or a function
that is used to map or convert data of bigger size or length
to a fixed or small index or hash value
• The hashing techniques in the data structure are very
interesting, such as:
• hash = hashfunc(key)
• index = hash % array_size
Hash functions
1.Division Method
2.Mid Square Method
3.Folding Method(pairing)
4.Multiplication Method
Division Method
• By performing module operation we have to insert record into a hash
table.
• Say that we have a Hash Table of size 'S', and we want to
store a (key, value) pair in the Hash Table. The Hash
Function, according to the Division method, would
be:H(key) = key mod M
• Here M is an integer value used for calculating the Hash
value, and M should be greater than S. Sometimes, S is used
as M.
Size of the Hash Table = 5 (M, S)
Key: Value pairs: {10: "Sudha", 11: "Venkat", 12: "Jeevani"}
For every pair:
{10: "Sudha"}
Key mod M = 10 mod 5 = 0
{11: "Venkat"}
Key mod M = 11 mod 5 = 1
{12: "Jeevani"}
Key mod M = 12 mod 5 = 2
Mid square method
• It is a two-step process of computing the Hash value. Given a {key:
value} pair, the Hash Function would be calculated by:
1.Square the key -> key * key
2.Choose some digits from the middle of the number to obtain the Hash
value.
• Suppose the size of the Hash Table is 10 and the key: value pairs are:
{10: "Sudha, 11: "Venkat", 12: "Jeevani"}
Number of digits to be selected: Indexes: (0 - 9), so 1
H(10) = 10 * 10 = 100 = 0
H(11) = 11 * 11 = 121 = 2
3. Folding Method
• Given a {key: value} pair and the table size is 100 (0 - 99
indexes), the key is broken down into 2 segments each
except the last segment. The last segment can have less
number of digits. Now, the Hash Function would be:
• H(x) = (sum of equal-
sized segments) mod (size of the Hash Table)
• For suppose "k" is a 10-digit key and the size of the table is
100(0 - 99), k is divided into:
sum = (k1k2) + (k3k4) + (k5k6) + (k7k8) + (k9k10)
Now, H(x) = sum % 100
Let us now take an example:
The {key: value} pairs: {1234: "Sudha", 5678: "Venkat"}
Size of the table: 100 (0 - 99)
For {1234: "Sudha"}:
1234 = 12 + 34 = 46
46 % 100 = 46
For {5678: "Venkat"}:
5678 = 56 + 78 = 134
134 % 100 = 34
Multiplication method
1.We must choose a constant between 0 and 1, say, A.
2.Multiply the key with the chosen A.
3.Now, take the fractional part from the product and multiply it by the
table size.
4.The Hash will be the floor (only the integer part) of the above result.
• So, the Hash Function under this method will be:
H(x) = floor(size(key*A mod 1))
For example:
{Key: value} pairs: {1234: "Sudha", 5678: "Venkat"}
Size of the table: 100
A = 0.56
For {1234: "Sudha"}:
H(1234) = floor(size(1234*0.56 mod 1))
= floor(100 * 0.04)
= floor(4) = 4
For {5678: "Venkat"}:
H(5678) = floor(size(5678*0.56 mod 1))
= floor(99 * 0.68)
= floor(67.32)
= 67
Collision in hashing
1.in this, the hash function is used to compute the index of the array.
2.The hash value is used to store the key in the hash table, as an index.
3.The hash function can return the same hash value for two or more keys.
4.When two or more keys are given the same hash value, it is called
a collision. To handle this collision, we use collision resolution techniques.
Collision resolution techniques
1.Separate chaining (open hashing)
2.Open addressing (closed hashing)->
1.Linear probing
2.Quadratic probing
3.Double hashing
Separate chaining technique
• Means maintaining linked lists for storing records
• In separate chaining technique, each bucket in hash table is associated with
a linked list or some other data structure that can store multiple elements.
• When a collision occurs, the colliding elements are added to the linked list
associated with that bucket.
• The key and its corresponding value are stored as a node in list
Separate chaining technique
Advantages
• There is no problem in overflow & collision.
Disadvantages
It requires more memory in order to store the data
Open Addressing
• In open addressing, when a collision occurs, the hash table’s slots
themselves are used to store the colliding elements. If a slot is already
occupied, the algorithm probes or searches for the next available slot
in a predetermined manner.
Linear probing
• In linear probing, if a collision occurs at a specific slot, the algorithm sequentially
searches for the next available (empty) slot by incrementing the index one by one
until it finds an empty slot.
• - The probing sequence is defined by the linear function:
• ‘hash(key) + i’
• where ‘hash(key)’ is the original hash value and ‘I’ is the probe number.
Drawbacks of linear probing
1.The main problem is primary clustering(most of the records stored in
a single cluster).
2.It takes too much time to find an empty slot.
Quadratic probing
• In this, when the collision occurs, we probe for i2th
slot in ith
iteration,
and this probing is performed until an empty slot is found.
• The cache performance in quadratic probing is lower than the linear
probing. Quadratic probing also reduces the problem of clustering.
• Drawbacks
Secondary clustering-whenever the half of the hash table is full it is
very difficult to find the location of the new element.
Double hashing
• H(K)=K%10
• H2(x) = P - (x%P), where P is a prime number smaller than N.
• It takes longer to determine two hash functions. The double probing
gives the very poor the cache performance, but there has no clustering
problem in it.
Dynamic hashing
• The drawback of static hashing is that it does not expand or shrink
dynamically as the size of the database grow or shrink.
• In dynamic hashing data buckets grow or shrinks(added or removed
dynamically) as records increases or decreases.
• Dynamic hashing is also called extended hashing.
Dynamic hashing/extended hashing
• It is a dynamic hashing method where in directories &buckets are used to
hash the data.
• It is aggressively flexible method in which the hash function also
experiences dynamic change.
• Directories store addresses of the buckets in pointer and id assigned to
each directory. Which may change each time when directory expansion
takes place.
• Buckets are used to hash the actual data.
hashing in data structures and its applications
hashing in data structures and its applications
Frequently used terms in Extendible Hashing:
• Directories: These containers store pointers to buckets. Each directory is given a unique id which may change
each time when expansion takes place.
• Buckets: They store the hashed keys. Directories point to buckets. A bucket may contain more than one pointers
to it if its local depth is less than the global depth.
• Global Depth: It is associated with the Directories. They denote the number of bits which are used by the hash
function to categorize the keys. Global Depth = Number of bits in directory id.
• Local Depth: It is the same as that of Global Depth except for the fact that Local Depth is associated with the
buckets and not the directories. Local depth in accordance with the global depth is used to decide the action that
to be performed in case an overflow occurs. Local Depth is always less than or equal to the Global Depth.
Frequently used terms in Extendible Hashing:
• Bucket Splitting: When the number of elements in a bucket exceeds a particular size, then the
bucket is split into two parts.
• Directory Expansion: Directory Expansion Takes place when a bucket overflows. Directory
Expansion is performed when the local depth of the overflowing bucket is equal to the global depth.
hashing in data structures and its applications
Ad

More Related Content

Similar to hashing in data structures and its applications (20)

hashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdf
hashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdfhashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdf
hashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdf
timoemin50
 
HASHING IS NOT YASH IT IS HASH.pptx
HASHING IS NOT YASH IT IS HASH.pptxHASHING IS NOT YASH IT IS HASH.pptx
HASHING IS NOT YASH IT IS HASH.pptx
JITTAYASHWANTHREDDY
 
Hashing in Data Structure and analysis of Algorithms
Hashing in Data Structure and analysis of AlgorithmsHashing in Data Structure and analysis of Algorithms
Hashing in Data Structure and analysis of Algorithms
KavitaSingh962656
 
LECT 10, 11-DSALGO(Hashing).pdf
LECT 10, 11-DSALGO(Hashing).pdfLECT 10, 11-DSALGO(Hashing).pdf
LECT 10, 11-DSALGO(Hashing).pdf
MuhammadUmerIhtisham
 
Hashing
HashingHashing
Hashing
Dawood Faheem Abbasi
 
Hash in datastructures by using the c language.pptx
Hash in datastructures by using the c language.pptxHash in datastructures by using the c language.pptx
Hash in datastructures by using the c language.pptx
my6305874
 
Hash based inventory system
Hash based inventory systemHash based inventory system
Hash based inventory system
DADITIRUMALATARUN
 
Hashing and Collision Advanced data structure and algorithm
Hashing and Collision Advanced data structure and algorithmHashing and Collision Advanced data structure and algorithm
Hashing and Collision Advanced data structure and algorithm
yogoso2948
 
Hashing .pptx
Hashing .pptxHashing .pptx
Hashing .pptx
ParagAhir1
 
Hashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdfHashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdf
JaithoonBibi
 
Hashing.pptx
Hashing.pptxHashing.pptx
Hashing.pptx
kratika64
 
Tojo Sir Hash Tables.pdfsfdasdasv fdsfdfsdv
Tojo Sir Hash Tables.pdfsfdasdasv fdsfdfsdvTojo Sir Hash Tables.pdfsfdasdasv fdsfdfsdv
Tojo Sir Hash Tables.pdfsfdasdasv fdsfdfsdv
2021csabhishekgdurga
 
8. Hash table
8. Hash table8. Hash table
8. Hash table
Mandeep Singh
 
unit-1-data structure and algorithms-hashing-2024-1 (1).pptx
unit-1-data structure and algorithms-hashing-2024-1 (1).pptxunit-1-data structure and algorithms-hashing-2024-1 (1).pptx
unit-1-data structure and algorithms-hashing-2024-1 (1).pptx
pritimalkhede
 
Hashing components and its laws 2 types
Hashing  components and its laws  2 typesHashing  components and its laws  2 types
Hashing components and its laws 2 types
abhinavkumar77723
 
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Hashing notes data structures (HASHING AND HASH FUNCTIONS)Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Kuntal Bhowmick
 
Hashing
HashingHashing
Hashing
VARSHAKUMARI49
 
358 33 powerpoint-slides_15-hashing-collision_chapter-15
358 33 powerpoint-slides_15-hashing-collision_chapter-15358 33 powerpoint-slides_15-hashing-collision_chapter-15
358 33 powerpoint-slides_15-hashing-collision_chapter-15
sumitbardhan
 
HASHING.ppt.pptx
HASHING.ppt.pptxHASHING.ppt.pptx
HASHING.ppt.pptx
MohammedAbdulNaseer5
 
Lecture14_15_Hashing.pptx
Lecture14_15_Hashing.pptxLecture14_15_Hashing.pptx
Lecture14_15_Hashing.pptx
SLekshmiNair
 
hashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdf
hashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdfhashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdf
hashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdf
timoemin50
 
HASHING IS NOT YASH IT IS HASH.pptx
HASHING IS NOT YASH IT IS HASH.pptxHASHING IS NOT YASH IT IS HASH.pptx
HASHING IS NOT YASH IT IS HASH.pptx
JITTAYASHWANTHREDDY
 
Hashing in Data Structure and analysis of Algorithms
Hashing in Data Structure and analysis of AlgorithmsHashing in Data Structure and analysis of Algorithms
Hashing in Data Structure and analysis of Algorithms
KavitaSingh962656
 
Hash in datastructures by using the c language.pptx
Hash in datastructures by using the c language.pptxHash in datastructures by using the c language.pptx
Hash in datastructures by using the c language.pptx
my6305874
 
Hashing and Collision Advanced data structure and algorithm
Hashing and Collision Advanced data structure and algorithmHashing and Collision Advanced data structure and algorithm
Hashing and Collision Advanced data structure and algorithm
yogoso2948
 
Hashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdfHashing and File Structures in Data Structure.pdf
Hashing and File Structures in Data Structure.pdf
JaithoonBibi
 
Hashing.pptx
Hashing.pptxHashing.pptx
Hashing.pptx
kratika64
 
Tojo Sir Hash Tables.pdfsfdasdasv fdsfdfsdv
Tojo Sir Hash Tables.pdfsfdasdasv fdsfdfsdvTojo Sir Hash Tables.pdfsfdasdasv fdsfdfsdv
Tojo Sir Hash Tables.pdfsfdasdasv fdsfdfsdv
2021csabhishekgdurga
 
unit-1-data structure and algorithms-hashing-2024-1 (1).pptx
unit-1-data structure and algorithms-hashing-2024-1 (1).pptxunit-1-data structure and algorithms-hashing-2024-1 (1).pptx
unit-1-data structure and algorithms-hashing-2024-1 (1).pptx
pritimalkhede
 
Hashing components and its laws 2 types
Hashing  components and its laws  2 typesHashing  components and its laws  2 types
Hashing components and its laws 2 types
abhinavkumar77723
 
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Hashing notes data structures (HASHING AND HASH FUNCTIONS)Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Kuntal Bhowmick
 
358 33 powerpoint-slides_15-hashing-collision_chapter-15
358 33 powerpoint-slides_15-hashing-collision_chapter-15358 33 powerpoint-slides_15-hashing-collision_chapter-15
358 33 powerpoint-slides_15-hashing-collision_chapter-15
sumitbardhan
 
Lecture14_15_Hashing.pptx
Lecture14_15_Hashing.pptxLecture14_15_Hashing.pptx
Lecture14_15_Hashing.pptx
SLekshmiNair
 

Recently uploaded (20)

Supplier_PFMEA_Workshop_rev 22_04_27.pptx
Supplier_PFMEA_Workshop_rev 22_04_27.pptxSupplier_PFMEA_Workshop_rev 22_04_27.pptx
Supplier_PFMEA_Workshop_rev 22_04_27.pptx
dariojaen1977
 
Comprehensive Guide to Distribution Line Design
Comprehensive Guide to Distribution Line DesignComprehensive Guide to Distribution Line Design
Comprehensive Guide to Distribution Line Design
Radharaman48
 
Evaluating Adaptive Neuro-Fuzzy Inference System (ANFIS) To Assess Liquefacti...
Evaluating Adaptive Neuro-Fuzzy Inference System (ANFIS) To Assess Liquefacti...Evaluating Adaptive Neuro-Fuzzy Inference System (ANFIS) To Assess Liquefacti...
Evaluating Adaptive Neuro-Fuzzy Inference System (ANFIS) To Assess Liquefacti...
Journal of Soft Computing in Civil Engineering
 
Health & Safety .........................
Health & Safety .........................Health & Safety .........................
Health & Safety .........................
shadyozq9
 
EHSS Orientation 2023 - Copy.Orientation
EHSS Orientation 2023 - Copy.OrientationEHSS Orientation 2023 - Copy.Orientation
EHSS Orientation 2023 - Copy.Orientation
GulfamShahzad11
 
An Explicit Formulation for Estimation of Structural Number (SN) of Flexible ...
An Explicit Formulation for Estimation of Structural Number (SN) of Flexible ...An Explicit Formulation for Estimation of Structural Number (SN) of Flexible ...
An Explicit Formulation for Estimation of Structural Number (SN) of Flexible ...
Journal of Soft Computing in Civil Engineering
 
Espresso PD Official MP_eng Version.pptx
Espresso PD Official MP_eng Version.pptxEspresso PD Official MP_eng Version.pptx
Espresso PD Official MP_eng Version.pptx
NingChacha1
 
Introduction to Python and its basics.pdf
Introduction to Python and its basics.pdfIntroduction to Python and its basics.pdf
Introduction to Python and its basics.pdf
sumitt6_25730773
 
Environment .................................
Environment .................................Environment .................................
Environment .................................
shadyozq9
 
Tech innovations management entreprenuer
Tech innovations management entreprenuerTech innovations management entreprenuer
Tech innovations management entreprenuer
Subramanyambharathis
 
🚀 TDX Bengaluru 2025 Unwrapped: Key Highlights, Innovations & Trailblazer Tak...
🚀 TDX Bengaluru 2025 Unwrapped: Key Highlights, Innovations & Trailblazer Tak...🚀 TDX Bengaluru 2025 Unwrapped: Key Highlights, Innovations & Trailblazer Tak...
🚀 TDX Bengaluru 2025 Unwrapped: Key Highlights, Innovations & Trailblazer Tak...
SanjeetMishra29
 
ESP32 Air Mouse using Bluetooth and MPU6050
ESP32 Air Mouse using Bluetooth and MPU6050ESP32 Air Mouse using Bluetooth and MPU6050
ESP32 Air Mouse using Bluetooth and MPU6050
CircuitDigest
 
Particle Swarm Optimization by Aleksandar Lazinica (Editor) (z-lib.org).pdf
Particle Swarm Optimization by Aleksandar Lazinica (Editor) (z-lib.org).pdfParticle Swarm Optimization by Aleksandar Lazinica (Editor) (z-lib.org).pdf
Particle Swarm Optimization by Aleksandar Lazinica (Editor) (z-lib.org).pdf
DUSABEMARIYA
 
digital computing plotform synopsis.pptx
digital computing plotform synopsis.pptxdigital computing plotform synopsis.pptx
digital computing plotform synopsis.pptx
ssuser2b4c6e1
 
DeFAIMint | 🤖Mint to DeFAI. Vibe Trading as NFT
DeFAIMint | 🤖Mint to DeFAI. Vibe Trading as NFTDeFAIMint | 🤖Mint to DeFAI. Vibe Trading as NFT
DeFAIMint | 🤖Mint to DeFAI. Vibe Trading as NFT
Kyohei Ito
 
May 2025 - Top 10 Read Articles in Network Security and Its Applications
May 2025 - Top 10 Read Articles in Network Security and Its ApplicationsMay 2025 - Top 10 Read Articles in Network Security and Its Applications
May 2025 - Top 10 Read Articles in Network Security and Its Applications
IJNSA Journal
 
4 Renewable-Energy-Chemistry-ppt-PP.pptx
4 Renewable-Energy-Chemistry-ppt-PP.pptx4 Renewable-Energy-Chemistry-ppt-PP.pptx
4 Renewable-Energy-Chemistry-ppt-PP.pptx
maairapayongayong
 
Automatic Quality Assessment for Speech and Beyond
Automatic Quality Assessment for Speech and BeyondAutomatic Quality Assessment for Speech and Beyond
Automatic Quality Assessment for Speech and Beyond
NU_I_TODALAB
 
A Study of Bank Line Shifting of the Selected Reach of Jamuna River Using Mul...
A Study of Bank Line Shifting of the Selected Reach of Jamuna River Using Mul...A Study of Bank Line Shifting of the Selected Reach of Jamuna River Using Mul...
A Study of Bank Line Shifting of the Selected Reach of Jamuna River Using Mul...
Journal of Soft Computing in Civil Engineering
 
Electrical and Electronics Engineering: An International Journal (ELELIJ)
Electrical and Electronics Engineering: An International Journal (ELELIJ)Electrical and Electronics Engineering: An International Journal (ELELIJ)
Electrical and Electronics Engineering: An International Journal (ELELIJ)
elelijjournal653
 
Supplier_PFMEA_Workshop_rev 22_04_27.pptx
Supplier_PFMEA_Workshop_rev 22_04_27.pptxSupplier_PFMEA_Workshop_rev 22_04_27.pptx
Supplier_PFMEA_Workshop_rev 22_04_27.pptx
dariojaen1977
 
Comprehensive Guide to Distribution Line Design
Comprehensive Guide to Distribution Line DesignComprehensive Guide to Distribution Line Design
Comprehensive Guide to Distribution Line Design
Radharaman48
 
Health & Safety .........................
Health & Safety .........................Health & Safety .........................
Health & Safety .........................
shadyozq9
 
EHSS Orientation 2023 - Copy.Orientation
EHSS Orientation 2023 - Copy.OrientationEHSS Orientation 2023 - Copy.Orientation
EHSS Orientation 2023 - Copy.Orientation
GulfamShahzad11
 
Espresso PD Official MP_eng Version.pptx
Espresso PD Official MP_eng Version.pptxEspresso PD Official MP_eng Version.pptx
Espresso PD Official MP_eng Version.pptx
NingChacha1
 
Introduction to Python and its basics.pdf
Introduction to Python and its basics.pdfIntroduction to Python and its basics.pdf
Introduction to Python and its basics.pdf
sumitt6_25730773
 
Environment .................................
Environment .................................Environment .................................
Environment .................................
shadyozq9
 
Tech innovations management entreprenuer
Tech innovations management entreprenuerTech innovations management entreprenuer
Tech innovations management entreprenuer
Subramanyambharathis
 
🚀 TDX Bengaluru 2025 Unwrapped: Key Highlights, Innovations & Trailblazer Tak...
🚀 TDX Bengaluru 2025 Unwrapped: Key Highlights, Innovations & Trailblazer Tak...🚀 TDX Bengaluru 2025 Unwrapped: Key Highlights, Innovations & Trailblazer Tak...
🚀 TDX Bengaluru 2025 Unwrapped: Key Highlights, Innovations & Trailblazer Tak...
SanjeetMishra29
 
ESP32 Air Mouse using Bluetooth and MPU6050
ESP32 Air Mouse using Bluetooth and MPU6050ESP32 Air Mouse using Bluetooth and MPU6050
ESP32 Air Mouse using Bluetooth and MPU6050
CircuitDigest
 
Particle Swarm Optimization by Aleksandar Lazinica (Editor) (z-lib.org).pdf
Particle Swarm Optimization by Aleksandar Lazinica (Editor) (z-lib.org).pdfParticle Swarm Optimization by Aleksandar Lazinica (Editor) (z-lib.org).pdf
Particle Swarm Optimization by Aleksandar Lazinica (Editor) (z-lib.org).pdf
DUSABEMARIYA
 
digital computing plotform synopsis.pptx
digital computing plotform synopsis.pptxdigital computing plotform synopsis.pptx
digital computing plotform synopsis.pptx
ssuser2b4c6e1
 
DeFAIMint | 🤖Mint to DeFAI. Vibe Trading as NFT
DeFAIMint | 🤖Mint to DeFAI. Vibe Trading as NFTDeFAIMint | 🤖Mint to DeFAI. Vibe Trading as NFT
DeFAIMint | 🤖Mint to DeFAI. Vibe Trading as NFT
Kyohei Ito
 
May 2025 - Top 10 Read Articles in Network Security and Its Applications
May 2025 - Top 10 Read Articles in Network Security and Its ApplicationsMay 2025 - Top 10 Read Articles in Network Security and Its Applications
May 2025 - Top 10 Read Articles in Network Security and Its Applications
IJNSA Journal
 
4 Renewable-Energy-Chemistry-ppt-PP.pptx
4 Renewable-Energy-Chemistry-ppt-PP.pptx4 Renewable-Energy-Chemistry-ppt-PP.pptx
4 Renewable-Energy-Chemistry-ppt-PP.pptx
maairapayongayong
 
Automatic Quality Assessment for Speech and Beyond
Automatic Quality Assessment for Speech and BeyondAutomatic Quality Assessment for Speech and Beyond
Automatic Quality Assessment for Speech and Beyond
NU_I_TODALAB
 
Electrical and Electronics Engineering: An International Journal (ELELIJ)
Electrical and Electronics Engineering: An International Journal (ELELIJ)Electrical and Electronics Engineering: An International Journal (ELELIJ)
Electrical and Electronics Engineering: An International Journal (ELELIJ)
elelijjournal653
 
Ad

hashing in data structures and its applications

  • 1. Introduction • Hashing is an important data structure designed to solve the problem of efficiently finding and storing data in an array. For example, if you have a list of 20000 numbers, and you have given a number to search in that list- you will scan each number in the list until you find a match. • Hashing is mainly used to overcome the drawbacks of linear sets &binary sets. • O(n)-time complexity of linear search • O(logn)-time complexity for binary search • O(1)-hashing
  • 2. What is hashing? • Hashing is a popular technique in computer science that involves mapping large data sets to fixed-length values. It is a process of converting a data set of variable size into a data set of a fixed size. • It is also known as the message digest function. • It is a technique that uniquely identifies a specific item from a collection of similar items. • Hashing is a technique in which we can perform searching operation in a constant computing time.
  • 4. Hash tables • Hash table is data structure(array) which contains fixed cells where each cell contains key & value. • It can be defined as a bucket where the data are stored in an array format. These data have their own index value. If the index values are known then the process of accessing the data is quicker.
  • 5. Hash function • The mathematical function to be applied on keys to obtain indexes for their corresponding values into the Hash Table. • Hash Function can be defined as an algorithm or a function that is used to map or convert data of bigger size or length to a fixed or small index or hash value • The hashing techniques in the data structure are very interesting, such as: • hash = hashfunc(key) • index = hash % array_size
  • 6. Hash functions 1.Division Method 2.Mid Square Method 3.Folding Method(pairing) 4.Multiplication Method
  • 7. Division Method • By performing module operation we have to insert record into a hash table. • Say that we have a Hash Table of size 'S', and we want to store a (key, value) pair in the Hash Table. The Hash Function, according to the Division method, would be:H(key) = key mod M • Here M is an integer value used for calculating the Hash value, and M should be greater than S. Sometimes, S is used as M.
  • 8. Size of the Hash Table = 5 (M, S) Key: Value pairs: {10: "Sudha", 11: "Venkat", 12: "Jeevani"} For every pair: {10: "Sudha"} Key mod M = 10 mod 5 = 0 {11: "Venkat"} Key mod M = 11 mod 5 = 1 {12: "Jeevani"} Key mod M = 12 mod 5 = 2
  • 9. Mid square method • It is a two-step process of computing the Hash value. Given a {key: value} pair, the Hash Function would be calculated by: 1.Square the key -> key * key 2.Choose some digits from the middle of the number to obtain the Hash value. • Suppose the size of the Hash Table is 10 and the key: value pairs are: {10: "Sudha, 11: "Venkat", 12: "Jeevani"} Number of digits to be selected: Indexes: (0 - 9), so 1 H(10) = 10 * 10 = 100 = 0 H(11) = 11 * 11 = 121 = 2
  • 10. 3. Folding Method • Given a {key: value} pair and the table size is 100 (0 - 99 indexes), the key is broken down into 2 segments each except the last segment. The last segment can have less number of digits. Now, the Hash Function would be: • H(x) = (sum of equal- sized segments) mod (size of the Hash Table) • For suppose "k" is a 10-digit key and the size of the table is 100(0 - 99), k is divided into: sum = (k1k2) + (k3k4) + (k5k6) + (k7k8) + (k9k10) Now, H(x) = sum % 100
  • 11. Let us now take an example: The {key: value} pairs: {1234: "Sudha", 5678: "Venkat"} Size of the table: 100 (0 - 99) For {1234: "Sudha"}: 1234 = 12 + 34 = 46 46 % 100 = 46 For {5678: "Venkat"}: 5678 = 56 + 78 = 134 134 % 100 = 34
  • 12. Multiplication method 1.We must choose a constant between 0 and 1, say, A. 2.Multiply the key with the chosen A. 3.Now, take the fractional part from the product and multiply it by the table size. 4.The Hash will be the floor (only the integer part) of the above result. • So, the Hash Function under this method will be: H(x) = floor(size(key*A mod 1))
  • 13. For example: {Key: value} pairs: {1234: "Sudha", 5678: "Venkat"} Size of the table: 100 A = 0.56 For {1234: "Sudha"}: H(1234) = floor(size(1234*0.56 mod 1)) = floor(100 * 0.04) = floor(4) = 4 For {5678: "Venkat"}: H(5678) = floor(size(5678*0.56 mod 1)) = floor(99 * 0.68) = floor(67.32) = 67
  • 14. Collision in hashing 1.in this, the hash function is used to compute the index of the array. 2.The hash value is used to store the key in the hash table, as an index. 3.The hash function can return the same hash value for two or more keys. 4.When two or more keys are given the same hash value, it is called a collision. To handle this collision, we use collision resolution techniques.
  • 15. Collision resolution techniques 1.Separate chaining (open hashing) 2.Open addressing (closed hashing)-> 1.Linear probing 2.Quadratic probing 3.Double hashing
  • 16. Separate chaining technique • Means maintaining linked lists for storing records • In separate chaining technique, each bucket in hash table is associated with a linked list or some other data structure that can store multiple elements. • When a collision occurs, the colliding elements are added to the linked list associated with that bucket. • The key and its corresponding value are stored as a node in list
  • 17. Separate chaining technique Advantages • There is no problem in overflow & collision. Disadvantages It requires more memory in order to store the data
  • 18. Open Addressing • In open addressing, when a collision occurs, the hash table’s slots themselves are used to store the colliding elements. If a slot is already occupied, the algorithm probes or searches for the next available slot in a predetermined manner.
  • 19. Linear probing • In linear probing, if a collision occurs at a specific slot, the algorithm sequentially searches for the next available (empty) slot by incrementing the index one by one until it finds an empty slot. • - The probing sequence is defined by the linear function: • ‘hash(key) + i’ • where ‘hash(key)’ is the original hash value and ‘I’ is the probe number.
  • 20. Drawbacks of linear probing 1.The main problem is primary clustering(most of the records stored in a single cluster). 2.It takes too much time to find an empty slot.
  • 21. Quadratic probing • In this, when the collision occurs, we probe for i2th slot in ith iteration, and this probing is performed until an empty slot is found. • The cache performance in quadratic probing is lower than the linear probing. Quadratic probing also reduces the problem of clustering. • Drawbacks Secondary clustering-whenever the half of the hash table is full it is very difficult to find the location of the new element.
  • 22. Double hashing • H(K)=K%10 • H2(x) = P - (x%P), where P is a prime number smaller than N. • It takes longer to determine two hash functions. The double probing gives the very poor the cache performance, but there has no clustering problem in it.
  • 23. Dynamic hashing • The drawback of static hashing is that it does not expand or shrink dynamically as the size of the database grow or shrink. • In dynamic hashing data buckets grow or shrinks(added or removed dynamically) as records increases or decreases. • Dynamic hashing is also called extended hashing.
  • 24. Dynamic hashing/extended hashing • It is a dynamic hashing method where in directories &buckets are used to hash the data. • It is aggressively flexible method in which the hash function also experiences dynamic change. • Directories store addresses of the buckets in pointer and id assigned to each directory. Which may change each time when directory expansion takes place. • Buckets are used to hash the actual data.
  • 27. Frequently used terms in Extendible Hashing: • Directories: These containers store pointers to buckets. Each directory is given a unique id which may change each time when expansion takes place. • Buckets: They store the hashed keys. Directories point to buckets. A bucket may contain more than one pointers to it if its local depth is less than the global depth. • Global Depth: It is associated with the Directories. They denote the number of bits which are used by the hash function to categorize the keys. Global Depth = Number of bits in directory id. • Local Depth: It is the same as that of Global Depth except for the fact that Local Depth is associated with the buckets and not the directories. Local depth in accordance with the global depth is used to decide the action that to be performed in case an overflow occurs. Local Depth is always less than or equal to the Global Depth.
  • 28. Frequently used terms in Extendible Hashing: • Bucket Splitting: When the number of elements in a bucket exceeds a particular size, then the bucket is split into two parts. • Directory Expansion: Directory Expansion Takes place when a bucket overflows. Directory Expansion is performed when the local depth of the overflowing bucket is equal to the global depth.
  翻译: