Okay, this is going to be a fun one, we're going to be talking about hash tables and the linear probing technique. To get started, let's recap how open addressing works. So in general, if we have a table of size n, here's how we do open addressing, no matter what your probing function is. So we start our constant, or sorry, variable x at one, the key hash is just going to be whatever a hash function gives us for key. And our first index we're going to check is going to be the original hash position. And while the table at the index now equal to no meaning that position is already occupied, then we're going to offset the index by where the key hash is plus the probing function, mod and every time we do this, we increase The variable x so that our probing function pushes us along one extra position.
And then once we found a position, then we can insert the key value pair into the table at that position. Alright, so what is linear probing? So linear probing is simply a probing method, which probes according to some linear formula, specifically, the linear function, p of x equals MX plus B, and we have to make sure that a is equal to zero otherwise, we're just adding a constant which does nothing. Now, I have a small note right here, which says that the constant b is obsolete. And if you know why, post in the comments and have a discussion with the others and as we saw in the last video, there's a slight problem with this currently, and it's that Sunday functions are unable to produce a full cycle of order and, and we might end up getting stuck in a cycle. Here's an example of that.
If we picked our linear function to be p of x equals three x, and say our key, k hashed to four, and for some reason our table size was nine, then we would end up with the following cycle occurring. Assuming that positions, four, seven and one are already taken by other key value pairs. So the fact that we're only probing at those three locations means we're unable to reach all the other buckets, which is really bad. And hence we're again stuck in an infinite loop. We cannot get stuck in this situation ever. So how do we avoid it?
So that's the question which values of the constant A and P of x Because air x produce a full cycle modulo n, it turns out that this happens when the constant a and n the table size are relatively prime to each other. So two numbers are relatively prime if their greatest common denominator is equal to one. So that is a an N have a GCD of one. And when that happens, the probing function will always be able to generate a complete cycle. And we will always be able to find an empty bucket. Awesome.
Alright, here's an example with linear probing. So suppose we have an originally empty hash table and we want to insert some key value pairs. And we selected our proving function to be p of x equals six x Our table size v n equals nine. And then we also selected a max load factor of alpha equals about two thirds, and the threshold will then be six. So we resize once we hit six elements, okay, so based on the probing function we chose and a table size, are we likely to run into an infinite loop while inserting based on what I told you in the last few slides? And the answer is, yes, the greatest common denominator of nine and six is equal to three which is not one.
So let's go ahead and attempt to insert some elements regardless, we may or may not hit any problems. Okay. So first, let's insert. So I guess on the left, I have some key value pairs I want to insert and we're going going to insert k one first. So suppose that the hash value for K one is equal to two, then it would say, Alright, the hash of K one plus the probing sequence at zero mod nine is equal to two. So we're going to insert that key value pair at position two.
Alright, next one. So now suppose k of two is equal to two again. So we're going to try to insert this key value pair at position two. But oh snap, we have a hash collision. So we're going to increment x and now try to offset the probing function at one and so probing function is zero. So that is instead of inserting now two, we're going to insert it at eight.
Right? So that went in because that slot was free. Now let's try inserting key three. So Key three, suppose that hashes to three, then we can insert position three, and there's no issue. Now, notice that we're trying to re insert k two, which already exists within our hash table. So instead of inserting it, we're actually going to be updating its value because it exists in the hash table.
Alright, so from before, we know that the hash value for K two is two. So So then we look at position two, and K two is not there, and it was hash collision. So we increment x offset by a P of one. And now we're able to find q two and now we can update value right there. Let's go to K five. Okay, so suppose k five has a hash value of eight.
So eight is taken. So we're going to probe and that leaves to index five, so we're going to insert the key value pair there. Now Australian suitcase six. Suppose case six hashes the five, then let's probe ones now that so five plus six more nine gives us two. Okay, now the hash collision let's keep probing. So now, five plus 12 mod nine is equal to eight.
All right, another hash collision, so we have to increment x and keep probing, and now we're back to five. So we've hit a cycle. Alright, so we're trapped in a cycle. But we kind of expected this to happen because we knew that we picked two numbers whose GCD was equal to three and not one. So if we looked at all the possible a values we could have selected, instead of Six, we see that the ones that give a greatest common denominator of one with and the table size are 12457 and eight, while any multiple of three would give us something else. So this comes to the realization that for our choice of p of x, if we choose p of x to be one times x, then the greatest common denominator of an N one is always going to be one no matter what our choice of the table sizes, which is why p of x equals one times x is a very popular probing function choice.
All right. Now suppose we have another hash table, and we wish in certain more key value pairs, but this time, we're actually going to pick a probing function that works so that we don't ever have an issue. So I'm going to pick the table size b 12 and our probing function to be five x. So no cycle should occur. All right, so let's go with inserting the first guy. So suppose that k one has a hash value of 10.
Then at index 10, we'd put k one v one, suppose k two as a hash value of eight, then slot eight is free. So let's pop those right in there. So now suppose k three is equal to 10. Oh, hash collision with K, one in there. So we need to keep probing. Alright, so if we use our probing function, which is p of x equals five x, so they'll give us three module and when we add it to the hash value, moving on K four.
Now suppose the hash value for K forest and then here to keep probing, so then we hit k three, which we inserted last time. And then oh, we also hit k two. But we're able to pull out eventually when we hit the third probing value, however, now notice that we've actually reached the threshold of our table. If I go back a few slides, we can see that I picked alpha to be 0.35. So n, which is the table size times alpha gave us four. And we just finished inserting the fourth element.
So it's time that we resize the table. So how we usually resize the table is via some x Financial fashion such as doubling or tripling, or so on. But we need to double in such a way that the GCD of an A still holds. So after doubling, and is equal to 24, and the GCD properties is still good. Alpha is a constant, so it's still 3.5. So our new threshold is just going to be eight and we don't change the probing function.
All right, so let's allocate a new chunk of memory for this new table. And now we need to place all the old elements in our old table into this new table using the new table size for an All right, so we scan across all these elements we start at zero and nothing there and move along. So from before we knew the hash value for q4 was 10. So inside the new table as soon as position 10, scan along nothing here. from before, we know that q3 was 10. So it should go in position 10 in this new table, but we have a hash collision, so we have to keep probing.
So if we add our probing function at one, which is five x, then we get 10 plus five, which is 15. So we're going to insert a position 15 in the new table, keep probing nothing here, nothing here, nothing here. So eventually we hit k two. So we know from before k two is equal to eight. So we're going to insert position eight. Now we know k one is equal to 10.
So we're going to try and insert position 10. that's taken so to probe so the next position is also taken. So add five again, that gives us 20. So insert k one V 120. Alright, so now we throw away the old table and we keep this new table and this is the new hash table we're working with. And we were at inserting k five. Now suppose k five is equal to two.
And that spot is free. So we are good. So here's a frequently asked question. Sweet. I know how insertion works. Now, how do I remove key value pairs from the hash table using open addressing?
And my answer this is that this topic is going to merit its own video. And we're going to do it after we see all the other open addressing techniques because it's actually non trivial. Alright, guys, thanks for watching. If you like this video, please hit the like button and subscribe and drop a comment. See you later.