In this lecture, we will be talking about HTTP flood, one of the most popular DDoS attacks on the application layer. This is a particularly important lecture because it covers an important and popular attack vector. And it basically includes protection against your websites. And in most cases, this is what people want to protect. So let's start with how it works. As you can see, we have a sample GET request, get line, which is the request line, basically requests for sample path.
And these are as it's written here, the request headers. And essentially, this attack is actually quite simple. The attacker after the three way handshake constantly sends those GET requests one and after. With may also remind you here that it doesn't have to be just a get request. Some bots, some attackers can send for example, get them post together as well. So it doesn't mean that it will be just get Can we get them paused combined in the same attack.
Although based on my experience, in most cases, most of those bots use only one method, they usually have the same predefined headers. Now what happens when one of these requests arrives at the server? Basically, this is all your server response, as you can see, but imagine that millions of these requests are arriving at your web destination, either on port 80 or 443. Then of course, your servers will try to respond to all of them, not necessarily with 200 response code. Because for instance, if the requested URL here doesn't exist, basically, the response will be something with 400 it can be four or four or something like that. Here, the response code is not important.
What is important is that the server will try to respond to all of these requests. So how can we basically detect those, since we cannot just rely on the response code of your server? The first way, the first thing to check is basically checking for an existence of pushback. Because of the three way handshake, these bots usually requests for HTTP with a push back in order to stay to server that this is urgent, so it's worth checking. And the second thing is basically filtering based on the method. For example, on Wireshark, it could be a method get or method post, or put method for example.
But those two are by far the most popular ones. And after proper filtering, when you check on Wireshark, again, this is more or less what you will get. Basically, as you can see, 200 400 sometimes you will also see for force for the paths that don't take And as you can see the same source or the sources from same subnet, or trying sending, get requests for the same URLs over and over again. And for the next step, you need to check every request to identify whether it is a part of an attack, which in fact, we are going to cover during the mitigation. Having said that, how to mitigate these attacks? Well, I would say the first thing, and the best thing to do during an attack especially, is using combination of those to check for common HTTP header patterns, and URLs, and check for common lack of HTTP header pattern.
What I mean by that is, suppose that you started to observe spike in the incoming HTTP traffic. And when you investigated it, you realize majority of the HTTP requests in the spike has, let's say, no host others and you don't see any contentment either. All the requests had the same user agent header, let's say Mozilla. Likewise, all the remaining three headers you see here also exist in every request, not necessarily their values in this case, on like the user agent. Let's assume that their values change per request, but the headers except language, etc, are always in the request. For now, we are assuming that the attackers use simplistic bots that are not intelligent.
There are intelligent bots and we will also discuss how to mitigate them. But for now, let's assume that those bots are not necessarily clever. And they use a similar pattern of headers. So under such a circumstance, it will be wise to create let's say, a snort rule or a similar rule where we block all the requests, which do not have contentment, and homesteaders at the same time, which always continue Those three headers and the same user agent value. This will effectively protect your environment in real time when implemented correctly, while not blocking the legitimate users. It is also crucial to note that you don't want to just block all the GET requests or POST requests, since this will obviously block legitimate users as well.
So we are trying to kind of infiltrate only the requests from the attackers here. We are trying to find the common pattern in the packets of the attackers. Another important thing to notice, even though the sample pattern we just investigated is simplistic. It doesn't mean all the time that it is generated by a bot. Since this kind of pattern combination is legit. real users or real agents can also generate them.
But what if the attacker is actually clever and it emulates a real browser in such cases These headers will be more diverse actually than usual. And therefore, it will be quite difficult to find common patterns in such cases, because your attacker actually is firstly implementing a browser emulation. And secondly, it uses a better understanding of randomization. So what if your attacker uses the browser emulation and the better randomization? Where it's almost impossible to find the common pattern for HTTP headers as well as common lack of HTTP headers? Well, then the answer is basically response challenging the traffic in further ways about which we will be talking in next lecture.