Why TCP connection termination needs four-way handshake

Background

In TCP/IP protocol stack, the three way handshake for connection establishment and four way handshake for connection termination are some concepts you must understand.

But after I learned TCP/IP stack, one question coming to my mind is why the connection termination needs four-way handshake. In this article, I will focus on this topic. To understand the answer to this question, we need to first explain the details of these two processes and compare them to find the difference.

TCP connection establishment

In contrast to UDP, TCP is a connection-oriented and reliable full duplex protocol, which requires a logical connection to be established between the two processes before data is exchanged. The connection must be maintained during the entire process that communication is taking place. And in TCP both sides can send data to its peer.

So four things need to happen for fully establishing a TCP connection:

  • Each node has send a SYN flag to its peer (that’s two things, one SYN in each direction)
  • Each node has received a ACK flag of its own SYN flag from its peer (that’s two things)

Both SYN and ACK are properties in TCP header, where SYN means synchronization and ACK means acknowledgement. In detail the process goes as follows:

Firstly, the client sends the SYN: 1 packet to server to start the connection. When the server receives the packet, it will send the ACK: 1 packet to client to confirm that. Since in the TCP protocol, the server can also send data to the client. So the server needs to send the SYN: 1 packet to the client as well. Finally, when the client sends the ACK: 1 packet to the server. The establishment process is done, which goes as follows:

Note that the server sends two packets: ACK: 1 and SYN: 1 in sequence. Why not combine these two packets into one to have a better network performance? So the common implementation of TCP protocol goes as follows:

Note: besides SYN and ACK flag, in the establishment stage there are other fields need to be set in TCP header such as: sequence number and window. I will discuss these important TCP header fields in other articles.

TCP connection termination

When the data stream transportation (I will not cover this part in this article) is over, we need to terminate the TCP connection.

Same as above, four things need to happen for fully terminating a TCP connection:

  • Each node has send a FIN flag to its peer (that’s two things, one FIN in each direction)
  • Each node has received a ACK flag of its own FIN flag from its peer (that’s two things)

But the difference is that in most TCP implementation, four packets are used to terminate the connection. We didn’t combine the ACK: 1 packet (marked as ②) and the FIN: 1 packet (marked as ③) into one packet.

The reason is ACK: 1 packet (marked as ②) is send by TCP stack automatically. And the next FIN: 1 packet (marked as ③) is controlled in application level by calling close socket API. Application has the control to terminate the connection. So in common case, we didn’t merge this two packets into one.

It’s flexible for the application to control this process, for example, in this way the application can reuse existing TCP connection to reduce the overhead and improve the performance. In next article, I will talk about this topic.

How to build a Heap in linear time complexity

Background

In this article, I will focus on the topic of data structure and algorithms (in my eyes, one of the most important skills for software engineers). Someday I came across one question goes like this: how can building a heap be O(n) time complexity? This question confuses me for a while, so I did some investigation and research on it. This article will share what I learned during this process, which covers the following points:

  • What is a heap data structure? How does a heap behave?
  • How to implement a heap in C programming?
  • How to do the time complexity analysis on building the heap?

Basics of Heap

Before we dive into the implementation and time complexity analysis, let’s first understand the heap.

As a data structure, the heap was created for the heapsort sorting algorithm long ago. Besides heapsort, heaps are used in many famous algorithms such as Dijkstra’s algorithm for finding the shortest path. Essentially, heaps are the data structure you want to use when you want to be able to access the maximum or minimum element very quickly.

In computer science, a heap is a specialized tree-based data structure. A common implementation of a heap is the binary heap, in which the tree is a binary tree.

So a heap can be defined as a binary tree, but with two additional properties (that’s why we said it is a specialized tree):

  • shape property: a binary heap is a complete binary tree. So what is a complete binary tree? That is all levels of the tree, except possibly the last one(deepest) are fully filled, and, if the last level of the tree is not complete, the nodes of that level are filled from left to right. The complete binary tree is one type of binary tree, in detail, you can refer to this document to learn more about it.

  • Heap property: the key stored in each node is either greater than or equal to (max-heaps) or less than or equal to (min-heaps) the keys in the node’s children.

The following image shows a binary max-heap based on tree representation:

The heap is a powerful data structure; because you can insert an element and extract(remove) the smallest or largest element from a min-heap or max-heap with only O(log N) time. That’s why we said that if you want to access to the maximum or minimum element very quickly, you should turn to heaps. In the next section, I will examine how heaps work by implementing one in C programming.

Note: The heap is closely related to another data structure called the priority queue. The priority queue can be implemented in various ways, but the heap is one maximally efficient implementation and in fact, priority queues are often referred as “heaps”, regardless of how they may be implemented.

Implementation of Heap

Because of the shape property of heaps, we usually implement it as an array, as follows:

  • Each element in the array represents a node of the heap.
  • The parent/child relationship can be defined by the elements’ indices in the array. Given a node at index i, the left child is at index 2*i + 1 and the right child is at index 2*i + 2, and its parent is at index ⌊(i-1)/2⌋ (⌊⌋ means Floor operation).

Based on the above model, let’s start implementing our heap. As we mentioned, there are two types of heaps: min-heap and max-heap, in this article, I will work on max-heap. The difference between max-heap and min-heap is trivial, you can try to write out the min-heap after you understand this article.

The completed code implementation is inside this Github repo.

First, let’s define the interfaces of max-heap in the header file as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// maxheap.h file
#ifndef _MAXHEAP_H
#define _MAXHEAP_H

typedef int key_type;
typedef struct _maxheap *maxheap; // opaque type

maxheap maxheap_create();
maxheap maxheap_heapify(const key_type *array, int n);
void maxheap_destroy(maxheap);

int maxheap_findmax(maxheap);
void maxheap_insert(maxheap, key_type);
void maxheap_deletemax(maxheap);

int maxheap_is_empty(maxheap);
int maxheap_size(maxheap);
void maxheap_clear(maxheap);

#endif

We define the max-heap as struct _maxheap and hide its implementation in the header file. And expose this struct in the interfaces via a handler(which is a pointer) maxheap. This technique in C program is called opaque type. Opaque type simulates the encapsulation concept of OOP programming. So that the internal details of a type can change without the code that uses it having to change. The detailed implementation goes as following:

1
2
3
4
5
6
// maxheap.c file
struct _maxheap {
key_type* array;
int max_size;
int cur_size;
};

The max-heap elements are stored inside the array field. The capacity of the array is defined as field max_size and the current number of elements in the array is cur_size.

Next, let’s go through the interfaces one by one (most of the interfaces are straightforward, so I will not explain too much about them). The first one is maxheap_create, which constructs an instance of maxheap by allocating memory for it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// maxheap.c file
maxheap maxheap_create() {
maxheap h = (maxheap)malloc(sizeof(struct _maxheap));
if (h == NULL) {
fprintf(stderr, "Not enough memory!\n");
abort();
}
h->max_size = 64;
h->cur_size = -1;
h->array = (key_type*) malloc(sizeof(key_type)*(h->max_size));
if (h->array == NULL) {
fprintf(stderr, "Not enough memory!\n");
abort();
}
return h;
}

The initial capacity of the max-heap is set to 64, we can dynamically enlarge the capacity when more elements need to be inserted into the heap:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// maxheap.c file
static void maxheap_double_capacity(maxheap h) {
int new_max_size = 2 * h->max_size;
key_type* new_array = (key_type*) malloc(sizeof(key_type)* (new_max_size));

if (new_array == NULL) {
fprintf(stderr, "Not enough memory!\n");
abort();
}
for(int i = 0; i < h->cur_size; i++) {
new_array[i] = h->array[i];
}
free(h->array);
h->array = new_array;
h->max_size = new_max_size;
}

This is an internal API, so we define it as a static function, which limits the access scope to its object file.

When the program doesn’t use the max-heap data anymore, we can destroy it as follows:

1
2
3
4
5
6
// maxheap.c file
void maxheap_destroy(maxheap h) {
assert(h);
free(h->array);
free(h);
}

Don’t forget to release the allocated memory by calling free.

Next, let’s work on the difficult but interesting part: insert an element in O(log N) time. The solution goes as follows:

  • Add the element to the end of the array. (The end of the array corresponds to the leftmost open space of the bottom level of the tree).
  • Compare the added element with its parent; if they are in the correct order(parent should be greater or equal to the child in max-heap, right?), stop.
  • If not, swap the element with its parent and return to the above step until reaches the top of the tree(the top of the tree corresponds to the first element in the array).

The first step of adding an element to the array’s end conforms to the shape property first. Then the heap property is restored by traversing up the heap. The recursive traversing up and swapping process is called heapify-up. It is can be illustrated by the following pseudo-code:

1
2
3
4
5
6
Max-Heapify-Up(A, i):
parent ← (i - 1) / 2

if parent >= 0 and A[parent] < A[i] then:
swap A[i] and A[parent]
Max-Heapify-Up(A, parent)

The implementation goes as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
// maxheap.c file
static void maxheap_swap(maxheap h, int i, int j) {
assert(h && i >=0 && i <= h->cur_size && j >= 0 && j <= h->cur_size);
key_type tmp = h->array[i];
h->array[i] = h->array[j];
h->array[j] = tmp;
}

static void maxheap_heapifyup(maxheap h, int k) {
assert(h && k >= 0 && k <= h->cur_size);

while(k>=0 && h->array[k] > h->array[k/2]) {
maxheap_swap(h, k/2, k);
k /= 2;
}
}

void maxheap_insert(maxheap h, key_type key) {
assert(h);

h->cur_size += 1;
// make sure there is space
if (h->cur_size == h->max_size) {
maxheap_double_capacity(h);
}

// add at the end
h->array[h->cur_size] = key;

// restore the heap property by heapify-up
maxheap_heapifyup(h, h->cur_size);

}

The number of operations requried in heapify-up depends on how many levels the new element must rise to satisfy the heap property. So the worst-case time complexity should be the height of the binary heap, which is log N. And appending a new element to the end of the array can be done with constant time by using cur_size as the index. As a result, the total time complexity of the insert operation should be O(log N).

Similarly, next, let’s work on: extract the root from the heap while retaining the heap property in O(log N) time. The solution goes as follows:

  • Replace the first element of the array with the element at the end. Then delete the last element.
  • Compare the new root with its children; if they are in the correct order, stop.
  • If not, swap the element with its child and repeat the above step.

This similar traversing down and swapping process is called heapify-down. heapify-down is a little more complex than heapify-up since the parent element needs to swap with the larger children in the max heap. The implementation goes as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
// maxheap.c file
int maxheap_is_empty(maxheap h) {
assert(h);
return h->cur_size < 0;
}

static void maxheap_heapifydown(maxheap h, int k) {
assert(h);

while(2*k <= h->cur_size) {
int j = 2*k;
if (j<h->cur_size && h->array[j+1] > h->array[j]) {
j++;
}
if (h->array[k] >= h->array[j]) {
break;
}
maxheap_swap(h, k, j);
k = j;
}
}

int maxheap_findmax(maxheap h) {
if (maxheap_is_empty(h)) {
fprintf(stderr, "Heap is empty!\n");
abort();
}

// max is the first position
return h->array[0];
}

void maxheap_deletemax(maxheap h) {
if (maxheap_is_empty(h)) {
fprintf(stderr, "Heap is empty!\n");
abort();
}
// swap the first and last element
maxheap_swap(h, 0, h->cur_size);
h->cur_size -= 1;

maxheap_heapifydown(h, 0);
}

Based on the analysis of heapify-up, similarly, the time complexity of extract is also O(log n).

In the next section, let’s go back to the question raised at the beginning of this article.

The time complexity of building a heap

What’s the time complexity of building a heap? The first answer that comes to my mind is O(n log n). Since the time complexity to insert an element is O(log n), for n elements the insert is repeated n times, so the time complexity is O(n log n). Right?

We can use another optimal solution to build a heap instead of inserting each element repeatedly. It goes as follows:

  • Arbitrarily putting the n elements into the array to respect the shape property.
  • Starting from the lowest level and moving upwards, sift the root of each subtree downward as in the heapify-down process until the heap property is restored.

This process can be illustrated with the following image:

This algorithm can be implemented as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
maxheap maxheap_heapify(const key_type* array, int n) {
assert(array && n > 0);

maxheap h = (maxheap)malloc(sizeof(struct _maxheap));
if(h == NULL) {
fprintf(stderr, "Not enough memory!\n");
abort();
}
h->max_size = n;
h->cur_size = -1;
h->array = (key_type*)(malloc(sizeof(key_type)*h->max_size));
if(h->array == NULL) {
fprintf(stderr, "Not enough memory!\n");
abort();
}
h->cur_size = n-1;
for (int i=0; i< n; i++) {
h->array[i] = array[i];
}
// small trick here. don't need start the loop from the end
for (int i = h->max_size/2; i >= 0; i--) {
maxheap_heapifydown(h, i);
}
return h;
}

Next, let’s analyze the time complexity of this above process. Suppose there are n elements in the heap, and the height of the heap is h (for the heap in the above image, the height is 3). Then we should have the following relationship:

2h<=n<=2h+112^{h} <= n <= 2^{h+1} - 1

When there is only one node in the last level then n=2hn = 2^{h} . And when the last level of the tree is fully filled then n=2h+11n = 2^{h+1} - 1

And start from the bottom as level 0 (the root node is level h), in level j, there are at most 2hj2^{h-j} nodes. And each node at most takes j times swap operation. So in level j, the total number of operation is j2hjj*2^{h-j} .

So the total running time for building the heap is proportional to:

T(n)=j=0hj2hj=j=0hj2h2j T(n) = \displaystyle\sum_{j=0}^h {j*2^{h-j}} = \displaystyle\sum_{j=0}^h {j* \frac {2^{h}} {2^{j}}}

If we factor out the 2h2^{h} term, then we get:

T(n)=2hj=0hj2j T(n) = 2^{h} \displaystyle\sum_{j=0}^h {\frac {j} {2^{j}}}

As we know, j=0j2j\displaystyle\sum_{j=0}^{\infty} {\frac {j} {2^{j}}} is a series converges to 2 (in detail, you can refer to this wiki).

Using this we have:

T(n)=2hj=0hj2j<=2hj=0j2j<=2h2=2h+1 T(n) = 2^{h} \displaystyle\sum_{j=0}^h {\frac {j} {2^{j}}} <= 2^{h} \displaystyle\sum_{j=0}^{\infty} {\frac {j} {2^{j}}} <= 2^{h}*2 = 2^{h + 1}

Based on the condition 2h<=n2^{h} <= n , so we have:

T(n)<=2n=O(n) T(n) <= 2n = O(n)

Now we prove that building a heap is a linear operation.

Summary

In this article, we examined what is a Heap and understand how it behaves(heapify-up and heapify-down) by implementing it. More importantly, we analyze the time complexity of building a heap and prove it’s a linear operation.