A mass collection of computers and components are networked together to form one of the global technological resources in the world today: The internet.
Such state of the art modern technology and architecture has enabled users around the world to keep in touch with information and news with just a click of a mouse. But it's the issue of knowing 'how' this has become the case.
What is it about the devices used that enables users in different geographical areas to communicate? What is the mechanism behind the internet? Raheal Mazumder explains the elements and protocols that are used to help send and receive data across a network.
What is a protocol?
In terms of networking, a protocol is a set of standards or rules that are used so that both computers (which can be heterogeneous) can communicate with each other.
The Internationalisation of Standards Organisation (ISO) demanded a proposal for a type of layered system that consisted of various standard protocols, which became known as the 'open systems interconnection reference model' (or OSI Model for short).
Diagram 1: The OSI Model
The diagram above illustrates the use of the OSI model between two terminals in a network. Each layer performs a specific type of function that helps to promote communication.
Starting with the bottom of the stack, the 'physical' layer associates itself with electrical and mechanical characteristics that allow the transmission of a unstructured data bit stream across the network via a physical medium.
The Data-Link layer works closely with the physical layer to help transfer data across the medium. Blocks of data (frames) are sent over the network with necessary synchronisation, error control and flow control. 'network' and 'session' layers share a common function of establishing, maintaining and terminating connections.
However, the session layer provides a control structure for communication between applications. The network layer also deals connection management, involving the routing of packets as well as traffic engineering.
The 'transport' layer associates itself with reliable data transfer and implementing actions of error-checking and controlling flow of traffic in the network.
Last but not least, the application layer is the one which represents the service available for users to use. Such services include FTP (file transfer protocol), HTTP (hypertext transfer protocol (World Wide Web), E-mail servers, etc.
Out of all the seven layers in the OSI Model, the transport layer (layer 4) will be discussed in further detail.
Layer 4: The transport layer
Just to recap, transmitting information across the internet is achieved by data being broken down and grouped into single packets, which in turn are transmitted one after the other.
Of course there are factors that affect the transmission, such as the pre-planned route as well as delay times, etc. But the main thing is that the packets are transmitted from the source to the destination terminal.
In the transport layer, there are a set of protocols which aid transmission of the packets, namely IP (internet protocol), TCP (transmission control protocol) and UDP (user datagram protocol).
Internet protocol (IP):
Internet Protocol is used to transmit the packets from the source to the destination. This is achieved by devices called 'Routers' which move the packets to the appropriate route, one that is capable of reaching the other terminal so that the data is received.
The routers have a routing table which consists of the pre-defined routes that it will move the packets through. Routers have another name called 'nodes' and also act as intermediate systems (IS for short). Each terminal, or system within the network, has an individual IP address that is comprised of four numbers, each separated by a dot (e.g. 21.123.22.01).
For a packet to go through the network, it may have to go through a series of other computer terminals (or Intermediate Systems) to reach the final destination. Engineers as well as computer scientists are able to find out which packets go through certain intermediate systems simply by their IP address.
IP addresses have been incredibly useful for the RIAA (Recording Industry of America Association) as it helped them to track down those users who've downloaded over 1000 songs illegally from P2P networks over the last 2 years.
Transmission control protocol (TCP):
The transmission control protocol plays a crucial part in transmitting packets accurately and efficiently across the network. Its function is to create/ establish connections with the source and destination terminals and it also ensures that the connection is reliable.
This means making sure that packets are not lost, duplicated or delayed at anytime during transmission. This service is assured by a number of technical tasks the protocol carries out.
TCP: Task1: Establishing the connection
The first task that the protocol performs is establishing the connection, which begins with a procedure called the '3-way handshake procedure'. The best way to explain the 3-way handshake would be to use an example.
Let's say that a client computer wants to send data to the server computer. In order to know whether there actually IS a connection between the two computers a test is carried out. A packet is sent to the server across the network with a flag (an identifier) attached to it.
This flag is called 'SYN'. Once this is sent, the server should send the packet back with the SYN flag attached (as it was before) and with an ACK flag attached to it (SYN-ACK) to the client computer.
The flags are important because they confirm that the server has received the packet with the SYN flag and they send the packet with the SYN and an ACK flag back to confirm this. This proves that a connection is established. Once that is done, the client sends back the ACK flag back to the server.
TCP: Task2: Transfer of data
As mentioned before, TCP is known to possess characteristics of resending lost packets, preventing congestion, preventing duplicated packets and also delivering packets in sequence order.
The mechanism of data transfer is full duplex. A packet of data is comprised of a collection of bytes. Each byte consists of an individual sequence number. For example, if there was a packet that consisted of four bytes of data the sequence numbers for each of the bytes would be 01, 02, 03 and 04.
Once the packets with the bytes (01, 02, 03 and 04) are sent, the receiving computer will send back to the client a byte with the sequence number of '05', meaning that the receiving computer is confirming to the client computer that it has received the bytes with the sequence numbers 01, 02, 03 and 04 and has sent a byte with the sequence number '05', which will be the sequence number of the next byte that will be sent with the next packet.
If for some reason the last two bytes were corrupted, then the receiving computer will send a byte with sequence number 03, to confirm that it only received the two bytes with sequence number 01 and 02 and not the last two with numbers 03 and 04. If this is the case, the bytes are then re-sent.
The principle of the sequence numbers on bytes aids activities of flow control, and congestion throttling, as well as sending packets in order according to their sequence number, which is unique.
TCP: Task3: Terminating the connection
Finally, there is the termination stage of the connection. The connection can be terminated in the same way as a connection can be established, which involves the 3-way handshake procedure. In this case, the client computer can send the receiver a packet with a FIN flag attached to it.
The receiving computer will receive the packet and attach another flag to it (the ACK flag) and then the packet (with FIN and ACK flags) will be sent back to the client, to confirm that the connection will be terminated. Once that is done, the client computer will send back the ACK back to the receiver and the TCP connection is terminated.
The transmission control protocol generally associates itself with the internet protocol, mainly because IP doesn't guarantee that packets would be sent in order to the destination terminal.
That is why TCP is used to provide reliable data transfer. Remember that IP is useful for forwarding packets across the network. TCP acts as a 'perfectionist' when it concerns packet transmission.
User datagram protocol (UDP):
The user datagram protocol is known to be another protocol that simply transmits data from the source to the destination (like the IP). However, this appears to the most unreliable when transmitting data.
You can certainly guarantee that not all the packets will be sent in order and in addition to this, delays and packet losses are likely to occur. Also, duplication of packets can happen too. This does contrast massively with TCP as it possesses characteristics to prevent these issues, as mentioned previously.
The applications of UDP include broadcasting. You will notice that when you download audio/ video stream files in real-time from the internet, bits of the media isn't viewed or heard properly due to the packets of data being lost during running time. Those which are lost aren't re-transmitted since there isn't time to.
In summary, communication between two computers in a network requires a standardised set of rules and procedures that must be used as a backbone to transfer data, namely the OSI model.
The OSI model contains one of the most important layers: the transport layer (layer 4) consisting of individual protocols that contribute to the transmission of data. The internet protocol (IP) is the one which forwards the packets via 'routing', which transmits data through the appropriate route of the network.
The transmission control protocol (TCP) is one that works with IP to ensure that there is a reliable connection, preventing loss and duplication of packets as well as making sure that individual packets are sent in sequence order.
User datagram protocol (UDP) is the alternative protocol which differs from TCP due to poor congestion control, possible packet losses and duplication and is mainly used in broadcasting and downloading audio and video stream.
Protocol: Set of rules and procedures that are followed in order for two or more computers to communicate in a network.
Router: A device used in a network to forward packets from the source to the destination via a pre-defined route out of all the possible routes it sets for them to go through. Routers contain a routing table to define the route.
Sequence number: A unique number set to individual packets, which is used as a tool to ensure that they are sent correctly and in order. They also apply to those packets that are involved in establishing connection (role of the TCP).
Intermediate systems: A device/ terminal that exists in a network where packets from the source go through it (the IS) and then to the destination terminal. A router may be considered as an intermediate system.
IP address: A unique string of four numbers (separated by full stops), that represent a terminal in the network (It could be the source terminal, destination terminal or an intermediate system (a terminal which packets pass through to reach their destination).
Wikipedia: (en.wikipedia.org) – TCP/IP/ROUTERS
Data and Computer Communications (William Stallings) SPECIAL EDITION
Note: Diagrams designed and produced by the author (Raheal Mazumder)