Data journey through the Internet - The OSI model approach

Amarachi Iheanacho - Jun 6 - - Dev Community

These last couple of years have seen the Internet become such a staple tool in humanity that it is hard to imagine a world without it, let alone imagine that most of human history and advancements happened without it.

But the Internet, as mystical as it seems, can simply be explained as a large network connecting countless diverse computers and devices, all looking to communicate with each other. Although these devices operate under different protocols, use different data formats, and use different addressing schemes, they can all exchange data seamlessly due to the standardization offered by the OSI model.

This article gets into what exactly the OSI model is and how it is used to send data through the internet.

What is the OSI model?

The Open Systems Interconnection (OSI) model, developed by the International Organization for Standardization (ISO) in 1984, acts as a blueprint for network communication. It breaks down communication between computer systems into seven layers: Physical, Data Link, Network, Transport, Session, Presentation, and Application.

The OSI model layers, image from Cloudflare

But before getting right into this article, it is important to note that while this article focuses on the OSI model, published in 1984, you must acknowledge that how you communicate over the Internet has evolved since then.

Today’s Internet uses a slightly altered OSI model called the Transmission Control Protocol/Internet Protocol(TCP/IP).

This TCP/IP model streamlines things by combining the top three OSI layers (Application, Presentation, and Session) into a single Application layer. It also merges the Data Link and Physical layers into a Network Access layer. In simpler terms, the TCP/IP model has four layers: Network Access, Internet, Transport, and Application.

Now, this streamlining does not mean that the tasks done by the Presentation Layer and the Session Layer are unimportant in this internet age; they just work a bit differently, and the next couple of sections will discuss this. The rest of the article will discuss the layers that make up the OSI model and the roles they play in network communication.

The OSI model layers

The OSI model is split into seven layers, and these layers are:

Physical Layer (Layer 1):
The physical layer is the foundation of data communication in the OSI model, defining the hardware elements involved in the network. This layer handles the transmission and reception of raw bitstreams (1s and 0s) that make up digital information. Its primary function is to transmit this data stream over physical media like cables (coaxial, fiber optic) or wireless signals (Wi-Fi). Depending on the medium, the physical layer converts the data into electrical signals for cables or radio waves for wireless transmission.

Data Link Layer (Layer 2):
The data link layer is responsible for error-free data transfer between nodes within the same network via the physical layer. To put it simply, A better way to understand this is:
Hosts - devices on a network that send, receive, and process data, such as computers, servers, and smartphones - use an addressing scheme called Internet Protocol(IP) addresses to identify each other; the layer 3 section will describe this phenomenon in more detail.

These IP addresses act like unique mailing addresses on the network, allowing hosts to send information to each other.

However, like with some deliveries, the sending host doesn't get the information sent to the receiving host on a one-way trip. They are little helpers on the way that help to send the information from one hop to another until it reaches the IP address of the receiving hosts.
And that’s where the data link layer comes in.

Addressing and Devices:
The data link layer utilizes a unique addressing scheme called the Media Access Control (MAC) addresses. These addresses are 48 bits long and typically displayed as a 12-digit hexadecimal number, like 00:1E:67:E4:CB:32. Every Network Interface Card (NIC) has a unique, pre-assigned MAC address.

Now, you might wonder how data actually gets from one device to another on the same network. This is where switches come in.

Switches act as traffic directors within a network. They connect multiple devices to their various ports. To facilitate efficient data sharing, switches maintain MAC address tables that map specific ports to the corresponding MAC addresses of the devices connected to them.

To populate these MAC address tables, switches use a learn, flood, and forward method:

  • Learn: Whenever we try to send data in or over the network, the sending hosts add a header to the data that holds the source and destination IP address and MAC address so that the network devices know where to send the data.

So when a switch receives a data packet, it examines the source MAC address in the frame header and the port on which it arrived. It then updates the MAC address table to associate this MAC address with that specific port. This way, the switch learns the location of devices on the network as they communicate.

  • Flood (if necessary): If the destination MAC address for the incoming frame isn't found in the table, the switch can't determine the recipient's location. In this case, the switch may flood the data frame out of all ports except the one it received it on. This ensures that the packet reaches the intended recipient, even if the switch hasn't yet learned the destination address. While flooding seems inefficient, it's a temporary measure until the switch learns the proper route.
  • Forward: Once the switch learns the destination MAC address and its port (either from the initial frame or a response), it can efficiently forward future frames for that device to the correct port.

In summary, the data link layer facilitates hop-to-hop delivery. It does this with the help of an addressing scheme called MAC address and a layer 2 device called switches. Switches learn what MAC addresses are connected to specific ports using the learn, flood, and forward technique.

Now that this article has explained how data travels hop-by-hop within a local network, how does the internet ensure these hops ultimately lead to the correct final destination, potentially located far away? And how do network devices choose the most efficient route for these hops?

That's where the network layer comes in. It plays a crucial role in directing data packets across the ocean of networks, which is the Internet.

Network Layer (Layer 3)
The next layer in the OSI model is the Network Layer. This layer is responsible for ensuring end-to-end delivery of data between devices. It achieves this by using a unique addressing system called an IP address.

An IP address is a 32-bit number, typically written in four sections (octets) separated by periods. Each section ranges from 0 to 255. An example of an IP address is 192.168.1.1.

The Network Layer uses the receiving device's IP address to determine its location. Using this IP address, the network layer can determine if the target device is on the same local network (like your home network) or a different network somewhere on the Internet. This ability to identify network location relies on a technique called subnetting.

Knowing the target network allows the Network Layer to choose the most efficient way to send data:

  • Same Network: If the target device is on the same local network, the data is sent directly using a switch.
  • Different Network: If the target device is on a different network, the data is sent through a switch first. The switch then forwards the data to a router. Routers are responsible for directing data across different networks, ultimately reaching the target device on the internet.

Understanding Routers and ARP
Routers act as intermediaries that connect different networks. They're similar to traditional hosts in that they have both IP and MAC addresses. However, unlike hosts, routers use this information specifically to route data packets to their intended destinations on the network.

To send data over a network, you need two types of addresses: the IP and MAC addresses. Finding the IP address is relatively easy; for a traditional host, you can obtain the IP address using systems like the Domain Name System (DNS).

A router’s IP address is typically configured along with the host's IP address during network setup. Here are some examples of other configurations you might set during network setup:

  • IP Address (IPv4): Unique identifier for the device on the network.
  • Subnet Mask: Defines the network portion of the IP address.
  • Default Gateway: The router's IP address, which acts as the gateway to the internet.

However, discovering the MAC address can be a little tricky. To find the MAC address of a particular host in a network, the sending host does the following:

  • Checking the ARP Cache: When a device needs to send data to another device on the same network, it first checks its Address Resolution Protocol (ARP) cache. This cache is stores a list of known MAC addresses associated with their corresponding IP addresses. To see the ARP cache on your system, run:
arp -a
Enter fullscreen mode Exit fullscreen mode

  • Sending an ARP Request (if not found): If the target device's IP address isn't found in the ARP cache, the sending device broadcasts an ARP request message on the network. This message essentially asks, "Who has the MAC address for [target device's IP address]?"
  • Receiving the MAC Address: All devices on the network receive the ARP request, but only the intended device responds. This response includes the target device's MAC address and is sent directly back to the sender.
  • Updating the ARP Cache: The sending device then updates its ARP cache with this new information. This allows the device to efficiently send future data packets directly to the target device without needing to repeat the ARP request.

With both the IP and MAC addresses in hand, the sending device can package the data into a frame and transmit it over the network.
Once your data has reached the receiving host, how do you ensure it is delivered to the correct application in the right format? This is the job of the transport layer.

Transport Layer (Layer 4)
This fourth OSI layer facilitates service-to-service delivery. If you are reading this article on your laptop or phone, you probably have multiple tabs open, all sending and receiving data to hosts continuously. How does this data stay isolated in their respective tabs? The transport layer achieves this using the ports addressing scheme.

You can think of ports as numbered doorways on a computer. Each application is assigned a specific port ranging from 0 to 65535 to send and receive data.
Here is a deeper look into how ports keep things organized on both the server and client side:

  • Servers: Servers act like well-known businesses with fixed addresses. They listen for incoming requests on predefined ports. For example, web servers typically listen on port 80 (HTTP) or 443 (HTTPS). This allows clients (like your web browser) to find and connect to the desired service reliably.
  • Clients: Unlike servers, clients choose random, temporary ports (called ephemeral ports) for their connections. This range is usually between 49152 and 65535. Choosing random ports helps avoid conflicts between apps trying to use the same port. It also adds an extra layer of security and allows a single device to make multiple connections simultaneously.

So, when a client wants to connect to a server, it sends its request to the server's IP address and its predefined port. The server then responds to the client's IP address and the temporary port used for the connection.
The transport layer also uses two main protocols to handle data delivery: the Transmission Control Protocol (TCP) and the User Datagram Protocol (UDP).

  • TCP (Transmission Control Protocol): TCP is ideal for data transmissions where data integrity is crucial. It ensures that data is ordered and guaranteed to arrive, making it suitable for file transfers, emails, and web browsing.
  • UDP (User Datagram Protocol): Conversely, UDP sends data without guaranteed delivery or order, making it ideal for situations where speed is more important than data integrity, such as online gaming and live streaming.

Session Layer (Layer 5)
The fifth layer of the OSI model, the session layer, is responsible for establishing, managing, and terminating communication sessions between applications on different devices. It ensures that sessions remain open while data is being exchanged and can close them once the communication is complete.

While this article mentioned that layers 5-7 (Session, Presentation, and Application) are somewhat compressed into a single layer in modern protocols like TCP/IP, there's a historical reason for this. In the past, personal computers weren't as common. People primarily used large, centralized computers called mainframes. Since these mainframes were shared by multiple users, the Session layer played a crucial role in efficiently managing these concurrent sessions.

However, with advancements in technology and the democratization of computers, the distinctions between these layers are becoming less significant.

Here's a real-world example of how the Session layer functions in today's world: When you connect to Wi-Fi at a restaurant or bar, your phone or laptop might receive a new IP address because it's on a different network. However, thanks to the Session layer, you don't need to log in again to your apps – your connection remains active.
The session layer enables session continuity, ensuring seamless and efficient communication for applications that require ongoing interactions.

Presentation Layer (Layer 6):
Next, on the OSI model, you have the presentation layer. This layer acts as a translator for network communication. It ensures that the data sent from one application (like your web browser) can be understood by another application on a different system.

This is especially important when two communicating devices are using different encoding methods. The presentation layer converts data formats, handles encryption and decryption, and manages compression and decompression. By ensuring data is presented in a readable format, this layer allows applications to interpret and use the information correctly.

Application Layer (Layer 7):
The application layer is the topmost layer of the OSI model, and it's the only layer that directly interacts with data from the user. It provides network services directly to user applications, facilitating tasks such as sending emails, retrieving web pages, and transferring files.

This layer is responsible for identifying communication partners, ensuring resource availability, and synchronizing communication. It also translates user data into a format suitable for network transmission and vice versa, breaking down data into bits (1s and 0s). Common protocols operating at this layer include HTTP, FTP, SMTP, and DNS. The application layer ensures that network communication is user-friendly and accessible, enabling effective interaction with the network resource.

Putting it Together: A Practical Example

Now that this article has established the basics, how do these components work together to exchange data in the real world?

Well, to start off, you must note that sending data through the Internet involves the data traveling down the seven layers of the OSI model on the sending device, from the Application layer to the Physical layer, and then up the seven layers on the receiving device.

To simulate how data exchange would happen through an OSI model, lets take the example of visiting a webpage.

On the sending end

Visiting a webpage over the internet is largely just sending and receiving data. You send a request specifying the webpage you want to see, and in return, you receive the requested content.

Here is how the data transaction would work in an OSI model.

Application layer (Layer 7)
This journey starts at the Application layer when you type a website into your web browser and hit enter. In this layer, the browser uses the Domain Name System (DNS) protocol to get the website's IP address, facilitating end-to-end delivery.

DNS acts like a phonebook for the internet. It translates human-readable domain names like "google.com" into machine-readable IP addresses like "216.58.223.238" that computers use to connect to websites. This makes it much easier for people to remember and access websites. To learn how DNS works exactly, check out this article: What is DNS?

Next, the application layer prepares a Hypertext Transfer Protocol (HTTP) or Hypertext Transfer Protocol Secure (HTTPS) request to fetch the requested webpage.
This HTTP request includes a version type, a URL, a method, request headers, and an optional HTTP body.

Presentation layer (Layer 6)
The browser formats the HTTP request, so the server can interpret the request.
Additionally, this layer handles data encryption and compression if you're accessing a secure website (HTTPS). Encryption scrambles the data to protect it from prying eyes, while compression reduces the data size for faster transmission.

Session Layer (Layer 5):
The browser establishes a session with Google’s server. This layer manages the session creation, maintenance, and termination, ensuring a smooth back-and-forth flow of data.
For websites prioritizing security, like those with HTTPS in the address bar, the Session layer works hand-in-hand with protocols like SSL (Secure Sockets Layer) or its more modern successor, TLS (Transport Layer Security). These protocols create a secure encrypted tunnel to safeguard the data traveling between your browser and the server.

Transport Layer (Layer 4):
In this layer, the HTTP request is broken down into smaller, manageable segments. The browser relies on TCP to ensure these segments are delivered to the correct ports on the server anf that they all arrive safely and in the right order.

Network Layer (Layer 3):
The Network Layer breaks down the segments into smaller packets and adds header information to each packet, including the IP addresses of your computer, Host A, and the web server, Host B.
Using the IP address it got from DNS, Host A checks whether the receiving host, Host B, is on the same network. It also uses this IP address to determine the best route to send the packets across the Internet.

Data Link Layer (Layer 2):
The Data link layer (Layer 2) further breaks these packets into frames. These frames are encapsulated with a frame header that contains the MAC address of your Host A network’s card and Host B’s network interface. This layer ensures the packets are delivered correctly within your local network.

How Layer 3 and Layer 2 will work together
After adding the IP address layer to its data packet, Host A will determine whether both the sending and receiving hosts are on the same network. If they are not, Host A needs to send the data to its default gateway, which is a router.

Host A uses an ARP request to find the router's MAC address and stores the ARP mappings (IP address mapped to a MAC address) in an ARP cache. Once Host A finds the MAC address, it adds the MAC address layer to the further segmented data packets, which are data frames, and sends them to the next hop or node.

Typically, hosts are not directly connected to routers but to switches, which are then connected to a router.

When the data reaches the router, the layer 2 (MAC address) header is discarded, as this layer is not needed for the next hop, which has a different source and destination and thus would need a new layer 2 header.

The Internet is a web of interconnected networks, usually involving multiple connected routers. So, when Host A sends data to the router, the router forwards it to the next router until it reaches the router in Host B's network. To learn how routers send data to each other, check out this article, How Routers facilitate communication.

So the data moves from one hop to another, discarding and re-adding layer 2 headers as needed until the data packet gets to the final router in Host B’s network.

The router in Host B's network will then send an ARP request to figure out Host B's MAC address and send the data to a switch in the network. The switch then sends the data to Host B, finally stripping the Layer 2 and Layer 3 headers.

Physical Layer (Layer 1):
Your network card converts the data packets into electrical signals to travel over a cable or radio waves for Wi-Fi.

On the Receiving End

When the data reaches the receiving device (Host B), it goes back up the seven OSI model layers. Let's follow the journey of the data:

Physical Layer (Layer 1): The data arrives as electrical signals (for wired connections) or radio waves (for wireless connections) and is received by Host B's network interface card (NIC). The NIC converts these signals back into binary data.

Data Link Layer (Layer 2): The NIC processes the incoming frames. It checks the frame header for the MAC address to confirm it's the intended recipient. The frame header is then stripped off, and the remaining data is passed to the Network layer.

Network Layer (Layer 3): At this layer, the packets are examined to ensure they are addressed to Host B’s IP address. If the packet is for Host B, the network layer strips off the IP header and reassembles the fragments into the original segments.

The data is then passed up to the Transport layer.

Transport Layer (Layer 4): The Transport layer (typically using TCP) reassembles the segments into the original message. It checks for errors and ensures all data segments are correctly ordered.
Once reassembled, the data is passed to the Session layer.

Session Layer (Layer 5): This layer manages the session between the browser and the web server. It ensures the session remains open as long as data exchange is needed.

If the session uses SSL/TLS, it ensures that the data remains encrypted until it is passed to the Presentation layer.

Presentation Layer (Layer 6): If the data is encrypted (e.g., HTTPS), the Presentation layer decrypts it. This layer also handles any data formatting or translation needed to make the data understandable to the Application layer.

The formatted data is then passed to the Application layer.

Application Layer (Layer 7): The Application layer receives the HTTP response. The web browser processes the HTTP response, which contains the HTML, CSS, JavaScript, and other resources needed to render the webpage.

The browser then renders the webpage, displaying it to the user.

Wrapping it up

The internet is a deeply interesting web of devices. As more people around the world have clever ideas and develop new smart devices, the internet becomes even more complex and diverse.

Despite their differences, the internet only works well because these diverse devices can communicate with each other. This is thanks to a standardized set of rules and regulations called protocols. No matter where your devices are located or what kind of devices they are, they can talk to each other across the vast ocean of the internet, understand each other, and exchange information. That, my friends, is the beauty of the OSI model.

As discussed extensively in this article, the OSI model is a seven-layered framework that acts like a translator for all these devices. Each layer has its own specific protocol, like a specialized language, that allows it to communicate with the layers above and below it. These protocols ensure that data is packaged correctly, addressed for its destination, and delivered efficiently.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .