How does the Internet really work?
The internet is a vast and complex system of interconnected networks that allows us to access information all over the world. In this blog, we'll discuss in detail how the Internet really works and what all happens behind the scene when you try to access any website.
The internet has become an integral part of our lives, and we use it daily to access information, connect with others, and conduct business. However, have you ever wondered how the internet really works? In this blog, we will take a detailed look at the underlying technologies that power the internet, including DNS resolution, TCP handshake, TLS handshake, and HTTP request and response.
Domain Name System (DNS) Resolution
When you type a URL into your web browser, such as "www.smartutr.com," your computer needs to know the IP address of the server that hosts the website. The Domain Name System (DNS) is responsible for translating human-readable domain names into IP addresses that computers can understand.
The DNS system is a distributed database that consists of millions of servers worldwide. When you enter a URL, your computer sends a request to a DNS resolver, typically provided by your Internet Service Provider (ISP), which then queries multiple DNS servers to find the IP address of the website's server. You can also read our blog for detailed explanation about DNS resolution.
TCP Handshake
Once your computer has the IP address of the website's server, it uses the Transmission Control Protocol (TCP) to establish a connection with the server. TCP is a protocol that provides reliable, ordered, and error-checked delivery of data between applications running on different hosts.
The TCP handshake is a three-way process that establishes a connection between two hosts: the client (your computer) and the server (the website's computer). The handshake involves the following steps:
- The client sends a SYN (synchronize) packet to the server, indicating that it wants to establish a connection.
- The server responds with a SYN-ACK (synchronize-acknowledgment) packet, indicating that it has received the SYN packet and is willing to establish a connection.
- The client sends an ACK (acknowledgment) packet to the server, confirming that it has received the SYN-ACK packet and the connection is established.
Once the TCP handshake is complete, the client and server can exchange data using TCP.
TLS Handshake
While TCP provides reliable and ordered data delivery, it does not provide security. The Transport Layer Security (TLS) protocol is used to encrypt and authenticate data exchanged between the client and server to ensure confidentiality and integrity.
The TLS handshake is a process that occurs before the TCP handshake and involves the following steps:
- The client sends a ClientHello message to the server, specifying the TLS version and the cryptographic algorithms it supports.
- The server responds with a ServerHello message, indicating the TLS version and cryptographic algorithms it will use for the session.
- The server sends its public key to the client, which is used to establish a secure channel for exchanging symmetric encryption keys.
- The client generates a pre-master secret and encrypts it using the server's public key, sending the encrypted pre-master secret to the server.
- The server decrypts the pre-master secret using its private key and uses it, along with the client and server random values, to generate the session keys used for encryption and authentication.
Once the TLS handshake is complete, the client and server can exchange data securely using the agreed-upon encryption and authentication algorithms.
HTTP Request and Response
After the TCP and TLS handshakes are complete, the client can send an HTTP request to the server, which will respond with an HTTP response. The Hypertext Transfer Protocol (HTTP) is the protocol used for transferring data between web servers and clients, and it works on top of the TCP and TLS protocols.
An HTTP request consists of several parts, including the request line, headers, and message body. The request line specifies the HTTP method (such as GET or POST), the requested URL, and the HTTP version. The headers provide additional information about the request, such as the user-agent, which specifies the type of client making the request, and the Accept-Encoding header, which specifies the compression algorithms supported by the client.
The message body, if present, contains data that the client wants to send to the server, such as form data or a file upload. Once the server receives the request, it processes it and sends an HTTP response back to the client.
An HTTP response also consists of several parts, including the status line, headers, and message body. The status line indicates the result of the request, such as whether the request was successful or not, and the headers provide additional information about the response, such as the content-type, which specifies the type of data being returned, and the cache-control header, which specifies how long the response can be cached.
The message body contains the actual data being returned by the server, such as the HTML, CSS, JavaScript, or other content that makes up the website. The client then processes the response and displays the website to the user.
Browser Rendering
When a client receives an HTTP response containing HTML, CSS, and JavaScript, the browser begins the process of rendering the webpage. This process involves multiple steps, including the creation of the Document Object Model (DOM) and the rendering of the Cascading Style Sheets (CSS) and JavaScript.
The DOM is a hierarchical representation of the webpage's structure, created by parsing the HTML document. The browser creates a tree-like structure of the HTML elements, where each element corresponds to a node in the tree. The browser then uses the CSS rules to apply styling to each element in the DOM, such as colors, fonts, and positioning. This process is called the CSS Cascade, as the browser combines multiple CSS rules to determine the final style for each element.
Once the DOM and CSS have been processed, the browser begins executing the JavaScript code on the webpage. JavaScript is a programming language that allows developers to create interactive elements on web pages, such as animations, form validation, and dynamic content. The browser runs the JavaScript code in a single thread, executing one line at a time, and updates the DOM and CSS accordingly.
The final step in rendering a webpage is the layout and painting of the visual elements on the screen. The browser calculates the position and size of each element based on the CSS and DOM, and then paints the pixels on the screen. This process is called the rendering pipeline, and it involves several steps, such as layout, paint, and compositing.
Conclusion
In conclusion, the internet is a complex system that relies on multiple protocols and technologies to function. The Domain Name System (DNS) resolves human-readable domain names into IP addresses that computers can understand, while the Transmission Control Protocol (TCP) establishes reliable connections between hosts. The Transport Layer Security (TLS) protocol provides encryption and authentication to ensure secure data transfer, and the Hypertext Transfer Protocol (HTTP) is used for transferring data between web servers and clients. Understanding these underlying technologies can help us appreciate the complexity and reliability of the internet, and how it has transformed our lives.