Blog — Advancing Analytics

Computer Network Basics — Advancing Analytics

Written by Grace OHalloran | Dec 21, 2021 12:00:00 AM

Introduction

My motivation for writing this blog post was largely to share my own journey in understanding networking. I did not come from an on-prem network background, and my first exposure to networking was using virtual networks in the cloud. Although the cloud does make it easy to be able to use a lot of these resources without fully understanding what's going on under the hood, I started to feel like I wanted to know more about how networks actually worked. This was especially important when I started to get involved with things such as network peerings, VPNs, and route tables! I won't touch on any of that in this post, but read on for an introduction to computer networks.

What is a Network?

According to Wikipedia, the definition of a computer network is as follows:

Computer Network (noun)
A computer network is a set of computers sharing resources located on or provided by network nodes. The computers use common communication protocols over digital interconnections to communicate with each other. These interconnections are made up of telecommunication network technologies that may be arranged in a variety of network topologies.

I don't know about you, but I think there's a lot of complicated words in that definition, so let's break it down a bit.

If you're not too interested in breaking down this definition and just want the simple answer, skip ahead to here

Network Nodes

We can think of a node in the mathematical sense here: a network is a set of nodes that are connected together. So a node just describes the objects within a network. When talking about computer networks, this node can be thought of as a communication endpoint. A node could be a computer, a router, a switch, and other such things.

How do we identify each node within a network? That's where IP addresses come in. Each node in a network has an associated IP address which acts as a unique identifier. We'll discuss this in more detail later on, but it's important to note for now.

Communication Protocols

A communication protocol is just a defined system of rules that allows information transmission between two or more network nodes. One which you will have heard of is HTTPS, which secures the communication between one computer using a browser and another computer fetching data from a web server. One of the most common protocols is called TCP, which is the protocol that the internet uses (which is, not surprisingly, the biggest computer network in the world). We can simply think of a communication protocol as a way of allowing information from one node to be passed to another node in the network.

Network Topology

"Network Topology" is a great phrase to use because it sounds pretty cool and complicated. In actual fact, a network topology simply describes what a network looks like; what are the constituent parts of the network and how are they arranged and interrelated. A network topology could also be comprised of multiple individual networks, and explain how they talk to each other.

 

Simple Answer

 

So let's go back to our original question: what is a computer network?
A simple answer would be a collection of computers that can talk to each other. Since we also know that each computer within a network has an associated unique IP address, we can also think of a computer network as a collection of IP addresses that can talk to each other.

 

CIDR Notation

If we think of a network as a collection of IP addresses which can talk to each other, how do we concisely label a particular network? We use CIDR Notation (Classless Inter-Domain Routing). You may have seen CIDR notation before; here is an example: 10.0.0.0/25. It's made up of an IP address, followed by a slash and a suffix. This particular CIDR range describes the network comprised of the IP addresses 10.0.0.0 - 10.0.0.127; 128 IP addresses in total. We'll explore this in a bit more detail.

The following table shows how we calculate the number of IP addresses in a given network from the CIDR suffix.

lthough there is a formula to work this out, I usually just google "CIDR chart" which is much quicker!! Note that the bigger the CIDR suffix, the smaller the network.

Usable IP Addresses

In our previous example of a network, 10.0.0.0/25, I said there were 128 IP addresses. Whilst this is true, there aren't actually 128 usable IP addresses (also referred to as available hosts).

There are two IP addresses within a network which are reserved; the first IP address and the last IP address. The first is used to refer to the network itself (a network ID), and other networks use this IP address to identify the network. The last IP address is called the broadcast address; devices connected to the network use it to send information to the rest of the network. Let's update our table:

Notice how the /32 and /31 networks are exceptions to the rule. In these cases, the network ID can also be used as a host IP address. There is also no broadcast address, so the network can only send traffic outwards, and is unable to broadcast traffic across its own network. For a /32 network this isn't relevant anyway, since there is only space for one device on the network. For a /31 network, you can have two devices, but they won't be able to talk to each other.

Since these network sizes have a lot of restrictions, you don't often see them being used. There are some specific use cases, but we won't go into those here.

Different Types of IP Addresses

Now that we understand the concept of a network and how to denote one, let's explore the actual IP addresses a bit more. For example, where did I pluck 10.0.0.0 from?

The full range of IP addresses is 0.0.0.0 - 255.255.255.255. Note: we are talking about IP address which are part of Internet Protocol version 4 (IPv4), which is the most commonly used internet protocol at the moment. We will talk a bit more about this later.

These IP address are split into two major categories: Private and Public IP addresses.

Private IP Addresses

Private IP addresses are used as unique identifiers for each device within a network; the IP address is used to identify the device to the network itself. Outside of the network, the private IP address is not relevant. This means that private IP addresses do not need to be globally unique, and can be reused across different networks, as long as they are unique within the network.

There are three classes of Private IP address; classes A, B and C. Each class carves out an address range from within 0.0.0.0 - 255.255.255.255.

If you check the private IP address of your computer (Settings > Network & Internet > Status > Properties > IPv4 address), you'll see that it will fall under one of these address ranges.

Public IP Addresses

Public IP addresses are used for any device which is connected to the internet. This IP address uniquely identifies the device on the internet, and therefore must be globally unique. Usually, the device which is directly connected to the internet, e.g. a router on a home network, is assigned the public IP address and is then responsible for sharing this IP address across all other devices on the network.

You probably guessed it, but if private IP addresses are constrained to those three address ranges listed above, then public IP address are everything else. This gives 3,706,452,992 available public IP addresses. This needs to be a large number, because as we noted before, public IP addresses must be globally unique.

Static vs Dynamic IP Addresses

Public IP address can also be split down into two further categories: Static and Dynamic IPs. A static IP is simply one which does not change, whereas a dynamic IP address does. Dynamic IP addresses are managed by a Dynamic Host Configuration Protocol (DHCP) Server, which manages a pool of IPs and assigns to the devices appropriately.

A requirement for a static IP address could be if it needs to be added to a whitelist somewhere; if it were constantly changing, the whitelist would have to be constantly updated.

The downside of a static IP address is that it's more hackable; hackers know exactly where you are on the internet. A dynamic IP address makes it harder to be tracked down. Static IPs also come at a higher cost. One of the main reasons for this is due to the fact that there is a limited number of public IPv4 IP addresses out there, so taking one permanently for yourself comes at a price.

Is there enough space?

"Hang on a minute, I thought you said there were 3.7 billion public IPv4 IP addresses?" I did, but although this sounds like a very large number, is it actually big enough for the whole of the internet? IPv4 was introduced in the 1980s, and back then, 3.7 billion probably did sound like enough. However, the internet has exponentially grown in the last few decades, and we are actually running out of public IPv4 IP addresses. It's also a problem with private IPv4 IP addresses, as you can imagine the size and complexity of some of the private networks employed by companies such as Microsoft.

To combat this problem, IPv6 has been introduced, which accommodates around 340 trillion trillion IP addresses. IPv6 was officially ratified as an Internet Standard in 2017, however moving from IPv4 to IPv6 is a mammoth task, since all devices currently using IPv4 need to be reconfigured. According to Google, we globally reached an IPv6 adoption rate of 34.15% as of January 2021, so we still have quite a way to go.

Splitting up your network

So now that we know what a network is, and have a good understanding of IP addresses, let's discuss splitting up our network. Just like organising your files on your computer into folders, you often want to organise your network into logical partitions. To do this, we use subnets.

Subnets

You can split your network up into smaller parts called subnets. The splitting has to be done in a certain way, since any network (subnet or otherwise) must have a size equal to a power of 2 (remember our formula from before). Let's look at an example:

In the above example, we take our example network 10.0.0.0/25, which we know has 128 IP addresses, and we split it in half, giving us 64 IP addresses in each subnet. You can see that the range for the first subnet, 10.0.0.0/26, comprises of the first 64 IP addresses of our original network. The second subnet, 10.0.0.64/26, takes the last 64 IP addresses. Some things to note:

  • As mentioned before, since we are splitting our network up, i.e. making it smaller, the CIDR suffix has increased.
  • The CIDR notation for our subnets comes from taking the new first IP address (the network ID) and combining it with the CIDR suffix.
  • Before we had any subnets, we know that 2 of our IP addresses - the network ID and the broadcast address - were reserved. Therefore, we had 126 available hosts. Now that we've split our network into two subnets, we have actually reduced the number of available hosts, since each subnet needs it's own network ID and broadcast address. In this particular example, the following addresses are unavailable: 10.0.0.0, 10.0.0.63, 10.0.0.64, 10.0.0.127, leaving 124 available hosts.

Although you can mathematically work out your subnet splits, I find this online calculator very useful.

How much Address Space do you need?

Often you will have certain requirements for how much address space you need in different subnets, or in your network overall.

When adding things into your network, you need to consider how many IP addresses things will take up. For example, each server will need it's own IP address. So make sure to provision plenty of space. It's usually best practice to save some unallocated space in your network, in case your solution expands in the future.

Networks in the Cloud

So how does all of this translate to the cloud? Well, the underlying concepts all remain the same. Networks deployed in the cloud are called virtual networks. A physical network uses hardware to connect the nodes, for example through cables. A virtual network is simply a network connected by software over the internet, and is usually used because the nodes in the network are geographically unrelated. Everything we've discussed so far about IP addresses, CIDR notation, and subnets, also applies to virtual networks.