How URLs Work on the Web

Updated on

This tutorial explains how URLs work from a technical SEO perspective.

What Is a URL?

A Uniform Resource Locator, or URL, contains the address of a resource on the Web.

An example URL for the home page of Google’s search engine might look like this:

https://www.google.com/index.html

It says to use the HTTPS protocol to go to the www subdomain of the domain google.com and fetch the index.html file at the root directory (/) of the server.

Most servers are configured to automatically load index.html files when the directory that contains the file is requested, so requesting this:

https://www.google.com/

should produce the same result as requesting this:

https://www.google.com/index.html

A directory is represented by a forward slash, so the index.html file at the root of the server is located at the path /index.html.

Let’s take a closer look at the structure of URLs.

Anatomy of a URL

In its most basic form, a URL on the Web has three parts:

Here are some example URLs that use those three components:

Port Numbers

All URLs also have a port number after the domain name, but it’s hidden in normal requests because an HTTP site runs on port 80 by default and an HTTPS site runs on port 443 by default.

Those three URLs from above would look like this if you included the (unnecessary) port numbers:

When we get into Web development, we’ll use various port numbers to run Web servers on our local computers.

Parameters

URL parameters are key-value pairs of data that can appear in URLs.

The parameters are separated from the beginning of the URL by a question mark. Each parameter is separated from other parameters by an ampersand.

Here’s a typical URL that contains several URL parameters:

https://example.com/?utm_source=newsletter&utm_medium=email&utm_campaign=spring_sale&utm_id=123

If we remove the separator characters (? and &), it’s easier to see the key-value pairs:

utm_source = newsletter
utm_medium = email
utm_campaign = spring_sale
utm_id = 123

utm is related to the old name of Google Analytics, so we can make it easier to read by removing that:

source = newsletter
medium = email
campaign = spring_sale
id = 123

There are multiple reasons for using URL parameters that we’ll get into later. For now, you should know how to recognize them and how to separate them into their key-value pairs.

URL Fragments

Another thing you’ll see on URLs is URL fragments.

A URL fragment is the part that comes after a hash sign (#).

They typically have two uses:

  1. You can link to a specific part of the page by using URL fragments. If you’ve ever clicked on a link and had the page scroll to a different part of the same page, it’s likely that a URL fragment was involved. If the page content doesn’t change when the hash changes, you’re probably dealing with this case. If you want to try it on this page, click here and it will use a URL fragment to scroll this section of the page to the top of your browser window. If you look at the URL, it will have #url-fragments on the end.
  2. In some JavaScript frameworks, the hash sign is used to navigate between pages. It might be a single hash sign like /#/ or it might have an exclamation point like /#!/. If the page content changes when the part after the hash changes, then you’re dealing with this case. Page navigation with URL fragments isn’t in fashion any more, because it isn’t good for SEO, but it’s still possible to find cases of it in the wild.

URL fragments are not sent across the Web, so the server can’t see them.

How to Parse a URL with JavaScript

If you know a little JavaScript and want to experiment with URLs, you can inspect them right in the browser.

First open your browser console. On most computers you can press F12. Alternatively, right-click on a Web page and choose “Inspect”. Then, in the tool that pops up, go to the tab that says “Console”.

You can type JavaScript code in the console and the browser will run it.

Paste this code into the console:

var u = new URL(
  "https://www.example.com/?utm_source=newsletter&utm_medium=email&utm_campaign=spring_sale&utm_id=123#abc"
);

You can then access various parts of the URL on the variable named u.

Try this one:

u.origin

It should return the base of the URL:

https://www.example.com

Here are some other fields to try:

URLs can contain usernames and passwords, but it isn’t common on the open Web, because you generally don’t want anyone to see your password.

Takeaways

Here are some things you should remember from this section:

Return to the main tutorial page.

Feedback and Comments

What did you think about this page? Do you have any questions, or is there anything that could be improved? You can leave a comment after clicking on an icon below.