It's been a long time since we covered one of the most fundamental building blocks of SEO—the structure of domain names and URLs—and I think it's high time to revisit. But, an important caveat before we begin: the optimal structures and practices I'll be describing in the tips below are NOT absolutely critical on any/every page you create. This list should serve as an "it would be great if we could," not an "if we don't do things this way, the search engines will never rank us well." Google and Bing have come a long way and can handle a lot of technical challenges, but as always in SEO, the easier we make things for them (and for users), the better the results tend to be.
#1: Whenever possible, use a single domain & subdomain
It's hard to argue this given the preponderance of evidence and examples of folks moving their content from a subdomain to subfolder and seeing improved results (or, worse, moving content to a subdomain and losing traffic). Whatever heuristics the engines use to judge whether content should inherit the ranking ability of its parent domain seem to have trouble consistently passing to subdomains.
That's not to say it can't work, and if a subdomain is the only way you can set up a blog or produce the content you need, then it's better than nothing. But your blog is far more likely to perform well in the rankings and to help the rest of your site's content perform well if it's all together on one sub and root domain.
For more details and plenty of examples (in the post and comments), check out this recent Whiteboard Friday on the topic.
#2: The more readable by human beings, the better
It should come as no surprise that the easier a URL is to read for humans, the better it is for search engines. Accessibility has always been a part of SEO, but never more so than today, when engines can leverage advanced user and usage data signals to determine what people are engaging with vs. not.
Readability can be a subjective topic, but hopefully this illustration can help:
The requirement isn't that every aspect of the URL must be absolutely clean and perfect, but that at least it can be easily understood and, hopefully, compelling to those seeking its content.
#3: Keywords in URLs: still a good thing
It's still the case that using the keywords you're targeting for rankings in your URLs is a solid idea. This is true for several reasons.
First, keywords in the URL help indicate to those who see your URL on social media, in an email, or as they hover on a link to click that they're getting what they want and expect, as shown in the Metafilter example below (note how hovering on the link shows the URL in the bottom-left-hand corner):
Second, URLs get copied and pasted regularly, and when there's no anchor text used in a link, the URL itself serves as that anchor text (which is still a powerful input for rankings), e.g.:
Third, and finally, keywords in the URL show up in search results, and research has shown that the URL is one of the most prominent elements searchers consider when selecting which site to click.
#4: Multiple URLs serving the same content? Canonicalize 'em!
If you have two URLs that serve very similar content, consider canonicalizing them, using either a 301 redirect (if there's no real reason to maintain the duplicate) or a rel=canonical (if you want to maintain slightly different versions for some visitors, e.g. a printer-friendly page).
Duplicate content isn't really a search engine penalty (at least, not until/unless you start duplicating at very large scales), but it can cause a split of ranking signals that can harm your search traffic potential. If Page A has some quantity of ranking ability and its duplicate, Page A2, has a similar quantity of ranking ability, by canonicalizing them, Page A can have a better chance to rank and earn visits.
#5: Exclude dynamic parameters when possible
This kind of junk is ugly:
If you can avoid using URL parameters, do so. If you have more than two URL parameters, it's probably worth making a serious investment to rewrite them as static, readable, text.
Most CMS platforms have become savvy to this over the years, but a few laggards remain. Check out tools like mod_rewrite and ISAPI rewrite or MS' URL Rewrite Module (for IIS) to help with this process.
Some dynamic parameters are used for tracking clicks (like those inserted by popular social sharing apps such as Buffer). In general, these don't cause a huge problem, but they may make for somewhat unsightly and awkwardly long URLs. Use your own judgement around whether the tracking parameter benefits outweigh the negatives.
Research from a 2014 RadiumOne study suggests that social sharing (which has positive, but usually indirect impacts on SEO) with shorter URLs that clearly communicate the site and content perform better than non-branded shorteners or long, unclear URL strings.
#6: Shorter > longer
Shorter URLs are, generally speaking, preferable. You don't need to take this to the extreme, and if your URL is already less than 50-60 characters, don't worry about it at all. But if you have URLs pushing 100+ characters, there's probably an opportunity to rewrite them and gain value.
This isn't a direct problem with Google or Bing—the search engines can process long URLs without much trouble. The issue, instead, lies with usability and user experience. Shorter URLs are easier to parse, to copy and paste, to share on social media, and to embed, and while these might all add up to only a fractional improvement in sharing or amplification, every tweet, like, share, pin, email, and link matters (either directly or, often, indirectly).
#7: Match URLs to titles most of the time (when it makes sense)
This doesn't mean that if the title of your piece is "My Favorite 7 Bottles of Islay Whisky (and how one of them cost me my entire Lego collection)" that your URL has to be a perfect match. Something like
or variations on these. The matching accomplishes a mostly human-centric goal, i.e. to imbue an excellent sense of what the web user will find on the page through the URL and then to deliver on that expectation with the headline/title.
It's for this same reason that we strongly recommend keeping the page title (which engines display prominently on their search results pages) and the visible headline on the page a close match as well—one creates an expectation, and the other delivers on it.
For example, above, you'll see two URLs I shared on Facebook. In the first, it's wholly unclear what you might find on the page. It's in the news section the BBC's website, but beyond that, there's no way to know what you might find there. In the second, however, Pacific Standard magazine has made it easy for the URL to give insight into the article's content, and then the title of the piece delivers:
We should aim for a similar level of clarity in our own URLs and titles.
#8: Including stop words isn't necessary
If your title/headline includes stop words (and, or, but, of, the, a, etc.), it's not critical to put them in the URL. You don't have to leave them out, either, but it can sometimes help to make a URL shorter and more readable in some sharing contexts. Use your best judgement on whether to include or not based on the readability vs. length.
You can see in the URL of this particular post you're now reading, for example, that I've chosen to leave in "for" because I think it's easier to read with the stop word than without, and it doesn't extend the URL length too far.
#9: Remove/control for unwieldy punctuation characters
There are a number of text characters that become nasty bits of hard-to-read cruft when inserted in the URL string. In general, it's a best practice to remove or control for these. There's a great list of safe vs. unsafe characters available on Perishable Press:
It's not merely the poor readability these characters might cause, but also the potential for breaking certain browsers, crawlers, or proper parsing.
#10: Limit redirection hops to two or fewer
If a user or crawler requests URL A, which redirects to URL B. That's cool. It's even OK if URL B then redirects to URL C (not great—it would be more ideal to point URL A directly to URL C, but not terrible). However, if the URL redirect string continues past two hops, you could get into trouble.
Generally speaking, search engines will follow these longer redirect jumps, but they've recommended against the practice in the past, and for less "important" URLs (in their eyes), they may not follow or count the ranking signals of the redirecting URLs as completely.
The bigger trouble is browsers and users, who are both slowed down and sometimes even stymied (mobile browsers in particular can occasionally struggle with this) by longer redirect strings. Keep redirects to a minimum and you'll set yourself up for less problems.
It's not that the slashes (aka folders) will necessarily harm performance, but it can create a perception of site depth for both engines and users, as well as making edits to the URL string considerably more complex (at least, in most CMS' protocols).
There's no hard and fast requirement—this is another one where it's important to use your best judgement.
#12: Avoid hashes in URLs that create separate/unique content
The hash (or URL fragment identifier) has historically been a way to send a visitor to a specific location on a given page (e.g. Moz's blog posts use the hash to navigate you to a particular comment, like this one from my wife). Hashes can also be used like tracking parameters (e.g. randswhisky.com/lagavulin#src=twitter). Using URL hashes for something other than these, such as showing unique content than what's available on the page without the hash or wholly separate pages is generally a bad idea.
There are exceptions, like those Google enables for developers seeking to use the hashbang format for dynamic AJAX applications, but even these aren't nearly as clean, visitor-friendly, or simple from an SEO perspective as statically rewritten URLs. Sites from Amazon to Twitter have found tremendous benefit in simplifying their previously complex and hash/hashbang-employing URLs. If you can avoid it, do.
#13: Be wary of case sensitivity
A couple years back, John Sherrod of Search Discovery wrote an excellent piece noting the challenges and issues around case-sensitivity in URLs. Long story short—if you're using Microsoft/IIS servers, you're generally in the clear. If you're hosting with Linux/UNIX, you can get into trouble as they can interpret separate cases, and thus randswhisky.com/AbC could be a different piece of content from randswhisky.com/aBc. That's bad biscuits.
In an ideal world, you want URLs that use the wrong case to automatically redirect/canonicalize to the right one. There are htaccess rewrite protocols to assist ( like this one)—highly recommended if you're facing this problem.
#14: Hyphens and underscores are preferred word separators
Notably missing (for the first time in my many years updating this piece) is my recommendation to avoid underscores as word separators in URLs. In the last few years, the search engines have successfully overcome their previous challenges with this issue and now treat underscores and hyphens similarly.
Spaces can work, but they render awkwardly in URLs as %20, which detracts from the readability of your pages. Try to avoid them if possible (it's usually pretty easy in a modern CMS).
#15: Keyword stuffing and repetition are pointless and make your site look spammy
Check out the search result listing below, and you'll see a whole lot of "canoe puppies" in the URL. That's probably not ideal, and it could drive some searchers to bias against wanting to click.
Repetition like this doesn't help your search rankings—Google and Bing have moved far beyond algorithms that positively reward a keyword appearing multiple times in the URL string. Don't hurt your chances of earning a click (which CAN impact your rankings) by overdoing keyword matching/repetition in your URLs.
Best of luck with all your URL creation and optimization efforts! Please feel free to leave any additions, ideas, or observations in the comments below.