Have you ever wished you could get inside the mind of Google? To figure out what makes its search engine tick?
How great it would be if that were easy to do.
Well, actually it is. I realized that recently when I was doing research for one of my personal passions, which is finding invasive plants in local parks and eliminating them.
I wanted to see what information the Forest Service had on that topic, so I searched for “forest service invasive species removal.” Here’s a screen capture of the top four results:
Most of us notice right away that each title shown here contains the terms “invasive species” and “Forest Service.” That makes perfect sense—terms in the title should tell us what information that document contains, so search engines should pay attention to terms that appear in the title.
You might have also noticed that Google shows snippets of the document with the search terms bolded. It’s so obvious that Google would look for my search terms in the body of each document that I feel silly saying so. My larger point is this: Look through everything Google displays and you might find bolded text in places you hadn’t thought were important.
For instance, look at each URL. In the URL of three of the top four results, “invasive species” is bolded:
In studying a number of different examples of search results, I have noticed a pattern:
- When the search term is in the URL, the page ranks higher in the results. In fact, the other result in the top four also has “invasive species” in its URL; it’s just that the URL was too long for all of it to appear in the search results.
- All other things being equal, the page ranks higher when the search term is closer to the root of the URL.
- So think about that. Depending on your method of producing a website, the URL of each page reflects your site’s directory structure, its information architecture, or both. The choices you make in those areas will have a direct effect on the ranking of your pages in every search your customers perform.
Does your website take advantage of this fact? Not if it follows some of the more common practices we see:
URLs that reflect the agency’s org chart:
URLs that reflect a document format and naming convention:
URLs that give information that means something to programmers or Web servers or cyborgs but is opaque to all the rest of us: www.agency.gov/documents/cgi-bin/filename.pdf (“cgi-bin”? Really?)
These and similar terms mean nothing to your customers, so don’t use them in your URLs:
How bad could their impact on your customers’ search results be? Consider these two quick experiments:
Searching for cgi bin invasive species, I found:
Drop the cgi bin, and the results are:
The top results in the first search aren’t among the results of the second—not in the top four pages of results, anyway. The opposite is also true—the top results in the second search don’t appear when cgi bin is used.
<li style="margin-bottom: 15px"> <strong>Invasive species in American forests</strong>: <p> Searching for <strong>pubs other invasive species American forests</strong>, I found: </p> <p></p> <p> Without <strong>pubs other</strong>, the results are: </p> <p></li> </ol> <p> In both of these searches, the Nature Conservancy’s Fading Forests is the fifth result—but that is the only thing the top four pages of results have in common. </p> <p> These experiments show that Google not only considers the terms that appear in the URL of the page but also puts heavy weight on them in ranking the items it finds. If the URL didn’t matter, the results in each pair of searches should have been virtually the same. Instead, except for the one article by the Nature Conservancy, they have nothing in common. </p> <h2> Try it Out on Your Site </h2> <p> Do the URLs produced by your website get in the way when your customers try to find your content? Test it out and see: </p> <ol> <li> Using your website navigation, find several different Web pages or publications. Choose information that many people need or should know. </li> <li> Examine the URL of each item you found. Does it contain words or abbreviations that have nothing to do with the subject of the item? </li> <li> Using Google, search the Web for two or three words people who need this information would have in mind. Is your item in the top page of hits? The top two pages? (Very few people look at even the second page, so the third and fourth hardly matter.) </li> <li> Do the search again, but this time add two terms from the item’s URL. For the greatest impact, put them before the other terms. Are the results the same as you saw before? Have your items risen in the rankings? </li> </ol> <p> The differences you see will give you an idea of how important it could be to redesign your website with more meaningful URLs in mind. If your URLs aren’t meaningful, but the difference in the results of your experiment isn’t dramatic, it might be that other factors are counterbalancing the impact of the poor URLs. </p> <h2> Tip: Your Link Text Matters, Too </h2> <p> Google also puts heavy weight on the words used in links that point to your Web pages and documents. So even if you have a URL structure that works something like this: </p> <p> communications.agency.gov/publications/reports-and-studies/fy2014/lead-water.pdf </p> <p> People who need that document might still find it if enough links that point to it read “Lead in My Drinking Water: How Toxic Is It?” </p> <p> Other words can matter, too. Because of poor practices like these: </p> <p> <span style="text-decoration: underline"><span style="color: #0000ff;text-decoration: underline">Download the report</span></span>, “Lead in My Drinking Water: How Toxic Is It?” </p> <p> To read “Lead in My Drinking Water: How Toxic Is It?” <span style="text-decoration: underline"><span style="color: #0000ff;text-decoration: underline">click here</span></span>. </p> <p> Google also pays attention to words that are close to links in your Web pages and documents. After all, Google wants to help its customers find what they need no matter how little the publishers have done to help. </p> <p> So keep these two points in mind: </p> <ol> <li> If your URLs are a problem, but you can’t do anything about them right away, work on the wording of links and words close to the links to your most important information. Make sure that you include as many words that would be good search terms as you can without making the sentence hard to read. </li> <li> If you test your own URLs in front of people who need to get this point, be sure you have tested the searches on your own beforehand. You don’t want stunningly powerful link text to hide the impact of poor URLs—at least, not until you’ve made your point. </li> </ol> <p> <em>Cliff Tyllick has worked on clear communication, usability, and Web development since the new authoring tools were HyperCard and Owl Guide—in other words, long before there was a World Wide Web. He is now the accessibility coordinator of the Texas Department of Aging and Disability Services.</em> </p>