If you were taking an English language test today, or a mathematical test, and you were asked to define “algorithm”, what definition would you provide? Do any of the following match your idea of what an “algorithm” is?
- A process for completing tasks
- The means by which the end is reached.
- A problem for which there is no resoluton.
- A method for solving problems.
- A method for defining methods.
A lot of people find the SEO Theory blog through search engine referrals for variations on “SEO algorithm”, “SEO algorithms”, “search engine algorithm”, etc. The funny thing about those referrals is that I haven’t actually written about SEO algorithms. I write an
SEO algorithm roundup article last year (some of the advice in that article is now outdated, by the way). But what I called “SEO algorithms” in that post were not really SEO algorithms.
Search engine algorithms are complex things. One does not simply
detail a search engine algorithm in a single blog post. But one can recap (or attempt to recap) the basic steps in the search indexing process. A fair number of SEOs have done this, some even using pictures. None of them have really done an adequate job. Nor am I likely to do an adequate job.
Search engines don’t have much to work with when they are indexing billions of pages. They just get a few hunderd pieces of information to pick from. If you have ever designed an inventory management system you’ll immediately see the advantages you have over a search engine. If you have never designed an inventory management system, you may appreciate the comparison with a little explanation.
Let’s say you operate a warehouse for automobile parts. On average I would say they have to stock around 100,000 individually identified parts. Each part comes with one or more unique identification strings or tags. The manufacturers provide their own model numbers and serial numbers, shippers and distributors may provide their own tracking IDs, and retailers (you, the guy with the warehouse) usually assign their own identification strings for internal tracking.
That one paragraph provides you with more detailed information about any given manufactured item intended for use in an automobile than any search engine knows about Web pages. If search engines could
know that every Web page was tagged with one or more unique identifiers other people had provided, that would make life so much easier for them. But as it is, anyone who has struggled with canonical URL issues knows that search engines can easily confuse one page with many.
In order to index and arrange billions of pages, search engines have to make up their own unique identifiers and manage those identfiers without the benefit of making sanity checks against other people’s identifiers. But the average inventory management system has more advantages over search engines than that.
Knowing that an auto parts warehouse needs to stock about 100,000 different types of parts, we can design our facilities, software, and procedures around 100,000 unique types of parts. A search engine has absolutely no idea of how many pages it will eventually be asked to index. Your resources have to be allocated very conservatively if you are dealing with an open-ended inventory rather than with a limited inventory.
An auto parts Warehouse can track customer purchasing habits over time and find out which parts are most likely to be in high demand. A search engine can track queries and clicks but because search engines see 20-25% new queries every month, they never really know which pages will be in high demand for how long. The typical auto parts warehouse doesn’t see 20-25% new parts requests every month.
Predictability influences how you manage and organize data. Unpredictability also influences how you manage and organize data.
So think of how you might organize an endless supply of new Web pages as you find them AND how you might respond to an endless stream of new requests for information that your constantly growing (or changing) inventory of Web pages may or may not satisfy. In today’s world of search the major search engines rely on two major factors more than anything else: content and links.
Content is a fuzzy concept. Does
content include the meta data that accompanies many Web pages? Does
content include descriptive text that accompanies links (such as the descriptions we provide in directory listings)?
Links may seem more straight-forward than
content but the answers we provide ourselves with for the content questions may make links more complex. After all, if we don’t associate all the text around a link with the destination page, should it be associated with the link? Have you ever thought about a search engine simply looking at a link for itself rather than for the relationship it creates between two documents?
A search engine can collect a lot of information about a link and some search engines may indeed be doing that. They may use that information to determine whether the link should be trusted, whether it should be given extra weight, or whether it should be followed (crawled). A search engine can record how it handles what it finds on the destination page and associate that finding with the link (or, perhaps more likely, with the linking page).
Ultimately, the search engine is trying to solve two problems: first, how to manage an ever-growing inventory of Web pages of unpredictable quantity, quality, and design; second, how to respond to a continuous stream of requests for information that it may or may not have seen before.
In mathematics, one algorithm can be used to solve more than one problem but the problems have to all belong to the same group (or class) of problems. They have to share similar characteristics. For example, you could use the same algorithm to find out how fast two trains are traveling if you are given their relative speeds and directions AND to find out how fast a bullet is traveling toward a moving object if you are provided with similar information. But you would have to use a completely different algorithm to determine what the volume of a sphere is.
Managing data and searching data require different processes. Hence, every search engine requires at least two algorithms. When you speak of a search engine’s algorithm, therefore, you’re thinking of a mega-algorithm that incorporates many smaller algorithms. Your task as a search optimizer becomes more complex if you address that mega-algorithm rather than focus on each real algorithm separately.
And that brings us to
SEO algorithms. An
SEO algorithm is the process by which you optimize content for search. Optimization doesn’t mean get the best possible ranking. In our
SEO glossary here on SEO Theory you’ll find this definition for
search engine optimization: “The art of designing or modifying Web pages to rank well in search engines.”
That is the most broad and comprehensive definition possible. I will occasionally clarify the definition by adding that we want converting traffic, but sometimes you optimize for something else that you hope to achieve through search. Spammers and SEOs alike prefer to optimize their link profiles (although the rules for link profile optimization have never been articulated, so basically no one knows what they are optimizing for).
In search engine optimization, you can rely on one algorithm to address the two types of search engine algorithms or you can rely on several algorithms. Most SEOs seem to prefer the several algorithm approach but let’s look at the one algorithm approach first.
Your optimization problem can be described this way: how do you get a page indexed so that it is used to respond to as many queries as possible?
Our goal is to achieve
maximum optimization, such as
ranking a single page for 100 seo questions (technically, I did not address 100 questions — I got tired somewhere in the 80s or 90s, I think).
Maximum optimization is an ideal state in which a page ranks well (not necessarily
first) for every query to which it is relevant. I don’t think that is humanly possible, at least not with the current level of SEO theory we have available.
Your algorithm needs to be simple but it can be self-referring. That is, it can invoke itself. We don’t usually speak in terms of “invoking an algorithm for SEO” but that is essentially what we do. Maximum optimization requires that a page be strongly relevant to as many queries as its indexable words are relevant to. To achieve maximum optimization, you have to repeat and emphasize every word in every possible combination in as many ways as possible.
You could create a huge page that attempts to tackle everything or you could look at how you construct your text, how you emphasize it, and how you repeat terms to determine a pattern that ensures every word (or nearly every word) is used
optimally. Hence, you may find yourself emphasizing your emphasis, repeating your repetititions, and reorganizing your word patterns into more complex patterns.
We used to call that last part
power keyword optimization, where you construct complex keyword expressions that can be broken down into less complex keyword expressions. This method was proposed for the
keywords meta tag in the late 1990s. We can extend the method to the indexable copy of the page and call it
power content optimization. So, instead of using “keyword1 keyword2″ you use “keyword3 keyword1 keyword2 keyword4″ and optimize for several variations.
There is a little more to it but let me move on.
Most SEOs and eTailers are not interested in
maximum optimization. The algorithms one might employ for maximum optimization are more theoretical toys than anything else, as most people are looking for a return on investment. But many people are very interested in what we could call
extended optimization, where you design your content to rank well for many queries (but nothing like “all relevant queries”).
For example, let’s say you have a jewelry Web site and you have a category page that lists 20 different types of jewelry (perhaps they are all rings with stones). Although you want those individual ring-with-stone pages to rank well for their most specific queries, would it not be great to have the category rank alongside them? Sure it would. That’s
extended optimization. Of course, not every search engine prefers to show category pages if it can serve up the detail pages.
Your algorithm is defined in terms of
what you do on the page,
what you do around the page, and
what you do to the page. “On the page” is self-evident to anyone who is familiar with the basic concepts of SEO page design.
What you do around the page is a little less familiar because most people don’t think in terms of “managing sibling relationships” but rather they focus on “theming a Web site”. You don’t need (or want) to theme a Web site, but you do want to cross-promote your most valuable content for a specific query. Put your best foot forward.
What you do to a page usually occurs as link building, but you can do other things to a page (such as embed it in a frame, embed it in an iframe, block it, replicate it across multiple ambiguous URLs, etc.). That is, most people focus on link building rather than on piggybacking content, although there are optimizers out there who have piggy-backed plenty of my content.
A well-designed Web site should address the types of search engine algorithms (indexing and query resolution) adequately in most cases. However, if you’re the kind of person who wants to walk around the mountain rather than quickly fly over it, you can do what most of your fellow SEOs have been doing for years.
You can devise a links-to-get-indexed algorithm and a links-to-get-ranked algorithm. Remember that link building is the least efficient, least effective means of optimizing for search. It’s the most time-consuming and resource-hogging approach to search engine optimization. Therefore, everyone does it simply because it looks like it’s the
right way to do things. After all, the right way has to be harder than other ways, right?
So how do we manage to separaet our algorithms for search engine optimization through link building? Because there are those links where you control the anchor text and those links where you cannot control the anchor text. If you cannot control the anchor text, all you can use the link for (with respect to search engine optimization — there are clearly other valuable uses) is to get crawled and indexed.
Some people invest a great deal of time in building links with anchor text they cannot control. These types of links “look natural”, “confer trust”, “reflect editorial opinion”, and (my favorite) “are SEO friendly”.
Some people just go for the throat and grab every link they can get with the anchor text they want. These types of links usually “look spammy”, rarely confer PageRank (or trust), bypass editorial opinion, and (my favorite) “are SEO friendly”.
Anything that is
SEO friendly must be good, right?
You probably divide your link building time between asking for links and creating links. However, many SEOs are now chasing the dream of creating long-lasting link bait (something that rarely happens). Link bait provides you with “natural” links whose anchor text you cannot control. Link bait will get you crawled and will probably help you rank for expressions you never imagined, but it isn’t helping you optimize for both types of
search algorithm.
Good link bait should statistically attract more links with targeted anchor text than not. Great link bait creates a brand, but that’s another story.
If you divide your resources between creating link bait and building links, you’re not optimizing your content. Link bait can be optimized after the fact but most link bait that I have looked at is not optimized. It’s designed to attract links, not rank well in search results. A well optimized page should rank well in a non-competitive query. A high optimized page should not require many links to rank well even in competitive queries.
So if you’re not thinking in terms of “SEO algorithms” then you’re not looking at how you allocate your resources. You’re not looking at how you solve the problems of getting your content indexed and getting it to rank well.
Simply being indexed doesn’t guarantee a good ranking. Of course, simply ranking well doesn’t guarantee click-throughs and conversions but that leads to a problem that doesn’t have anything to do with the search engine algorithms.
In search engine optimization there is no
right way to optimize. Every query resolves the question of “which optimization methodology works best” only for itself. You cannot use one query to prove a point about optimization with another query. Your SEO algorithm therefore has to be immensely flexible but it also has to be replaceable.
That is, to do this right, you have to know more than one way to optimize. You have to be prepared to tackle your problems from different angles every time because sometimes the old tricks won’t do the job and sometimes the new tricks won’t do the job.
An algorithm is a method for solving problems. There is no universal algorithm in search engine optimization, although the SEO Method applies to all of them: experiment, evaluate, adjust.