About Dan Thies

Find more about me on:

Here are my most recent posts

Ding Ding, Google! Time For Cathedral v. Bazaar, Round 2?

When Jimmy Wales (one of the founders of Wikipedia) announced plans to create a search engine to compete with Google, some people took it seriously, while others dismissed it as a pie-in-the-sky fantasy. Here’s what Jimbo himself has to say:

"Search is part of the fundamental infrastructure of the Internet. And we are making it open source. Wikia Search will start to change search from being proprietary, top-down, and closed."

Well, an early alpha release of that "open source" search engine is now online, albeit with a very small data set, and you can see the first hints at how the user interface will differ.

Matt Cutts blogged about it with a screenshot of the Nutch relevance scoring display. Michael Arrington called it a big disappointment. I don’t know what he expected, really. It’s Nutch for the search engine and Grub for the crawler – but we knew that was coming before they released it.

What interests me about Wikia isn’t the current state of the index or SERPs. You’d expect those to be next to useless right now. What interests me is whether an open source, community effort can build a search engine to rival the best efforts of large commercial engines like Google. To put it another way, what are the limits of open source?

In The Cathedral and the Bazaar, which is one of the canons of the Open Source movement, Eric Raymond tells the story of how Linux managed to succeed, and of an open-source project that he himself led.

The parallels are interesting enough – like Linus Torvalds and Eric Raymond, the Wikia team has started with existing applications (Nutch and Grub). As with Linux, early releases don’t look like much (I followed Linux from the very beginning, thanks to a co-worker who was on it from the start).

Whether this initial Wikia rises to the level that Eric Raymond described as a precondition for success, whether it will really get enough people excited, I am not qualified to say… but it doesn’t look good. Linux got people excited back in 1991, but the bar was lower, because nobody had done anything like it before.

Sixteen years later, the community may be more jaded, and less likely to contribute to an effort that isn’t necessarily truly open source. The community that they need to engage is also a lot broader.

For me, three big questions arise:

  1. How will Wikia engage the minds of information retrieval scientists, and not just coders? Writing software that runs and is dependable is one thing – Open Source has made the case that it can do this extremely well. But to build a great search engine, you need great algorithms, which means you need a lot of people who understand (for example) what BM25F does, the pros and cons of using it, etc. – and you need to somehow get these people to work together. Oh yeah, and they’re starting at least 5 years behind. How can Wikia keep the ‘best and brightest’ engaged in an open source project, when the major search engines are hiring talent as fast as they can?
     
  2. How will Wikia’s user-input improve search results, rather than helping the usual pissers game the system? It’s hard enough to get a large user community to edit the Wikipedia without melting down – allowing user feedback on every SERP will not happen without a lot of challenges related to scale. How will Wikia respond to spam in general – the problem will only get bigger if they actually gain traction with users. Eyeballs attract spammers like flies to a rotting carcass – and they’ll be probing for weaknesses every step of the way.
     
  3. How will Wikia survive success, in the unlikely event that they can solve the other problems? The scale of the physical operations for Google is simply staggering. The sound of every hard drive Google owns, moving at once, would probably be loud enough to knock down the walls of Jericho. Google has huge resources because of ad revenue, hundreds of extremely talented people to work full time (minus 20%?) on solving their growth and scale problems. If Wikia takes off, what will they have?

So is this really Cathedral vs. Bazaar, round 2?

Or is it just Google’s Cathedral vs. Jimbo Wales’ pet project? That will depend on what happens at Wikia, because building a search engine is not the same thing as building software. It’s orders of magnitude more difficult. I wish them luck. I sorta hope that they’re up to it. It would be nice to see the underdog at least make a good show of it.

Stop Words Are Dead! Did I Miss Another Memo?

That’s right folks… after years of telling the world to pretend that "stop words" don’t exist, at least when you’re writing copy, now you can really do it for real, because…

Stop Words Are Dead!

Well, they’re dead at Google and MSN anyway. Yahoo might want to get it in gear. I didn’t bother checking Ask, A9, or any of the other spares – I’ll leave that as an exercise for enterprising readers who actually care what the 4th-tier search engines are up to right now.

What are stop words? Well, stop words are (were) words that are so common that search engines have chosen to (sort of) ignore them, by not indexing them when they crawl a web page. Words like (a, and, is, or, the, was), etc.

This doesn’t (didn’t) mean that they have no effect on search results, because the index records the position of words, so even a "blank" in the word order created by a stop word could still affect the order and proximity of other words that you searched for. If this makes no sense, don’t worry – it didn’t matter much in the first place, and now stop words are dead.

Did I Miss A Memo?

If this isn’t news to anyone, someone let me know, because I haven’t seen it written up anywhere. It was news to me anyway – not especially exciting news, but that’s beside the point. A couple years ago when I was working on the Search Engine Marketing Kit, stop words were definitely in effect. We demonstrated this with a bunch of comparative searches like:

  • cats or dogs vs. cats and dogs
  • kick the bucket vs. kick a bucket

These search queries will return the exact same search results, if the words (a, and, or, the) aren’t being indexed. If those words are being indexed, then we’d expect to see different search results – as we now see on MSN and Google. You have to do a bunch of searches to make sure, because common words, even when they are indexed, have a fairly small effect on ranking.

So, last month, a friend of mine asked me to look over a list of stop words he’d received and let me know if they were correct. When I checked on Yahoo, most of them were correct, although Yahoo does show different "universal search" type results (video, product, etc.) with different queries, the organic search results themselves did not appear to change on any of my test queries.

When I checked on MSN and Google though… none of the "stop words" on the list worked as stop words. Zippo. So it looks like somewhere along the line, 2 of the 3 major search engines stopped stopping, and started indexing every word.

Why should you even care about stop words?

Well, if you’ve been ignoring my advice all these years and fretting about stop words, you have one more reason to stop worrying and start writing naturally. If you use search engines, it’s at least a small improvement in the quality of search results when you’re using common words, as often happens when searching for books, lyrics, movies, music, Vogon poetry, and the like.

Fact check for me?

  • If you can find stop words that appear to still be "working" on MSN and Google, do let me know.
  • If you can find some indication that Yahoo is indexing commonly known stop words, do let me know.
  • If someone else has written this up and I really did just miss the memo, please post the original citation.

In other news…

My Stompernet colleague Don Crowther has put together a simply amazing video on how to leverage social media, social marketing, and Web 2.0 for traffic, conversion, and SEO. You need to watch the whole thing to fully understand the point, but it’s absolutely worth your time to do just that. Don’s also releasing a free PDF report with more information either today or tomorrow. While it’s in support of a coaching program that he’s launching soon, my friends at Stompernet really know how to "move the free line" and give away great information. It’s too bad they had to cut it down to only 50 minutes, but as I understand it the other 20-30 minutes was almost as good.

Designers and conversion specialists: Another Stompernet colleague, Andy Edmonds (our "chief scientist"), spent a good deal of time and treasure in 2007 developing a "vision simulation tool" called Stomper Scrutinizer that works just like a web browser, to show you how people see the colors, type, and navigational elements on your web page designs. Then our fearless CEO decided to, what the heck, give the software away as a holiday gift to the community. Nothing to buy, just go get this free software. The video linked from that blog post is a real eye-opener, by the way. Working with smart people is really cool.

 

Free Camtasia Studio – Make your own screencast videos

Just saw this today… . Since we’ve got a video contest coming up, I thought you could all use a link to some free software. :D

Camtasia Studio 3, free: http://www.paulcolligan.com/2007/11/21/free-camtasia-studio/

No time to hunt down the original source, so Paul gets the link.

Camtasia Studio is the same software I use to make my own video trainings, like the free 6-week Link Building Clinic for registered subscribers.

BTW, quick tip… don’t try to record sound with a microphone plugged into a hardware audio jack.
Just about any USB mike (I use a Logitech gaming headset nowadays) will sound a lot cleaner.

Why Google Can't Just "Dump" PageRank

2008 SEMMY FinalistUpdate: This post has been nominated for a SEMMY award – vote today!

When it comes to innovations in information retrieval, and there have been many over the years, none has achieved the legendary status of Page & Brin’s “PageRank” algorithm. Its elegance, its simplicity, its wicked subtlety… the sheer underlying truth of the thing. It’s beautiful.

Unfortunately, Google decided several years ago to release a search toolbar for web browsers, and they included a visual representation of PageRank. The green (or gray, or white) bar. The green pixels, as I’ve been calling them since it came out…  The obsession with those little green dots has loomed large in SEO ever since.

Now forget about the green pixels for a few minutes if you can. This is about the real PageRank score, not what they show you on the toolbar. It’s about why Google can’t just get rid of PageRank, why the Supplemental Index exists, and just for fun, why Robert Scoble shouldn’t talk about this stuff.

This may be a little bit heavy for some readers. Take your time. Read it twice if you have to.

Query-Dependent Ranking Factors: The Search Engines’ Secret Sauce

It’s probably safe to say that the vast majority of the factors used by search engines to rank web pages on search results are query-dependent. That is to say, the search query itself affects what factors are involved, the weight of different factors, etc.

For example, when I search for “purple widgets,” the weight of any occurrences of those terms in a given document depends on other documents in the index, etc. The importance of word proximity and order will depend on other documents.

I have discussed query-dependent factors before, in the context of explaining why statistical analysis doesn’t give you the “magic number.”

PageRank is different, because it is Query-Independent.

The PageRank score for a document exists independently of the search query – it is a property of the document (URL) itself. The algorithm is patented, and public. In theory, if you had a list of every URL in Google’s index, you could determine the PageRank score of every document (URL).

However, you would not be able to arrive at the same score that Google uses. Your number will be different, because you don’t know which links Google is ignoring due to paid link filtering and other link spam detection processes.

Side note: before Google announced nofollow, they had to have a working and thoroughly tested implementation. How would they test it? By using an “invisible nofollow” internally, to tag untrusted links on untrusted pages on untrusted sites.

They’ve been filtering links (e.g. paid links) for years, folks. As I have discussed and illustrated in prior posts, nofollow is also a useful tool for SEO.

Some might argue with me, but I believe that for most search results, query-dependent factors are more important than PageRank.

This is somewhat obscured by Google’s heavy reliance on anchor text (they are not alone in this), which can make it appear that PageRank is more important than it actually is. Since the same links that pass anchor text also pass PageRank, you do tend to see higher-PR pages towards the top of search results.

Now some news that the SEO world hasn’t gotten yet…

PageRank’s True Purpose Is Not Ranking Documents

I was talking with Andy Edmonds one day, and he was sort of wondering why nobody really talks about the real benefit of PageRank to someone trying to operate a large-scale hypertext search engine. In Andy’s world (he studies this stuff a lot more than I do), PageRank is partly a performance hack.

Quick side trip – there are at least 4 steps that a search engine must go through to deliver search results in response to a query. Query analysis (examining the user input), document retrieval (fetch 1000 documents from the index), ranking, and presentation (output to the user).

The ranking stage may consist of an initial ranking and then a reordering (Google’s -30 and -999 penalties, for example) before presenting search results. Universal search may add specialized search results (news, images, maps, local, etc.) depending on the query.

The biggest benefit of using PageRank doesn’t come in at the ranking step, it comes in at the document retrieval stage, when you’re trying to decide which of 27,438,902 documents that matched the query text are actually worth ranking. The search engine’s job at the retrieval stage is to pick 1,000 documents to rank, without losing any really important documents.

PageRank excels at this. In fact, PageRank’s role in the retrieval process is why Google’s “Supplemental Index” exists. When they created the SI, they recognized that some documents had such low PageRank that they were unlikely to make the cut for most search queries. Those documents live in the supplemental index.

Maintaining a smaller main index improved the speed of document retrieval, and if they couldn’t get 1000 results from the main index, they could always go dip into the supplemental index to get more documents. Google didn’t run out of “document ID numbers” for the main index (sorry, Daniel Brandt), they did it to improve performance.

Google has made some changes to how they manage the supplemental index recently, and in many cases they may dip into the supplemental index when there are more than 1000 matching documents in the main index. Obviously, for some queries, the “best results” from the SI (based on query-dependent factors) are often better than results 900-1000 from the main index. Google’s on top of that problem. They’re working on it. They’re doing stuff.

So… if you think that Google is going to just dump PageRank, now you know why they can’t. It’s probably the greatest performance hack in the history of information retrieval.

How PageRank Can Become (Somewhat) Query-Dependent

I wrote a “somewhat speculative” report on Topic-Sensitive PageRank (PDF) a few years back, after Google’s “Florida Update” shook up the SEO world. Now, it’s just possible (OK, probable) that I was wrong about that. It’s also possible that I was right, and I’ll cling to that hope until someone at Google actually denies it.

However, we do know that Google implemented something like Topic-Sensitive PageRank, because they offered a couple topical search products (personalized search and site-flavored search), where you picked from a list of approximately 50 topics to skew your search results. In order to offer a topical search product, they had to have something in place to help deliver those results.

As I described in that paper, with a topical PageRank score for each document, Google could map some search queries to topics at the query analysis stage. They could use that information to retrieve a more topically relevant set of documents at the retrieval stage, and of course, apply those topical scores in the same way that PageRank is used in the ranking of the final result set.

Could they do this? Absolutely, given a small enough set of topics, it’s not out of the question for Google to use Topic-Sensitive PageRank. Can they do it with enough topics to make it worth the effort? Well, I think that’s more difficult. 50 topics wasn’t enough to make their site-flavored and personalized search products terribly useful.

To really do more with topical analysis, search engines would need to understand more, and that means some other kind of analysis. It also means that you’d have a hell of a time scaling it up to work on 20 billion or so documents.

Why You Can’t Have A Different Score For Every Keyword

Scale is a huge problem for search engines. The web is really big, for one thing. For another, there are like a billion people searching for stuff all the time. There are a lot of things you might like to do, that you can even do in a lab, that just don’t work in a large-scale public search engine like Google.

So, when I heard that Robert Scoble was telling people that PageRank is different for every keyword, I laughed at the sheer stupidity. Then I went and read what he actually said. He didn’t say that at all, in fact, he didn’t say enough for me to even figure out what he was saying. This left it up to others in the blogosphere to interpret his words for him.

That, in a nutshell, is why Scoble just shouldn’t talk about this stuff, unless he wants to take the time to explain himself clearly – it’s too easy for the first misinterpretation to become the standard interpretation… and then everyone thinks you’re an idiot. On the other hand, maybe I should be mindful of my own glass house.

Anyway, Robert… in case you really did mean what they said you meant, here’s why that’s dumb: it won’t scale.

“That’s no moon… it’s a gigantic RAID array!” (Chewie, get us out of here…)

To store a PageRank score for every keyword for every document on the web, I calculated that this would take a disk array the size of a small moon. I did the math in my head, of course, but you get the idea. It’s 20 billion pages times 200-300 unique words per page, and you don’t just have to store it, you have to calculate it.

Technically speaking, search engines don’t operate on keywords, they operate on queries, which may be up to 10 words long at Google. Which makes the problem even harder. Let’s see, 300 factorial, squared, to the tenth power, draw the implied hypotenuse, carry the one… yep. The size of a small moon, requiring a Dyson Sphere to power it. At least.

They already have problems with supplying electricity and keeping the damned thing cooled down. No way.

Can Anchor Text Be Weighted On a Sliding Scale?

I get this question all the time, and I started to answer it in my last (SEO) post, but it really needs a longer explanation. Today, I’ll answer it, but I won’t explain the answer until next time. Actually, there are several variations of the question, but it goes something like this:

Is the weight of anchor text weighted differently based on the PageRank of the linking document, the amount of PageRank flowing through the link, etc.?

The short answer is no, but that might not be the right answer. The slightly longer answer is “maybe, but not the way you think.”

I will explain in great detail in my next post. I promise.

MacOS Leopard + WMWare Fusion + iPhone = Brave New World

As some readers know, I made a decision back in August to go “all in” with Stompernet. Stompernet is a membership community site for elite internet marketers, where we provide best-in-class training on a wide range of internet marketing subjects.

My primary role in Stompernet has been to develop and deliver the training on pay-per-click, with some additional contributions on SEO, analytics & conversion, copywriting, infoproduct marketing… I get to do a lot of fun stuff.

Anyway, this isn’t a post about Stompernet. It’s about how I became a Mac user, and why I think nearly everyone reading this should follow along as soon as you can.

Andy Jenkins and Brad Fallon, who started Stompernet last year, are both Mac users. Most of the Stompernet faculty (some of the smartest people I’ve ever worked with) are also Mac users. Andy “hinted” that I just might want to get one too.

My Windows laptop was in the early stages of death by overuse, so I figured I’d go get a Mac, and if it sucked, I could just stick it on a shelf and wait for a reason to use it. Cash flow not being a huge issue in my world, but still being a cheap bastard, I went out and bought the lowest-endest model of the new MacBook laptop.

My expectation was that the Mac would prove to be a “toy computer,” and that I’d be shopping for a top-of-the-line Windows Tablet PC soon enough. With XP, not Vista. Damn, but that Vista is slow.

“Damn, this sucker is fast!”

A couple things impressed me about the Mac right away. First of all, the time it took from powering on to actually having full control of the desktop environment was only 23 seconds, compared to several minutes on my Windows PC.

I repeat: the cheapest Mac you can buy was fully booted in 23 seconds. Which was nice, because if it did turn out to be a toy computer, at least it wouldn’t take long to boot when I actually needed to use it.

But if you do one amazing thing, I’ll usually give you a chance to impress me again. Since handling the mountain of email was by far the most time consuming task in my work day (2-3 hours easy just to clear the email in the morning), I decided to fire up the Mac email client and see what it could do.

I should probably mention that I use(d) Outlook on my PC, with years worth of mail filtering rules and stuff to make it go faster. Even with all that, with the huge volume of email I need to sort, I was at 2-3 hours PER DAY just dealing with email.

So how did the Mac do? I started work at 8:00am just like every day. At 8:00:23 I was in the email client, setting up my account. By 9:30am, I was done. Done with email for the day. Zero in the inbox. All replies sent. Everything filed. This includes the time I spent setting up folders and mail-filtering rules.

Random but important note: Apple’s built-in junk mail filter uses Bayesian whatchamahoozies to learn, and it learns fast. I didn’t have to go install some extra thing to get an intelligent junk mail filter. I get a false positive every couple weeks.

Plus, Apple actually realized that they should apply your mail-filtering rules BEFORE running the junk mail filter – so unlike Outlook, it didn’t randomly dump “known good” emails into the junk folder.

Once I figured out how to use “Smart Folders” I got my email processsing down to about 45 minutes a day. If you’ve ever tried to use folders on Outlook/Windows and given up because of the time it takes for Outlook to “prepare the requested view,” you will appreciate the Mac mail application even more. It just works. Instantly.

“Ok, now what?”

My second day, it took less than an hour to handle the mail on my Mac… as it has every day since. So, there I was, stuck with a Mac because I needed it for email. The mission then became “how do I move all the stuff I do on Windows to the Mac?” Here are my solutions:

  1. Web browsing – Firefox for Mac works just fine, and Google Browser Sync let me move all my bookmarks, saved passwords, cookies, and all that crud over to the Mac. Problem solved, except for “what do I do if I need IE for some reason?”
  2. Recording & editing audio & video – Ed Dale sent me a list of cool apps that he runs. Easy enough to replace everything going forward, except I have a huge archive of stuff that was edited in Camtasia on Windows.
  3. All those little Windows apps (EDGE Diagrammer, HTML-Kit) that I am addicted to… what to do, what to do? Microsoft Office is kind of important – you can get an older version for the Mac, and iWork is so much better for presentations, but still – a lot of other people are using Office, so I need it.

Getting My PC Onto My Mac OS X System

Sorry Mac die-hards, I still need Windows. Not all the time, but I need to be able to use some stuff that you can only find on Windows. Not because Windows is so great but because so many people use it – I MUST be able to check web sites in IE. Not an option, it’s part of the job.

I had a Windows desktop, but sliding the chair back and forth wasn’t working for me. I wanted to get my Windows stuff into my Mac. You have some options for this:

  1. Boot Camp – basically, you set up a separate hard drive partition with Windows running on it, and you can boot up your Mac into Windows instead of MacOS. This is possible with the new Macs because they run on Intel-type hardware. I tried it. I didn’t like it. Why? Because even on a Mac, Windows takes too damn long to start up.
  2. Parallels – this was what the Apple store guy recommended. The Parallels software allows you to run Windows in a window, on your Mac, while still taking advantage of the Intel hardware. This was much better. I could fire up Windows in about 10 seconds, without leaving my Mac work environment. Unfortunately, USB audio support was weak, and I couldn’t record in Camtasia under Windows.Fortunately, you can hack Parallels to use a Boot Camp partition and boot into Windows when needed. Problem semi-solved… until…
  3. VMWare Fusion – released shortly after I got my Mac, this software solved all my problems with Windows in one shot. Their converter software cloned my Windows laptop and turned it into a virtual machine I could run on my Mac, in a window. USB support has been flawless – I can record audio and video. I can even watch videos, DVDs, TV, etc. in Windows in a window on my Mac.

If a Mac can do all that, can it make me mobile too?

I travel a lot, and do a lot of off site stuff locally. Because of that, my Windows laptop had become my primary computer. Unfortunately, this meant half an hour of switching cables, packing USB drives, and all that crap every time I wanted to take the computer out with me… and another half hour setting it all back up so that I could use a full size monitor and everything at my desk.

Over the years, I have tried many different “sync” solutions, to let me keep a laptop and desktop in sync with Windows.

With the Mac, I was able to buy a desktop model (iMac), get a .Mac account, and sync everything up – email (including filtering rules!!!), calendar, contacts, and the holy grail – files that I am working on.That’s right, with .Mac, I have an iDisk that lives somewhere up in the sky, and I can store all my files on it for $99 a year.

Now, when I want to take off and go somewhere, I just grab my laptop bag and walk out the door. That’s it. No copying files, no lugging USB drives. When I get where I’m going, I connect to the Internet, and everything syncs up without me touching a single button.

iPhone Completes the Package:

I had been using a Windows Mobile “smart phone” for a while, but since I was already in the Apple store to buy the “Leopard” upgrade for MacOS, I picked up an iPhone. Now, my email, calendar, and contacts are flawlessly synced up with my phone, and unlike my Windows phone, it all actually works.

Now, this isn’t all necessary for everyone. Some of us never leave our office. Some of us don’t want our phone to remind us of appointments, have the right phone number and email address for our contacts, etc. But for those who do need it, Mac + iPhone makes it all so easy. Everything just works.

Upgrading to Mac OS X Leopard & “Spaces”

There are a lot of cool features in Leopard. At the moment, I am so focused on one feature, that I can’t even remember what else they added. I’m talking about “Spaces,” which is a set of 4 virtual desktops that you can set up in Leopard.

One of my Spaces has Mail on it – always open, never in the way of other work. Another has my web browser – always open, never in the way. A third has my old Windows laptop, running full-screen with VMWare Fusion – instantly available, never in the way. The fourth Space is my “workspace” for video projects, writing, coding, etc. – in other words, it’s my desktop.

My Apple Mighty Mouse (got the bluetooth version) has a pair of side buttons. I configured it so I just squeeze the side buttons to bring up a view of my Spaces, so it’s easy to switch between them. I like the Apple mouse. It looks like a one-button mouse, but I can left click, right click, middle-click, scroll, just like my old Windows mouse. Which, BTW, I could also use.

Going Mac, Step By Step

If you are using Windows now, and want to switch over to enlightened computing, here’s a step by step guide of how I would do it, if I had to start from scratch:

  1. Get the right Mac. If you never travel with your computer, get an iMac. The low end version works great at around $1200. If you do video editing and stuff you’ll feel the difference if you upgrade the memory. If you do need to travel, get a MacBook – whichever one feels right because even the cheap one runs circles around your Windows box. If you’re going to “clone” your Windows system, get a hard drive that will absorb your Windows hard drives with room to spare. If you decide to get two Macs (laptop & desktop) get a firewire cable so you can just transfer the applications and settings from one to the other – this was a huge time saver for me.
  2. Keep your Windows stuff by cloning your Windows PC. I highly recommend VMWare Fusion. I used the instructions provided by VMWare to clone my physical machine. Follow the instructions carefully – when you first start up your virtual machine, Windows may try to get you to activate your Windows license, but don’t do it yet. Install VMWare tools first, then activate Windows. Windows can only be activated so many times before you have to buy another license, which I had to do via Windows Genuine Advantage because I didn’t follow the instructions, and had already activated my license a couple times with Parallels. VMWare will give you a complete Windows PC, inside of your Mac. Google Browser Sync will move your Firefox bookmarks and stuff across for you.
  3. If you are a mobile power user like me, get a .Mac account. Like I said already – .Mac allows you to sync up all your stuff. If you add a mail filtering rule while you’re using your laptop in some lonely hotel room, that rule will be in effect on your desktop when you get home. If you use the iDisk, your files will all be up to date. iDisk is really cool – it keeps a local copy and syncs up the changes. Way cool.
  4. If you want your phone to be part of the solution, get an iPhone. And be glad you waited until they dropped the price. :D It’s powerfully strange to use a phone to browse the web, and have it actually look like the web. It’s powerfully weird to read and answer emails on a phone, without it looking like SMS messaging. But it’s really nice. Everything syncs up. Everything just works. Added bonus: you can sell your iPod because the iPhone does that too. With iTunes I can grab a podcast or video and watch/listen on my phone, on a plane, in a taxi, wherever.

Reasons To Stick With Windows

There are a few reasons why you may want to stick with Windows:

  1. Your boss paid for the computer and won’t get you a Mac.
  2. You pay for the computer, and can’t afford to get a Mac.
  3. You like waiting 10 minutes for your computer to boot, because it gives you time for a cup of coffee.
  4. All of the great things a Mac will give you aren’t important because all you do is surf the web reading blogs.
  5. Steve Jobs is an arrogant bastard. (and Bill Gates isn’t)

That’s all I have for today. Back to SEO in a couple days… in the meantime, Aaron Wall has posted some nice new videos.

I Have PageRank Now! Is My Link Bait Working?

Like many of my readers, I’ve been following the hubbub about the “Google PageRank Penalties” for several days.

Unfortunately for me, all of my Google-provided toolbars seem to be malfunctioning in some strange way, because I haven’t been able to see the massive PageRank drops that these folks are claiming they’ve suffered. Search Engine Journal wobbled between 6 and 7 and is still showing PR7 today. Problogger still shows PR6. I went down the list of “victims” that’s been circulating, and nary a one ever showed up on my Google toolbar with a massively lower PageRank.

Can anyone out there confirm that they actually witnessed these “penalties” with a Google-provided toolbar, or did they only show up if you used some special tool to pull a score from some special data center? What am I missing here?

Just curious… I may be wrong, but this whole thing just smells like bait to me.

Mastering Both Kinds Of Link Building – Authority & Reputation

My wife and I still get a giggle out of that bit from the Blues Brothers movie:

Elwood: What kind of music do you usually have here?
Claire: Oh, we got both kinds. We got country *and* western.

What does this have to do with link building? Well, for starters, there are two kinds of link building that everyone needs to do. Not country and western link building, of course… I’m talking about authority (PageRank) and anchor text (reputation).

Why do you need to know the difference? I see people making the same mistakes over and over, wasting time and money on ineffective strategies. The economics of the text link ad "industry" at the moment seem to be driven by people who think that there’s only one kind of link building.

How does this help you? Well, if you’re promoting a real business and not some kind of made-for-adsense type of site, it means that you can better focus your link building efforts on the stuff that makes the needle move on your rankings and traffic.

The bottom line? To get good SEO results, you need two types of links… and how you go about getting those links can be very different.

Link Building For Authority & PageRank

PageRank and other link-based measurements of a web site or page’s "importance" serve two purposes for search engines.

First and foremost, they allow the search engine to reduce the number of pages that they need to consider when trying to produce 1000 search results from potentially millions of pages that match the text of a searcher’s query. This helps the search engines solve the huge problems posed by the large scale of the web, especially when handling queries that return millions of potential results based on on-page factors alone.

Secondly, link-based measurements of authority, trust, or importance can be helpful in ranking web pages for presentation to a searcher. Depending on how sophisticated the search engine is, and how well their link approach can scale up to handle the web, they may be able to apply a certain amount of topical or semantic sensitivity – most likely very little.

The big concern for search engine optimizers is that first factor – selection of pages for ranking. If your pages aren’t important enough, they won’t even be considered, and all the fancy work you did on tweaking your title tags and "keyword density" won’t mean a thing. Therefore, a substantial portion of your link building effort needs to go toward creating more links, more authority, more PageRank… but this is actually the easiest part of the job.

Why is it easier? It’s easier because any old links will do. It doesn’t matter what the text of the link is, as long as you get more links. That means that the entire spectrum of online (and to some extent offline) marketing strategies and tactics are at your disposal. The obvious SEO-focused stuff, like directory submissions, and the not-so-obvious stuff, like actually doing marketing and promotions aimed at your target audience.

If, for example, you carry out a blogger relations campaign targeting those who speak to your potential customers, you may see real bottom line benefits. If a marketing campaign brings in some good traffic and a few links, you can evaluate that campaign based on its profitability, and consider SEO benefits as icing on the cake. As Mike Grehan said once upon a podium, presence builds presence – simply being present in more channels will lead to you getting more links, more authority, more PageRank.

Link building for authority is easier, because you can do just about anything you please. Link bait, public relations, press releases, viral marketing… It all works to get you more links. If you can find any ways to promote your business profitably, and also get links, you’ve got the "authority thing" more or less solved, because at the moment, relatively few of your SEO competitors are bothering with real marketing.

To build authority, you just need more links. Should you focus some attention on trying to get links from real authority sites in your market or niche? Of course. Should you buy a Yahoo directory listing? Of course. But you can get most of the job done just by promoting your business… and at the end of this article, I’ll explain why this pays off more than you might think.

Link Building For Anchor Text & Link Reputation

The hard part of the SEO’s job is getting the right anchor text pointing to the right pages… and it’s made even harder when you try to combine this type of linking building with the first kind. People trying to do two different jobs with the same tool are the reason why the price of a "PR7 text link" is so high.

Once you liberate yourself from this kind of thinking, "getting anchor text" gets a lot easier.

Once you stop worrying about "PageRank" in your anchor text & reputation, now you’re only worried about getting links with the right text pointing to the right pages. As long as the search engines are indexing the page and finding the link, any anchor text link is a good one.

In terms of getting a page ranked higher for a specific search term, give me 10 "PR4" pages with the right text over one "PR7" page any day… and "give" is a whole lot more likely when your only standard for the linking page is that the search engines are going to index it.

I get text links given to me for free all the time, simply by providing unique content to webmasters – call it article marketing, content distribution, or whatever else you like.

When I say "free," I don’t mean that this activity has no cost. In fact, it does cost real money to create good unique content that will be attractive to content publishers… but there is no monthly rent for the link, and it’s easy to ensure that the content is going to get indexed. Slap links on a couple search keywords in the author’s bio on the article, and you’ve got yourself a nice long term link.

The main difference here is that our options for building "anchor text" links are considerably more limited. It’s no surprise to me, therefore, that so many people are hung up on using paid (or rented) links for building anchor text. It sure sounds tempting… but if you think you just need one more PR7 text link ad to get you over the hump, good luck, and enjoy your ever-increasing link rental bills.

Fortunately, in most markets, you probably don’t need nearly as much anchor text link building as you think.

When It Comes To Anchor Text, Don’t Forget Your Own Site

It’s not unusual to examine a search results page and see that the largest site is ranked #1. There’s a reason for that.

Naturally, for a large site to get all of its pages indexed, it needs to have more authority (PageRank, whatever) to make that happen. But the other advantage that a large site gains, as more pages are indexed, is that they have more opportunities to create their own anchor text and link reputation within their own pages.

I’ve seen countless examples, where my students are able to take control of a huge share of search engine referrals in their own markets, simply by making better use of internal linking. In fact, some of our greatest success stories involve students who put almost no emphasis on anchor text in their link building campaigns, and focused instead on creating more unique content within their own sites.

It’s Up To You – Laugh, Cry, or Take Action

Some SEOs will applaud this article, others may disagree with me. If it doesn’t make sense to you, feel free to ignore it. This article, like the rest of the site, is primarily for the benefit of my "SEO Fast Start" readers. Any interest from the larger SEO community is definitely welcome, but that must be a secondary consideration if I am to serve my audience.

If you want to get out of the link rental rat race… if you’re interested in getting an education in creative link building that goes beyond "just buy ‘em," then I encourage you to register, download my free book, and watch the free link building clinic that all of my newsletter subscribers receive.

Thanks for reading, and we’ll talk more soon.

Rand's New SEO Quiz…

Since Rand has included more than a few things that are a matter of opinion, and at least one thing where he is just plain wrong, take it for what it’s worth… it’s fun anyway.

Watch out for #3 and #17, and try to think like Rand. Read #72 carefully, because monitor dyslexia had me all confused on that one.

They had some kind of coding problem on #65 because I have the right answer (even according to Rand!) yet it’s scored as wrong… so if you get 100% you didn’t really… and if you don’t get 100% (I scored 86%) you can just fix the code for your image and make yourself feel better, like this:

SEO Overlord – 100%

Take Rand’s Quiz

I, For One, Welcome Our New SEO Overlords

Matt Cutts & Google have sure stirred up a lot of mayhem by insisting that webmasters label paid links with "rel=nofollow." Their stated purpose is to create a "machine readable disclosure" that the links represent advertising.

Cutts has also added to the controversy by referring to past U.S. Federal Trade Commission (FTC) rulings on ad disclosures as justification for Google’s "nofollow plan." Apparently, there are other countries in the world that aren’t yet subject to U.S. laws and regulatory agencies.

The issue is clouded, as debates always are, by semantic quibbling and disputes over definitions. The most courageous (or stupid) thing to do in any divisive debate is take the middle ground, but I have nothing to lose either way.

In this article, I’ll try to bring some clarity to the issue, by framing the discussion of what a paid link is, explaining why Google’s not going to win this one on nofollow, and wrap it up with some observations on what we can expect from the FTC if they do weigh in.

What Defines A Paid Link Anyway?

From the FTC’s perspective, defining a "paid link" isn’t going to be as important as defining "advertising." When you look at it that way, all that really matters is that some financial consideration is given for the link. It "helps" if the link is sold as advertising, and in each of these cases it is:

  • Pay-Per-Click (Adsense, YPN, etc.): This is clearly advertising, not even debatable.

  • Pay-Per-Action (affiliate links): Clearly advertising, not even debatable.

  • Advertorial (Paid reviews, "buzz" marketing): Clearly advertising, not even debatable.

  • Paid Placement On Page (text link ads): Clearly advertising, not even debatable.

  • Paid Editorial Review (Yahoo Directory): Clearly advertising, not even debatable.

In case you doubt that the Yahoo directory is advertising, riddle me this: why else would you pay them? To perform a site review? The last time I checked, I could get a much better site review from Kim Krause for only $1 more, and when she’s done she actually lets me see a written review.

If the site review were the product, then Yahoo would give you something – like a copy of the review. No folks… all paid directories are advertising. Yahoo has been selling their directory as an advertising opportunity. End of discussion.

Friends trading links, SEOs buying each other drinks, linking to your employer’s site from your blog, Chamber of Commerce membership and the like are just not going to get the FTC excited. End of discussion.

Stop trying to muddy the waters, everyone – we don’t need specious arguments about the definition of a paid link. The inconsistency in Google’s position is clear enough if you just accept the definitions above.

Because, if you noticed, Google seems to be perfectly OK with high-quality directories like Yahoo from a "paid link" perspective, but clearly these links are advertising. 

By relying on past FTC statements (on advertising disclosure) Google further weakens their case. If advertising must be disclosed as such (this is why the FTC would weigh in), then Google’s nofollow plan won’t work, because nofollow does not (and can not) explicitly mean "this is an ad."

What Disclosure Will Mean To The FTC

For TLAs, plain text that says "Sponsored Links" above the links would probably be sufficient*.

That shade of gray apparently meets Google’s standards for disclosure, because it’s what they use to disclose the paid ads on their SERPs. Of course, if the paid ads are in a "top box" @ Google, the disclosure is way over on the right side, well outside of the searcher’s foveal vision, but let’s not digress into how "evil" all of the SE’s are in trying to "barely disclose" the ads that they sell.

I have no doubt that the FTC would frown on using "Sponsored Links" in an image (without equivalent alt text) because the disclosure would need to work with accessibility devices like screen readers. That’s about as far as Google’s going to get with the FTC on "machine readable disclosure."

After all, the FTC isn’t going to give a rat’s tail about the effect of paid links on Google’s organic results. Or at least, Google had better hope so, because if the FTC decided that organic listings are a form of advertising, that would put all of the search engines onto a very slippery slope.

I think we all understand why it’s important for Google to identify and filter paid links. I think we all understand that they have every right to filter the links from a site that doesn’t disclose in some form. But the nofollow plan is just plain bad.

If Google wants another bite at this apple, they better try to get it soon, and come up with a better plan… because one thing is for certain: there is no stopping them; the FTC will soon be here.

And I, for one, welcome our new hyperlink regulation overlords. I’d like to remind them that as a trusted blogger, I can be helpful in rounding up others who may have strayed from the true path… whatever that actually is.

* Disclaimer: I am not an attorney, this is not legal advice, it’s not even SEO advice, and I need a vacation… We’ll talk more soon.