Wednesday, February 27, 2008

Google’s Tag To Remove Content Spamming

Content spamming, in its simplest firm, is the taking of content from other sites that rank well on the search engines, and then either using it as-it-is or using a utility software like Articlebot to scramble the content to the point that it can’t be detected with plagiarism software. In either case, your good, search-engine- friendly content is stolen and used, often as part of a doorway page, to draw the attention of the search engines away from you.

Everyone has seen examples of this: the page that looks promising but contains lists of terms (like term term paper term papers term limits) that link to other similar lists, each carrying Google advertising. Or the site that contains nothing but content licensed from Wikipedia. Or the site that plays well in a search but contains nothing more than SEO gibberish, often ripped off from the site of an expert and minced into word slaw.

These sites are created en masse to provide a fertile ground to draw eyeballs. It seems a waste of time when you receive a penny a view for even the best-paying ads but when you put up five hundred sites at a time, and you’ve figured out how to get all of them to show up on the first page or two of a lucrative Google search term, it can be surprisingly profitable.

The losers are the people who click on these pages, thinking that there is content of worth on these sites and you. Your places are stolen from the top ten by these spammers. Google is working hard to lock them out, but there is more that you can do to help Google.

Using The Antispam Tag

But there is another loser. One of the strengths of the Internet is that it allows for two-way public communication on a scale never seen before. You post a blog, or set up a wiki; your audience comments on your blog, or adds and changes your wiki.

The problem? While you have complete control over a website and its contents in the normal way of things, sites that allow for user communication remove this complete control from you and give it to your readers. There is no way to prevent readers of an open blog from posting unwanted links, except for manually removing them. Even then, links can be hidden in commas or periods, making it nearly impossible to catch everything.

This leaves you open to the accusation of link spam for links you never put out there to begin with. And while you may police the most recent several blogs you’ve posted, no one polices the ones from several years ago. Yet Google still looks at them and indexes them. By 2002, bloggers everywhere were begging Google for an ignore tag of some sort to prevent its spiders from indexing comment areas.

Not only, they said, would bloggers be grateful; everyone with two-way uncontrolled communication wikis, forums, guest books needed this service from Google. Each of these types of sites has been inundated with spam at some point, forcing some to shut down completely. And Google itself needed it to help prevent the rampant sp@m in the industry.

In 2005, Google finally responded to these concerns. Though their solution is not everything the online community wanted (for instance, it leads to potentially good content being ignored as well as sp@m), it does at least allow you to section out the parts of your blog that are public. It is the “nofollow” attribute.

“Nofollow” allows you to mark a portion of your web page, whether you’re running a blog or you want to section out paid advertising, as an area that Google spiders should ignore. The great thing about it is that not only does it keep your rankings from suffering from sp@m, it also discourages spammers from wasting your valuable comments section with their junk text.

The most basic part of this attribute involves embedding it into a hyperlink. This allows you to manually flag links, such as those embedded in paid advertising, as links Google spiders should ignore. But what if the content is user-generated? It’s still a problem because you certainly don’t have time to go through and mark all those links up.

Fortunately, blogging systems have been sensitive to this new development. Whether you use Wordpress or another blogging system, most have implemented either automated “nofollow” links in their comment sections, or have issued plugins you can implement yourself to prevent this sort of spamming.

This does not solve every problem. But it’s a great start. Be certain you know how your user-generated content system provides this service to you. In most cases, a software update will implement this change for you.

Is This Spamming And Will Google Block Me?

There’s another problem with the spamming crowd. When you’re fighting search engine sp@m and start seeing the different forms it can take and, disturbingly, realizing that some of your techniques for your legitimate site are similar you have to wonder: Will Google block me for my search engine optimization techniques?

This happened recently to BMW’s corporate site. Their webmaster, dissatisfied with the dealership’s position when web users searched for several terms (such as “new car”), created and posted a gateway page a page optimized with text that then redirects searchers to an often graphics-heavy page.

Google found it and, rightly or wrongly, promptly dropped their page rank manually to zero. For weeks, searches for their site turned up plenty of sp@m and dozens of news storiesbut to find their actual site, it was necessary to drop to the bottom of the search, not easy to do in Googleworld.

This is why you really need to understand what Google counts as search engine sp@m, and adhere to their restrictions even if everyone else doesn’t. Never create a gateway page, particularly one with spammish data. Instead, use legittmate techniques like image alternate text and actual text in your page. Look for ways to get other pages to point to your site t article submission, for instance, or directory submission. And keep your content fresh, always.

While duplicated text is often a sign of serious spammage, the Google engineers realize two things: first, the original text is probably still out there somewhere, and it’s unfair to drop that person’s rankings along with those who stole it from them; and second, certain types of duplicated text, like articles or blog entries, are to be expected.

Their answer to the first issue is to credtt the site first catalogued with a particular text as the creator, and to drop sites obviously spammed from that one down a rank. The other issue is addressed by looking at other data around the questionable data; if the entire site appears to be spammed, it, too, is dropped. Provided you are not duplicating text on many websites to fraudulently increase your ranking, you’re safe. Ask yourself: are you using the same content on several sites registered to you in order to maximize your chances of being read? If the answer is yes, this is a bad idea and will be classified as spamdexing. If your content would not be useful to the average Internet surfer, it is also likely to be classed as spamdexing.

There is a very thin line between search engine optimization and spam indexing. You should become very familiar with it. Start with understanding hidden/invisible text, keyword stuffing, metatag stuffing, gateway pages, and scraper sites.

Wednesday, February 20, 2008

Black-Hat SEO Grey-Hat SEO White-Hat SEO Tactics Black-Hat SEO Tactics

Keyword Stuffing
This is probably one of the most commonly abused forms of search engine spam. Essentially this is when a webmaster or SEO places a large number of instances of the targeted keyword phrase in hopes that the search engine will read this as relevant. In order to offset the fact that this text generally reads horribly it will often be placed at the bottom of a page and in a very small font size. An additional tactic that is often associated with this practice is hidden text which is commented on below.
Hidden Text
Hidden text is text that is set at the same color as the background or very close to it. While the major search engines can easily detect text set to the same color as a background some webmasters will try to get around it by creating an image file the same color as the text and setting the image file as the background. While undetectable at this time to the search engines this is blatant spam and websites using this tactic are usually quickly reported by competitors and the site blacklisted.
Cloaking
In short, cloaking is a method of presenting different information to the search engines than a human visitor would see. There are too many methods of cloaking to possibly list here and some of them are still undetectable by the search engines. That said, which methods still work and how long they will is rarely set-in-stone and like hidden text, when one of your competitors figures out what is being done (and don’t think they aren’t watching you if you’re holding one of the top search engine positions) they can and will report your site and it will get banned.
Doorway Pages
Doorway pages are pages added to a website solely to target a specific keyword phrase or phrases and provide little in the way of value to a visitor. Generally the content on these pages provide no information and the page is only there to promote a phrase in hopes that once a visitor lands there, that they will go to the homepage and continue on from there. Often to save time these pages are generated by software and added to a site automatically. This is a very dangerous practice. Not only are many of the methods of injecting doorway pages banned by the search engines but a quick report to the search engine of this practice and your website will simply disappear along with all the legitimate ranks you have attained with your genuine content pages.
Redirects
Redirecting, when used as a black-hat tactic, is most commonly brought in as a compliment to doorway pages. Because doorway pages generally have little or no substantial content, redirects are sometime applied to automatically move a visitor to a page with actual content such as the homepage of the site. As quickly as the search engines find ways of detecting such redirects, the spammers are uncovering ways around detection. That said, the search engines figure them out eventually and your site will be penalized. That or you’ll be reported by a competitor or a disgruntled searcher.
Duplicate Sites
A throwback tactic that rarely works these days. When affiliate programs became popular many webmasters would simply create a copy of the site they were promoting, tweak it a bit, and put it online in hopes that it would outrank the site it was promoting and capture their sales. As the search engines would ideally like to see unique content across all of their results this tactic was quickly banned and the search engines have methods for detecting and removing duplicate sites from their index. If the site is changed just enough to avoid automatic detection with hidden text or the such, you can once again be reported to the search engines and be banned that way.

Interlinking
As incoming links became more important for search engine positioning the practice of building multiple websites and linking them together to build the overall link popularity of them all became a common practice. This tactic is more difficult to detect than others when done “correctly” (we cannot give the method for “correct” interlinking here as it’s still undetectable at the time of this writing and we don’t want to provide a means to spam engines). This tactic is difficult to detect from a user standpoint unless you end up with multiple sites in the top positions on the search engines in which case it is likely that you will be reported.
Grey-Hat SEO Tactics
The following tactics fall in the grey area between legitimate tactics and search engine spam. They include tactics such as cloaking, paid links, duplicate content and a number of others. Unless you are on the correct side of this equation these tactics are not recommended. Remember: even if the search engines cannot detect these tactics when they are used as spam, your competitors will undoubtedly be on the lookout and report your site to the engines in order to eliminate you from the competition.
It is definitely worth noting that, while it may be tempting to enlist grey-hat and black-hat SEO tactics in order to rank well, doing so stands a very good chance of getting your website penalized. There are legitimate methods for ranking a website well on the search engines. It is highly recommended that webmasters and SEO’s put in the extra time and effort to properly rank a website well, insuring that the site will not be penalized down the road or even banned from the search engines entirely.
Grey-Hat SEO Tactics:
Cloaking
There are times when cloaking is considered a legitimate tactic by users and search engines alike. Basically, if there is a logical reason why you should be allowed to present different information to the search engines than the visitor (if you have content behind a “members only” area for example) you are relatively safe. Even so, this tactic is very risky and it is recommended that you contact each search engine, present your reasoning, and allow them the opportunity to approve it’s use.
Arguably, another example of a site legitimately using cloaking, is when the site is mainly image-based such as an art site. In this event, provided that the text used to represent the page accurately defines the page and image(s) on it, this could be considered a legitimate use of cloaking. As cloaking has often been abused, if other methods such as adding visible text to the page is possible it is recommended. If there are no other alternatives it is recommended that you contact the search engine prior to adding this tactic and explain your argument.
There is more information on cloaking on our black-hat SEO tactics page.
Paid Links
The practice of purchasing link on websites solely for the increase in link-popularity that it can mean has grown steadily over the last year-or-so with link auction sites such as LinkAdage making this practice easier. (You can read more about LinkAdage on our SEO resources page.
When links are purchased as pure advertising the practice is considered legitimate, while the practice of purchasing links only for the increase in link-popularity is considered an abuse and efforts will be made to either discount the links or penalize the site (usually the sellers though not always).
As a general rule, if you are purchasing links you should do so for the traffic that they will yield and consider any increase in link-popularity to be an “added bonus”.
Duplicate Content
Due primarily to the increase in popularity of affiliate programs, duplicate content on the web has become an increasingly significant problem for both search engines and search engine users alike with the same or similar sitesdominating the top positions in the search engine results pages.
To address this problem many search engines have added filters that seek out pages with the same or very similar content and eliminate the duplicate. Even at times when the duplicate content is not detected by the search engines it is often reported by competitors and the site’s rankings penalized.
There are times when duplicate content is considered legitimate by both search engines and visitors and that is on resource sites. A site that consists primarily as an index of articles on a specific subject-matter will not be penalized by posting articles that occur elsewhere on the net, though the weight it may be given as additional content will likely not be as high as a page of unique content.
White-Hat SEO Tactics:
Internal Linking

By far one of the easiest ways to stop your website from ranking well on the search engines is to make it difficult for search engines to find their way through it. Many sites use some form of script to enable fancy drop-down navigation, etc. Many of these scripts cannot be crawled by the search engines resulting in unindexed pages.
While many of these effects add visual appeal to a website, if you are using scripts or some other form of navigation that will hinder the spidering of your website it is important to add text links to the bottom of at least your homepage linking to all you main internal pages including a sitemap to your internal pages.
Reciprocal Linking
Exchanging links with other webmasters is a good way (not the best, but good) of attaining additional incoming links to your site. While the value of reciprocal links has declined a bit over the past year they certainly still do have their place.
A VERY important note is that if you do plan on building reciprocal links it is important to make sure that you do so intelligently. Random reciprocal link building in which you exchange links with any and virtually all sites that you can will not help you over the long run. Link only to sites that are related to yours and who’s content your visitors will be interested in and preferably which contain the keywords that you want to target. Building relevancy through association is never a bad thing unless you’re linking to bad neighborhoods (penalized industries and/or websites).
If you are planning or currently do undertake reciprocal link building you know how time consuming this process can be. An useful tool that can speed up the process is PRProwler. Essentially this tool allows you to find related sites with high PageRank, weeding out many of the sites that would simply be a waste of time to even visit. You can read more about PRProwler on our search engine positioning tools page.
Content Creation
Don’t confuse “content creation” with doorway pages and the such. When we recommend content creation we are discussing creating quality, unique content that will be of interest to your visitors and which will add value to your site.
The more content-rich your site is the more valuable it will appear to the search engines, your human visitors, and to other webmasters who will be far more likely to link to your website if they find you to be a solid resource on their subject.
Creating good content can be very time-consuming, however it will be well worth the effort in the long run. As an additional bonus, these new pages can be used to target additional keywords related to the topic of the page.
Writing For Others
You know more about your business that those around you so why not let everyone know? Whether it be in the form of articles, forum posts, or a spotlight piece on someone else’s website, creating content that other people will want to read and post on their sites is one of the best ways to build links to your website that don’t require a reciprocal link back.

Site Optimization

The manipulation of your content, wording, and site structure for the purpose of attaining high search engine positioning is the backbone of SEO and the search engine positioning industry. Everything from creating solid title and meta tags to tweaking the content to maximize it’s search engine effectiveness is key to any successful optimization effort.
That said, it is of primary importance that the optimization of a website not detract from the message and quality of content contained within the site. There’s no point in driving traffic to a site that is so poorly worded that it cannot possibly convey the desired message and which thus, cannot sell. Site optimization must always take into account the maintenance of the salability and solid message of the site while maximizing it’s exposure on the search engines.