Panda

In February 2011, Google released their Panda algorithm, affecting a whopping 12.5% of all searches, making it one of the biggest changes ever made to their algorithm. Originally called the Farmer update, Panda was designed to go after content farms. Despite not being the target of Panda, many merchants have found themselves on the wrong side of a Panda update as well. Whether you've been negatively affected already, or whether you are looking to stay on the right side of Panda in the future, here is your guide to Panda-proofing your Magento store.

What is Panda?

Panda is a change to Google's algorithm that targeted "content farms", or sites that had thousands or hundreds of thousands of "thin" or "low quality" pages, usually blanketed in ads. To get an idea of what kinds of sites were targeted, check out this list of the top 20 websites most negatively impacted by Panda when it first rolled out. Google has provided this list of 23 questions to help webmasters understand what type of sites are targeted.

How Do I Know If My Site Was Affected?

A site that gets hit by Panda typically loses 30% to 50% of its organic traffic across all or most of its keywords, particularly in the long tail. When it was first released, there were specific "Panda" days where the data set the algorithm uses was updated, which made it fairly simple to determine if a site was hit by Panda, or had recovered. Now, however, Panda has been added in to the real time algorithm.

Why It Effects E-commerce Sites

So what does Panda have to do with your Magento store, if it was targeting ad-heavy content farms? Unfortunately, many e-commerce stores have a lot in common with the thin content sites that Panda targets: thousands of "thin" product pages and lots of duplicate content created by their e-commerce platform. Because of this, many e-commerce sites were also negatively impacted by Panda. Even if your site hasn't been hit yet, merchants need to be proactive in protecting themselves.

Panda Proofing Your Store

Luckily, much of the risk inherent in a default Magento installation can be removed with a few fairly simple changes. Remember, the goal of all of these changes is to reduce overall page count and eliminate duplicate content. Here's what you need to do:

Enable Canonical Tags on Product and Category Pages

In Magento, products and categories are available from many different URL's. For example, by default a product is available at each of the following:

yoursite.com/product.html
yoursite.com/category/product.html
yoursite.com/catalog/product/view/id/1/
youriste.com/catalog/product/view/id/1/category/1/

Each of these is the same identical page, but to Google it appears as four different pages. With layered navigation, you get hundreds of different URL paths for the same category page, for example:

yoursite.com/category/cat1.html
yoursite.com/category/cat1.html?&filter1=123
yoursite.com/category/cat1.html?&filter1=123&filter2=456

The same thing happens with pagination:

yoursite.com/category/cat1.html?&p=2
yoursite.com/category/cat1.html?&p=3

When combined, this is creating hundreds if not thousands of pages for Google to index, all of which have very similar or duplicate content. Google has provided a way to tell them when a given URL is the same as another URL, the "canonical" tag. Magento has an option to enable canonical tags. These tags will tell Google that every layered navigation and paginated page is the same as the first page of the category with no filters, and it will also tell Google that all of the different product pages are really the same page. To enable canonical tags, in Admin navigate to Configuration -> Catalog -> Search Engine Optimizations. There, change Use Canonical Link Meta Tag For Categories and Use Link Meta Tag For Products both to "Yes". Also, while there, change Use Categories Path for Product URL's to "No" - this will prevent Magento from creating even more duplicate product pages. Don't forget to click Save Config.




Block Search and Tag Pages With Robots.txt

To further reduce your site's page count, you will want to block Google from indexing certain pages. The first set of pages to block are the search and tag pages. There are two ways to block Google from indexing a page, using either robots.txt or the meta robots tag. There are advantages and disadvantages to each. In the case of search results and tag pages, you probably want to use robots.txt. Robots.txt prevents Google from crawling the pages at all. By blocking these pages via robots.txt, you reduce the number of pages Goolgebot needs to crawl. Place the following in a plain text file named "robots.txt" in the root of your Magento install to block these two sets of pages:

User-agent: *
Disallow: /catalogsearch/
Disallow: /tag/

Note you only need to disable tag pages if you have them enabled in Admin. Also, depending on your site's implementation of tag pages, you may not want to block them. While in there, you may as well block these other non-essential Magento directories:

Disallow: /cgi-bin/
Disallow: /app/
Disallow: /downloader/
Disallow: /errors/
Disallow: /includes/
Disallow: /js/
Disallow: /lib/
Disallow: /pkginfo/
Disallow: /shell/
Disallow: /skin/
Disallow: /var/

Block Non-Essential Pages From Being Indexed

You also should block pages that do not need to be indexed, such as the customer account, wishlist, compare, checkout and cart pages. Unlike with the search and tag pages, however, it is better to use meta robots to block these. The reason or this is we still want Google to crawl the links on those pages to find additional pages in the site, and we want to maintain link equity flowing through the site according to the internal navigation. To block these pages, insert the following tag into each page's template: This tells Googlebot not to index the page, but to continue crawling the links on the page.

Block Any Development Environments

If you have any development environments that are publically accessable, chances are Googlebot is going to find them. If you do not block these sites from being indexed, that creates a duplicate copy of the entire site. You can block any dev environment by going to General -> Design - > HTML Head and changing Default Robots to Noindex,Follow. Remember to change this setting back to inex,follow before migrating to a live production enviroment, otherwise Google may deindex your entire site!

Set A Default Root URL For The Site And Implement A 301 Redirect

Magento by default will serve the homepage from several different URL's. This is not optimal, as Google will see multiple versions of the homepage that are identical (duplicate content) and will also split the link equity flowing to your homepage between the versions based on how it is linked, diluting the equity that flows to the rest of the site. To check if this is an issue for your store, visit the following and see if the URL in your browser's address bar changes:

yoursite.com
www.yoursite.com
www.yoursite.com/index.php/

A properly configured store will 301 redirect each of those to one version only. This is telling Google which version you prefer to use as your homepage. Go to General -> Web -> Url Options and set Auto-redirect to Base URL to "Yes (301 Moved Permanently)".




You also want to make sure that whatever version you choose (www vs non-www) is set as your Base URL in Admin, under General -> Web -> Unsecure and General -> Web - Secure:




Other Things To Consider

The above changes will do a lot to prevent the default behavior of Magento from creating duplicate content and indexing non-content pages. However, that often is not sufficient to avoid Panda. You should also consider the following:

Consolidate Products As Much As Possible

When designing your catalog structure, try to consolidate product variations into as few products as possible. For example, if you sold gym equipment, instead of listing each dumbbell weight as a separate simple product, you might list one dumbbell product and use custom options to have the user select the weight they want, or use configurable products to combine them all into one page.

Create Unique Product Page Content

When looking at your product pages, consider how much of the content is the same on every product page versus how much is unique for that page. Don't forget to include the navigation, the header and footer, and any other common elements. You want to thicken out these pages as much as possible. You can create unique product copy (never use manufacturer's descriptions, as that will create duplicate content issues between your site and other sites that use the same descriptions), add guides and how to info, or include video content with a transcription. Another great way to increase unique product page content is by collecting customer reviews (besides helping with conversion, these will help with getting long tail searches and avoiding Panda). Have a program in place to actively solicit reviews once your customer has received their product. Also, make sure that if you're using any third party reviews plug ins like Power Reviews that they are indexable on the product page - do not use reviews plug ins that load in an iframe or similar.

Engagement

As Google includes more engagement metrics into their algorithm, it's becoming increasingly important to pay attention to these metrics on a page and site wide basis. Think of ways to make your site "sticky" and get users to engage beyond making a purchase: encourage sharing via Twitter or Facebook, include reviews, instructional manuals, other content that is interesting to your customer base. Periodically review your pages based on bounce rate and time on page; consider trimming products that are not viewed and content that is not read. While the above will not completely protect you, and like with all things involving Google's algorithm it's highly situation specific, it will get your site a long ways toward staying on the right side of Panda.

Have you seen a site wide traffic drop? Would you like us to see if your store is at risk for being hit by Panda, Penguin, or other Google penalties? Get a hold of us and ask about our e-commerce SEO audits.