Creating Unique Content For Google

Posted by Al Scillitani on June 10, 2008 – 8:02 am

Some of what I am about to write may not be new to some of you, others will be blown away.  I have been researching what Google defines as duplicate content for a few years now.  I want to share what I have found.

December 18, 2006 Adam Lasnik (you may remember him, he is the Google employee that sent me the pain killers) wrote an article about Duplicate Content

Adam wrote: “Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar.” This all makes sense to me.  Google does not want the top ten results to show identical content.  This wouldn’t be good for the original writer of the content, nor Google searchers.  What I am looking for is, what is the definition of “appreciably similar?”  Google will not post what percent of the content must be different to be seen as different content. 

In the latest article written by Google Team Member Sven Naumann, posted June 9, 2008,  Duplicate Content Due To Scrapers Sven writes,

Generally, we can differentiate between two major scenarios for issues related to duplicate content:

• Within-your-domain-duplicate-content, i.e. identical content which (often unintentionally) appears in more than one place on your site
• Cross-domain-duplicate-content, i.e. identical content of your site which appears (again, often unintentionally) on different external sites

For “Within-your-domain-duplicate-content” Sven recommends blocking what you don’t want Google to claim as the main content and using the Sitemap to tell Google which page is the main one.

For “Cross-domain” content, Sven writes, “we look at various signals to determine which site is the original one, which usually works very well. This also means that you shouldn’t be very concerned about seeing negative effects on your site’s presence on Google if you notice someone scraping your content.”

One of the most important sentences in Sven’s entire post is, “I’d like to point out that in the majority of cases, having duplicate content does not have negative effects on your site’s presence in the Google index. It simply gets filtered out.”

So what does this mean to you?  This means if you sell the same products that a major brand sells and a 100 other sites sell, if your content is the same or similar, it will most likely not rank because Google will deem the brands website “original one.”  Your product page will be filtered out and not shown in the top results.

One thing I would like to add, that is not discussed in either article, is your title and description tags.  Even though these are only a sentence or two, I had tested and noted that duplicate description tags will be considered duplicate content and will cause Google to “filter out” pages.  You should have unique title and description tags anyway, but same or similar tags will have a dramatic affect on your indexing/ranking of pages.

So, what is the magic number that Google uses to deem a page similar?  In my opinion, there isn’t one.  Like Google’s algorithm, it is very complex and based on several factors (vertical, theme, how many words are on the page, site layout, density, sentence structure, proximity of words, etc…)  For all websites, you can not randomly say, “if you change 20% of your copy, Google will treat it as unique.”  Basically, you want the bot to think your page is unique even though it contains a lot of the same information, but is worded differently than other sites.

To get cross domain similar pages and articles indexed, you can either add totally new content or try my recommendations below:

1. 100% unique title and description tags.  Don’t even look at the other sites tags.
2. Depending on the amount of copy, try not to have more than 2 sentences in a row exactly the same within the page.  The first few sentences of the copy must be different.  There are ways to change the wording, but still say the same thing.  This is key is trying to get your point across without duplication.
3. Re-arrange the paragraph order.  If there are 3 paragraphs, can the middle one be worded so that it can be moved to the first position?  This combined with re-arranging some of the sentences will help the page appear unique to bots. 
4. Remember, Google is looking at density, number of words, theme, etc.. so adding a new unique SEO optimized first paragraph will help.  Change some of the words towards the end of the page using a thesaurus.  For example, if you have a page selling a particular brand and model shoe, towards the bottom page add the term “footwear” replacing some of “shoe” text. 
5. If you have 100’s or 1000’s of pages like this, there are ways to automate the process.  It is easier when adding new pages.  The title tags can be created based on the unique new title you give to the product (again rearrange the words).  This product title will be placed on your page as well.  When content is added, there are ways to rearrange how it is laid out and written, developers can help with this.  It still needs to be readable, but if you are concerned about duplicate content, the tedious task of making these tweaks are a must.
6. When I am all done, I would like to see a new SEO optimized first paragraph and the remaining copy to adhere to the steps 1 thru 5 above.  Yes, it takes time, but it is better than adding all these product pages just for them to be “filtered out.”

Feel free to test.  Test a category or a few product pages.  Make the changes above and see if it makes a difference.  There are no guarantee’s when it comes to Google’s algorithm and there are 100’s of other factors involved in natural rankings, but making these changes should help Googlebot consider your pages unique. 

  1. 22 Responses to “Creating Unique Content For Google”

  2. Great stuff Al - I’m actually working on a content problem at massive scale so this was GREAT TIMING!

    G

    By Garrett on Jun 10, 2008

  3. Good post. Dealing with this issue for a few clients.

    By Dan on Jun 10, 2008

  4. You’ve blown me away, Al :-)

    Another thing that I’ve noticed is that absence of meta-tags and/or very little copy on the page may cause the search engines to look at content like navigation and headers that may be identical on each page and view them as duplicate. This could be especially damaging to sites that rely heavily on images to promote a conversion, sale, etc. In these cases, unique meta-tags are even more imperative.

    By Esoterick on Jun 10, 2008

  5. Esoterick,

    Great point! You are correct. If you do not have a description tag, Google will find content somewhere on your page and use that as the description for that page. If Google chooses text that is used on everypage, then it will look like duplicate tags.

    By Al Scillitani on Jun 10, 2008

  6. great post. quick question: how does ‘duplicate content’ work for sites that aggregate content from a variety of sources from all over. i assume a story from the wash post will rank higher than the same story that can be found, in total, on a news aggregator.

    By derby on Jul 11, 2008

  7. Derby,

    You wrote, “i assume a story from the wash post will rank higher than the same story that can be found, in total, on a news aggregator.”

    That is correct. Google will take into consideration the origin of the original post and the authority of the site it is posted on. If you do not change that content, it is unlikely it will show up in the results.

    By Al Scillitani on Jul 11, 2008

  8. in-site content problem I think — Please, what if jewelry categories are pendants, rings, etc, but customers also want to see the items presented on pages by animal breed ? It’s the same jewelry collection. Item #7 is on pendants page, then also on arabian horse page. Won’t google see this as duplicate content because product items are the same.
    Question is how to do this, since it is of value to be searched by both jewelry pendants and arabian horse jewelry. Can’t figure it out, and I think there are penalties in place.
    Appreaciate any input or suggestions.

    By Pat on Jul 13, 2008

  9. Pat, Depends on the tags and content of each of those pages. If everything is the same execpt the image, then Google will most likely see it as dup content. You shouldn’t receive a penalty for this. Google will choose which page they feel is the most relevant and show that one based on the search terms used. If someone searches Horse pendants, then Google should be smart enough to show that page as long as you have title, meta tags, image names, alt tags, etc.. stating it is the horse pendant.
    If there are penalties in place, it may be due to something else.
    Wait! Did you say “Arabian Horse Jewelry” and animal jewelry? Where have I been? Never have seen this before. Nice niche!

    By Al Scillitani on Jul 13, 2008

  10. hmm, nice try at finding out an actual figure to measure duplicate content, but Google always stated it “wanted to reward good behaviour rather than penalise bad”.

    For this reason Google will never release info on their duplicate content guidelines, they don’t want us to design content to a minimum specification, they want copy to be as good and as useful as possible. Rather than trying to find the minimum amount, you should look to generate fantastic content. While this is a bit idealistic, that’s Google for you. At least it means it encourages better content to be created on the web. Imagine if there was no quality control, spam would reign and no-one would use the Internet because all they could find is shoddy content. For this reason, I do not envy Google for the money they have made, at least they try to reward great content!

    I’m going to have to get in touch with a pro copyrighter!

    By SEO-PRO on Aug 1, 2008

  11. SEO-PRO

    I dont think of it as trying to cheat the system. When you deal with large ecommerce sites, this is a real problem.

    If you sell golf clubs, you can have a name brand driver. That driver can have hundreds of different combinations: same driver with different lofts, lefty, righty, mens, womens, steel shaft, graphite shaft, etc.. If the page content and descriptions were the same except, let’s say, only the loft degree is different, Google could see this as duplicate content. This is where creative copywriting comes in. You need to write UNIQUE copy for the user, describe the product accurately, and get the individual product ranked in the engines.

    By Al Scillitani on Aug 2, 2008

  12. Nice tips, this is very help…

    By Bali on Sep 3, 2008

  13. What if my content is 95-100% unique but the title is the exact same as all the other sites? My website is music related and a song name can only have one name. Same goes for the album and Artist.

    By Andrew W. on Sep 24, 2008

  14. Andrew,

    If you are referring to the title and description tags, I have tested this before and they have to be unique.
    If you are talking about the title in the body of the page, with 95-100% unique content, Google may see them as separate pages even though the titles are the same. However, if it were my site, I would changes those titles. Why chance it?

    By Al Scillitani on Sep 25, 2008

  15. Al,

    Thanks for the reply. My site does Song/Artist/Album Reviews. I don’t see how I would be able to change the title even though my actual content may be unique.
    Example:
    Artist: 50 Cent
    Title: In Da Club
    Album: Get Rich or Die Tryin’

    I don’t think there is anyway of getting around that?? If so, feel free to let me know.

    By Andrew W. on Sep 25, 2008

  16. Andrew,

    With all that unique content, you may be ok. I would still try some tests on a few of them to see what happens. Try something like:

    Artist: 50 Cent - Curtis James Jackson III
    Title: InDaClub
    Album: Top Selling Album - Get Rich or Die Tryin

    By Al Scillitani on Sep 25, 2008

  17. Now that’s an idea. Thanks, lets see what happens.

    By Andrew W. on Sep 26, 2008

  18. http://www.copyscape.com is a pretty good tool to check to see if anybody else has the same content as you.

    By Andrew W. on Sep 26, 2008

  19. This definately does work. Proof, I switch to a new site but kept the old site up. Never considered the NEW site was duplicate. We literally copy pasted everything over. Had poor listings. After reading this two months ago we have changed the content on every page and bang! we’re up and listed

    By Paul J on Nov 30, 2008

  20. Paul J.

    Glad I could help!

    By Al Scillitani on Dec 1, 2008

  21. Good Info. But a lot factors that we do not know what Google uses in its logarithms. Only they & their staff knows. And from time to time
    they change the indexing procedures .However, I found many sites got exactly identical contents, I was shocked! How? No one noticed them?

    By Dr. Altaf on Feb 16, 2009

  22. Quite an informative article. Having studied the highest rankng sites for our keywords, we are in the process of changing all our website builder content from pre-published free articles to 100% uniquely created articles. It will be interesting to see what impact this has on our business websites Google position since Google stresses that unique website content is preferred.

    By website-builders on Apr 16, 2009

  1. 1 Trackback(s)

  2. Jul 6, 2009: Top SEO blog posts of 2008! « Advanced Internet Marketing

Post a Comment