« You're Not Stuck with Your TypePad Domain | Main | TypePad Post Dating Oddity »

David Weiss

Duplicate Content and Your TypePad Domain

David Weiss April 12, 2008

There are many webmasters and bloggers who are vigilant about reducing or eliminating duplicate content on their blogs and web sites. Everyone seems concerned about assigning a post to multiple categories, their non-wwww domain name being seen as a duplicate of the www version, and so on. 

In fact, the impetus behind this post comes from a comment on a my previous post about fixing up somesone's "mistake" of using their TypePad subdomain instead of their own domain for their blog.

People are so vigilant that they might think I'm trying to be controversial (or that I'm just stupid) when I say to just chill out over the subject.  Before my inbox gets flamed with mail, let me explain where I'm coming from.

Google is the king of search because they focus better on the authority of sites and relevancy of content with regard to specific search terms.  Some might disagree with this statement, but go do the same search on multiple search engines and compare results.  In my opinion, Google wins the relevancy contest consistently.

Take this point and apply it to a site that springs up overnight with hundreds of pages of content that is nearly identical to existing content on other sites.  That new site generally is not going to rank well.  As I've mentioned in yet another previous post about making money with blogs, I tried going down this path a long while back, and it just doesn't work.  Overtly spammy sites reliant on duplicated content generally don't rank well or at all in Google.

What Google tries to do is figure out the one best source for a given piece of content and do their best to make sure the site producing the original content gets their props for doing so in the form of increased "page rank" and SERP performance.  It is my belief that Google's focus on duplicate content relies on inter-domain content duplication much more than intra-domain duplication.

When you create a post in your TypePad blog, you're going to get duplicate content pretty much no matter what you do.  The post will appear on the main index page as well as the permalink page and category page(s), not to mention the RSS and ATOM feeds.

If your site is on the up-and-up and you're not relying on any black hat SEO hacks, don't worry about it.  Google and the other search engines will figure it all out, and the single best source for your content will find it's way into the SERPs.  Now that doesn't mean you should start associating posts with 5 different categories.  Pick one category if you can, or two if you feel that readers would derive benefit from the multiple category post. 

It also doesn't hurt to link to your own content in your posts where appropriate, as I've done here.  This will help the search engines find a path back to your original content if and when your posts are syndicated to other sites.

Now, let's address the inter-domain issues of duplicate content.

First, I believe non-www domain vs. www subdomain content falls in line with the previous points - Google and the other search engines will figure out the best source.  If all  your internal links point to the www subdomain, you can be pretty sure that those pages will be determined to be the original, best source of content.  If you use Google Webmaster Tools, you can also explicitly tell Google which one is preferred.  But by setting up your domain correctly to begin with, you can eliminate any doubt about possible duplicate content with a 301 redirect from the non-www to the www subdomain.  Unfortunately, if you register your domain with with Yahoo or Network Solutions, this 301 redirect capability is non-existent.  GoDaddy, on the other hand, does provide this functionality.

For those who have used their TypePad subdomain and want to switch to their own domain name, I think I outlined the best solution for making the transition as quickly and painlessly as possible without generating inter-domain duplicate content.  Note: I did not mention it in that previous post, but your TypePad domain's URLs can also be removed from the Yahoo index using Yahoo!'s Site Explorer.  Honestly, I haven't looked into MSN (or Live, or whatever they're calling it now).  I simply focus on Google because they are by far the dominant search engine today.

The one piece of functionality that is missing from the exercise of moving from your TypePad domain to your own domain is the ability to create 301 redirects from one to the other.  If TypePad could offer this one simple feature, it would make everyone's experience with their system so much better.

More Like This: Google

Comments

Pearl says:

Hi John. I want to thank you for your valuable contribution to the net community here for people like me who had just started "serious" blogging. In fact, I wrote to you before asking about drop down horizontal menu and your tips saved my day. Other than the technical info here which saves me alot of time of research, I learned from you that its important to give it back to the community. I mean, you set up typepadhacks and share so generously about your experience of blogging and the truth about internet marketing. I myself had spent a good fortune to learn from the so called guru here in my home country. However, I didn't gain much and the knowledge I have now is mostly based on my read ups. Your site is a sincere one which is not flooded with ads and banners.

Thank you so much! May God bless you!

Pearl
Singapore

David Weiss says:

I have to correct myself in this post. You CAN redirect your non-www domain to the www via Yahoo!

They just don't make it obvious for you, nor do they point out whether the redirect is a 301 redirect. I'll do some screen shots later this week and post them. I still prefer GoDaddy.com, as they present your options for this in a way that makes more sense to me.

best seo guide says:

Hi Dave, I just came across your Blog and it's just perfect what I have been looking for. I'm working on a Site's Blog that has a typad CMS. The format of the url is: http://blog.site.com. This blog is presenting several issues related to duplicate content:

-Multiple post tagging (several categories)
-For some reason when doing the site:blog.site.com command in Google I get results with http and https both, so every URL on the Site can be displayed the http or https way, don't know where this can be solved.

I have not worked much with typad but with wordpress cms. If you can give me some lights on the subject I would really appreciate it, thanks in advance.

Gus.

David Weiss says:

Huh.

Both http and https? That seems strange, indeed.

I'll have to think about this a bit and come up with potential reasons why this might be happening. An advanced template could be altered to produce https URLs, but I don't know why one would do that, and that seems like an unlikely basis for what you're seeing.

I don't see how any of the search engines would index https URLs if they did not, at one time or another, appear on the site. Even if there were inbound https links, I don't see how that would happen if all the internal links on the site were http.

If anyone else has any ideas on this, I'd like to see them.

I wouldn't be overly concerned about multiple categories, unless they went wild with it. One, sometimes two categories for a post shouldn't be a big deal. Consistently three, four, five categories, well..

I've mentioned before that the greatest piece of missing functionality I'd like to see in TypePad (from an SEO perspective) is he ability to 301 redirect. Certainly, there are ways this can be implemented without opening up server side vulnerabilities and malformed .htaccess entries.

For instance, a form could be created that only allows redirection from permalink to permalink and from category folder to category folder, or from permalink to the root or category folder to the root.

All the redirect information could be stored in the database (and validated to enforce these rules). Then, the .htaccess file could be published just like any template, style sheet, etc. Seems to me that would be pretty much bullet proof.

Post a comment

If you have a TypeKey or TypePad account, please Sign In

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/331661/28028802

Listed below are links to weblogs that reference Duplicate Content and Your TypePad Domain:

typepad hacks is a typepad featured weblog typepad hacks is listed on Alltop social media
typepad hacks a la carte code store typepad hacks custom tyepad template code
typepad hacks affiliate program

Subscribe

Email RSS Comments  
Subscribe to TypePad Hacks with email Read TypePad Hacks posts via RSS Join the conversation at TypePad Hacks via RSS (comments feed) subscriber count

Search

Search emoodicon blog

Share

Socialize

Twitter Logo
    follow typepadhacks on twitter

    Read and reply to the 100 most recent comments at the TypePad Hacks Community Page


    Grazr

    Colophon

    Powered by TypePad
    Member since 03/2005

    TypePad Status

    Creative Commons License