Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Duplicate Content for index.html
-
In the Crawl Diagnostics Summary, it says that I have two pages with duplicate content which are:
I read in a Dream Weaver tutorial that you should name your home page "index.html" and then you can let www.mywebsite.com automatically direct the user to index.html. Is this a bug in SEOMoz's crawler or is it a real problem with my site?
Thank you,
Dan
-
The code should definitely go into the websites root directory's .htaccess, however .htaccess can be weird, a few days ago I ran into a similar issue with a client's website, and I was able to remedy the issue with a variation of the code.
index Redirect RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)index.(php|html|htm|asp)\ HTTP/ RewriteRule ^(([^/]+/))index.(php|html|htm|asp)$ http://yoursite.com/$1 [R=301,L]
If you give me the URL for the site I will take a look at it and let you know what would be feasible.
-
Hi Daniel, can you share with us the URL of your site? We can take a look at it and give you a more precise answer that way. Thanks!
-
I eventually figured out that your method was a 301 redirect and I definitely broke my site trying to use the code you posted. .. haha. Its ok though. I just removed the code and it went back to normal. At first, I was editing the .htaccess file in the public_html folder which wasnt working. Then I tried the root folder for the site (I created the .htaccess file since it did not exist.) Neither of those worked. (I am using Bluehost so I do not think that I have root access and I am not sure if it is a Linux server or not.)
If there is an easy way to explain what I am doing wrong, please do so. Otherwise, I will use canonical.
Thanks for everything!
-
@Dan
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
sorry about the delay of this response, i didn't realize the that you were asking me a question right away. When placing the code I provided in my previous answer this will cause a 301 perminant redirect to the original URL. That's actually what the
[R=301,L]
portion of the code is stating (R) redirect (301) status is referring to. After reviewing the Matt Cutts video, I realize that I should have asked you if you were operating on a Linux server that you had root access to. We actually utilize both redirects and canonical tags since it was recommended by the on-page optimization reports. Heck Google uses them, I would assume because it's easier for the user to be referred to a single page URL. Obviously though if you don't have server header access, and are not familiar with .htaccess (you can accidentally break your site) then the canonical solution is appropriate
-
Josh,
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
Thanks,
Dan -
use the link rel tag for all my homepages for the http://www.yoursite.com
-
Odd enough I just recently answered this question. The SEOmoz crawler is correct, because without a redirect you will be able to access both versions of the page in your browser.
To resolve this issue simply rewrite the index.html to the root url by placing the following code into your .htaccess file into your root directory.
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.yoursite.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.yoursite.com/$1/ [R=301,L]
You can also do the same with the index file in any subdirectories that you might create, by simply placing a .htaccess into those sub directories and using variations of the above code. This is how you create nice tight URLs without the duplicate content issue that look like - http://www.semclix.com/design/business/
-
It is a problem which you need to fix. You need to canonicalize your pages.
Those are all various URLs which most likely lead to the same web page. I say "most likely" because these URLs can actually lead to different pages.
You need to tell crawlers and search engines how you organize your site. There are several ways to achieve canonicalization. The method I prefer is to add the following line of code to each page:
The URL provided should be the preferred URL for your page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Migration from HTML to Wordpress - SEO Implications?
I am in the process of having a wordpress site developed to replace my current HTML site. (I currently have my website in html and a blog in wordpress in a sub directory). I am doing this in phases to try and preserve as much of my good rankings as possible. My first phase is to replicate my site with the exact same pages, meta data, and site structure. I'm hoping that google will see this as not much change and not change my rankings for the worse. I also made it a goal that my site speed tests be at least equal to what they are now. We will have to 301 all of the URLs however since it will be going from /example.html to /example. I believe my blog will also need to move into the root directory as well, so I need to 301 all of those pages. I plan to wait a couple months for Phase 2. Phase 2 involves replacing old content (photo galleries), and introducing new content (virtual tours, videos, new pages, etc.) One of my reasons for moving to wordpress is to keep up with current trends a little easier since I have very little time. (I am owner, website maintainer, SEO - all on my own). My question here is three parts. 1. Do you think this strategy will work to preserve my current rankings? 2. Do you have any lessons learned or advice to share with me to make this as smooth as possible? 3. Do I really need to wait to add new content? I might get antsy and want to do it sooner! 🙂 Thank you in advance!
Web Design | | CalicoKitty20001 -
How to prevent development website subdomain from being indexed?
Hello awesome MOZ Community! Our development team uses a sub-domain "dev.example.com" for our SEO clients' websites. This allows changes to be made to the dev site (U/X changes, forms testing, etc.) for client approval and testing. An embarrassing discovery was made. Naturally, when you run a "site:example.com" the "dev.example.com" is being indexed. We don't want our clients websites to get penalized or lose killer SERPs because of duplicate content. The solution that is being implemented is to edit the robots.txt file and block the dev site from being indexed by search engines. My questions is, does anyone in the MOZ Community disagree with this solution? Can you recommend another solution? Would you advise against using the sub-domain "dev." for live and ongoing development websites? Thanks!
Web Design | | SproutDigital0 -
Do I need to 301 redirect www.domain.com/index.html to www.domain.com/ ?
So, interestingly enough, the Moz crawler picked up my index.html file (homepage) and reported duplicate content, of course. But, Google hasn't seemed to index the www.domain.com/index.html version of my homepage, just the www.domain.com version. However, it looks like I do have links going specifically to www.domain.com/index.html and I want to make sure those are getting counted towards my overall domain strength. Is it necessary to 301 redirect in the scenario described above?
Web Design | | Small_Business_SEO0 -
Privacy Policy: index it/? And where to place it?
Hi Everyone, Two questions, first: should you allow google to index your privacy policy? Second: for a service based site (not e-commerce, not selling anything) should you put the policy in the footer so it's site wide or just on the "contact us" form page? Best, Ruben
Web Design | | KempRugeLawGroup0 -
Reasons Why Our Website Pages Randomly Loads Without Content
I know this is not a marketing question but this community is very dev savvy so I'm hoping someone can help me. At random times we're finding that our website pages load without the main body content. The header, footer and navigation loads just fine. If you refresh, it's fine but that's not a solution. Happens on Chrome, IE and Firefox, testing with multiple browser versions Happens across various page types - but seems to be only the main content section/container Happens while on the company network, as well as externally Happens after deleting cookies, temporary internet files and restarting computer We are using a CMS that is virtually unheard of - Bridgeline/Iapps Codebase is .net Our IT/Dev group keeps pushing back, blaming it on cookies or Chrome plugins because they apparently are unable to "recreate the problem". This has been going on for months and it's a terrible experience for the user to have. It's also not great when landing PPC visitors on pages that load with no content. If anyone has ideas as to why this may be happening I would really appreciate it. I'm not sure if links are allowed, by today the issue happened on this page serversdirect.com/dm/geek-biz Linking to an image example below knEUzqd
Web Design | | CliqStudios0 -
ECWID How to fix Duplicate page content and external link issue
I am working on a site that has a HUGE number of duplicate pages due to ECWID ecommerce platform. The site is built with Joomla! How can I rectify this situation? The pages also show up as "external " links on crawls... Is it the ECWID platform? I have never worked on a site that uses this. Here is an example of a page with the issue (there are 6280 issues) URL: http://www.metroboltmi.com/shop-spare-parts?Itemid=218&option=com_rokecwid&view=ecwid&ecwid_category_id=3560081
Web Design | | Atlanta-SMO0 -
Subdomains, duplicate content and microsites
I work for a website that generates a high amount of unique, quality content. This website though has had development issues with our web builder and they are going to separate the site into different subdomains upon launch. It's a scholarly site so the subdomains will be like history and science and stuff. Don't ask why aren't we aren't using subdirectories because trust me I wish we could. So we have to use subdomains and I'm wondering a couple questions. Will the duplication of coding, since all subdomains will have the same design and look, heavily penalize us and is there any way around that? Also if we generate a good amount of high quality content on each site could we link all those sites to our other site as a possible benefit for link building? And finally, would footer links, linking all the subdirectories, be a good thing to put in?
Web Design | | mdorville0 -
How will it affect my site if i link to a site with adult content?
We are currently working on creating 2 sites for a company, one with no adult content, one with adult content. Will it affect the non adult content site if i link to the other one in terms of Google and being blocked by some internet providers.
Web Design | | MattWheatcroft0