Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Duplicate Content for index.html
-
In the Crawl Diagnostics Summary, it says that I have two pages with duplicate content which are:
I read in a Dream Weaver tutorial that you should name your home page "index.html" and then you can let www.mywebsite.com automatically direct the user to index.html. Is this a bug in SEOMoz's crawler or is it a real problem with my site?
Thank you,
Dan
-
The code should definitely go into the websites root directory's .htaccess, however .htaccess can be weird, a few days ago I ran into a similar issue with a client's website, and I was able to remedy the issue with a variation of the code.
index Redirect RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)index.(php|html|htm|asp)\ HTTP/ RewriteRule ^(([^/]+/))index.(php|html|htm|asp)$ http://yoursite.com/$1 [R=301,L]
If you give me the URL for the site I will take a look at it and let you know what would be feasible.
-
Hi Daniel, can you share with us the URL of your site? We can take a look at it and give you a more precise answer that way. Thanks!
-
I eventually figured out that your method was a 301 redirect and I definitely broke my site trying to use the code you posted. .. haha. Its ok though. I just removed the code and it went back to normal. At first, I was editing the .htaccess file in the public_html folder which wasnt working. Then I tried the root folder for the site (I created the .htaccess file since it did not exist.) Neither of those worked. (I am using Bluehost so I do not think that I have root access and I am not sure if it is a Linux server or not.)
If there is an easy way to explain what I am doing wrong, please do so. Otherwise, I will use canonical.
Thanks for everything!
-
@Dan
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
sorry about the delay of this response, i didn't realize the that you were asking me a question right away. When placing the code I provided in my previous answer this will cause a 301 perminant redirect to the original URL. That's actually what the
[R=301,L]
portion of the code is stating (R) redirect (301) status is referring to. After reviewing the Matt Cutts video, I realize that I should have asked you if you were operating on a Linux server that you had root access to. We actually utilize both redirects and canonical tags since it was recommended by the on-page optimization reports. Heck Google uses them, I would assume because it's easier for the user to be referred to a single page URL. Obviously though if you don't have server header access, and are not familiar with .htaccess (you can accidentally break your site) then the canonical solution is appropriate
-
Josh,
Thanks for your reply. It seems like there are lots of different ways to solve this problem. I just watched this video on Matt Cutt's blog where he discusses his preference for 301 redirects over rel canonical tag.
Where would you say your solution fits in?
Thanks,
Dan -
use the link rel tag for all my homepages for the http://www.yoursite.com
-
Odd enough I just recently answered this question. The SEOmoz crawler is correct, because without a redirect you will be able to access both versions of the page in your browser.
To resolve this issue simply rewrite the index.html to the root url by placing the following code into your .htaccess file into your root directory.
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.yoursite.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.yoursite.com/$1/ [R=301,L]
You can also do the same with the index file in any subdirectories that you might create, by simply placing a .htaccess into those sub directories and using variations of the above code. This is how you create nice tight URLs without the duplicate content issue that look like - http://www.semclix.com/design/business/
-
It is a problem which you need to fix. You need to canonicalize your pages.
Those are all various URLs which most likely lead to the same web page. I say "most likely" because these URLs can actually lead to different pages.
You need to tell crawlers and search engines how you organize your site. There are several ways to achieve canonicalization. The method I prefer is to add the following line of code to each page:
The URL provided should be the preferred URL for your page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
NO Meta description pulling through in SERP with react website - Requesting Indexing & Submitting to Google with no luck
Hi there, A year ago I launched a website using react, which has caused Google to not read my meta descriptions. I've submitted the sitemap and there was no change in the SERP. Then, I tried "Fetch and Render" and request indexing for the homepage, which did work, however I have over 300 pages and I can't do that for every one. I have requested a fetch, render and index for "this url and linked pages," and while Google's cache has updated, the SERP listing has not. I looked in the Index Coverage report for the new GSC and it says the urls and valid and indexable, and yet there's still no meta description. I realize that Google doesn't have to index all pages, and that Google may not also take your meta description, but I want to make sure I do my due diligence in making the website crawlable. My main questions are: If Google didn't reindex ANYTHING when I submitted the sitemap, what might be wrong with my sitemap? Is submitting each url manually bad, and if so, why? Am I simply jumping the gun since it's only been a week since I requested indexing for the main url and all the linked urls? Any other suggestions?
Web Design | | DigitalMarketingSEO1 -
Hiding content until user scrolls - Will Google penalize me?
I've used: "opacity:0;" to hide sections of my content, which are triggered to show (using Javascript) once the user scrolls over these sections. I remember reading a while back that Google essentially ignores content which is hidden from your page (it mentioned they don't index it, so it's close to impossible to rank for it). Is this still the case? Thanks, Sam
Web Design | | Sam.at.Moz0 -
Duplicate Content Issue: Mobile vs. Desktop View
Setting aside my personal issue with Google's favoritism for Responsive websites, which I believe doesn't always provide the best user experience, I have a question regarding duplicate content... I created a section of a Wordpress web page (using Visual Composer) that shows differently on mobile than it does on desktop view. This section has the same content for both views, but is formatted differently to give a better user experience on mobile devices. I did this by creating two different text elements, formatted differently, but containing the same content. The problem is that both sections appear in the source code of the page. According to Google, does that mean I have duplicate content on this page?
Web Design | | Dino640 -
Do I need to 301 redirect www.domain.com/index.html to www.domain.com/ ?
So, interestingly enough, the Moz crawler picked up my index.html file (homepage) and reported duplicate content, of course. But, Google hasn't seemed to index the www.domain.com/index.html version of my homepage, just the www.domain.com version. However, it looks like I do have links going specifically to www.domain.com/index.html and I want to make sure those are getting counted towards my overall domain strength. Is it necessary to 301 redirect in the scenario described above?
Web Design | | Small_Business_SEO0 -
Is it cloaking/hiding text if textual content is no longer accessible for mobile visitors on responsive webpages?
My company is implementing a responsive design for our website to better serve our mobile customers. However, when I reviewed the wireframes of the work our development company is doing, it became clear to me that, for many of our pages, large parts of the textual content on the page, and most of our sidebar links, would no longer be accessible to a visitor using a mobile device. The content will still be indexable, but hidden from users using media queries. There would be no access point for a user to view much of the content on the page that's making it rank. This is not my understanding of best practices around responsive design. My interpretation of Google's guidelines on responsive design is that all of the content is served to both users and search engines, but displayed in a more accessible way to a user depending on their mobile device. For example, Wikipedia pages have introductory content, but hide most of the detailed info in tabs. All of the information is still there and accessible to a user...but you don't have to scroll through as much to get to what you want. To me, what our development company is proposing fits the definition of cloaking and/or hiding text and links - we'd be making available different content to search engines than users, and it seems to me that there's considerable risk to their interpretation of responsive design. I'm wondering what other people in the Moz community think about this - and whether anyone out there has any experience to share about inaccessable content on responsive webpages, and the SEO impact of this. Thank you!
Web Design | | mmewdell0 -
How to put 'Link to this article' HTML code at bottom of article & is it helpful?
Hello, I was thinking about putting a box down at the bottom of my client's main articles that let's the reader easily copy the html code it takes to link to the article they're reading. Maybe I'd put it after the author bio. Do any of you do this? If so, what format do you use? It has to look nice of course. This is a non-techie industry. Thanks.
Web Design | | BobGW0 -
Duplicate H1 tag IF it holds SAME text?
Hello people, I know that majority of SEO gurus (?) claim that H1 tag should only be used once per page. In the landing page design I'm working with, we actually need to repeat our core message stated in H1 & H2 - at the bottom of the page. Now the question is: Can that in any way cause any ranking penalty from big G? In my eyes that is not attempt to over optimize page as it contains SAME info as the H1 & H2 at the top of the page. Confusing, so I'm hope that some SEO gurus here will share some light on this. Thanks in advance!
Web Design | | RetroOnline0 -
Custom 404 Page Indexing
Hi - We created a custom 404 page based on SEOMoz recommendations. But.... the page seems to be receiving traffic via organic search. Does it make more sense to set this page as "noindex" by its metatag?
Web Design | | sftravel0