Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Magento: Should we disable old URL's or delete the page altogether
-
Our developer tells us that we have a lot of 404 pages that are being included in our sitemap and the reason for this is because we have put 301 redirects on the old pages to new pages.
We're using Magento and our current process is to simply disable, which then makes it a a 404. We then redirect this page using a 301 redirect to a new relevant page. The reason for redirecting these pages is because the old pages are still being indexed in Google.
I understand 404 pages will eventually drop out of Google's index, but was wondering if we were somehow preventing them dropping out of the index by redirecting the URL's, causing the 404 pages to be added to the sitemap.
My questions are:
1. Could we simply delete the entire unwanted page, so that it returns a 404 and drops out of Google's index altogether?
2. Because the 404 pages are in the sitemap, does this mean they will continue to be indexed by Google?
-
Brilliant thanks Dan.
This is what I'm going to do.
-
Hey Andy
To answer your questions:
1. So if you're 301'ing the page, it's not really a 404 page, it's a 301
So yes, you can remove the 301 redirect, making it a true 404 page (check that it returns a 404 code using fetch as google or a tool like urivalet.com).
2. If they are in the sitemap, this won't prevent Google from removing them from the index, but it will throw an error. And not that many people care about Bing, but Bing is apparently super picky about having XML sitemaps perfect.
So yes I would just 404 them without the redirects.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
After hack and remediation, thousands of URL's still appearing as 'Valid' in google search console. How to remedy?
I'm working on a site that was hacked in March 2019 and in the process, nearly 900,000 spam links were generated and indexed. After remediation of the hack in April 2019, the spammy URLs began dropping out of the index until last week, when Search Console showed around 8,000 as "Indexed, not submitted in sitemap" but listed as "Valid" in the coverage report and many of them are still hack-related URLs that are listed as being indexed in March 2019, despite the fact that clicking on them leads to a 404. As of this Saturday, the number jumped up to 18,000, but I have no way of finding out using the search console reports why the jump happened or what are the new URLs that were added, the only sort mechanism is last crawled and they don't show up there. How long can I expect it to take for these remaining urls to also be removed from the index? Is there any way to expedite the process? I've submitted a 'new' sitemap several times, which (so far) has not helped. Is there any way to see inside the new GSC view why/how the number of valid URLs in the indexed doubled over one weekend?
Intermediate & Advanced SEO | | rickyporco0 -
Is there any effect on ranking if i disable right click on page??
Hello , I have site, in which client needs right click on All his pages, his traffic is very Good, But worried, if right click hurts its traffic, ?? any expert can help ?? Thx in Advance
Intermediate & Advanced SEO | | ieplnupur0 -
URL Injection Hack - What to do with spammy URLs that keep appearing in Google's index?
A website was hacked (URL injection) but the malicious code has been cleaned up and removed from all pages. However, whenever we run a site:domain.com in Google, we keep finding more spammy URLs from the hack. They all lead to a 404 error page since the hack was cleaned up in the code. We have been using the Google WMT Remove URLs tool to have these spammy URLs removed from Google's index but new URLs keep appearing every day. We looked at the cache dates on these URLs and they are vary in dates but none are recent and most are from a month ago when the initial hack occurred. My question is...should we continue to check the index every day and keep submitting these URLs to be removed manually? Or since they all lead to a 404 page will Google eventually remove these spammy URLs from the index automatically? Thanks in advance Moz community for your feedback.
Intermediate & Advanced SEO | | peteboyd0 -
How do I get rel='canonical' to eliminate the trailing slash on my home page??
I have been searching high and low. Please help if you can, and thank you if you spend the time reading this. I think this issue may be affecting most pages. SUMMARY: I want to eliminate the trailing slash that is appended to my website. SPECIFIC ISSUE: I want www.threewaystoharems.com to showing up to users and search engines without the trailing slash but try as I might it shows up like www.threewaystoharems.com/ which is the canonical link. WHY? and I'm concerned my back-links to the link without the trailing slash will not be recognized but most people are going to backlink me without a trailing slash. I don't want to loose linkjuice from the people and the search engines not being in consensus about what my page address is. THINGS I"VE TRIED: (1) I've gone in my wordpress settings under permalinks and tried to specify no trailing slash. I can do this here but not for the home page. (2) I've tried using the SEO by yoast to set the canonical page. This would work if I had a static front page, but my front page is of blog posts and so there is no advanced page settings to set the canonical tag. (3) I'd like to just find the source code of the home page, but because it is CSS, I don't know where to find the reference. I have gone into the css files of my wordpress theme looking in header and index and everywhere else looking for a specification of what the canonical page is. I am not able to find it. I'm thinking it is actually specified in the .htaccess file. (4) Went into cpanel file manager looking for files that contain Canonical. I only found a file called canonical.php . the only thing that seemed like it was worth changing was changing line 139 from $redirect_url = home_url('/'); to $redirect_url = home_url(''); nothing happened. I'm thinking it is actually specified in the .htaccess file. (5) I have gone through the .htaccess file and put thes 4 lines at the top (didn't redirect or create the proper canonical link) and then at the bottom of the file (also didn't redirect or create the proper canonical link) : RewriteEngine on
Intermediate & Advanced SEO | | Dillman
RewriteCond %{HTTP_HOST} ^([a-z.]+)?threewaystoharems.com$ [NC]
RewriteCond %{HTTP_HOST} !^www. [NC]
RewriteRule .? http://www.%1threewaystoharems.com%{REQUEST_URI} [R=301,L] Please help friends.0 -
Delete or not delete old/unanswered forum threads?
Hello everyone, here is another question for you: I have several forum postings on my websites that are pretty old and so they are sort of "dead discussion" threads. Some of those old discussion threads are still getting good views (but not new postings), and so I presume may be valuable for some users. But most of them are just answers to personal questions that I doubt someone else could be interested in. Besides that, many postings are just single, unanswered questions still waiting for an answer, forgotten, they are just sitting there, and will probably stay unanswered for years.... I don't think this may be good for SEO, am I right? How do you suggest to approach this kind of issues on forums or discussions sections on a website? I am eager to know your thoughts on all this. Thank you in advance! All the best, Fab.
Intermediate & Advanced SEO | | fablau0 -
Do I need to use canonicals if I will be using 301's?
I just took a job about three months and one of the first things I wanted to do was restructure the site. The current structure is solution based but I am moving it toward a product focus. The problem I'm having is the CMS I'm using isn't the greatest (and yes I've brought this up to my CMS provider). It creates multiple URL's for the same page. For example, these two urls are the same page: (note: these aren't the actual urls, I just made them up for demonstration purposes) http://www.website.com/home/meet-us/team-leaders/boss-man/
Intermediate & Advanced SEO | | Omnipress
http://www.website.com/home/meet-us/team-leaders/boss-man/bossman.cmsx (I know this is terrible, and once our contract is up we'll be looking at a different provider) So clearly I need to set up canonical tags for the last two pages that look like this: http://www.omnipress.com/boss-man" /> With the new site restructure, do I need to put a canonical tag on the second page to tell the search engine that it's the same as the first, since I'll be changing the category it's in? For Example: http://www.website.com/home/meet-us/team-leaders/boss-man/ will become http://www.website.com/home/MEET-OUR-TEAM/team-leaders/boss-man My overall question is, do I need to spend the time to run through our entire site and do canonical tags AND 301 redirects to the new page, or can I just simply redirect both of them to the new page? I hope this makes sense. Your help is greatly appreciated!!0 -
Does Google crawl the pages which are generated via the site's search box queries?
For example, if I search for an 'x' item in a site's search box and if the site displays a list of results based on the query, would that page be crawled? I am asking this question because this would be a URL that is non existent on the site and hence am confused as to whether Google bots would be able to find it.
Intermediate & Advanced SEO | | pulseseo0 -
Is 404'ing a page enough to remove it from Google's index?
We set some pages to 404 status about 7 months ago, but they are still showing in Google's index (as 404's). Is there anything else I need to do to remove these?
Intermediate & Advanced SEO | | nicole.healthline0