Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Using the Google Remove URL Tool to remove https pages
-
I have found a way to get a list of 'some' of my 180,000+ garbage URLs now, and I'm going through the tedious task of using the URL removal tool to put them in one at a time. Between that and my robots.txt file and the URL Parameters, I'm hoping to see some change each week.
I have noticed when I put URL's starting with https:// in to the removal tool, it adds the http:// main URL at the front.
For example, I add to the removal tool:-
https://www.mydomain.com/blah.html?search_garbage_url_addition
On the confirmation page, the URL actually shows as:-
http://www.mydomain.com/https://www.mydomain.com/blah.html?search_garbage_url_addition
I don't want to accidentally remove my main URL or cause problems. Is this the right way this should look?
AND PART 2 OF MY QUESTION
If you see the search description in Google for a page you want removed that says the following in the SERP results, should I still go to the trouble of putting in the removal request?
www.domain.com/url.html?xsearch_...
A description for this result is not available because of this site's robots.txt – learn more.
-
Thanks so much for taking the time to respond.
I think I will add the https to WMT and remove them that way.
I will take a look through the .htaccess file and the creation of the ssl robots file. A while back, it seemed that Google was indexing a lot of my site as https and then the dropped it and went mainly back to http. I will get that sorted to make it clear.
-
Hi there
I'll start with question 2 first as it's a bit easier to answer. Robots.txt blocks the crawling of a page, but not necessarily indexing. Of course, if the page cannot be crawled it will be deindexed eventually anyway, but if you're getting that description for one of your URLs, Google has not been able to access it and will stop trying to. So that is usually enough, although if you want to remove it as well, you can by all means.
For question 1 - GWT is a bit awkward in the sense that it treats http and https versions of your site as different webmaster properties. Furthermore, if you want to remove a URL on your site, it will always prefix it with the http/https version of your site, no matter how you enter it.
If you added another WMT property that was https://www.yourdomain.com - you would be able to manage that domain as well and thus you would be able to remove any URLs under that prefix.
Incidentally, if you want to block all HTTPS pages from being accessed, you can do that with a special instruction in your htaccess file and robots txt. You can instruct the Googlebot and other bots to read a specific robots.txt file if they visit an HTTPS URL. To do that, you would first add this to your htaccess file:
RewriteCond %{HTTPS} ^on$
RewriteCond %{REQUEST_URI} ^/robots.txt$
RewriteRule ^(.*)$ /robots_ssl.txt [L]This command basically says "if the URL has https, read the robots_ssl.txt file". You then upload a file called robots_ssl.txt to your root domain. In the txt file you just add:
User-agent: *
Disallow: /So now, if a bot reaches an https URL, it has to read the robots_ssl.txt file and upon reading that, they are denied access. That would prevent all of your https URLs from being indexed.
That might be useful to you, but if you go ahead and use it please take care to backup all your files in case anything goes wrong - your htaccess file is very important!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is 301 redirect the only way when using Vanity URLs?
We have been using vanity urls for some of our pages. Mostly the pages that have a vanity URL have a long URL length. But now the problem is, the vanity URL is getting displayed on the search engine when the particular keyword related to the page is entered. I checked the google search console, the vanity URL is indexed and the original URL remains unindexed. What should I do? Is adding 301 redirect to the vanity URLs are solution? Since some of vanity URLs are not redirecting to the original. Some of the original pages are not getting traffic. Also, can using canonical tag help?
Technical SEO | | tejasbansode0 -
What happens when you replace a page with a new version that has the same URL?
a new page template was created the plan is to publish the new page (which has the same URL as before) to web and delete the old page that has the URL , will that have an SEO implications ?
Technical SEO | | lina_digital1 -
Create Page Titles from H1 using Yoast?
I'm working on a site that has 280 blog posts that have either been migrated from an old CMS site or created on the Dev version of the new WordPress site. We've written 280 unique meta descriptions so they don't truncate but it there a quick way I can export the current H1s and then import them into Yoast so they are set as the Page Titles? I've written unique Page Titles and meta descriptions for all the Service and Products page and just want a way to speed up the blog posts as their H1s are really good and what I would use as Page Titles anyway. Any help, greatly appreciated!
Technical SEO | | Marketing_Today0 -
How to check if an individual page is indexed by Google?
So my understanding is that you can use site: [page url without http] to check if a page is indexed by Google, is this 100% reliable though? Just recently Ive worked on a few pages that have not shown up when Ive checked them using site: but they do show up when using info: and also show their cached versions, also the rest of the site and pages above it (the url I was checking was quite deep) are indexed just fine. What does this mean? thank you p.s I do not have WMT or GA access for these sites
Technical SEO | | linklander0 -
Google Webmaster Tools - content keywords containing spam?
Hi all, When I looked in Google Webmaster Tools today I found under the menu Google Index, Content Keywords, that the list is full of spammy keywords (E.g. Viagra (no. 1) and stuff like that) Around april we built a whole new website, uploaded a new xml-sitemap, and did all the other things Google Webmaster Tools suggest when one is creating a Google Webmaster Account. Under the menu "Security Issues" nothing is mentioned. All together I find it har d to believe that the site is hacked - so WHY is Google finding these content keywords on our site?? Should I fear that this will harm my SEO efforts? Best regards, Christian
Technical SEO | | Henrik_Kruse0 -
How to inform Google to remove 404 Pages of my website?
Hi, I want to remove more than 6,000 pages of my website because of bad keywords, I am going to drop all these pages and making them ‘404’ I want to know how can I inform google that these pages does not exists so please don’t send me traffic from those bad keywords? Also want to know can I use disavow tool of google website to exclude these 6,000 pages of my own website?
Technical SEO | | renukishor4 -
Noindex vs. page removal - Panda recovery
I'm wondering whether there is a consensus within the SEO community as to whether noindexing pages vs. actually removing pages is different from Google Pandas perspective?Does noindexing pages have less value when removing poor quality content than physically removing ie. either 301ing or 404ing the page being removed and removing the links to it from the site? I presume that removing pages has a positive impact on the amount of link juice that gets to some of the remaining pages deeper into the site, but I also presume this doesn't have any direct impact on the Panda algorithm? Thanks very much in advance for your thoughts, and corrections on my assumptions 🙂
Technical SEO | | agencycentral0 -
Removing URL Parentheses in HTACCESS
Im reworking a website for a client, and their current URLs have parentheses. I'd like to get rid of these, but individual 301 redirects in htaccess is not practical, since the parentheses are located in many URLs. Does anyone know an HTACCESS rule that will simply remove URL parantheses as a 301 redirect?
Technical SEO | | JaredMumford0