Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Should I block robots from URLs containing query strings?
-
I'm about to block off all URLs that have a query string using robots.txt. They're mostly URLs with coremetrics tags and other referrer info. I figured that search engines don't need to see these as they're always better off with the original URL.
Might there be any downside to this that I need to consider?
Appreciate your help / experiences on this one.
Thanks
Jenni
-
Thanks for your suggestions. I've already got canonical tags on every page, but they're not all being adhered to and lots of URLs with query strings are still getting organic traffic.
Passing referrer info behind scenes isn't an option with Coremetrics I don't think. Is it?
Interested to know more about number 1 though. How would you do that in WMT other than blocking with robots.txt?
Thanks
-
Instead of blocking them with robots.txt (which isn't very effective), try using the canonical tag instead.
For instance, a URL like this:
http://wwww.testdomain.com/page.html?utm_source=Google&utm_medium=Banner&utm_campaign=CampaignYou could add this canonical tag in the head:
With this solution you don't have to worry about losing quality links OR having your query tracking show up in any of the major search engines.
Cheers- Kyle
-
The downside to this would be if someone linked to the page with the query string, the search engines wouldn't crawl the page and flow link juice properly to the rest of your site.
Other options:
-
Use Google and Bing WMT to ignore those parameter query strings.
-
Make sure the canoncial tag is on those pages, pointing back to the version without the query string
-
Try to pass referrer info behind the scenes if possible
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is there a limit to how many URLs you can put in a robots.txt file?
We have a site that has way too many urls caused by our crawlable faceted navigation. We are trying to purge 90% of our urls from the indexes. We put no index tags on the url combinations that we do no want indexed anymore, but it is taking google way too long to find the no index tags. Meanwhile we are getting hit with excessive url warnings and have been it by Panda. Would it help speed the process of purging urls if we added the urls to the robots.txt file? Could this cause any issues for us? Could it have the opposite effect and block the crawler from finding the urls, but not purge them from the index? The list could be in excess of 100MM urls.
Technical SEO | | kcb81780 -
Best URL format for pagination
We're currently changing the URL format of our website search, we have been discussing a lot and cannot decide the past way to pass the pagination parameter for SEO. We narrowed down to the options. www.website.com/apples/p2 - www.website.com/apples?page=2 - www.website.com/apples/page/2 What would give us best ranking returns? What do you think?
Technical SEO | | HelpSaude0 -
Google insists robots.txt is blocking... but it isn't.
I recently launched a new website. During development, I'd enabled the option in WordPress to prevent search engines from indexing the site. When the site went public (over 24 hours ago), I cleared that option. At that point, I added a specific robots.txt file that only disallowed a couple directories of files. You can view the robots.txt at http://photogeardeals.com/robots.txt Google (via Webmaster tools) is insisting that my robots.txt file contains a "Disallow: /" on line 2 and that it's preventing Google from indexing the site and preventing me from submitting a sitemap. These errors are showing both in the sitemap section of Webmaster tools as well as the Blocked URLs section. Bing's webmaster tools are able to read the site and sitemap just fine. Any idea why Google insists I'm disallowing everything even after telling it to re-fetch?
Technical SEO | | ahockley0 -
Duplicate Content and URL Capitalization
I have multiple URLs that SEOMoz is reporting as duplicate content. The reason is that there are characters in the URL that may, or may not, be capitalized depending on user input. A couple examples are: www.househitz.com/Pennsylvania/Houses-for-sale www.househitz.com/Pennsylvania/houses-for-sale www.househitz.com/Pennsylvania/Houses-for-rent www.househitz.com/Pennsylvania/houses-for-rent There are currently thousands of instances of this on the site. Is this something I should spend effort to try and resolve (may not be minor effort), or should I just ignore it and move on?
Technical SEO | | Jom0 -
Someone is redirecting their url to mine
Hello, I have just discovered that a company in Poland www.realpilot.pl is directing their domain to ours www.transair.co.uk. We have not authorised this, neither do we want this. I have contacted the company and the webmaster to get it removed. If you search for the domain name www.realpilot.pl we (www.transair.co.uk) come up top. My biggest worry is that we will get penalised by Google for this re-direct as it appears to be done using some kind of frame. Does anyone know anything about this kind of thing? Many Thanks Rob Martin
Technical SEO | | brightonseorob0 -
Duplicate canonical URLs in WordPress
Hi everyone, I'm driving myself insane trying to figure this one out and am hoping someone has more technical chops than I do. Here's the situation... I'm getting duplicate canonical tags on my pages and posts, one is inside of the WordPress SEO (plugin) commented section, and the other is elsewhere in the header. I am running the latest version of WordPress 3.1.3 and the Genesis framework. After doing some testing and adding the following filters to my functions.php: <code>remove_action('wp_head', 'genesis_canonical'); remove_action('wp_head', 'rel_canonical');</code> ... what I get is this: With the plugin active + NO "remove action" - duplicate canonical tags
Technical SEO | | robertdempsey
With the plugin disabled + NO "remove action" - a single canonical tag
With the plugin disabled + A "remove action" - no canonical tag I have tried using only one of these remove_actions at a time, and then combining them both. Regardless, as long as I have the plugin active I get duplicate canonical tags. Is this a bug in the plugin, perhaps somehow enabling the canonical functionality of WordPress? Thanks for your help everyone. Robert Dempsey0 -
How to handle sitemap with pages using query strings?
Hi, I'm working to optimize a site that currently has about 5K pages listed in the sitemap. There are not in face this many pages. Part of the problem is that one of the pages is a tool where each sort and filter button produces a query string URL. It seems to me inefficient to have so many items listed that are all really the same page. Not to mention wanting to avoid any duplicate content or low quality issues. How have you found it best to handle this? Should I just noindex each of the links? Canonical links? Should I manually remove the pages from the sitemap? Should I continue as is? Thanks a ton for any input you have!
Technical SEO | | 5225Marketing0