Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Good robots txt for magento
-
Dear Communtiy,
I am trying to improve the SEO ratings for my website www.rijwielcashencarry.nl (magento). My next step will be implementing robots txt to exclude some crawling pages.
Does anybody have a good magento robots txt for me? And what need i copy exactly?Thanks everybody!
Greetings,
Bob
-
This is fine, as long as you don't want to exclude robots from crawling any part of your site.
-
Me to have this problem, if someone can help with setting root.txt
my webcurrent configuration is
Sitemap: http://www.myweb/sitemap.xml
User-agent: *
Disallow:THIS IS GOOD ?
-
Hi Ruth,
Also thanks for your response!
Greetings,
Bob
-
Hi Peter,
Thanks for your response! I am going to follow up your advice and build a good Robots TXT.
Greetings,
Bob
-
Peter is correct - your search, admin and user pages are common pages to block for Magento. What you block is up to you, though. Don't forget that a page that is blocked by robots.txt can still be found by search engines, so if it's a page that will contain private information you should protect it with a password.
-
Hi there! Did Peter's response take care of this for you? If so, please mark it as a "Good Answer."
-
Hi,
Creating robots.txt file for the site is one of the most important thing, you need to understand your website or stores basic needs what to keep private and what to make public; I think you need to block some part in your magento site like your search pages (?*sid) and admin pages, user dashboard pages, here is an example links Robots.txt for Magento and Robots.txt File Examples
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Role of Robots.txt and Search Console parameters settings
Hi, wondering if anyone can point me to resources or explain the difference between these two. If a site has url parameters disallowed in Robots.txt is it redundant to edit settings in Search Console parameters to anything other than "Let Googlebot Decide"?
Technical SEO | | LivDetrick0 -
Multiple robots.txt files on server
Hi! I have previously hired a developer to put up my site and noticed afterwards that he did not know much about SEO. This lead me to starting to learn myself and applying some changes step by step. One of the things I am currently doing is inserting sitemap reference in robots.txt file (which was not there before). But just now when I wanted to upload the file via FTP to my server I found multiple ones - in different sizes - and I dont know what to do with them? Can I remove them? I have downloaded and opened them and they seem to be 2 textfiles and 2 dupplicates. Names: robots.txt (original dupplicate)
Technical SEO | | mjukhud
robots.txt-Original (original)
robots.txt-NEW (other content)
robots.txt-Working (other content dupplicate) Would really appreciate help and expertise suggestions. Thanks!0 -
Should I block Map pages with robots.txt?
Hello, I have a website that was started in 1999. On the website I have map pages for each of the offices listed on my site, for which there are about 120. Each of the 120 maps is in a whole separate html page. There is no content in the page other than the map. I know all of the offices love having the map pages so I don't want to remove the pages. So, my question is would these pages with no real content be hurting the rankings of the other pages on our site? Therefore, should I block the pages with my robots.txt? Would I also have to remove these pages (in webmaster tools?) from Google for blocking by robots.txt to really work? I appreciate your feedback, thanks!
Technical SEO | | imaginex0 -
Blocked jquery in Robots.txt, Any SEO impact?
I've heard that Google is now indexing links and stuff available in javascript and jquery. My webmastertools is showing that some links are blocked in robots.txt of jquery. Sorry I'm not a developer or designer. I want to know is there any impact of this on my SEO? and also how can I unblock it for the robots? Check this screenshot: http://i.imgur.com/3VDWikC.png
Technical SEO | | hammadrafique0 -
Two META Robots tags on a page - which will win?
Hi, Does anybody know which meta-robots tag will "win" if there is more than one on a page? The situation:
Technical SEO | | jmueller
our CMS is not very flexible and so we have segments of META-Tags on the page that originate from templates.
Now any author can add any meta-tag from within his article-editor.
The logic delivering the pages does not care if there might be more than one meta-robots tag present (one from template, one from within the article). Now we could end up with something like this: Which one will be regarded by google & co?
First?
Last?
None? Thanks a lot,
Jan0 -
Allow or Disallow First in Robots.txt
If I want to override a Disallow directive in robots.txt with an Allow command, do I have the Allow command before or after the Disallow command? example: Allow: /models/ford///page* Disallow: /models////page
Technical SEO | | irvingw0 -
Subdomain Removal in Robots.txt with Conditional Logic??
I would like to see if there is a way to add conditional logic to the robots.txt file so that when we push from DEV to PRODUCTION and the robots.txt file is pushed, we don't have to remember to NOT push the robots.txt file OR edit it when it goes live. My specific situation is this: I have www.website.com, dev.website.com and new.website.com and somehow google has indexed the DEV.website.com and NEW.website.com and I'd like these to be removed from google's index as they are causing duplicate content. Should I: a) add 2 new GWT entries for DEV.website.com and NEW.website.com and VERIFY ownership - if I do this, then when the files are pushed to LIVE won't the files contain the VERIFY META CODE for the DEV version even though it's now LIVE? (hope that makes sense) b) write a robots.txt file that specifies "DISALLOW: DEV.website.com/" is that possible? I have only seen examples of DISALLOW with a "/" in the beginning... Hope this makes sense, can really use the help! I'm on a Windows Server 2008 box running ColdFusion websites.
Technical SEO | | ErnieB0 -
Schema for Price Comparison Services - Good or Bad?
Hey guys, I was just wondering what the whole schema.org markup means for people that run search engines (i.e. for a niche, certain products) or price comparison engines in general. The intend behind schema.org was to help the engines better understand the pages content. Well, I guess such services don't necessarily want Google to understand that they're just another search engine (and thus might get thrown out of the index for polluting it with search result pages). I see two possible scenarios: either not implement them or implement them in a way that makes the site not look like an aggregator, i.e. by only marking up certain products with unique text. Any thoughts? Does the SEOmoz team has any advice on that? Best,
Technical SEO | | derderko
schuon0