Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
OK to block /js/ folder using robots.txt?
-
I know Matt Cutts suggestions we allow bots to crawl css and javascript folders (http://www.youtube.com/watch?v=PNEipHjsEPU)
But what if you have lots and lots of JS and you dont want to waste precious crawl resources?
Also, as we update and improve the javascript on our site, we iterate the version number ?v=1.1... 1.2... 1.3... etc.
And the legacy versions show up in Google Webmaster Tools as 404s. For example:
http://www.discoverafrica.com/js/global_functions.js?v=1.1
http://www.discoverafrica.com/js/jquery.cookie.js?v=1.1
http://www.discoverafrica.com/js/global.js?v=1.2
http://www.discoverafrica.com/js/jquery.validate.min.js?v=1.1
http://www.discoverafrica.com/js/json2.js?v=1.1Wouldn't it just be easier to prevent Googlebot from crawling the js folder altogether?
Isn't that what robots.txt was made for?
Just to be clear - we are NOT doing any sneaky redirects or other dodgy javascript hacks.
We're just trying to power our content and UX elegantly with javascript.
What do you guys say:
Obey Matt? Or run the javascript gauntlet?
-
Hey!
So, I listened to Matt's video. I see his point about wanting to crawl the JS files just in case something tricky is going on. Do understand that this is a risk you take. I don't see an issue blocking crawling of those files from a logical perspective, but if you or someone that takes over for you in the future does do something sneaky with JS and you are caught ... plus you have blacked access to the offending files ... it is going to take a lot more work to get back in good graces with them.
It's like a cop searching your car. You have every right to ban them from doing so, but if you have nothing to hide, why make trouble? Matt is right, banning crawling of these files is not going to save you much but if you think it's an issue, feel free. Just know that they might take it as a possible flag in the future.
Kate
-
Harald, it looks like the response you've quoted is from http://groups.google.com/a/googleproductforums.com/forum/#!category-topic/webmasters/crawling-indexing--ranking/9MGYEoROdkg, which is a question about a menu that has javascript. I think this poster has a slightly different question. I'll ask another associate to come on in and take a look.
-
Hi Discover,I think that whenever we access the web pages , we have seen number of times that there is run time error & they asking for debug. This error message is helpful for the developers only but not for the users.
I think that you should please refer to the following link:
The truth about non javascript
I hope that above content help to solve your query.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
In writing the url, it is better to use the language used by the people of my country or English?
We speak Persian and all people search in Persian on Google. But I read in some sources that the url should be in English. Please tell me which language to use for url writing?
Technical SEO | | ghesta
For example, I brought down two models: 1fb0e134-10dc-4737-904f-bfdf07143a98-image.png https://ghesta.ir/blog/how-to-become-rich/
2)https://ghesta.ir/blog/چگونه-پولدار-شویم/0 -
Robots.txt & meta noindex--site still shows up on Google Search
I have set up my robots.txt like this: User-agent: *
Technical SEO | | RoxBrock
Disallow: / and I have this meta tag in my on a Wordpress site, set up with SEO Yoast name="robots" content="noindex,follow"/> I did "Fetch as Google" on my Google Search Console My website is still showing up in the search results and it says this: "A description for this result is not available because of this site's robots.txt" This site has not shown up for years and now it is ranking above my site that I want to rank for this keyword. How do I get Google to ignore this site? This seems really weird and I'm confused how a site with little content, that has not been updated for years can rank higher than a site that is constantly updated and improved.1 -
Url folder structure
I work for a travel site and we have pages for properties in destinations and am trying to decide how best to organize the URLs basically we have our main domain, resort pages and we'll also have articles about each resort so the URL structure will actually get longer:
Technical SEO | | Vacatia_SEO
A. domain.com/main-keyword/state/city-region/resort-name
_ domain.com/family-condo-for-rent/orlando-florida/liki-tiki-village_ _ domain.com/main-keyword-in-state-city/resort-name-feature _
_ domain.com/family-condo-for-rent/orlando-florida/liki-tiki-village/kid-friend-pool_ B. Another way to structure would be to remove the location and keyword folders and combine. Note that some of the resort names are long and spaces are being replaced dynamically with dashes.
ex. domain.com/main-keyword-in-state-city/resort-name
_ domain.com/family-condo-for-rent-in-orlando-florida/liki-tiki-village_ _ domain.com/main-keyword-in-state-city/resort-name-feature_
_ domain.com/family-condo-for-rent-in-orlando-florida/liki-tiki-village-kid-friend-pool_ Question: is that too many folders or should i combine or break up? What would you do with this? Trying to avoid too many dashes.0 -
Disallow: /404/ - Best Practice?
Hello Moz Community, My developer has added this to my robots.txt file: Disallow: /404/ Is this considered good practice in the world of SEO? Would you do it with your clients? I feel he has great development knowledge but isn't too well versed in SEO. Thank you in advanced, Nico.
Technical SEO | | niconico1011 -
Double Slash // in URL
My client is using double forward slahes in URL like this "//" is this affecting SEO?
Technical SEO | | yanaiguana1110 -
Robots.txt to disallow /index.php/ path
Hi SEOmoz, I have a problem with my Joomla site (yeah - me too!). I get a large amount of /index.php/ urls despite using a program to handle these issues. The URLs cause indexation errors with google (404). Now, I fixed this issue once before, but the problem persist. So I thought, instead of wasting more time, couldnt I just disallow all paths containing /index.php/ ?. I don't use that extension, but would it cause me any problems from an SEO perspective? How do I disallow all index.php's? Is it a simple: Disallow: /index.php/
Technical SEO | | Mikkehl0 -
/~username
Hello, The utility on this site that crawls your site and highlights what it sees as potential problems reported an issue with /~username access seeing it as duplicate content i.e. mydomain.com/file.htm is the same as mydomain.com~/username/file.htm so I went to my server hosts and they disabled it using mod_userdir but GWT now gives loads of 404 errors. Have I gone about this the wrong way or was it not really a problem in the first place or have I fixed something that wasn't broken and made things worse? Thanks, Ian
Technical SEO | | jwdl0 -
How does Google find /feed/ at the end of all pages on my site?
Hi! In Google Webmaster Tools I find *.../feed/ as a 404 page in crawl errors. The problem is that none of these pages exist and they have no inbound links (except the start page). FYI, it´s a wordpress site. Example: www.mysite.com/subpage1/feed/ www.mysite.com/subpage2/feed/ www.mysite.com/subpage3/feed/ etc Does Google search for /feed/ by default or why do I keep getting these 404´s every day?
Technical SEO | | Vivamedia0