Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Stop google indexing CDN pages
-
Just when I thought I'd seen it all, google hits me with another nasty surprise!
I have a CDN to deliver images, js and css to visitors around the world. I have no links to static HTML pages on the site, as far as I can tell, but someone else may have - perhaps a scraper site?
Google has decided the static pages they were able to access through the CDN have more value than my real pages, and they seem to be slowly replacing my pages in the index with the static pages.
Anyone got an idea on how to stop that?
Obviously, I have no access to the static area, because it is in the CDN, so there is no way I know of that I can have a robots file there.
It could be that I have to trash the CDN and change it to only allow the image directory, and maybe set up a separate CDN subdomain for content that only contains the JS and CSS?
Have you seen this problem and beat it?
(Of course the next thing is Roger might look at google results and start crawling them too, LOL)
P.S. The reason I am not asking this question in the google forums is that others have asked this question many times and nobody at google has bothered to answer, over the past 5 months, and nobody who did try, gave an answer that was remotely useful. So I'm not really hopeful of anyone here having a solution either, but I expect this is my best bet because you guys are always willing to try.
-
Thank you Edward.
I don't have quite that problem, but I think you are right too.
My CDN is set up to be Origin Pull.
That means there is no need to FTP - the system just fetches content as requested.
- you should check that out if you have to ftp everything.
But what you said that helped me is this - that I should have had one CNAME for images and anotehr CNAME for content and the content should be limited to a folder called content, so I can put the CSS files and the JS files in it and that way, the plain HTML pages at teh root level will never be affected.
I also realized, while checking the system, that I wasn't using a canonical tag in the intermediate pages, as I was in the story pages. So I just added code to add canonical tags for all the intermediate pages and the front page.
I do have a few other types of pages, so I will handle the code for them next.
I think adding the canonical tag might fix the problem, but I will also work on reconfiguring the CDN and change over when the action is not too busy, in case it takes a while to propagate.
-
It sounds like you have set up your CDN slightly wrong.
After setting up a few like you have I realised that I was actually making a complete duplicate of the site rather than just the images or assets
I imagine you have your origin directory for the CDN in the public html folder.
Create a subdomain, set that as the origin.
Eg.. I'm working on this site at the moment: http://looksfishy.co.uk/
I have a subdomain called assets: http://assets.looksfishy.co.uk/
The cdn content: http://cdn.looksfishy.co.uk/
Files uploaded here:
http://assets.looksfishy.co.uk/species/holder/pike.jpg
Displayed here:
http://cdn.looksfishy.co.uk/species/holder/pike.jpg
Check the ip address on them.
It does make uploading images by ftp a bit of a faff, but does make your site better
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How long does google takes to crawl a single site ?
lately i have been thinking , when a crawler visits an already visited site or indexed site, whats the duration of its scanning?
Algorithm Updates | | Sam09schulz0 -
How often should I update the content on my pages?
I have started dropping on my rankings - due to lack of time after having a baby. I'm still managing to blog but I'm wondering if I update the content on my pages will that help? All my Meta tags and page descriptions were updated over a year ago - do I need to update these too? We were ranking in the top spots for a good few years, but we're slowly falling 😞 Please give me any advice to keep us from falling even further. I have claimed all my listings, and try to add new links once a month. I share my blog to all social sites and work hard to get Google reviews, we have 53 which is higher than any of our competitors. Any other ideas? Have I missed something that Google is looking for nowadays? Many thanks 🙂
Algorithm Updates | | Lauren16890 -
Very strange, inconsistent and unpredictable Google ranking
I have been searching through these forums and haven't come across someone that faces the same issue I am. The folks on the Google forums are certain this is an algorithm issue, but I just can't see the logic in that because this appears to be an issue fairly unique to me. I'll take you through what I've gone through. Sorry for it being long. Website URL: https://fenixbazaar.com 1. In early February, I made the switch to https with some small hiccups. Overall however the move was smooth, had redirects all in place, sitemap, indexing was all fine. 2. One night, my organic traffic dropped by almost 100%. All of my top-ranking articles completely disappeared from rank. Top keyword searches were no longer yielding my best performing articles on the front page of results, nor on the last page of results. My pages were still being indexed, but keyword searches weren't delivering my pages in results. I went from 70-100 active users to 0. 3. The next morning, everything was fine. Traffic back up. Top keywords yielding results for my site on the front page. All was back to normal. Traffic shot up. Only problem was the same issue happened that night, and again for the next three nights. Up and down. 4. I had a developer and SEO guy look into my backend to make sure everything was okay. He said there were some redirection issues but nothing that would cause such a significant drop. No errors in Search Console. No warnings. 5. Eventually, the issue stopped and my traffic improved back to where it was. Then everything went great: the site was accepted into Google News, I installed AMP pages perfectly and my traffic boomed for almost 2 weeks. 6. At this point numerous issues with my host provider, price increases, and incredibly outdated cpanel forced me to change hosts. I did without any issues, although I lost a number of articles albeit low-traffic ones in the move. These now deliver 404s and are no longer indexed in the sitemap. 7. After the move there were a number of AMP errors, which I resolved and now I sit at 0 errors. Perfect...or so it seems. 8. Last week I applied for hsts preload and am awaiting submission. My site was in working order and appeared set to get submitted. I applied after I changed hosts. 9. The past 5 days or so has seen good traffic, fantastic traffic to my AMP pages, great Google News tracking, linking from high-authority sites. Good performance all round. 10. I wake up this morning to find 0 active people on my site. I do a Google search and notice my site isn't even the first result whenever I do an actual search for my name. The site doesn't even rank for its own name! My site is still indexed but search results do not yield results for my actual sites. Check Search Console and realised the sitemap had been "processed" yesterday with most pages indexed, which is weird because it was submitted and processed about a week earlier. I resubmitted the sitemap and it appears to have been processed and approved immediately. No changes to search results. 11. All top-ranking content that previously placed in carousal or "Top Stories" in Google News have gone. Top-ranking keywords no longer bring back results with my site: I went through the top 10 ranking keywords for my site, my pages don't appear anywhere in the results, going as far back as page 20 (last page). The pages are still indexed when I check, but simply don't appear in search results. It's happening all over again! Is this an issue any of you have heard of before? Where a site is still being indexed, but has been completely removed from search results, only to return within a few hours? Up and down? I suspect it may be a technical issue, first with the move to https, and now with changing hosts. The fact the sitemap says processed yesterday, suggests maybe it updated and removed the 404s (there were maybe 10), and now Google is attempting to reindexed? Could this be viable? The reason I am skeptical of it being an algorithm issue is because within a matter of hours my articles are ranking again for certain keywords. And this issue has only happened after a change to the site has been applied. Any feedback would be greatly appreciated 🙂
Algorithm Updates | | fenixbazaar0 -
Sitemaps for landing pages
Good morning MOZ Community, We've been doing some re-vamping recently on our primary sitemap, and it's currently being reindexed by the search engines. We have also been developing landing pages, both for SEO and SEM. Specifically for SEO, the pages are focused on specific, long-tail search terms for a number of our niche areas of focus. Should I, or do I need to be considering a separate sitemap for these? Everything I have read about sitemaps simply indicates that if a site has over 50 thousand pages or so, then you need to split a sitemap. Do I need to worry about a sitemap for landing pages? Or simply add them to our primary sitemap? Thanks in advance for your insights and advice.
Algorithm Updates | | bwaller0 -
How long for google to de-index old pages on my site?
I launched my redesigned website 4 days ago. I submitted a new site map, as well as submitted it to index in search console (google webmasters). I see that when I google my site, My new open graph settings are coming up correct. Still, a lot of my old site pages are definitely still indexed within google. How long will it take for google to drop off or "de-index" my old pages? Due to the way I restructured my website, a lot of the items are no longer available on my site. This is on purpose. I'm a graphic designer, and with the new change, I removed many old portfolio items, as well as any references to web design since I will no longer offering that service. My site is the following:
Algorithm Updates | | rubennunez
http://studio35design.com0 -
Ecommerce SEO: Is it bad to link to product/category pages directly from content pages?
Hi ! In Moz' Whiteboard friday video Headline Writing and Title Tag SEO in a Clickbait World, Rand is talking about (among other things) best practices related to linking between search, clickbait and conversion pages. For a client of ours, a cosmetics and make-up retailer, we are planning to build content pages around related keywords, for example video, pictures and text about make-up and fashion in order to best target and capture search traffic related to make-up that is prevalent earlier in the costumer journey. Among other things, we plan to use these content pages to link directly to some of the products. For example a content piece about how to achieve full lashes will to link to particular mascaras and/or the mascara category) Things is, in the Whiteboard video Rand Says:
Algorithm Updates | | Inevo
_"..So your click-bait piece, a lot of times with click-bait pieces they're going to perform worse if you go over and try and link directly to your conversion page, because it looks like you're trying to sell people something. That's not what plays on Facebook, on Twitter, on social media in general. What plays is, "Hey, this is just entertainment, and I can just visit this piece and it's fun and funny and interesting." _ Does this mean linking directly to products pages (or category pages) from content pages is bad? Will Google think that, since we are also trying to sell something with the same piece of content, we do not deserve to rank that well on the content, and won't be considered that relevant for a search query where people are looking for make-up tips and make-up guides? Also.. is there any difference between linking from content to categories vs. products? ..I mean, a category page is not a conversion page the same way a products page is. Looking forward to your answers 🙂0 -
Google Index
Hi all, I just submit my url and linked pages along with xml map to index. How long does it take google to index my new pages?
Algorithm Updates | | businessowner0 -
Does google index non-public pages ie. members logged in page
hi, I was trying to locate resources on the topics regarding how much the google bot indexes in order to qualify a 'good' site on their engine. For example, our site has many pages that are associated with logged in users and not available to the public until they acquire a login username and password. Although those pages show up in google analytics, they should not be made public in the google index which is what happens. In light of Google trying to qualify a site according to how 'engaged' a user is on the site, I would feel that the activities on those member pages are very important. Can anyone offer suggestions on how Google treats those pages since we are planning to do further SEO optimization of those pages. Thanks
Algorithm Updates | | jumpdates0