Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Do I have a problem with missing pages in Screaming Frog?
-
We have category pages and some of those pages have pagination due to us having additional items. Screaming Frog could not find the items that were after page 1. Is this a problem for Google? These item pages are still in the sitemap. I am sure they can find them to index them but does it hurt rankings at all.
-
Check your settings in Screaming Frog for obeying robots.txt, obeying canonicals, etc. That might be your problem.
-
Big Fan!
Yes the pages are in the Google Index, I'll message you the URLs.
-
Hey Niners52! You a 49ers fan?
In my experience, yes, if ScreamingFrog cannot crawl those pages, you will definitely have an issue being crawled by Google as well. Are any of those pages/items currently in Google's index?
Would you mind sharing the URL with me? If so I'd be happy to take a look into it and see if I can help further.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Are thousands of 404s a problem?
An ecommerce site I work on has around 16,000 URLs that are 404s in Webmaster Tools. The vast majority are for products that are no longer stocked by the site, which is a natural occurrence in ecommerce. But my question is, could these possibly be harming rankings?
Technical SEO | | creativemay1 -
Exclude status codes in Screaming Frog
I have a very large ecommerce site I'm trying to spider using screaming frog. Problem is I keep hanging even though I have turned off the high memory safeguard under configuration. The site has approximately 190,000 pages according to the results of a Google site: command. The site architecture is almost completely flat. Limiting the search by depth is a possiblity, but it will take quite a bit of manual labor as there are literally hundreds of directories one level below the root. There are many, many duplicate pages. I've been able to exclude some of them from being crawled using the exclude configuration parameters. There are thousands of redirects. I haven't been able to exclude those from the spider b/c they don't have a distinguishing character string in their URLs. Does anyone know how to exclude files using status codes? I know that would help. If it helps, the site is kodylighting.com. Thanks in advance for any guidance you can provide.
Technical SEO | | DonnaDuncan0 -
How Does Google's "index" find the location of pages in the "page directory" to return?
This is my understanding of how Google's search works, and I am unsure about one thing in specific: Google continuously crawls websites and stores each page it finds (let's call it "page directory") Google's "page directory" is a cache so it isn't the "live" version of the page Google has separate storage called "the index" which contains all the keywords searched. These keywords in "the index" point to the pages in the "page directory" that contain the same keywords. When someone searches a keyword, that keyword is accessed in the "index" and returns all relevant pages in the "page directory" These returned pages are given ranks based on the algorithm The one part I'm unsure of is how Google's "index" knows the location of relevant pages in the "page directory". The keyword entries in the "index" point to the "page directory" somehow. I'm thinking each page has a url in the "page directory", and the entries in the "index" contain these urls. Since Google's "page directory" is a cache, would the urls be the same as the live website (and would the keywords in the "index" point to these urls)? For example if webpage is found at wwww.website.com/page1, would the "page directory" store this page under that url in Google's cache? The reason I want to discuss this is to know the effects of changing a pages url by understanding how the search process works better.
Technical SEO | | reidsteven750 -
How to identify orphan pages?
I've read that you can use Screaming Frog to identify orphan pages on your site, but I can't figure out how to do it. Can anyone help? I know that Xenu Link Sleuth works but I'm on a Mac so that's not an option for me. Or are there other ways to identify orphan pages?
Technical SEO | | MarieHaynes0 -
Can you 301 redirect a page to an already existing/old page ?
If you delete a page (say a sub department/category page on an ecommerce store) should you 301 redirect its url to the nearest equivalent page still on the site or just delete and forget about it ? Generally should you try and 301 redirect any old pages your deleting if you can find suitable page with similar content to redirect to. Wont G consider it weird if you say a page has moved permenantly to such and such an address if that page/address existed before ? I presume its fine since say in the scenario of consolidating departments on your store you want to redirect the department page your going to delete to the existing pages/department you are consolidating old departments products into ?
Technical SEO | | Dan-Lawrence0 -
We have set up 301 redirects for pages from an old domain, but they aren't working and we are having duplicate content problems - Can you help?
We have several old domains. One is http://www.ccisound.com - Our "real" site is http://www.ccisolutions.com The 301 redirect from the old domain to the new domain works. However, the 301-redirects for interior pages, like: http://www.ccisolund.com/StoreFront/category/cd-duplicators do not work. This URL should redirect to http://www.ccisolutions.com/StoreFront/category/cd-duplicators but as you can see it does not. Our IT director supplied me with this code from the HT Access file in hopes that someone can help point us in the right direction and suggest how we might fix the problem: RewriteCond%{HTTP_HOST} ccisound.com$ [NC] RewriteRule^(.*)$ http://www.ccisolutions.com/$1 [R=301,L] Any ideas on why the 301 redirect isn't happening? Thanks all!
Technical SEO | | danatanseo0 -
Trailing Slash Problems
Link juice being split between trailing slash and non versions. ie. ldnwicklesscandles.com/scentsy-uk and ldnwicklesscandles.com/scentsy-uk/ Initially asked in here and was told to do a rewrite in the htaccess file. I don't have access to this with squarespace, nor can I add canonical tags on a page by page basis. 301 redirect from scentsy-uk to scentsy-uk/ didn't work either...said that the redirect wasn't completing in an error message on the browser. Squarespace hasn't been very helpful at all. My question is....is there another way to fix this? or should I just call it a day with squarespace and move to wordpress?
Technical SEO | | cmjolley0 -
Where to put Schema On Page
What part of my page should I put Schema data? Header? Footer? Also All pages? or just home page?
Technical SEO | | bozzie3114