Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Way to spider Wordpress site
-
I have an old Wordpress site and I want to move it to a new server and take it off Wordpress (too many hacks). I am trying to spider the site so as to get static, non-Wordpress, pages.
I am having trouble doing this. When I spider the site, it changes the URLs. For instance, if the URL is www.domain.com/page/ the URL I get out of the spider is /page/index.html And those are not the URLs in the search engine indices. There are about 2000 pages on this site, so it is not feasible to set up 301 redirects.
I tried using these spidering programs: WinHTTack Website Copier and PageNest
Does anyone know of another method of turning a Wordpress site into a non Wordpress site?
-
Hi Dan
Hmm that's a little strange. Two things;
- is WordPress updated? Do you get the normal URLs when viewing in your browser?
- have you tried Screaming Frog SEO Spider? It's free to crawl up to 500 pages
Although it won't get the actual HTML on the pages, it could solve the URL issue perhaps.
This blackhat world thread has a few options too.
-Dan
-
Hi Dan, I'm not so experienced in migrating a WP to non -wp but I understand that the issue you're having is that the spider is returning index.htmlfiles for urls like domain/page/.
IT's normal, any spider you will use you'll always have and index.html file. Every directory has it's index.html which is the default file to show if you're not establishing something different with rewrite rules.
If you write /page/ the browser will read the index.html file. What you have to be sure is that you'll set up a 301 redirect to avoid any index.html url to show and have it redirected to the main / page (with wildcards is a one line rule) and that your internal links are pointing all to / pages and not to index.html version of it. You can jsut find and replace the /index.html" string into the html code with the /" text (dreamweaver or any html editor will do that in bulk.
Only one commentary on you idea is that you may consider useful to build a php driven site, using includes for header, footer and nav/sidebar, jsut because thinking ahead if you're willing to make changes to a portion of the page repeating throughout the site you'll have to make changes in all pages and uplaod them all which is quite huge to do and also let space for many human/machine errors.
Hope that helped you out!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Not Indexing Pages (Wordpress)
Hello, recently I started noticing that google is not indexing our new pages or our new blog posts. We are simply getting a "Discovered - Currently Not Indexed" message on all new pages. When I click "Request Indexing" is takes a few days, but eventually it does get indexed and is on Google. This is very strange, as our website has been around since the late 90's and the quality of the new content is neither duplicate nor "low quality". We started noticing this happening around February. We also do not have many pages - maybe 500 maximum? I have looked at all the obvious answers (allowing for indexing, etc.), but just can't seem to pinpoint a reason why. Has anyone had this happen recently? It is getting very annoying having to manually go in and request indexing for every page and makes me think there may be some underlying issues with the website that should be fixed.
Technical SEO | | Hasanovic1 -
What should I name my Wordpress homepage?
I work almost exclusively in wordpress now. And I always hesitate when it comes to naming a site's homepage. I have to give it a name - right? I usually pick the business name or /home. And then that is identifies as the site's static homepage in the Wordpress settings and it works just fine. But I've started to get warning that it is an issue because it creates redirects. For example, I just ran the Ryte service analysis on a website and it warned me about "Non-indexable pages with high relevance" and it's basically my homepage that has 29 incoming links that "passes all pagerank to https://ourdomain/home But what am I supposed to call my homepage if not "Home"? It's not like the old days where anyone has to type it in. The root domain loads the homepage just as it should. Can anybody advise me regarding best practices for what to name a Wordpress homepage for good SEO? With thanks in advance for your help.
Technical SEO | | Dandelion0 -
Move a Wordpress Site to HTTPS with Bluehost
HI Guys, do you think that the following guide is enoght to move a bluehost wordpress site to https in a seo best practive way? https://www.shoutmeloud.com/free-ssl-certificate-bluehost-hosting.html Basically their steps are: Install SSL on Bluehost panel Install Really Simple SSL Wp Plugin Edit Your .htacess File & Add The Code For HTTP To HTTPS Redirection Update All HTTP URLs In Database To HTTPS Using Search and Replace Plugin Use Broken Link Checker plugin & use its redirection module to find links to 3rd party sites with HTTP that should now be HTTPS. Last thing to do Submit your new HTTPS site to Google Search Console & submit your sitemap. Update your profile link on Google Analytics. Update your website links on social media profiles & anywhere else they exist. This step you can do in pieces in the coming days. Read this guide to learn more about HTTP to HTTPS migration & fixing mixed content. If you disabled Who.Is guard for your domain name, you can enable it now. Do you know a better practical guide for wordrpess? in term of usefull plugins to handle the migration? Tx to everyone!
Technical SEO | | Dreamrealemedia0 -
What's the best way for users to upload their images to my wordpress site to promote UGC
I have looked at lots of different plugins and wanted a recommendation for an easy way for patients of ours to upload pictures of them out partying and having fun and looking beautiful so future users can see the final results instead of sometimes gory or difficult to understand before and after images. I'd like to give them the opportunity to write captions (like facebook or insta posts and would offer them incentives to do so. I don't want it to be too complicated for them or have too many steps or barriers but I do want it to look nice and slick and modern. Also do you think this would have a positive impact on SEO? I was also thinking of a Q&A app where dentists could get Q&A emails and respond - i've been doing AMA sessions and they've been really successful and I would like to bring it into out site and make it native. Thanks in advance 🙂
Technical SEO | | Smileworks_Liverpool1 -
Staging site and "live" site have both been indexed by Google
While creating a site we forgot to password protect the staging site while it was being built. Now that the site has been moved to the new domain, it has come to my attention that both the staging site (site.staging.com) and the "live" site (site.com) are both being indexed. What is the best way to solve this problem? I was thinking about adding a 301 redirect from the staging site to the live site via HTACCESS. Any recommendations?
Technical SEO | | melen0 -
Authorship and Publisher on WordPress
I successfully enabled rel=publisher on our WordPress blog, and as a test I also enabled rel=authorship for a set of blog posts. (Tested both in Google's Rich Snippets Tester.) However, on the individual blog posts the publisher credit disappears. Is there a way to enable both to appear on blog posts?
Technical SEO | | ufmedia0 -
What is the best way to find missing alt tags on my site (site wide - not page by page)?
I am looking to find all the missing alt tags on my site at once. I have a FF extension that use to do it page by page, but my site is huge and that will take forever. Thanks!!
Technical SEO | | franchisesolutions1 -
Sitefinity vs Wordpress
We're looking for a new CMS and out development company suggested Sitefinity. I've had great success with Wordpress. Is either system better. I love worpdress but have had no experience with Sitefinity. Thanks!
Technical SEO | | StandUpCubicles0