Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Duplicate Content - Bulk analysis tool?
-
Hi
I wondered if there's a tool to analyse duplicate content - within your own site or on external sites, but that you can upload the URL's you want to check in bulk?
I used Copyscape a while ago, but don't remember this having a bulk feature?
Thank you!
-
Great thank you!
I'll give both a go!
-
Great thanks
Yes I use screaming frog for this, but it was to look at actual page content. So yes to see if sites copy our content, but also to see whether we need to update our product content as some products are very similar.
I'll check the batch process on copyscape thanks!
-
I have not used this tool in this way, but have used it for other crawler projects related to content clean up and it is rock solid. They have been very responsive to me on questions related to use of the software. http://urlprofiler.com/
Duplicate content search is the project next on my list, here is how they do it.
http://urlprofiler.com/blog/duplicate-content-checker/
You let URL profiler crawl the section of your site that is most likely to be copied (say your blog) and you tell URL profiler what section of your HTML to compare against (i.e. the content section vs the header or footer). URL profiler then uses proxies (you have to buy the proxies) to perform Google searches on sentences from your content. It crawls those results to see if there is a site in the Google SERPs that has sentences from your content word for word (or pretty close).
I have played with Copyscape, but my markets are too niche for it to work for me. The logic here from URL profilers is that you are searching the database that most matters, Google.
Good luck!
-
I believe you might be able to use List Mode in ScreamingFrog to accomplish this, however it depends on ultimately what your goal is to check for duplicate content. Do you simply want to find duplicate titles or duplicate descriptions? Or do you want to find pages with sufficiently similar text as to warrant concern?
== Ooops! ==
It didn't occur to me that you were more interested in duplicate content caused by other sites copying your content rather than duplicate content among your list of URLs.
Copyscape does have a "Batch Process" tool but it is only available to paid subscribers. It does work quite nicely though.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to fix duplicate content for homepage and index.html
Hello, I know this probably gets asked quite a lot but I haven't found a recent post about this in 2018 on Moz Q&A, so I thought I would check in and see what the best route/solution for this issue might be. I'm always really worried about making any (potentially bad/wrong) changes to the site, as it's my livelihood, so I'm hoping someone can point me in the right direction. Moz, SEMRush and several other SEO tools are all reporting that I have duplicate content for my homepage and index.html (same identical page). According to Moz, my homepage (without index.html) has PA 29 and index.html has PA 15. They are both showing Status 200. I read that you can either do a 301 redirect or add rel=canonical I currently have a 301 setup for my http to https page and don't have any rel=canonical added to the site/page. What is the best and safest way to get rid of duplicate content and merge the my non index and index.html homepages together these days? I read that both 301 and canonical pass on link juice but I don't know what the best route for me is given what I said above. Thank you for reading, any input is greatly appreciated!
On-Page Optimization | | dreservices0 -
Does hover over content index well
i notice increasing cases of portfolio style boxes on site designs (especially wordpress templates) where you have an image and text appears after hover over (sorry for my basic terminology). does this text which appears after hover over have much search engine value or as it doesnt immediately appear on pageload does it carry slightly less weight like tabbed content? any advice appreciated thanks neil
On-Page Optimization | | neilhenderson0 -
Duplicate Content with ?Page ID's in WordPress
Hi there, I'm trying to figure out the best way to solve a duplicate content problem that I have due to Page ID's that WordPress automatically assigns to pages. I know that in order for me to resolve this I have to use canonical urls but the problem for me is I can't figure out the URL structure. Moz is showing me thousands of duplicate content errors that are mostly related to Page IDs For example, this is how a page's url should look like on my site Moz is telling me there are 50 duplicate content errors for this page. The page ID for this page is 82 so the duplicate content errors appear as follows and so on. For 47 more pages. The problem repeats itself with other pages as well. My permalinks are set to "Post Name" so I know that's not an issue. What can I do to resolve this? How can I use canonical URLs to solve this problem. Any help will be greatly appreciated.
On-Page Optimization | | SpaMedica0 -
Duplicate Content - Blog Rewriting
I have a client who has requested a rewrite of 250 blog articles for his IT company. The blogs are dispersed on a variety of platforms: his own website's blog, a business innovation website, and an IT website. He wants to have each article optimised with keyword phrases and then posted onto his new website thrice weekly. All of this is in an effort to attract some potential customers to his new site and also to establish his company as a leader in its field. To what extent would I need to rewrite each article so as to avoid duplicating the content? Would there even be an issue if I did not rewrite the articles and merely optimised them with keywords? Would the articles need to be completely taken by all current publishers? Any advice would be greatly appreciated.
On-Page Optimization | | StoryScout0 -
Duplicate Content when Using "visibility classes" in responsive design layouts? - a SEO-Problem?
I have text in the right column of my responsive layout which will show up below the the principal content on small devices. To do this I use visibility classes for DIVs. So I have a DIV with with a unique style text that is visible only on large screen sizes. I copied the same text into another div which shows only up only on small devices while the other div will be hidden in this moment. Technically I have the same text twice on my page. So this might be duplicate content detected as SPAM? I'm concerned because hidden text on page via expand-collapsable textblocks will be read by bots and in my case they will detect it twice?Does anybody have experiences on this issue?bestHolger
On-Page Optimization | | inlinear0 -
If I enbed the same video from my YouTube account on two different websites, will I get a duplicate content penalty?
I have a YouTube video I want to show my B2B and B2C customers. But I have a different websites for each. If I embed the video will I get duplicate content strike against me?
On-Page Optimization | | RoxBrock0 -
Webmaster tools
Hi there, I have access to my sebsite writeing www.piensapiensa.es or www.piensapiensa.com. What domain should I add to the webmasters tools? Should I have to do some kind of 301 direction from piensapiensa.com to piensapiensa.com as the main market is in Spain? Thanks.
On-Page Optimization | | juanmiguelcr0 -
Is there a tool that will "grade" content?
Does anybody know of a tool that can "grade" content for Panda compliance. For example, it might look at: • the total number of words on the page • the average number of words in sentences • grammar • spelling • repetitious words and/or phrases • Readability—using algorithms such as: Flesch Kincaid Reading Ease Flesch Kincaid Grade Level Gunning Fog Score Coleman Liau Index Automated Readability Index (ARI) For the last 5 months I've been writing and rewriting literally 100s of catalog descriptions—adhering to the "no duplicate content" and "adding value" rubrics—but in an extremely informal style. I would like to know if I'm at least meeting Google Panda's minimum standards.
On-Page Optimization | | RScime250