Content theft for dummies
by Cem Basman
You want to suck content from your favorite site? The author doesn't provide you RSS? No problem, buddy. There is remedy. Two alternate scrapers are now on market: Feedyes and Feed43. They scrape any HTML site and build a neat feed for you in seconds. You'll never have to write your own content again. It's real easy. Content theft for dummies. Creative commons rights? Who cares. Real bad.
Think about it.
Tags: content+theft rss feeds creative+commons
Comments
Does this mean there finally is a way to get full excerpts from vowe.net? ;-)
... either that or you wait a little and give the guy at http://www.rssextender.com a liittle bit more time, currently alpha.
Does require .Net though :(
This is really just the tip of the iceberg. There's another scraper with the initials WS that takes it much further. It doesn't simply create a feed from a site, but it scrapes an entire site, static or with an RSS feed, manipulates it using a thesaurus program to substitute synonyms for key words and then reposts the whole shebang on a site of your choosing. It's possible, for example, to completely scrabe the IMDB site.
This is the direction scraping technology is heading. Very scary stuff.
If you want more infor about this scraper, just email me through my site. I'll gladly share my info with you but I don't want to post the name freely on the Web. The less amount of advertising I can give them, the better.
Some days ago, lots of people were complaining about a new Swiss web service that took/still takes newsfeeds of various blogs and re-published them, see e.g.
BloggingTom's article (in German). There were and there are still various discussions in the blogsphere regarding publishing only excerpts or the entire content as feed. And now these aforementioned services just scrape any HTML content and provide them as feed.
However, there are various tools available for grabbing website content and store it locally on the user's hard disk. FeedYes & Feed43 are doing this in a similar manner, but do provide RSS instead of HTML files. And these tools are web services, and are typical for Web 2.0: providing an online service for this purpose to get rid off offline tools.
Thus, I see no problem regarding the services itself since they just replace offline tools and provide additional features.
But I agree: these services make it even more easier for thieves to copy the content. But that's life, when publishing something in the web, it is easy to copy it.
Post a comment
Recent comments
Thomas Gumz
on How to save half a gig of disk space in a couple of seconds at 03:11
Sebastian Herp
on Amazing photos - all taken with a mobile phone at 01:02
Armin Roth
on From my inbox at 01:01
Volker Weber
on How to save half a gig of disk space in a couple of seconds at 00:18
Mark Elgar
on How to save half a gig of disk space in a couple of seconds at 00:04
Volker Weber
on How to save half a gig of disk space in a couple of seconds at 23:52
Ed Brill
on How to save half a gig of disk space in a couple of seconds at 23:39
Volker Weber
on Google Gears beta for Safari at 22:36
Ben Poole
on Google Gears beta for Safari at 22:20
Alexander Kluge
on How to save half a gig of disk space in a couple of seconds at 21:44
Volker Weber
on How to save half a gig of disk space in a couple of seconds at 21:24
Martin Christian Kautz
on How to save half a gig of disk space in a couple of seconds at 21:20
Claurice Jackson
on How to save half a gig of disk space in a couple of seconds at 20:41
Andy Brunner
on How to save half a gig of disk space in a couple of seconds at 19:56
Norlailawati Zain
on What's the Notes market share really like? at 19:26
Richard Kaufmann
on Department of Homeland Security launches Electronic System for Travel Authorization at 18:35
Lennard Timm
on Password not appropriate at 14:37
Adalbert Duda
on Password not appropriate at 14:03
Roger Schwarz
on Synchronizing iPhone with ... Lotus Notes at 13:57
Ben Rose
on Put a Porsche in your driveway at 13:31
Ben Rose
on Put a Porsche in your driveway at 13:22
Ben Rose
on Zones at 13:10
Nick Daisley
on Put a Porsche in your driveway at 13:03
Ben Rose
on Put a Porsche in your driveway at 12:50
Karsten Lehmann
on Tweet of the day at 12:31



