Content theft for dummies

by Cem Basman

You want to suck content from your favorite site? The author doesn't provide you RSS? No problem, buddy. There is remedy. Two alternate scrapers are now on market: Feedyes and Feed43. They scrape any HTML site and build a neat feed for you in seconds. You'll never have to write your own content again. It's real easy. Content theft for dummies. Creative commons rights? Who cares. Real bad.

Think about it.

Comments

Does this mean there finally is a way to get full excerpts from vowe.net? ;-)

Karl Ostendorf, 2006-03-09

... either that or you wait a little and give the guy at http://www.rssextender.com a liittle bit more time, currently alpha.

Does require .Net though :(

Steffen Gutermann, 2006-03-09

This is really just the tip of the iceberg. There's another scraper with the initials WS that takes it much further. It doesn't simply create a feed from a site, but it scrapes an entire site, static or with an RSS feed, manipulates it using a thesaurus program to substitute synonyms for key words and then reposts the whole shebang on a site of your choosing. It's possible, for example, to completely scrabe the IMDB site.

This is the direction scraping technology is heading. Very scary stuff.

If you want more infor about this scraper, just email me through my site. I'll gladly share my info with you but I don't want to post the name freely on the Web. The less amount of advertising I can give them, the better.

Jonathan Bailey, 2006-03-09

Some days ago, lots of people were complaining about a new Swiss web service that took/still takes newsfeeds of various blogs and re-published them, see e.g.
BloggingTom's article (in German). There were and there are still various discussions in the blogsphere regarding publishing only excerpts or the entire content as feed. And now these aforementioned services just scrape any HTML content and provide them as feed.
However, there are various tools available for grabbing website content and store it locally on the user's hard disk. FeedYes & Feed43 are doing this in a similar manner, but do provide RSS instead of HTML files. And these tools are web services, and are typical for Web 2.0: providing an online service for this purpose to get rid off offline tools.
Thus, I see no problem regarding the services itself since they just replace offline tools and provide additional features.
But I agree: these services make it even more easier for thieves to copy the content. But that's life, when publishing something in the web, it is easy to copy it.

Michael Woehrer, 2006-03-10

Old vowe.net archive pages

I explain difficult concepts in simple ways. For free, and for money. Clue procurement and bullshit detection.

vowe

Paypal vowe