Tuesday, April 13, 2010

Finally a way to block those pesky bots stealing content

We've been using a product over at MFG which is sort of like an invisible captcha tool. The beauty of the product is the end user doesn't even know its running, but the accuracy and technology which is used is very unique and cutting edge. We first started speaking with Pramana – www.pramana.com about over a year ago, initially there was issues with the technology, but it had progressed quickly and become rock solid. I was unable to get false positives in all my testing and scripting.

We implemented the technology (Pramana HumanPresent - www.pramana.com/human-present/) based on issues with competitors which sell databases and information about manufacturing companies essentially stealing our content. They use various methods, including screen scraping, and seo scraping bots. This has been observed in many occasions, and we even had one company who wanted to sell out to us, while they were stealing our data! (somewhat legally)

The product is not super simple to implement, but the benefits are great. They have SDKs for a bunch of languages (for us we use Java, which is more complex than the PHP API or others they have). The SDKs give you all kinds of granular control.

We are a paying customer of Pramana, and they got the great idea of letting users use the service for free (Called BotAlert - http://www.pramana.com/botalert/) in order to detect and measure the bots (you get pretty daily reports from them), if you want to block the bots then you have to pay. The cost is very reasonable considering it doesn't inconvenience users, and it can also allow search engine crawlers to index content, but homebuilt screen scrapers to be blocked.

No comments: