Skip to main content

Finally a way to block those pesky bots stealing content

We've been using a product over at MFG which is sort of like an invisible captcha tool. The beauty of the product is the end user doesn't even know its running, but the accuracy and technology which is used is very unique and cutting edge. We first started speaking with Pramana – www.pramana.com about over a year ago, initially there was issues with the technology, but it had progressed quickly and become rock solid. I was unable to get false positives in all my testing and scripting.

We implemented the technology (Pramana HumanPresent - www.pramana.com/human-present/) based on issues with competitors which sell databases and information about manufacturing companies essentially stealing our content. They use various methods, including screen scraping, and seo scraping bots. This has been observed in many occasions, and we even had one company who wanted to sell out to us, while they were stealing our data! (somewhat legally)

The product is not super simple to implement, but the benefits are great. They have SDKs for a bunch of languages (for us we use Java, which is more complex than the PHP API or others they have). The SDKs give you all kinds of granular control.

We are a paying customer of Pramana, and they got the great idea of letting users use the service for free (Called BotAlert - http://www.pramana.com/botalert/) in order to detect and measure the bots (you get pretty daily reports from them), if you want to block the bots then you have to pay. The cost is very reasonable considering it doesn't inconvenience users, and it can also allow search engine crawlers to index content, but homebuilt screen scrapers to be blocked.

Comments

Popular posts from this blog

Dynatrace Growth Misinformation

For my valued readers: I wanted to point out some issues I’ve recently seen in the public domain. As a Gartner analyst, I heard many claims about 200% growth, and all kind of data points which have little basis in fact. When those vendors are asked what actual numbers they are basing those growth claims on, often the questions are dodged. Dynatrace, recently used the Gartner name and brand in a press release. In Its First Year as an Independent Company, Gartner Ranks Dynatrace #1 in APM Market http://www.prweb.com/releases/2015/06/prweb12773790.htm I want to clarify the issues in their statements based on the actual Gartner facts published by Gartner in its Market Share data: Dynatrace says in their press release: “expand globally with more than three times the revenue of other new generation APM vendors” First, let’s look at how new the various technologies are: Dynatrace Data Center RUM (DCRUM) is based on the Adlex technology acquired in 2005, but was cr

Misunderstanding "Open Tracing" for the Enterprise

When first hearing of the OpenTracing project in 2016 there was excitement, finally an open standard for tracing. First, what is a trace? A trace is following a transaction from different services to build an end to end picture. The latency of each transaction segment is captured to determine which is slow, or causing performance issues. The trace may also include metadata such as metrics and logs, more on that later. Great, so if this is open this will solve all interoperability issues we have, and allow me to use multiple APM and tracing tools at once? It will help avoid vendor or project lock-in, unlock cloud services which are opaque or invisible? Nope! Why not? Today there are so many different implementations of tracing providing end to end transaction monitoring, and the reason why is that each project or vendor has different capabilities and use cases for the traces. Most tool users don't need to know the implementation details, but when manually instrumenting wi

Vsphere server issues and upgrade progress

So I found out that using the host update tool versus Vcenter update manager is much easier and more reliable when moving from ESXi 3.5 to 4.0. Before I was using the update manager and it wasn't working all that reliably. So far I haven't had any issues using the host update tool. I've done many upgrades now, and I only have 4 left, 3 of which I am doing this weekend. Whenever I speak to vmware they always think I'm using ESX, when I prefer and expect that people should move to the more appliance model of ESXi. With 4.0 they are pretty much on par, and I'm going to stick with ESXi. On one of my vsphere 4.0 servers (virtualcenter) its doing this annoying thing when I try to use the performance overview:   Perf Charts service experienced and internal error.   Message: Report application initialization is not completed successfully. Retry in 60 seconds.   In my stats.log I see this.   [28 Aug 09, 22:28:07] [ERROR] com.vmware.vim.stats.webui.startup.Stat