Skip to main content

Performance Monitoring, another view

Originally posted on devops.com : http://devops.com/2015/05/20/performance-monitoring-another-view/

This is my first post to DevOps.com, for those who haven’t read my other writings, I currently work as the VP of Market Development and Insights at AppDynamics. I joined in February 2015 after being at Gartner for 4 years as a Research VP covering all things monitoring. During my time as a Gartner analyst I would help buyers make decisions about purchasing monitoring software. When I read articles on the internet which blatantly disregard best practices I either smell that something is fishy (sponsorship) or they just didn’t follow a process. Conversely some people do follow a proper process. I specifically wrote this up due to reading a really fishy article.

Step 1 – Admit it

The first step to determining you have a problem is admitting it. Most monitoring tool buyers realize they have too many tools, and none of them help isolate root cause. This is a result of most buying happening at the infrastructure layer, when the application layer is either lightly exercised synthetically or disregarded. People soon realize you must find an application layer monitoring tool, or write your own instrumentation. The vendors who sell logging software would love for you to generate a lot of rich logs, so they can help you solve problems, but realistically this doesn’t work in most enterprises as they are using a combination of custom and packaged software, and forcing developers to write consistent logging is a frustrating and futile task.

Step 2 – Document it

The next phase is to document what you are using today, most organizations have over 10 monitoring tools, and they typically want to buy another one. As they consider APM tools they should document what they have today, and determine a plan to consolidate and eliminate redundant tools. Having tools within silos will happen until you can fix the organizational structure.

Step 3 – Requirements

I can’t tell you how many times vendors set requirements for buyers. You can tell when vendors do this versus actually understanding the requirements the user has, and if the solution will scale and be operable by the team. Purchasing a tool which requires full time consultants on staff to keep it running is no longer feasible today, sorry to the old big ITOM vendors (formerly the big-4) but people don’t want that anymore.

Step 4 – Test it!

If you are evaluating software you MUST test it, and make sure it works in your environment. Better yet load test the software to determine if the vendors claims of overhead and visibility are in fact true. I see far too many people buy software they do not test.
Getting back to the article which prompted this post:
The author is located in Austria and I believe there is a connection to Dynatrace. Dynatraces’ R&D and technical leadership are in Austria, but this is just conjecture. The article is written with a bias, which is obvious to any reader who has used these tools. The evaluator did not test these products, or if he did he didn’t have a testing methodology.
Finally the author’s employer is new and all employees have come from Automic Software in the past year. I don’t need to explain to those who work with Automic the state of things there. http://www.glassdoor.com/Reviews/Automic-Software-Reviews-E269431.htm#
I think I’ve spent enough time on the background, here are my specific issues with the content.
The screenshots are taken from various websites, and are not current or accurate. If someone writes an evaluation of technology, at least test it, and use your screenshots from the testing!
This post was specifically to discuss DevOps, but yet the author doesn’t discuss the need to monitor microservices or asynchronous application patterns, which are clearly the current architecture of choice. This is especially the case in DevOps or continuous release environments. In fact Microsoft themselves announced new products last week at Ignite including nano server, and Azure Fabric Service for building these patterns. The difficulty with microservices is that each request externally spawns a large number of internal requests, these often start and stop out of order creating issues when monitoring these. This causes major issues for APM products, and very few handle this effectively today.
The article calls out user interface and deployment models inconsistently, and fails to mention Dynatrace is an on-premises software product. Dynatrace is also a thick Java client, built on Eclipse. This is the definition of a heavy client, which the author calls out for other products in this article.
Continuing the incorrect facts in this article the author calls out Dell (formerly Quest) Foglight. This is not Dell’s current product for modern architectures. Dell has moved to a SaaS product called Foglight APM SaaS, which is not what was evaluated. The evaluator should look at the current product (noticing a trend here).
Finally the AppDynamics review is inaccurate:
Editors note: the following is this authors opinions, not the opinions of DevOps.com.  As the author states, he is employed by AppDynamics.
“code level visibility may be limited here to reduce overhead.”
The default in order to scale AppDynamics to monitor thousands of systems with one single on premises controller (something no other APM vendor delivers) is done based on advanced smart instrumentation, but every transaction is monitored and measured. All products do  limit the amount of data they request in some way. AppDynamics has several modes including one which captures full call traces for every execution.
“ lack of IIS request visibility”
AppDynamics provides excellent visibility to IIS requests, I’m not sure what the author means here. AppDynamics also supports implementation on Azure using nuget.
“features walk clock time with no CPU breakdown”
Once again, I’m not sure what the author is referring to here. The product provides several ways of viewing CPU usage.
“Its flash-based web UI and rather cumbersome agent configuration”
The agent installation is no different than other products, showing the author has not installed the product. Finally the product is HTML5, and has been for a while, there are still some flash views in configuration screens, but those are being removed with each release. I’d much rather have a web based UI with a little remaining flash than a fat client UI requiring a large download.
AppDynamics has a single UI which goes far deeper into database performance than the other products in this review. AppDynamics also provides far more capabilities than what was highlighted in this article including synthetic. AppDynamics provides deployment of the same software via SaaS or on premises installation.
Finally this review did not look at analytics, which is clearly an area of increasing demand within APM. That being said this review is far from factual or useful.
Hopefully this sets some of this straight, please leave comments here, or @jkowall on twitter!
Thanks.

Comments

Popular posts from this blog

Dynatrace Growth Misinformation

For my valued readers: I wanted to point out some issues I’ve recently seen in the public domain. As a Gartner analyst, I heard many claims about 200% growth, and all kind of data points which have little basis in fact. When those vendors are asked what actual numbers they are basing those growth claims on, often the questions are dodged. Dynatrace, recently used the Gartner name and brand in a press release. In Its First Year as an Independent Company, Gartner Ranks Dynatrace #1 in APM Market http://www.prweb.com/releases/2015/06/prweb12773790.htm I want to clarify the issues in their statements based on the actual Gartner facts published by Gartner in its Market Share data: Dynatrace says in their press release: “expand globally with more than three times the revenue of other new generation APM vendors” First, let’s look at how new the various technologies are: Dynatrace Data Center RUM (DCRUM) is based on the Adlex technology acquired in 2005, but was cr

Vsphere server issues and upgrade progress

So I found out that using the host update tool versus Vcenter update manager is much easier and more reliable when moving from ESXi 3.5 to 4.0. Before I was using the update manager and it wasn't working all that reliably. So far I haven't had any issues using the host update tool. I've done many upgrades now, and I only have 4 left, 3 of which I am doing this weekend. Whenever I speak to vmware they always think I'm using ESX, when I prefer and expect that people should move to the more appliance model of ESXi. With 4.0 they are pretty much on par, and I'm going to stick with ESXi. On one of my vsphere 4.0 servers (virtualcenter) its doing this annoying thing when I try to use the performance overview:   Perf Charts service experienced and internal error.   Message: Report application initialization is not completed successfully. Retry in 60 seconds.   In my stats.log I see this.   [28 Aug 09, 22:28:07] [ERROR] com.vmware.vim.stats.webui.startup.Stat

Misunderstanding "Open Tracing" for the Enterprise

When first hearing of the OpenTracing project in 2016 there was excitement, finally an open standard for tracing. First, what is a trace? A trace is following a transaction from different services to build an end to end picture. The latency of each transaction segment is captured to determine which is slow, or causing performance issues. The trace may also include metadata such as metrics and logs, more on that later. Great, so if this is open this will solve all interoperability issues we have, and allow me to use multiple APM and tracing tools at once? It will help avoid vendor or project lock-in, unlock cloud services which are opaque or invisible? Nope! Why not? Today there are so many different implementations of tracing providing end to end transaction monitoring, and the reason why is that each project or vendor has different capabilities and use cases for the traces. Most tool users don't need to know the implementation details, but when manually instrumenting wi