Skip to main content


This week we were offsite at the tech center in NYC for a day trip. We looked at IBM Tivoli Provisioning Manager (TPM), the provisioning, deployment product. It is one of the products we are considering standardizing on. I wanted to get clear picture of what it can and cannot do versus Opsware SAS. The product looks good, but I still need to write up the full gap analysis. It definitely would meet most of our patching, inventory, and deployment requirements, but it doesn't fill the system administration, or complex audit and control requirements we are given due to customer audits and regulatory compliance.

Last week IBM brought us a POC for IBM Tivoli Monitoring (ITM), which is the monitoring platform. It compares to HP Openview Operations (OVO). We are going to bring it in house to do more testing, but upon the initial 1 day with the product, we found the following comparison to be true:


Issues in POC:

  1. Multiple times the agent died, and the server died. There was no indication of the error aside from a manual restart.
  2. Did not go over agent installation.


    Pros to ITM:

  1. Reporting is nicer, and based on open standards.
  2. Multiple server roll into a single TEMS easier than OVO.
  3. More flexible on operating system, database and platform the components can run on.
  4. IBM is quicker to support new component versions (OS, Application server, etc)

    Cons to ITM:

  1. Email management for notifications outside of event escalation are not manageable aside from using command line calls with emails as arguments.
  2. Scenarios applied to groups are not easily manageable, meaning you have to manage the policy in a lot of notifications.
  3. UI is not as easy to use, there are fewer wizards to guide the engineer with the workflow of making a change or implementing something new.
  4. Everything seems to run as a separate agent. So you will have a Windows OS agent, a Universal Agent, and a Custom Agent etc, with all of them running as separate services and processes.


Popular posts from this blog

Misunderstanding "Open Tracing" for the Enterprise

When first hearing of the OpenTracing project in 2016 there was excitement, finally an open standard for tracing. First, what is a trace? A trace is following a transaction from different services to build an end to end picture. The latency of each transaction segment is captured to determine which is slow, or causing performance issues. The trace may also include metadata such as metrics and logs, more on that later.
Great, so if this is open this will solve all interoperability issues we have, and allow me to use multiple APM and tracing tools at once? It will help avoid vendor or project lock-in, unlock cloud services which are opaque or invisible? Nope! Why not?
Today there are so many different implementations of tracing providing end to end transaction monitoring, and the reason why is that each project or vendor has different capabilities and use cases for the traces. Most tool users don't need to know the implementation details, but when manually instrumenting with an API, t…

F5 Persistence and my 6 week battle with support

We've been having issues with persistence on our F5's since we launched our new product. We have tried many different ways of trying to get our clients to stick on a server. Of course the first step was using a standard cookie persistence which the F5 was injecting. All of our products which use SSL is being terminated on the F5, which makes cookie work fine even for SSL traffic. After we started seeing clients going to many servers, we figured it would be safe to use a JSESSIONID cookie which is a standard Java application server cookie that is always unique per session. We implemented the following Irule (slightly modified in order to get more logging): (registration is free)
# Check if there is a JSESSIONID cookie
if {[HTTP::cookie "JSESSIONID"] ne ""}{
# Persist off of the cookie value with a timeout of 2 hours (7200 seconds)

NPM is Broken

As someone who bought and implemented NPM solutions, covered them as an analyst, and now watches the industry, one cannot help but notice that NPM(D) is broken. According to Gartner themselves, the data center is rapidly changing, the data center is going away, maybe not as quickly as Capp states, but it’s happening. This is apparent by the massive public cloud growth posted by Amazon, Microsoft, and Google in their infrastructure businesses. This means that traditional appliance-based NPMD offerings will not work, nor will traditional ways of collecting packet data. Many of the flow offerings do not handle the new types of flows which these services generate, but most importantly they do not understand the internet, which is the most important part of assuring services in cloud hosted environments.
The network itself is not just moving to overlay a-la NSX and ACI, it's moving inside of orchestrated containers, and new proxy/load balancing systems typically built off components or …