Wednesday, December 20, 2006

Technology consolidation – Datacenter Automation

Across the company we have lacking tools in the area of configuration management, both on the network, storage, and server. Datacenter automation ( Bladelogic, and Opsware) is very needed as determined by the initial meetings of our cross market group meetings.

The work I have done so far will be leveraged by the group as we select and roll out a technology across the corporation. The business side is clamoring for these tools to help deployment, troubleshooting, and management of their servers. I am pushing back so we can make the right decision.

Do I roll something out with the intention of possibly replacing it when we are ready? It's a hard call to make, but we'll see how it pans out over the next 3 weeks.

More on this soon.

Technology alignment

My company is undergoing a realignment of our direction and technical operations groups across all of our market groups. This includes many initiatives, but the one I am involved in is monitoring and tools in use to conduct operations. The other groups don't seem to have a solid strategy, nor a solid understanding of how to solve their operational problems. This is in part due to the fact that they are always firefighting, and thus not able to analyze situations and determine a path.

The strategy and products I have selected in my roadmap are partially due to the fact that we have certain technologies engrained in our process which are not problematic enough to replace. With this new initiative that becomes not only possible, but probable. This means we'll have the BEST tools in place in our roadmap. That includes supportability, cost, and open standards (open source).

Now the tools and set of products will be aligned across the whole organization, across the 20,000+ servers and 10,000+ network devices.

Wednesday, December 6, 2006

Business Intelligence for Infrastructure Monitoring

How cool would it be if you could look at your servers behind a VIP on a load-balancer and tell which servers were serving more errors, serving traffic slower, or not getting the requests they should be in comparisons to your other servers. I have a product in place to do just this, its not released so I can't talk too much about it.

The issue is that we have to embed some of the data in the http requests, or cookies, which I need help from development to do. I am getting pushback about security, and information disclosure which I am fighting through. I understand it could be an issue, but it's the only way that we can properly track usage and performance. This would rally help a lot to have as a internet based standard for tracking requests and backend connections and usage. Using weblogs is not feasible when you have thousands of servers that don't have any consistency. This is my first pass at what I want to add to the http headers:

Type

Variable Name

Explination

Sample code

Application

ApplicationName

StockTickerApp

 

Application

ServiceName

TickerWebService

 

Application

PageName

Login_Page

 

Application

TransactionID

Unique ID of request

 

Infrastructure

ServerID

Something like fcweb01 + last 3 from ip range (.124) fcweb01124

 

User

UserName

Bsmith

 

User

CompanyName

Customer ABC

 


 

More on this later.