Sunday, December 14, 2008

Akamai EDNS

My team is running yet another move from internally hosted DNS to Akamai EDNS. I've done the same at CCBN and Thomson on other projects previously. The product is great, and the design is superb. We should start moving on Tuesday for web forwarding (from godaddy) and using Akamai EDNS on Wednesday. I look for improves response times and much better DNS resiliency than we had before. More when its implemented J

Been in Geneva doing a datacenter build out

Built out our new datacenter, I was in Geneva for the last 12 days. We are getting ready to move our US hosting facility to Geneva to get ready to merge the MFG and sourcingparts environments. Its going to take several months overall, but the first steps are going to occur in the next few weeks. Everything went really well, we installed and moved a lot of gear, servers, F5 Big-ips, (Unknown Vendor) HA firewalls, and a nice Netapp 3040 dual head cluster.

The only issue was that we got a quad 1gig card in it versus the 10G copper card we wanted. Netapp only makes a 10G fiber card, so we had to buy different switch modules as well. Netapp ate the extra cost on the cards for us, which was nice. We also can't run LACP on the 10G fiber cards, so we lose switch redundancy.

The Netapp is currently serving Vmware ESXi over NFS, and it screams. We also are using iSCSI for our MSSQL clusters. The speeds on Vmware and iSCSI are very good even with the current 4G we are running until we get the new cards in. Working perfectly, and very happy about it.

The new network and server design is nice and clean, and its working perfectly as well. Still have a lot to build in the 1st environment we are working on, as well as the other environments we have to build out. Going to be back in Geneva to finish the last physical moves in Feb or so, but we'll be doing things remotely, and moving as much as possible as well. My team over there has been doing great, as well as the US team who was responsible for most of the implementation and design. Great job to all parties involved, I'm very excited by the progress, and the flexibility we'll have with our new 70 disk Netapp system, networking improvements, and the larger Vmware environment.

More later.

Wednesday, December 3, 2008

Trip to Geneva

I arrived in lovely Geneva on Tuesday morning, and let some of the guys, as well as one of the engineers who works for me here. Great cool city, reminds me of Banff. We spent some time in the datacenter preparing, and getting acquainted with the staff and facility. We unpacked all of the gear, around 4 tons total, it was a lot of work. Today we started racking and configuring after working on some stuff in the office first. Trying to install some new DCs, access points, CA, and RADIUS authentication for 802.1x using dd-wrt. Still need to finish this work tomorrow, we also have Netapp coming in to finish the configuration of the new FAS3040s we bought.

Had a great dinner with some of the guys here in France, and doing a bunch of work tonight to catch up. More as we progress, I'll be here for another 8 days working on the datacenter.

Nite.

Saturday, November 22, 2008

Week in review – Hosting Move planning, and other projects

Had a productive week, and didn't kill myself. I write this from the office on Saturday, I am working on fixing our fileserver which isn't configured properly (on the disk config). I and a colleague booked a trip to Geneva for 12/2 for 10 days. We finished getting everything out the door for that work, aside from the f5 cluster, which we are still working on. We swapped our two production boxes out for a loaner on Wednesday, that went ok. We are still working on our new hosting contract in Geneva. Everyone wants no downtime, but we need time to build the environment. I didn't have the cash to buy everything, so we have to do some shuffling of equipment. Much of that is repurposing machines with ESXi and moving the old machines to VMs.

Other projects we are working on include a Microsoft licensing deal, new web conferencing, office videoconferencing, a new ticketing system, and some project planning.

Friday, November 14, 2008

Bad bad week, kinda…

From REALLY BAD to AWESOME J

REALLY BAD : We had a little exchange issue this week, which was not recovered from so well. That's fixed going forward.

SOMEWHAT BAD : On top of that my biggest project is starting to be realized, which is a datacenter move. That's going okay, but we have some interesting swapping to do. I need to move my f5s this week, and ship a whole bunch of networking gear. I have a Netapp, and lots of servers ready to be installed.

BAD: The issue is that my new contract isn't done yet, which is another thing I have to do. I also just realized we aren't going to do managed backups due to the $5k a month they want to charge us, so now I'm ordering a library as well.

BAD : Found out we opened a German office without any IT signoff, no firewall, connectivity issues, and the staff there speaks no English. Trying to get my hands on that, but it's a bad situation. Need to order some gear.

BAD : We have 4 web conferencing tools in use here, we waste a LOT of money. I want to get everyone on 1 or 2 web conferencing tools. Another annoying project to deal with.

GOOD : Just wrapped up signing a new and much better Akamai contract, which is pretty cool. Also got a LOT of purchases pushed through. Probably about 10 purchases this week for the company.

GOOD : Redoing and reassessing all of our backups due to the exchange issue in all of our offices.

KICKASS : Installed a new helpdesk app that I love that we are evaluating – www.cerb4.com its awesome. I hope to get off the crappy ACE system we are using which is like pulling teeth out of a rabid Rottweiler.

Changes at my Company

I'm not running the group as of 2 weeks ago. That means all operations, enterprise, production, networks, systems, etc etc. I have a good crew of guys, and looking to better augment some of our staff. Happy to have been given the opportunity, and looking forward to making an even larger impact than I have up until this point. Lots of things to get in order which are eating my time aside from all the project management, purchasing, and other engineering I also have to do. Its been very crazy. I'll be posting more on some of the highlights of this week as well.

Friday, October 31, 2008

Monitoring, Reporting, and other stuff :)

Over the last week, I was sick a couple days. I used some of that time to build a cool management dashboard based on data coming from various tools:

  1. Coradiant – E2E, Host, Status Codes, Sessions, good versus errored
  2. Google Analytics – Pageviews
  3. ACE – Ticket metrics

We are also evaluating the Dejaclick product from Alertsite to enhance the basic remote reporting we are using. The tool is very slick, and the pricing is pretty reasonable for what you get. We are also going to be looking at Coradiant Truesight Edge next week to get more data from our Akamaized traffic (which most of it is).

Other than that just ordering a bunch of RAM for some boxes we are turning into ESXi machines, and also ordering some new Dell switches to run 10Gig Ethernet to the new Netapps we are putting in place.

Wednesday, October 22, 2008

Domain fun

I have been busy moving about 400 domains to godaddy. It's been a lot of fun (NOT) consolidating across our current 14 registrars. Once this is done we can start to do web forwarding and migrating onto our new DNS infrastructure. The new one is fully split, where we have a proper external, internal, and update system running. Good design, and should serve us much better.

Enterprise Fun

I am rewiring a bunch of servers, and reconfiguring a large fileserver on Friday. We need to repartition a 8TB volume into smaller slices. I'm going to use the knoppix CD and some of the tools on that distro to move and recut the NTFS partition so that we can run VSS.

Progressing on the Exchange 2007 migration, everything seems to be 100% now, all we have to do is get some of the mailboxes small enough so we can move them over properly. We are shooting for under 500MB, too bad some of our bad users have 7G mailboxes L

Pain.

Wednesday, October 8, 2008

Outages

Our site has been having a lot of issues, so we offloading the image serving onto its own 2 boxes. We also added another 15 spindles on the production DB server, which seems to be helping a lot. We are seeing higher traffic volumes on our Coradiant reports, and we are also seeing lower latency. Good work by the whole team to fix the environment.

Licensing

Working on some licensing for Redhat as well for our new environment. I don't really like dealing with Redhat, but in order to have a good supported OS we have no choice. They are already using it in some places. We were deciding if we were going to switch to CentOS, but figured it was smarter to have the fallback on a vendor.

Vmware and Firewalls

We've been pushing more esxi here, and we are going to run our remote offices off a single esxi host running a domain controller, exchange server, and vpn server. We did the same setup for both Shanghai and Geneva offices. We are going to ship them with the new firewalls which should be a good setup, and easy to manage.


 

Friday, September 26, 2008

Tons of progress this week – VM, Exchange, Firewalls, Security, Storage/NAS

Had a very productive, and busy week. Balancing out server crashes taking down the production site, and the build-out project we are working on the team was very busy. I built over a dozen VMs, and 3 ESXi boxes.

I also worked on some of the config to finish out the main exchange 2007 implementation. We are waiting for ESXi boxes to ship to the remote offices, which will house a mailbox server, a domain controller, and possible a client access server.

One of my colleagues (who knows the setup of the WAN well) is working on the configuration and testing of our new firewall infrastructure. We are putting in all Sonicwall NSA series appliances. We have a 2400 for Shanghai, and 3500's for Atlanta and Geneva. The production environment will be off a pair of HA 4500 series boxes. The features, ease of setup, and price were excellent. We also have good resources from the reseller and the vendor if we need them.

We are also running a pilot (which has been going quite slow) of the Q1labs qradar product. I have used this product in previous companies, and its been a great tool for security, network analysis, and troubleshooting. The problem is running the pilot with these huge projects that need to be done over the next few months isn't really feasible. I hope to invest more time in testing it, but my priorities are dictating that not to be the case.

I also got rid of a couple older boxes this week by doing P2V using VMware converter starter edition. Now the enterprise ESXi boxes are pretty much full, so I need more disk space in order to do anymore work. That should be happening as we get the NAS implemented.

We wrapped up the Bluearc v. Netapp stuff, we are going with a couple of Netapps. The solution looks very good, and we should have it up in a couple weeks. I'm looking forward to building out some Exchange, MSSQL, and ESXi clusters using it!

Talk to you all soon, have a nice weekend. Loving the weather in Georgia J

Tuesday, September 16, 2008

Updates on some projects – NAS/Firewalls

Senior management has given us a date of December to have our current production hosting here in Atlanta (colo, downtown) moved to Geneva (hosting facility we have there). In order to do this we'll need to virtualize a lot, and build out a much better network and storage infrastructure. This will also help us in the future as we grow the business as well.

We pretty much decided at this point that Bluearc is a better solution for us. It was very close between them and Netapp, but it was a matter of performance over more advanced software. We are looking to wrap up the deal soon and get the purchase completed.

We have also decided to go with a sonicwall solution at 5 locations. The product looks very good, and will enable us to have a proper VPN mesh between all sites. We will also be replacing some of our web and spam standalone filters with the new NSA UTM device.

Making some plans to move our Sharepoint and Spam filter from our colo to the corporate office.

Not much else going on at the moment…. That's the update.

Friday, August 29, 2008

Week updates

  1. Been working on a F5 which is having issues. I was going to reload the OS, but I think I can fix it. Still working on it. Waiting for a reply from support.
  2. Finally figured out a way to export from sharepoint collection, and import them into the root collection. Something I'll need to do before we can reorganize the structure of our sharepoint here. I should have this finished early next week. Have to schedule some downtime to do the export/import.
  3. Got my first ESXi machine, with the embedded hypervisor. Me and Jamie will rack it up and get it running on Monday, they we should be good to start loading the backlog of about 5 VMs to start.
  4. Should be making a decision on NAS and Firewalls next week. Have to nail down the specific configs and pricing.
  5. Threw SQL 2008 (DB, SSAS, Report server) on my development box, seems pretty cool so far.
  6. Completed the implementation of DB monitoring and machine monitoring on production and enterprise hosts.

I think that covers this week's major project work. Still also doing some of day-to-day work, fixing things as needed.

Wednesday, August 20, 2008

Wireless networks using certificates painful

Another project which is being annoying is hooking up a few hacked dd-wrt boxes to my Active Directory CA by using IAS. Still doing some testing, and I'll post my findings and details once the team figures out why Windows XP doesn't seem to want to work (we are trying SP3 now). Works like a charm on Vista J


 

Love Vista.

Sharepoint annoyances

Have a bit of a time trying to fix up this mess of a sharepoint site. First thing we got our 3GB database off SQL express. Now we are on standard, I need to fix up the database itself. Myself and a co-worker (Steve) are having a tuff time figuring out how to restructure the site, since it was setup in a very odd manner. I think we have some good ideas, so lets hope we can get a good structure to it. I would post the hierarchy we are moving towards here, but I probably shouldn't J

Storage/NAS Evals

We are looking for a strong NAS system, I'm leaning towards a clustered scale up/out system versus buying a box that I have to replace every 4 years. I think Isilon is the proven leader in this space, and we've selected them to go up against the king (Netapp). Here are the criteria we used to get it down to these two. We are looking into them in depth now.


 

Requirement

Sub-Feature

Weight

Architecture

  

  

  

Centralized Management

6

  

Clustered Device (NAS, Controller, and Power)

10

  

Appliance Model

8

  

Add storage with no downtime

10

  

Add bandwidth and nodes without downtime

10

  

Add cache without downtime (increase IO)

8

  

Auto balance IO across disks and connections

10

Security

  

  

  

AD Integration

10

Data Management

  

  

  

Tiered design 2 or 3 tier

8

  

Migration of data based on usage and other vars

8

  

Content aware

5

  

Ability to snapshot multiple times, and replicate a snap

10

  

Rapid snapshot

6

  

NDMP support

5

  

Backup Exec Support

10

Connectivity

  

  

  

CIFS

10

  

NFS

10

  

Replication at file/block

8

  

Optimization for WAN replication

10

  

Namespace virtualization and migration of namespaces

7

Management

  

  

  

Reporting via Web

8

  

Email reporting

6

  

Usage/compliance reporting per user (AD integration)

6

  

Monitoring and alerting of issues

8

  

Thin provisioning

5

General

  

  

  

Ease of administration

10

  

Ease of use

10

  

Configuration and setup

8

  

Documentation quality

5

  

Speed of client

8

  

Resources used by client

8

Company

  

  

  

Viability

10

  

Support

8

  

Price

10

Total Score

  

  

Things happening - Firewalls

We decided to evaluate Sonicwall and Cisco based on our assessment. We are digging into depth in the next week or so. Here is the criteria. These are needed for both hosting/collocation and for our 3 major offices:


 

Requirement

Sub-Feature

Weight

Architecture

  

  

  

Centralized Management

10

  

Clustered Device

8

  

Appliance Model

8

  

QOS management

6

  

Appliance must support up to 2G (external) and 3G (internal)

10

Security

  

  

  

Stateful Firewall

10

  

Full packet inspection and content filtering

10

  

IPSEC VPN Support

10

  

SSL VPN Support

8

  

OpenVPN Support

8

  

PTPP Support

9

  

AD integration for authentication

6

  

Anti virus/Auti Spam/Anti spyware

4

  

IDS/IPS

9

  

Enforce desktop patchlevel and AV, Quarantine user

4

  

Wireless security

4

  

Behavioral analysis

7

Management

  

  

  

Reporting via Web

8

  

Email reporting/Monitoring and alerting of issues

8

  

Usage/compliance reporting per user (AD integration)

6

  

Backup/Restore

8

  

Upgrades

8

General

  

  

  

Ease of administration

10

  

Ease of use

10

  

Configuration and setup

8

  

Documentation quality

5

  

Speed of client

8

  

Resources used by client

8

Company

  

  

  

Viability

10

  

Support

8

  

Price

10

Total Score

  

  

Wednesday, August 13, 2008

Missed a couple things

I'm moving us onto SQL Standard from SQL Express which is what the corporate intranet site is running. That upgrade was tested, and I'm doing production today. I've also setup some proper backup jobs for the database.

Starting tomorrow, I (and another engineer here) are redoing the Sharepoint structure, permissions, etc. It should be a lot more clear, and easier to understand what people are up to.

I haven't done a lot of Sharepoint administration, so it should be a good learning experience!

Recap of week 1.25 :)

So first week on the new job, and making some good progress. I am learning the infrastructure and some issues that have been bothering us. We have done the following items:

  1. Monitoring
    1. Redid the Coradiant Truesight setup to better catch items and view backend information. Got visibility to additional network areas.
    2. Implemented Solarwinds IPMonitor. We are installing it at the colo and at our enterprise office.
    3. Testing Idera DM, deciding if it will work for us. We need better DB monitoring and diagnostics.
  2. Infrastructure planning
    1. Did initial grading of clustered scale up/out NAS solutions. I will post more details as the project progresses.
    2. Did requirements for new firewall solutions, still have yet to nail this down and grade them.
    3. Built plan around fixing exchange, and moving to a multi-site international infrastructure on Windows 2008 and Exchange 2007.
    4. Started planning a DNS revamp, and proper split domain configurations.
    5. Working on a new wireless implementation as we speak, using dd-wrt and integration into AD with WPA.
  3. Ops
    1. Debugged issues going on with production website.
    2. Reviewed and did some updates on the Akamai configuration.
    3. Implemented VMware environment for HP Quality Center testing, have yet to have QA fully test the buildout.

I learned about the platform, and the upcoming new version of the platform. We also started looking at NBA/IDS systems such at Mazu and Q1Labs. Its not high priority, but it would help a lot with security, and the ability to diagnose network issues, and non-http issues. I really wish Coradiant would view SQL response time…. One can dream.

Tuesday, August 5, 2008

The fun begins

I had a smooth 15.5 hour drive down to Atlanta from Boston on Saturday. I have to do it again with my girl, cats, and other car in a couple weeks. Just working on renting my place in Boston, should be done in a couple days.

I got my place on Sunday, which is awesome. It's on Grant Park, and It's huge and empty. The city is great, and I'm doing a reverse commute.

I've been digging in on some things (yes its only day 2)

  1. Looked at some network issues, but I haven't determined anything really yet.
  2. Sharepoint migration plan from SQL Express to SQL Standard.
    1. Temporary VM setup for testing
    2. Purchasing a proper VMware box for real dev VM
  3. Looking at some NAS vendors, and putting together a requirements plan. I will post that once we nail it down today or tomorrow.
    1. Support for Virtualization (ESX clusters)
    2. Exchange clustering
    3. SQL clustering
    4. Located in Geneva development, Colocation, and Corporate office for DR/replication of data.
  4. Looking at deploying www.dd-wrt.com and integrating our wireless into AD with enterprise WPA. Should be fun and easy to do.

Thursday, July 24, 2008

NAS NAS NAS

I've onlyr eally delat with a couple scalable NASes, and the EMC Celera was a nightmere for many reasons.  I always did like Netapp, they work well and are easy to manage.  Now there are the "next gen" NASes out there, at the new gig we need something good, which can scale easily and run clusters off MS and Linux technologies.  I was looking at the following "cool" vendors.  Has anyone used these before?

These are in order of which ones I think are coolest:

Isillon - I know we have some at my current company, and they work well.  Nice product, no idea on costs.
Onstor - Seen these guys when I was looking at 3par, product looked good, cheap, and solid.
Netapp - old faithful is something thats good with storage :)  Easy to find people who've used it.  Who hasn't?
Bluearc - seems good
Ibrix - Looks interesting since you don't need to buy hardware, but might be too "out there"
Pillar - looks okay...

Friday, July 18, 2008

News time- Moving jobs and moving locations

Now that I told most of the people who I work with, and the vendors who I deal with often.  I've decided to accept a position at a very exciting startup.  The company is located in Atlanta, so I'll be moving down south.  It should be interesting since with the high growth Atlanta is not as "southern" as it could be. 

The company name is MFG.com (www.mfg.com) and its a very interesting business.  They match up companies who want to have parts manufactured with manufacturers.  The kicker is that they have a good presence in europe, and a large presence in China.  It allows for smaller shops to get the benefits of the big companies regarding leveraging the global economy.  It has some very high profile investors, and has a profitable, and fast growing business.

I will be doing all kinds of infrastructure work there.  There is a hot list of major issues which I hope to start to tackle right away, but I'll be moving back into a role further from the tools and technology and into general  IT issues.  I will use my skills to focus on system issues, application problems, network issues, and security problems.  I already have to tackle some problems around exchange, sharepoint, and the hosting environment.  There is a lack of specific kinds of tools there, and I hope to take a best of breed approach with a combination of open source, and small company products (Sorry HP and IBM).  There are a lot of cool products out there I have tested and wished I had a smaller enterprise to deal with.  Now that i'll be working in a smaller environment in terms of production and enterprise it will be a good chance to use these tools.

Since my current employer is large, public, and sensitive, I've avoided naming them.  I will be leaving there on 8/1, and starting my new position on 8/4.

Please keep reading, I will post anything that comes up in the meantime, and should resume in early August.

Thanks to you all, please leave comments, IMs, or email me!

Tuesday, July 15, 2008

A bit crazy

Sorry I haven't updated you all.  I will be updating on Friday.  Mostly its to announce something which I am going to be undertaking.  It should be interesting, and a bit of a shift back to my roots.

More on Friday!

Tuesday, June 24, 2008

HP Software Universe

I attended and spoke at HPSU this year. It was a good conference, and HP is making a lot of progress on the products which we are very interested in. I'm going to go into each of these and explain some of what I saw, and what we are up to with each product.

  • HP NAS
    • New UI is coming down the road.
    • Additional of bare metal provisioning.
    • Internally
      • We are moving from 7.0 to 7.2 here in the next few weeks.
      • We are just finishing up a MS-SQL Multimaster setup with the UK this week. It's been painful, and taken us 4 attempts now, but we are close.
  • HP SAS
    • Some interesting stuff with other customers, best practices, and other tips on it.
    • Internally
      • We are moving towards a small SAS deployment in 2008, probably 200-400 systems. Just firming up some budgets, since we are splitting the cost among a couple budgets.
  • HP OO
    • Tons of new content, which makes OO above and beyond.
    • Multimaster is coming down the pipe, which should be really nice. Since the platform of OO is very similar to NAS it should be easily done.
  • HP BAC/RUM
    • Lots of progress in RUM, especially generic TCP Monitoring.
    • Correlation of alarms coming into BAC.
    • Baselining and auto thresholding – REALLY GOOD TO SEE THIS IN BPM!
      • They have taken this another level higher, and they show you how it would have worked if you use the suggested thresholds.
    • Problem management and workflow ideas which should help the usage.
    • Integration of OM into UCMDB, and other feeds to and from the UCMDB.
  • HP BPM
    • Interesting customer presentation of a large distributed BPM they deploy at branch offices. Very good use case, and well managed. They deal with BPM problems in a good way.
  • HP OM
    • Sitescope integration in OM 8.1.
    • New enhancements around usability and reporting.
    • HTTP agent is excellent, and much easier to deal with than the old agent.

Thursday, June 19, 2008

HP Software Universe

I'm at HPSU this week. I will summarize the great stuff happening over here this weekend when I'm back east.

Friday, June 6, 2008

HP Software Universe 2008

I'm giving a talk at HPSU on our conversion off OVIS onto BAC, BPM, and Sitescope. Should be pretty good, and straightforward, it's taken a while to get it all baked and integrated, but things are progressing well now.

    Are any readers planning on attending the show? It would be good to hear from you guys who are, and we can meet up.

Thanks!

HP – Real User Monitoring (RUM)

I like what I see from HP RUM, my issue is that the pricing and model for it don't work in a shared environment. I can't nail down each application, webserver, application server, or environment. I prefer to deploy it at the edge of the datacenter and just use it as needed. This is how we deploy and use our current real user monitoring from Coradiant. HP has been trying to be flexible, and create a pricing model that works for us over the last 6 months, but we haven't made the kind of progress that I need to see.

    We can't test or deploy a solution which isn't cost effective compared to what I am getting from my current vendor. I don't want to POC the product if it doesn't make financial sense. Hopefully at HP Software Universe we'll make more progress, and enable us to test the product. Ultimately the solution is very compelling and fits nicely with our usage of Business Availability Center (BAC) and related products.

Tool Replacements

I have a couple of short term tool replacements we are gearing up to normalize toolsets across the company. We are a HP NNM shop, and we've decided to finally get rid of it, due to the numerous problems we've been dealing with over the past several years. The amount of work that this tool needs is immense. This is why HP is doing a ground up rewrite of it for 8.0. We can't wait for it, since its not mature enough.

We are implementing IBM ITNM (Formerly Precision). We are also going forward with enhancing our event management by using IBM Impact. Both of these products will provide a lot of value for us in terms of efficiency and increasing the effectiveness of the event management layer we already use from IBM.

We are also looking to get HP SAS in for a small install which will scale out. It's looking like 400 systems to start with. More on that as we nail down the project and timelines. More on the HP side, we are also looking to scale out HP OO and start using it across the company for larger projects.

Friday, May 23, 2008

IBM Pulse 2008 - Review

I spent Monday-Wednesday at IBM Pulse in Orlando. It was a good show, but quite a few of the sessions were full when I arrived. It was frustrating because they didn't offer them more than once. The morning sessions were mostly pie in the sky, and not very useful to me. I got to spend a lot of time with senior people in engineering, architecture, and acquisitions/strategy. I also got to meet people I knew from online or other dealings with IBM. Overall, the show was a good use of my time, and I found it enjoyable.

Here are some of my highlights:

  • ITM 6.2.1 improvements including agentless capabilities and such.
  • New reporting framework based on BIRT which will be rolling forward.
  • New UI which is being pushed and was on display from TBSM 4.2.
  • Hearing about what other customers are up to (mostly bad decisions from what I've seen).
  • Affirmation of ITNM (Precision) as a best of breed tool, with a excellent roadmap.

Some things which are bad and make no sense:

  • Focus on manufacturing (due to MRO).
  • Pushing the MRO platform as IT, when it's clearly not ready for prime time.
  • Pushing TPM as datacenter automation. TPM is a worst of breed product in every way, and is years behind HP SAS and Bladelogic. The product will never catch up.
  • Allowing people to abuse Omnibus by turning it into a monitoring tool, and not a MOM or event manager.
  • Lack of clarity in strategy for Webtop vs TBSM (which customers seem to build BSM on Webtops).
  • ITCAM is total junk, and is years behind competing products from HP (Mercury).

Week in review

Been working out some organizational stuff with my counterpart of the company we purchased. I've also been spending time figuring out the future toolset we will be using for server/app monitoring, event management, network monitoring and management. It will be a combination of our toolsets based on what makes sense. Automation is a huge goal for us of course.

Need to put together some documentation on what we have, and then complete a strategy in the next 4-5 weeks.

Wednesday, May 14, 2008

Conference Updates

I am speaking at HP Software Universe in June in Vegas. It should be interesting. I am talking about our conversion off OVIS onto HP Sitescope, HP BPM, and HP BAC. We are very happy with the new products, and most of all my monitoring customers absolutely love the visibility and clarity these new tools give them.

I am attending IBM Pluse as well in Orlando this weekend. This is my first IBM show, and I'm very interested in ITNM (which we are moving to from NNM in the next 12 months), Omnibus, Webtop, and some of the other IBM tools we use. The show doesn't seem as well put together as shows from CA, HP, or EMC. We'll see how it runs. The venue is definitely questionable… I'm not a big fan of the Disney junk J

Been a little out of it – Updates on my job

Once again I've been quite busy dealing with the integration of a large company we purchased. They have a lot of major toolset issues which need to be worked through. I feel there are a lot of architecture problems with what they are using for monitoring. Since the project is 18 months old, and not progressing as well they are still sinking money into a sinking ship. Should be interesting to see when they want to scale this back, or actually design a portfolio which is best of breed, versus trying to implement a full solution from a single large vendor. Should be a matter of time…. Then they will ask for my help and I'll be willing to assist. I don't want to be the architect when I can't make major changes to the design which has a lot of flaws.

I am interested in potentially running a transformation group, which would include global reach of automation, virtualization, etc. This would be a new area for us, and we need to get serious with products like HP OO, SAS, and NAS. Leveraging automation across the board would be a huge cost and time saver.

Other than that, not sure where I am off to.

Several close friends, and excellent co-workers have been leaving my company. My thoughts are with you in your ventures, they sound like excellent moves which will promote your learning and horizons. I hope to work with you again, and we'll be in touch on LinkedIn or Plaxo!!!

I also have had several people I work with at vendors who's software has been a great asset and learning experience who have also moved onto smaller upcoming companies. I wish them luck, and I hope to work with them in the future again.


 

Friday, April 25, 2008

Travels

Been travelling a bit the last week, and I am off to London for most of next week to work on strategy and organizational structure. I was offered a seat at IBM Pulse, but I don't think I'm going to make it. I will still be attending HP Software Universe in mid-June, which should be interesting given the advances in the HP products from Mercury and Opsware integrations.

Toolset Changes

    As many of you know, we just completed a large purchase of another company, so in the next few weeks I will know what I am going to be doing. In the meantime we are figuring out how to converge our toolsets. There will be more activity here as we work through the issues and determine the final set of products we are going to be pushing out across the universe of the company. Based on some of the deals we make with IBM and other vendors it could also create other areas where we can standardize.

More to come in this area.

IBM POTs

This week we were offsite at the tech center in NYC for a day trip. We looked at IBM Tivoli Provisioning Manager (TPM), the provisioning, deployment product. It is one of the products we are considering standardizing on. I wanted to get clear picture of what it can and cannot do versus Opsware SAS. The product looks good, but I still need to write up the full gap analysis. It definitely would meet most of our patching, inventory, and deployment requirements, but it doesn't fill the system administration, or complex audit and control requirements we are given due to customer audits and regulatory compliance.

Last week IBM brought us a POC for IBM Tivoli Monitoring (ITM), which is the monitoring platform. It compares to HP Openview Operations (OVO). We are going to bring it in house to do more testing, but upon the initial 1 day with the product, we found the following comparison to be true:


 

Issues in POC:

  1. Multiple times the agent died, and the server died. There was no indication of the error aside from a manual restart.
  2. Did not go over agent installation.

Environment:

    Pros to ITM:

  1. Reporting is nicer, and based on open standards.
  2. Multiple server roll into a single TEMS easier than OVO.
  3. More flexible on operating system, database and platform the components can run on.
  4. IBM is quicker to support new component versions (OS, Application server, etc)

    Cons to ITM:

  1. Email management for notifications outside of event escalation are not manageable aside from using command line calls with emails as arguments.
  2. Scenarios applied to groups are not easily manageable, meaning you have to manage the policy in a lot of notifications.
  3. UI is not as easy to use, there are fewer wizards to guide the engineer with the workflow of making a change or implementing something new.
  4. Everything seems to run as a separate agent. So you will have a Windows OS agent, a Universal Agent, and a Custom Agent etc, with all of them running as separate services and processes.

Friday, April 4, 2008

Realtime market data systems

I don't really talk about this much, but we deliver a lot of real time market data to thousands of customers. Part of my responsibility is running a group which monitors the real time environment. With very strange things like multicast, and other abnormal requirements that most standard products don't deal with. A good example is the exchange holidays, open and close times for each exchange out of the 200+ exchanges we bring feeds into global POPs.

The infrastructure has more changes by development than anything else at the company. Figuring out the proper "state" is next to impossible, which make monitoring a challenge to say the least. They have a custom tool, which we are working with the developers on to get a web services interface so we can better understand state before presenting a false alarms to the Realtime operators.

We are cleaning up the rules by pushing the responsibility onto developers to write the proper rules. This should fix things as we audit the existing 75,000 rules in the custom monitoring tool. Going forward the rules are required with each software release.

Part of all of these custom, old, homebuilt, somewhat crappy tools is that we need to extract metrics which are non-standard and use them for capacity analysis. Aside fromt he standard system, and network capacity planning there also has to be software capacity planning. The team generates a monthly report, which is very large a complex. Taking anywhere from 60-100 man hours to create I want to automate the report more, or build some kind of self-service reporting or BI portal. These are initial thoughts, but something we need to start discussing.

Have a good weekend, please leave comments!