Skip to main content

Recap of week 1.25 :)

So first week on the new job, and making some good progress. I am learning the infrastructure and some issues that have been bothering us. We have done the following items:

  1. Monitoring
    1. Redid the Coradiant Truesight setup to better catch items and view backend information. Got visibility to additional network areas.
    2. Implemented Solarwinds IPMonitor. We are installing it at the colo and at our enterprise office.
    3. Testing Idera DM, deciding if it will work for us. We need better DB monitoring and diagnostics.
  2. Infrastructure planning
    1. Did initial grading of clustered scale up/out NAS solutions. I will post more details as the project progresses.
    2. Did requirements for new firewall solutions, still have yet to nail this down and grade them.
    3. Built plan around fixing exchange, and moving to a multi-site international infrastructure on Windows 2008 and Exchange 2007.
    4. Started planning a DNS revamp, and proper split domain configurations.
    5. Working on a new wireless implementation as we speak, using dd-wrt and integration into AD with WPA.
  3. Ops
    1. Debugged issues going on with production website.
    2. Reviewed and did some updates on the Akamai configuration.
    3. Implemented VMware environment for HP Quality Center testing, have yet to have QA fully test the buildout.

I learned about the platform, and the upcoming new version of the platform. We also started looking at NBA/IDS systems such at Mazu and Q1Labs. Its not high priority, but it would help a lot with security, and the ability to diagnose network issues, and non-http issues. I really wish Coradiant would view SQL response time…. One can dream.

Comments

Popular posts from this blog

Misunderstanding "Open Tracing" for the Enterprise

When first hearing of the OpenTracing project in 2016 there was excitement, finally an open standard for tracing. First, what is a trace? A trace is following a transaction from different services to build an end to end picture. The latency of each transaction segment is captured to determine which is slow, or causing performance issues. The trace may also include metadata such as metrics and logs, more on that later.
Great, so if this is open this will solve all interoperability issues we have, and allow me to use multiple APM and tracing tools at once? It will help avoid vendor or project lock-in, unlock cloud services which are opaque or invisible? Nope! Why not?
Today there are so many different implementations of tracing providing end to end transaction monitoring, and the reason why is that each project or vendor has different capabilities and use cases for the traces. Most tool users don't need to know the implementation details, but when manually instrumenting with an API, t…

F5 Persistence and my 6 week battle with support

We've been having issues with persistence on our F5's since we launched our new product. We have tried many different ways of trying to get our clients to stick on a server. Of course the first step was using a standard cookie persistence which the F5 was injecting. All of our products which use SSL is being terminated on the F5, which makes cookie work fine even for SSL traffic. After we started seeing clients going to many servers, we figured it would be safe to use a JSESSIONID cookie which is a standard Java application server cookie that is always unique per session. We implemented the following Irule (slightly modified in order to get more logging):
http://devcentral.f5.com/Default.aspx?tabid=53&view=topic&postid=1171255 (registration is free)
when HTTP_REQUEST {
# Check if there is a JSESSIONID cookie
if {[HTTP::cookie "JSESSIONID"] ne ""}{
# Persist off of the cookie value with a timeout of 2 hours (7200 seconds)
persist…

NPM is Broken

As someone who bought and implemented NPM solutions, covered them as an analyst, and now watches the industry, one cannot help but notice that NPM(D) is broken. According to Gartner themselves, the data center is rapidly changing, the data center is going away, maybe not as quickly as Capp states, but it’s happening. This is apparent by the massive public cloud growth posted by Amazon, Microsoft, and Google in their infrastructure businesses. This means that traditional appliance-based NPMD offerings will not work, nor will traditional ways of collecting packet data. Many of the flow offerings do not handle the new types of flows which these services generate, but most importantly they do not understand the internet, which is the most important part of assuring services in cloud hosted environments.
The network itself is not just moving to overlay a-la NSX and ACI, it's moving inside of orchestrated containers, and new proxy/load balancing systems typically built off components or …