Skip to main content

Back from vacation and diving in head first!

We had an awesome cruise in the Caribbean with my wife for the last 10 days. I am feeling very good and back in the swing of things. I missed my great colleagues, house, and cats.

1st day back and we ended up seeing some odd packet loss around 6pm in our office. We ended up finding a couple of bad switches, and essentially rewired the whole office. It was a good time. We also put an order in for a couple of better Dell switches that support STP, management, and Spanning. These features are something we really wanted, but we had these crappy web managed Dell switches which cannot do much of anything. It was a fun 16 hour day yesterday, and thank you to Jamie for working so late with me to fix the problem. We still have 1 issue left to fix with the router -> firewall connection, but it should be done tonight late or tomorrow morning.

We have a software push tonight to production systems, should go smoothly, but we keep risking not letting the code settle long enough before we push it.

Also our VM environment is ever growing; supporting our enterprise product is really becoming a major drag on the infrastructure. Still have a lot of cleanup to do on the legacy environments, but It's mostly being pushed back as far as the dates go. We are starting to be too risky on this side of things for my comfort.

Been seeing major Database growth as well, we are doing a online volume expansion using Snapdrive and the iSCSI LUNs we host on the Netapps. Done this before without issue, but there is always some risk involved.

Hope all the readers are doing well, and I will update soon with what's been going on


Popular posts from this blog

Misunderstanding "Open Tracing" for the Enterprise

When first hearing of the OpenTracing project in 2016 there was excitement, finally an open standard for tracing. First, what is a trace? A trace is following a transaction from different services to build an end to end picture. The latency of each transaction segment is captured to determine which is slow, or causing performance issues. The trace may also include metadata such as metrics and logs, more on that later. Great, so if this is open this will solve all interoperability issues we have, and allow me to use multiple APM and tracing tools at once? It will help avoid vendor or project lock-in, unlock cloud services which are opaque or invisible? Nope! Why not? Today there are so many different implementations of tracing providing end to end transaction monitoring, and the reason why is that each project or vendor has different capabilities and use cases for the traces. Most tool users don't need to know the implementation details, but when manually instrumenting wi

NPM is Broken

As someone who bought and implemented NPM solutions, covered them as an analyst, and now watches the industry, one cannot help but notice that NPM(D) is broken. According to Gartner themselves, the data center is rapidly changing, the data center is going away, m aybe not as quickly as Capp states, but it’s happening. This is apparent by the massive public cloud growth posted by Amazon, Microsoft, and Google in their infrastructure businesses. This means that traditional appliance-based NPMD offerings will not work, nor will traditional ways of collecting packet data. Many of the flow offerings do not handle the new types of flows which these services generate, but most importantly they do not understand the internet, which is the most important part of assuring services in cloud hosted environments. The network itself is not just moving to overlay a-la NSX and ACI, it's moving inside of orchestrated containers, and new proxy/load balancing systems typically built off component

F5 Persistence and my 6 week battle with support

We've been having issues with persistence on our F5's since we launched our new product. We have tried many different ways of trying to get our clients to stick on a server. Of course the first step was using a standard cookie persistence which the F5 was injecting. All of our products which use SSL is being terminated on the F5, which makes cookie work fine even for SSL traffic. After we started seeing clients going to many servers, we figured it would be safe to use a JSESSIONID cookie which is a standard Java application server cookie that is always unique per session. We implemented the following Irule (slightly modified in order to get more logging): (registration is free) when HTTP_REQUEST { # Check if there is a JSESSIONID cookie if {[HTTP::cookie "JSESSIONID"] ne ""}{ # Persist off of the cookie value with a timeout of 2 hours (7200 seconds) p