Living in the IT Petri Dish
It occurs to me that problems in an IT infrastructure are like germs spreading in the Petri dish that is a kindergarten class. Germs grow and grow, kids get the sniffles, and then the entire class gets sick. Ultimately, the illness spreads to everyone in the school and at home (yes I have young children). The same thing happens with applications that share infrastructure. One server could be operating in an unhealthy manner, which can, and usually does, impact its neighbor. This “spreading of germs” is only made more violent and unpredictable with the advent of server virtualization and shared storage.
It only takes one “unhealthy” vm to spoil a cluster or one poorly performing vm to start others pin-wheeling around based on simple thresholds.
How do you determine the root cause? If you are simply looking at cpu, memory, or disk utilization that is equivalent to waiting for the kid in the class to have hives and 102 fever before you start to take action. And when you do move the kid, how many more are already “unhealthy”. But in a vm world we move the unhealthy vm to another server which would be equivalent to keeping the sick child in the classroom for the rest of the day. So many more might become “unhealthy” or at the very least annoyed … sound like the day a few people called and said “my application is slow today?”
If it doesn’t fit there we move it again, to where it might fit more nicely. Isn’t the definition of insanity doing the same thing over and over again? Couldn’t we see that if the kid is complaining maybe he should get checked out and fixed? OK, enough of the analogy. VMs don’t complain, end-users do. So how do you know when workloads shift? What is a normal workload? What workloads play nicely together? Is it a heavy cpu, memory, or IO workload. What combination is right? Is the vm doing un-natural acts … servicing more workload than it was spec’ed too? More on that next time.
comments
2 Responses to “Living in the IT Petri Dish”
Leave a Reply





Subscribe to our RSS feed
Did you just find a way to compare the germ farm that is a room full of kids (think Daycare, think 1st grade classroom, think 3 year old birthday party, etc) to virtual machines – each are needy, resource hogging, possibly contagious, etc.
It’s a great analogy – I just want a real life version of DRS to pluck the misbehaving children away from the others.
Agree … very much agree … at least part of our life has DRS!