Monday, November 29, 2010

VMware Troubleshooting course review.

A review of the VMware troubleshooting course. This is a 4 day course. The very notion of which I find interesting. Invariably the “troubleshooting” module on almost all courses I’ve ever done is the Friday afternoon, last or last but one module, when everyone wants to get away.

Yet troubleshooting is arguably one of the most important tasks of an IT pro. It’s when the spotlight falls on you (along with blame). But it’s hidden away on courses. As though nothing ever goes wrong. I know it’s partly a time constraint, but still, it’s just so important and yet it’s invariably glossed over.

So here we have 4 days dedicated to it. It’s also a recommended course for the VCAP-DCA (another reason for taking it, as I aim to have a stab at this early 2011). So, onwards, into the breach (or something) …

This course took place at Global Knowledge in Wakefield, Nov 23-26 2010.

Day 1
Intros, and understanding people’s experience and expectations. We get 3 manuals. Course notes, labs and a troublshooting reference guide which outlines procedures eg, “vCenter Server system cannot migrate a virtual machine with vMotion”.

As ever with courses, it’s not just the content that will determine the success or not of the course. It’s also the instructor and the fellow delegates. Fortunately the instructor is Scott, who was the instructor on my fast track course for 3.5 nearly 2 1/2 years ago, and I know he knows his stuff and it will be good.

The majority of the first day modules are in essence setting you up for being able to troubleshoot. We spend time understanding and configuring vMA and log files. vMA is basically going to allow you to do the work in ESXi that you did in the service console in ESX (given the absence of a service console in ESXi). It can also be used with ESX. It would appear at this stage it’s the future for this type of work, so good to spend time and have an understanding of it. Plus of course for ESXi, it can be used for logging and also resxtop.

Most of the labs today are standard procedural labs - ie, you follow the instructions, and you should be good at the end of it all.

Day 2
Networking today. First it’s a review - the things VMware expect you to know, but (again, where the experience of the instructor comes in from having taught the course previously), you may be rusty on. So we run through this. It’s a good refresher for me in parts, and highlights that I REALLY need to do more work with distributed virtual switches (especially to prepare for the DCA). A straw poll showed that 3 of the 7 of us on the course were using dVS switches in production!

More procedural labs, and then break-fix. Basically, the core of the course, the instructor has scripts at his disposal which will “do things” to your environment. You’re given a user report “vmotion doesn’t work”, and you fix it. I think that it’s likely the instructor has various degrees of difficulty at his disposal, so the course will kind of shape and evolve to fit the needs of the course. This may mean skipping some labs, may mean fiendishly difficult or relatively simple.

Day 3
Finish the networking module - setting up a packet sniffer, setting switches/port groups to promiscuous mode to allow it to work etc. And then more break/fix on networking.

Afternoon is management and then storage. Following a similar pattern of relatively brief notes, which really are going over things people already know, then some more break/fix.

Day 4

Finish storage module, and a procedural set of labs configuring different iSCSI LUNs - CHAP, digest, then adding multipathing and using claimrules.

Then it’s into the final stretch with modules on vMotion, storage vMotion HA/DRS, FT, DPM and general VM troubleshooting. Stay on at the end for a couple more labs as I’ve frankly been dreadful at them, and I need more.

Overall Impressions

So, lessons learnt? Well, networking really is key, and I need to do much much more with dVS - a lot of the course kind of hinges on these.

My troubleshooting was dreadful. It was kind of a mix of embarassing and humbling really. But in a way, that was probably the BEST part of the course for me. I’ve come out of it with a good idea on areas I really need to get focussed on, and it’s not something that I can’t overcome. I guess it’s like this, my work environment is sufficiently small and reliable, that I’ve never had to truly troubleshoot the VMware setup. That could be seen as a testament to the quality of the software and also the hardware in use. Maybe a small element can be attributed to what I’ve done in the setup and maintenance. But when you don’t do something frequently (fixing things in this instance) you can get rusty.

On the other hand, if you do troubleshoot every day, the course may not be as eye opening - there was clearly one guy who excelled - often getting the problem within a few minutes.

I guess my main complaint is that we’re paired up on labs. Personally I prefer to work alone as I can work at my own pace. I tend to work quite quickly, and so find myself waiting for my lab partner, which disrupts my flow (though of course working fast is no guarantee of working smart). Plus in the context of this course, I don’t want my mistakes to disrupt my lab partner. It’s not fair on them. And on these types of courses, I like to try stuff (it’s an opportunity to do so, without breaking production kit AND where you have the safety net of an experienced VMware vExpert to help you out when you basically have a brainfart). But that’s how the labs are designed, so, so be it. Oh, and of course I’d love to have access to the scripts for my testlab, but that’s not to be. Still, there’s enough suggestions within the manuals that I’m sure I can at least work back from those and create various scenarios.

So, recommend the course? Yep, I think so. As I said above, I was embarrassed and humbled by my performance, but you need to learn from these things, and set aspirations and goals accordingly. As preparation for the VCAP-DCA, well, I’ll let you know when I’ve tried it. My suspiscion is it will be valuable, if for no more than it emphasises once again that you have to get hands on - build, break, learn, repeat.

Thursday, September 30, 2010

VMware Manage for Performance course (VSMP)

This is primarily from memory now as I post this, so no real “details” as such, just a personal view. I hope there’s nothing that breaches the terms of the course with regard to describing content - it’s pretty much all taken from the readily available resources.

So, I took this course in the middle of August in London. It was switched from it’s original venue about a week before to the Regus Broadgate Tower. I believe from what was said, it was the first time Global Knowledge had run the course in the UK - if I’m wrong about that, I apologise.

The course was small in terms of numbers - 4 others besides myself and (of course) the instructor. I prefer this, as in larger number courses, there’s a tendency for time to drift, especially during lab sessions, as people finish at different times and time is lost as chatter descends. This was still a problem to an extent here, primarily as there’s a lot to cram into the 3 days. But the larger the numbers, the larger the problem in general.

First day starts with your standard introductions. As most time, I get the overwhelming feeling of being the small time player here, when hearing about the other environments people are in. Interestingly one of the students in discussing his environment highlights performance problems they’re experiencing which he hopes he will be able to have a better grasp on upon completing the course. This kind of real world thing (in my view) can help make or break a course. Courses by and large are designed to work. Designed to a roadmap, a schedule. The labs work. Usually (not always) when they don’t, it’s a layer 8 problem.

So, as the problem is described, the instructor starts making notes about it on the whiteboard. He has an idea already what he thinks the problem will be, but over the course of the 3 days, we will return to it on a fairly regular basis, to try and piece it together.

The course structure is

Day 1
Module 1: Course Introduction

Module 2: Performance in a Virtualized Environment
Discuss the vSphere performance troubleshooting methodology
Monitor performance using vCenter Server performance graphs and the ESX/ESXi resxtop command

Module 3: Virtual Machine Monitor
Discuss software and hardware virtualization techniques and their impact on performance

Module 4: CPU Performance
Discuss the CPU scheduler, NUMA, and CPU cache contention
Monitor key CPU performance metrics
Troubleshoot common CPU performance problems

Day 2
Module 5: Memory Performance
Discuss memory reclamation techniques and memory overcommitment
Monitor key memory performance metrics
Troubleshoot common memory performance problems

Module 6: Network Performance
Discuss the performance features of modern network adapters
Monitor key network performance metrics
Troubleshoot common network performance problems

Day 3
Module 7: Storage Performance
Discuss how storage protocols, VMFS configuration, load balancing, and queuing affect performance
Monitor key storage performance metrics
Troubleshoot common storage performance problems

Module 8: Virtual Machine Performance
Discuss guidelines for configuring a virtual machine for optimal performance

Module 9: Application Performance
Discuss what applications can be virtualized
Discuss how VMware vCenter AppCenter manages application performance

There’s some introductory time going through performance in a virtualised environment - monitor mode, , CPU hardware virtualization, MMU virtualization etc, and a general troubleshooting methodology (based on the living vSphere4 performance troubleshooting document)

There’s a look at the GUI and performance aspects within, but a lot of the key time is spent working through tools such as esxtop and resxtop as these tend to dominate the remainder of the course. Although familiar with the tool beforehand, it is eyeopening just how many options and how many times you can apply it. I suspect I won’t be the only person who completes this course whose first action upon returning to work will be to fire it up. A lab usually follows a lesson, or sometimes held back to the end of the module. The labs entailed remote desktop into your own vSphere setup, and running everything from within your RDP session. Again, good to me that none of it was local, as it gives it a more real world feel. Looking back through the lab book, there are 12 labs in total, and they can take a bit of time eg, establishing a baseline, then generate load or contention or whatever eg, single threaded program in dual vCPU Vs dual threaded program in dual vCPU etc. A great deal of the course is taken around CPU and memory - I’d say between 1 3/4 - 2 of the 3 days.

Lab 1 : VMware Monitor Modes
Lab 2 : VMware Monitoring Tools
Lab 3 : Monitoring CPU Performance
Lab 4 : Diagnosing CPU Performance
Lab 5 : Monitoring Memory Performance
Lab 6 : Diagnosing Memory Performance
Lab 7 : Working with Resource Controls
Lab 8 : Network Peformance
Lab 9 : Diagnosing a Network Performance
Lab 10 : Monitoring Storage Performance
Lab 11 : Using VMware vscsiStats
Lab 12 : Guest Operating System Timer Interrupt Rates

Prior to the course, I’d read a review , and collected the various documents referenced, and had a read through them - some were familiar, some not so. I also have some linux experience (not great, but enough to get by, and also to know my comfort zone). This certainly helps as most of the work is done from the shell.

There were faults, though none were showstoppers.

As alluded to earlier, time was lost during labs as chatter ensued and labs probably dragged on longer than they needed to (I’ve never attended a course where this wasn’t the case - hmmm, maybe it’s me then). When it’s directly related to the course, or VMware in general, it’s not so bad, but there were times where this wasn’t really directly related to the course, and this tends to be distracting. We basically ended up skipping one lab over the duration of the course, and there was also a problem with another lab which the instructor and myself worked together to resolve while everybody else went to lunch. Not a problem, but if everyone had been a bit more focussed at that point, we may have sorted it quicker (plus the others will have actually missed out on troubleshooting an actual problem - always better than scripted problems). No big problem, and I suspect a reflection on being the first or one of the first times they’d run the course. Timing will get tightened I’m sure.

I also suspect that someone with plenty of VMware experience in larger organisations, and with experience in Linux, won’t find too much new here. A lot of the information is already out there eg, the performance troubleshooting document I mentioned earlier, Duncan Epping’s esxtop pages etc. But it’s often just little nuggets, the information having been pulled together in a focused manner, the time to actually spend on things and the shared experience and tales that are told that help to make a course worthwhile.

As a venue, we received no lunch (snacks yes), so it was a case of popping out to grab a bite. Not personally a big deal to me, but I know a lot of people want the added extras when they do a course (personally I’ll sacrifice a meal for a good course).

My main nitpick, well, I understand, but, I’d *really* love to be able to have a copy of course notes as a PDF (or any other suitable electronic format). Most (all?) courses don’t do it - for understandable reasons, but I tend to find with all courses that the notes just get tossed aside once the course is complete, because they’re not convenient. When in truth, I know there’s lots of time’s I’d like to reach for them to check something, jog the memory. I don’t know if there’s a way to reasonably make the notes available to course attendees, but I can continue to wish.

Overall, a very good course. Certainly one of the better ones I have attended, and if you work with VMware and have sufficient experience (or the VCP) to be able to skip the Fast Track, and take this instead, it’s highly recommended. It is packed - we finished at around 4:45pm on the final day. It is one of the recommended courses for the VCAP-DCA (which I hope to feel vaguely ready to attempt at some point in 2011), and you can understand why. If your employer gives you the opportunity, take it.