Monday, November 28, 2016

vSphere (and Some Other Products) Upgrade Notes

Recently, a lot of my customer are planning, doing, or have just done vSphere upgrade. Mostly due to vSphere 5.1 which already in end of general support phase per 24 August 2016. Technical guidance will still be provided for vSphere 5.1 until 24 August 2018 (For a complete important date on your product support phase, please check this VMware product lifecycle matrix.), but please note that no more security patches or bug fixes will be released for vSphere 5.1 in the future, unless stated otherwise. Other than that, during technical guidance phase, support request will only be given to low-severities issues on supported configuration only as stated in this VMware lifecycle policies. This is my personal notes on some information which can help in planning VMware environment upgrade.

Monday, November 14, 2016

Why Guest OS Task Manager is Showing Different Value Compare to vSphere Performance Monitor?

Demystifying CPU States in vCPU World

Have you experienced a situation where your guest OS task manager is showing different value compare to vSphere performance monitor? Or you get a request for additional vCPU from the application team which uses your VM because they see their VM utilizing almost all vCPU they have, but when you check vCPU usage of that VM in your vSphere web client, it only shows low utilization? Is there something wrong? Before you think there's something wrong with vSphere performance monitor, read this article to understand what's causing that situation.

Figure 1. Windows task manager shows ~100% CPU Utilization
Before going further, let me first describe the situation clearer. Figure 1 and 2 are coming from the same Virtual Machine, perf-worker-01b. The first figure shows Windows Task Manager where the CPU utilization is hitting 100% for most of the last 4 minutes. The second figure shows vSphere performance monitor which taken about the same time as Figure 1, and this figure reveals that for the last 4 minutes, VM CPU usage is only around 50%. FYI, I actually ran a CPU benchmark tool on perf-worker-01b for about 30 minutes, and for most of the time in that period, Windows task manager showed 100% CPU utilization, while vSphere performance monitor showed around 50% CPU usage. Why vSphere performance monitor only showed 50% CPU usage when Windows task manager showed ~100%?

Monday, November 7, 2016

Quickly Identify Whether My Virtual Machines Get All CPU Resources They Need


One of the capability brings by virtualization is the ability to run several virtual machines in one physical machine. This ability may lead to something called over provisioned, where we provisioned resources to VMs more than what we have in the physical layer. For instance, we can create 20 VMs, where each has 4 vCPUs - in total of 80 vCPUs provisioned, while the server we used only has 20 CPU cores. Wait.. wait.... If we only has 20, how can we give 80? How can we give more than what we actually had? Actually the answers is one of the reason why virtualization rose in the first place: most of our server has - in average - low CPU utilization and each server has different time in experiencing peak and low utilization. VMware vSphere manages how VMs get their turn utilizing physical CPU resources in efficient and fair manner by a component called CPU Scheduler. In simple word, CPU Scheduler is like traffic light. It rules who may go, or in this case who may use the physical CPU resources, and who need to stop and wait. More about CPU Scheduler can be found on this CPU Scheduler Technical Whitepaper.

Using the analogy of traffic light, we know that at one time, the number of cars can go will be defined by numbers of lanes available. If the road has 4 lanes, then only maximum of 4 cars can pass at the same time, other cars will queue behind the first row. This is also true in virtualization, even though we can do over provisioning, what CPU scheduler can schedule at a time will be limited to how many  logical CPUs available on the physical server. Means, if at any one time there are several VMs, with total vCPUs more than available logical CPUs, asking for their share to use logical CPUs, then some of those VMs will need to queue. By having to queue, it will takes longer for a VM to finish its job. Now the challenge is how to identify this queue, and furthermore how to manage that queue into an acceptable timeframe. This article will try to answer the first part, while for the latter will be discussed in the future article.

Monday, October 24, 2016

VMware Virtual Machine Virtual Disk Security

Last week, during VMworld 2016 Europe, VMware announces the latest release of vSphere 6.5. You can check this, amongst other things being announced, on this press release. One area of improvement for vSphere 6.5 is in virtual infrastructure security which you can read here. What interests me related to the new security features is VM encryption, as some customers which I met asked about this capability. So I dug out an old post which originally was a personal notes I wrote back in 2014 about some points of discussion regarding virtual disk security, and modify it to be relevant with the recent announcement. 

OK, let's understand the problem first. Remember one of the characteristic of virtualisation? Encapsulation. In other word, VM basically is only a set of files. If those files happened to be walked out the door, then people can mount it up, extract the files/information, or even have the VM up and running. Check this article if you want to get the idea on how that could be done.

You might say that if that situation happened, that means that company not applying a good security policy, and if that is the case, anything can happened, even in non virtualise world. Well you got that right, but let's see what we can do to prevent that situation, how VMware able to cater this situation, how VMware can make sure that if virtual disk leakage happened, the person who have it could not take advantage from it.

Monday, October 17, 2016

[lunar.lab] Build My Lab Network Using VyOS

Am trying to build my own lab. The idea is to have three "virtual datacenter" as described in the following figure. Datacenter A and datacenter B would be two independent datacenter, where later I can simulate DR failover, workload mobility, stretch network, etc across those two datacenter.  Each datacenter will have their own ESXi hosts and vCenter. Datacenter C is where I keep shared services which are required by either datacenter A or B, but not relevant to the test that I want to perform. Other than that, datacenter C will hosts some workload which mimic as user accessing workload on datacenter A or B. Each datacenter will have their own router, and dynamic routing should be configured between those 3 datacenter as later I want to explore NSX multi site capabilities. You can see the network and address that I plan to use on the following figure.