Monday, July 11, 2016

Setting QFullSampleSize and QFullThreshold by script in ESXi 5.1

[Repost of old entry that died under the weight of spam]

With ESXi 5.1, VMware have changed things a bit for Controlling LUN queue depth throttling in VMware ESX/ESXi
- see Controlling LUN queue depth throttling in VMware ESX/ESXi

Previously it was a global setting on the host, configured in the Advanced Settings - and could be checked and set using PowerCLI. eg :

To check values :

Get-VMHostAdvancedConfiguration -vmhost <host> -name Disk.QFullSampleSize
Get-VMHostAdvancedConfiguration -vmhost <host> -name Disk.QFullThreshold

To set values :

Set-VMHostAdvancedConfiguration -vmhost <host> -name Disk.QFullSampleSize -value <int>
Set-VMHostAdvancedConfiguration -vmhost <host> -name Disk.QFullThreshold -value <int>

That’s no longer the case. Now, it’s per LUN and not global. So, each LUN you present will need setting (if you use this), and on each host.

We use 3Par storage, and do set the values recommended :

QFullSampleSize = 32.
QFullThreshold = 4

The question was asked whether these settings were retained on a host that had them, and was then upgraded to 5.1. An upgrade in one of the test labs seems to indicate they’re not - all LUNs were set to 0 for these parameters.

So, I decided to try and script something. The result is below.

Basically, I read the LUN device ID from a text file (one entry per line), and then apply the change to it. The problem then is generating the text file - we have 50+ LUNs per host typically. For this, I used RV Tools , and exported the vDatastore tab, which includes the value I need - second column labelled Address. I take those values, and stick them in the text file - cumbersome, but it works.

I call the script Set3Par.sh, and chmod +x it to set the execute bit.

#!/bin/sh

usage() {
        echo "Usage : ./Set3Par.sh filename.txt"
        echo "Where filename.txt is the file with the list of naa ids, eg test.txt"
}

# File with the naa entries. One per line.
filename=$1

# Check script is invoked with correct number of arguments, ie, filename.txt
# If not, give usage details then exit.
[[ $# -ne 1 ]] && usage && exit 1

# Check if the file specified exists, if not, exit
if [ ! -f $filename ]
then
        echo "$filename does not exist. Exiting"
        exit 1
fi

# Set the values, working through the input file one line at a time.
echo "Reading in the file $filename"
cat $filename | while read line
do
        echo "Running the following ... esxcli storage core device set --device $line -q=4 -s=32"
        esxcli storage core device set --device $line -q=4 -s=32
done

echo "Completed the changes ... exiting"

How to run :
# ./Set3Par.sh
Usage : Set3Par filename.txt
Where filename.txt is list of naa ids, eg test.txt

Sample input file : (values changed for writeup purposes)
# more test.txt
naa.50002aaaaaaaaaaa
naa.50002bbbbbbbbbbb
naa.50002ccccccccccc
naa.50002ddddddddddd
naa.50002eeeeeeeeeee
#

Sample run:
# ./Set3Par.sh test.txt
Reading in the file test.txt
Running the following … esxcli storage core device set –device naa.50002aaaaaaaaaaa -q=4 -s=32
Running the following … esxcli storage core device set –device naa.50002bbbbbbbbbbb -q=4 -s=32
Running the following … esxcli storage core device set –device naa.50002ccccccccccc -q=4 -s=32
Running the following … esxcli storage core device set –device naa.50002ddddddddddd -q=4 -s=32
Running the following … esxcli storage core device set –device naa.50002eeeeeeeeeee -q=4 -s=32
Completed the changes … exiting

New vApp fails to deploy in vCloud Director with “… does not exist in our inventory, but vCenter Server claims that it does” type error.

[Repost of old entry that died under the weight of spam]

Working on a vCloud Director 5.1 proof of concept in work, ran into an issue where deploying a new vApp would fail after about a minute. Same with trying to import a template. All errors were of the type :

Folder vApp_system_25 (b220707f-7e73-401a-91b9-c74000c76a1a) does not exist in our inventory, but vCenter Server claims that it does..

I could indeed see the object in vCenter, but nothing in vCloud.

Searching around, I found this VMware communities article http://communities.vmware.com/message/2202781 It mentions “None of the cells have a vCenter proxy service running.” Checking our environment, this message was also present. The communities page indicated that it was then fixed by cleaning the QRTZ tables in the database, but no details on how. Not being a DBA, this bothers me :-)

Searching for details on how to do this led to this blog entry by Jason Boche - http://www.boche.net/blog/index.php/2011/12/16/vcloud-director-and-vcenter-proxy-service-failure/

All looked similar, so tried it on our setup, and pleasingly it fixed it. We’re running the database in MSSQL, so used the relevant script, tidied to fit our database name.