Search This Blog

Monday, March 15, 2010

Key Performance Monitor Counters


Key Performance Monitor Counters

This article looks at key Performance Monitor counters you may want to watch to ensure your Windows Server 2003 machines are running smoothly. Performance Monitoring is a complex subject and in some ways more of an art than a science. It can be daunting to be confronted with a choice of over a thousand performance counters to choose from. Which ones are important to monitor on a regular basis and which ones can be largely ignored? A lot depends on the role of the system you want to monitor and whether you're talking about capacity planning, ensuring availability, scaling upwards, monitoring for possible problems, or troubleshooting issues that have arisen. While the basic procedure for how to use the Performance console has been covered previously on WindowsNetworking in Andrew Tabona's article Windows 2003 Performance Monitor, I thought it might be useful to list a few key counters that administrators may want to monitor as far as general server health is concerned. So here are some of my top recommended perfmon counters in no particular order and organized around five questions you might ask yourself concerning the health of your machines.

Is Your Server Available?

Availability means your system or application is up and running, and one way of determining the availability of your system is to view the System\System Up Time counter, which tells you how many seconds it's been since your server last rebooted. The easiest way to view this counter is in report view, which shows the actual numerical value of the counter in elapsed seconds. A better way however is to create a performance counter log and track this counter in the background so you can review it periodically when you need to generate your month-end report.
If you want to drill down even further though, you can monitor uptime for any process running on your machine using the Process(instance)\Elapsed Time counter, which tells you how long that particular process has been running on your machine. For example, Process(winlogon)\Elapsed Time will tell me how long it's been since the Winlogon process started running on my machine, and this should normally be a few seconds less than System\System Up Time since Winlogon starts running during the boot process. You can of course use theProcess(instance)\Elapsed Time counter to also monitor processes associated with specific applications and services to monitor the availability of these applications and services. Be careful however as some services are designed to start and stop under certain conditions, while other services are embedded in service host processes (svchost.exe) and you need to first identify which host process contains the service you want to monitor.

How Busy Is It?

A server that's too busy may be unable to satisfactorily respond to client requests. That translates into unhappy users and let's face it, an important aspect of your job as an administrator is to ensure a satisfactory "experience" for the end-users you support. The simplest measure of a system's busyness is Processor(_Total)\% Processor Time, which measures the total utilization of your processor by all running processes. Note that if you have a multiprocessor machine, Processor(_Total)\% Processor Time actually measures the average processor utilization of your machine (i.e. utilization averaged over all processors).
If you're monitoring this counter and it's running at or near 100% for extended periods, you should drill down at the process level by examining Process(instance)\% Processor Time counter for various process instances on your machine. For example, on an IIS web server you might track Process(inetinfo)\% Processor Time, while on an Exchange server a good counter to watch is Process(store)\% Processor Time and so on. High processor utilization isn't always a sign of a problem however. For example, when a backup job is running it's typical for processor utilization to hit high levels for the duration of the backup, especially if the backup program is encrypting or compressing information before writing it to tape. In fact, if your server typically runs at around 70% or 80% processor utilization then this is normally a good sign and means your machine is handling its load effectively and not under utilized. Average processor utilization of around 20% or 30% on the other hand suggests your machine is under utilized and may be a good candidate for server consolidation using Virtual Server or VMWare.
Another thing you can do to investigate high processor utilization is to break it down into Processor(_Total)\% Privileged Time and Processor(_Total)\% User Time, which respectively show processor utilization for kernel- and user-mode processes on your machine. If kernel mode utilization is high, your machine is likely underpowered as it's too busy handling basic OS housekeeping functions to be able to effectively run other applications. And if user mode utilization is high, it may be you have your server running too many specific roles and you should either beef hardware up by adding another processor or migrate an application or role to another box.
If your machine is running several applications or handles several server roles on your network, another way to measure busy-ness is to measure processor contention, which is an indication of how different threads are fighting for the attention of the processors on your machine. If too many threads are contending for use of the same processor, the requests by these threads get queued up, and looking at the System\Processor Queue Lengthcounter gives an indication of how many threads are waiting for execution. If this counter is consistently higher than around 5 when processor utilization approaches 100%, then this is a good indication that there is more work (active threads) available (ready for execution) than the machine's processors are able to handle. Note that this is not always a hard and fast indicator however, for some services like IIS 6 pool and manage their own worker threads, so on a busy web server for example you would want to look at other counters like ASP\Requests Queued orASP.NET\Requests Queued as well. Furthermore, the larger the number of active services and applications running on your server, the busier the processor queue will normally be, so on a multi-role server running near 100% utilization content may only be a significant factor once System\Processor Queue Length exceeds something like 10 instead of 5 as mentioned previously.

Is Hardware Functioning Properly?

There are a couple of perfmon counters you can track to monitor for signs that your machine's hardware devices are functioning properly. One of these is System\Context Switches/sec, which measures how frequently the processor has to switch from user- to kernel-mode to handle a request from a thread running in user mode. The heavier the workload running on your machine, the higher this counter will generally be, but over long term the value of this counter should remain fairly constant. If this counter suddenly starts increasing however, it may be an indicating of a malfunctioning device, especially if you are seeing a similar jump in the Processor(_Total)\Interrupts/sec counter on your machine. You may also want to check Processor(_Total)\% Privileged Time Counter and see if this counter shows a similar unexplained increase, as this may indicate problems with a device driver that is causing an additional hit on kernel mode processor utilization. In this case you can drill down and maybe find the culprit by examining theProcess(instance)\% Processor Time counter for each process instances running on your machine. This won't directly tell you which driver is utilizing processor time, but it may indicate which calling application is indirectly causing the problem and may help you troubleshoot the issue further.
If Processor(_Total)\Interrupts/sec does not correlate well with System\Context Switches/sec however, your sudden jump in context switches may instead mean that your application is hitting its scalability limit on your particular machine and you may need to scale out your application (for example by clustering) or possibly redesign how it handles user mode requests. In any case, it's a good idea to monitor System\Context Switches/sec over a period of time to establish a baseline for this counter, and once you've done this then create a perfmon alert that will trigger when this counter deviates significantly from its observed mean value.

Got Enough RAM?

The Memory\Pages/sec counter indicates the number of paging operations to disk during the measuring interval, and this is the primary counter to watch for indication of possible insufficient RAM to meet your server's needs. A good idea here is to configure a perfmon alert that triggers when the number of pages per second exceeds 50 per paging disk on your system. Another key counter to watch here is Memory\Available Bytes, and if this counter is greater than 10% of the actual RAM in your machine then you probably have more than enough RAM and don't need to worry.
You should do two things with the Memory\Available Bytes counter: create a performance log for this counter and monitor it regularly to see if any downward trend develops, and set an alert to trigger if it drops below 2% of the installed RAM. If a downward trend does develop, you can monitor Process(instance)\Working Set for each process instance to determine which process is consuming larger and larger amounts of RAM.Process(instance)\Working Set measures the size of the working set for each process, which indicates the number of allocated pages the process can address without generating a page fault. A related counter isMemory\Cache Bytes, which measures the working set for the system i.e. the number of allocated pages kernel threads can address without generating a page fault.
Finally, another corroborating indicator of insufficient RAM is Memory\Transition Faults/sec, which measures how often recently trimmed page on the standby list are re-referenced. If this counter slowly starts to rise over time then it could also indicating you're reaching a point where you no longer have enough RAM for your server to function well.

Disks Fast Enough?

Finally, let's look at a couple of indicators of well-functioning hard disks in your system. Watch the Physical Disk (instance)\Disk Transfers/sec counter for each physical disk and if it goes above 25 disk I/Os per second then you've got poor response time for your disk. A bottleneck from a disk can significantly impact response time for applications running on your system, so you should investigate this further by tracking Physical Disk(instance)\% Idle Time, which measures the percent time that your hard disk is idle during the measurement interval, and if you see this counter fall below 20% then you've likely got read/write requests queuing up for your disk which is unable to service these requests in a timely fashion. In this case it's time to upgrade your hardware to use faster disks or scale out your application to better handle the load.

Thursday, March 11, 2010

OCSetup Command-Line Options

OCSetup is available as part of the Windows Vista® and Windows Server® 2008 operating system. This tool replaces Sysocmgr.exe, which is included in the Windows XP and Windows Server 2003 operating systems.
You can use OCSetup.exe on a computer running Windows Vista or Windows Server 2008 to install or uninstall:
  • Microsoft System Installer (MSI) files that are passed to the Windows Installer service (MSIExec.exe)
  • Component-Based Servicing (CBS) components that are passed to Package Manager
  • CBS or MSI packages that have an associated custom installer .exe file
To use OCSetup and Package Manager, the system MSIs must be staged before they are installed, and the paths to the packages must be specified in an answer file.
Before running OCSetup at a command prompt, set the command prompt to run as an administrator (even if you are logged on as an administrator on the computer).
  1. Click Start, point to All Programs, and then click Accessories.
  2. Right-click Command Prompt, and then click Run as administrator.
The following command-line options are available for OCSetup.
ocsetup.exe [/?] [/h] [/help] component [/log:file] [/norestart] [/passive] [/quiet] [/unattendfile:file] [/uninstall] [/x: parameter]
The following table shows the syntax for the OCSetup command-line options.

 

Parameter
Description
/?, /h, /help Displays help for all options when run with or without options.
component The name of the component to be installed or uninstalled. The component name is case-sensitive.
/log:file Specifies a non-default log file location.
/norestart The computer is not rebooted even if required after component installation.
/passive Unattended mode. Progress only.
/quiet Quiet mode. No user interaction.
/unattendfile:file The file contains overrides or additions to default configuration settings. Implies passive mode.
/uninstall Uninstalls the component. Installation is default.
/x: parameter Additional configuration parameters to be applied when installing a component that requires a custom installer. OCSetup will pass these parameters to the custom installer.
To determine if a component has a custom installer, review the following registry key:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Setup\OptionalComponents\componentName
The value of CustomSetup is the name of the custom installer. If the CustomSetup value is supplied, OCSetup defers to the indicated EXE to perform the installation of the package. The EXE is expected to conform to the same command line specification as OCSetup.exe, and must return the same exit codes.
noteNote
This registry key will not exist in Windows Vista if system MSI files are not installed. In addition, this registry key will not exist if you have not registered the packages that require a custom installer and have not defined the custom installer.
If you are installing or uninstalling a package that requires a custom installer, you must register the name of the package. When you register the name of the package, you also identify the type of package (CBS or MSI), any updates that have been applied, and the name of the custom installer .exe file.
To register a package that requires a custom installer, edit the following registry key.
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Setup\OCSetup\Components\ComponentName
Add the following values under this key:

 

Value
Type
Data
Type REG_DWORD Type of package
0 - CBS
1 - MSI
Component REG_SZ The name of the CBS update or the path to the MSI.
PatchFiles REG_MULTI_SZ (Optional) Path to the MSI update file.
CustomSetup REG_SZ (Optional) Path to the custom installer .exe file.
The custom installer is expected to conform to the same command line specification as OCSetup.exe, and must return the same exit codes.
If you are configuring Windows features without using an answer file that was created in Windows System Image Manager (Windows SIM), use the command-line name in the Windows-Foundation-Package section of the Unattended Windows Setup Reference. The name is case-sensitive.

Examples

The following table gives examples of using OCSetup to establish server roles that have no dependencies.

 

Server Role
Command
Dynamic Host Configuration Protocol (DHCP) Server
start /w ocsetup DHCPServer
Domain Name System (DNS) Server
start /w ocsetup DNS-Server-Full-Role
Windows Deployment Services (Windows DS)
start /w ocsetup Microsoft-Windows-Deployment-Services
The following table gives examples of using OCSetup to enable Windows features available in server editions.

 

Windows Feature
Command
Desktop Experience
start /w ocsetup DesktopExperience
Failover Clustering
start /w ocsetup FailoverCluster-FullServer
Windows Server Backup
start /w ocsetup WindowsServerBackup

Troubleshooting

To verify that a component is installed, do one of the following:
  • Verify in the Event Viewer that OCSetup raised the event OCSETUP_EVENT_INSTALLSUCCESS.
  • If you are enabling a Windows feature, in Control Panel, click Programs, under Programs and Features, click Turn Windows features on or off, and then confirm that the check box for the Windows feature is selected.
To verify that a component was removed, do one of the following:
  • Verify in the Event Viewer that OCSetup raised the event OCSETUP_EVENT_UNINSTALLSUCCESS.
  • If you are disabling a Windows feature, in Control Panel, click Programs, and then, under Programs and Features, click Turn Windows features on or off. Make sure the check box for the Windows feature is cleared.
Additional troubleshooting information can be found in the following log files:
  • Windows Update log (%WINDIR%\WindowsUpdate.log)
  • Component-Based Servicing Log (%WINDIR%\logs\cbs\cbs.log)

Updated list of ocsetup-able components for Longhorn Server Core

All of these are installable by executing *ocsetup* and if you forget the exact syntax, you can always execute *oclist* . Only exception right now is installing Active Directory, which you will want to use *dcpromo* with an unattended file.
BitLocker
BitLocker-RemoteAdminTool
ClientForNFS-Base
DFSN-Server
DFSR-Infrastructure-ServerEdition
DHCPServerCore
DirectoryServices-ADAM-ServerCore
DirectoryServices-DomainController-ServerFoundation
DNS-Server-Core-Role
FailoverCluster-Core
FRS-Infrastructure
Microsoft-Windows-RemovableStorageManagementCore
MultipathIo
NetworkLoadBalancingHeadlessServer
Printing-ServerCore-Role
Printing-LPDPrintService
QWAVE
ServerForNFS-Base
SIS
SNMP-SC
SUACore
TelnetClient
WindowsServerBackup
WINS-SC

I'll be curious to see just how many roles and features will end up being available at RTM for Core.

Blogger in Draft: The Blogger Template Designer

Blogger in Draft: The Blogger Template Designer

Followers