Exchange 2007 Performance Troubleshooting


Perf Tips

  • Don’t stop on the first possible problem, continue on to be sure that’s not simple a symptom
  • Dont make any detrimental changes and ALWAYS have a backup!

If your having exchange perf issues here are some counters you should look at

  1. First the RPC Counters – these counters will show you if the clients are “feeling” a resource issue
    • MsExchangeIS\RPCAveraged Latency – should be under  50 (100 if in cached)
      • RPC Operations/Sec – Relative (Baseline\Trending
      • RPC Requests – Rec  under 70
    • If you see RPC ops go at around time of latency may be adding too much load
  2. Exchange Database health
    • MsExchangeDatabase(Information Store)\Database Page Faults Stalls/Sec  <not page faults>
      • Check health of DB it self – Page fault stalls indicate an issue writing to the DB, some are OK many are not.
      • Cache Size (Mem -2gb) – look at avail mem vs the Cache size | Check write\Read latency
      • (RTM = Database)
    • MSExchange Database\Log Record Stalls/Sec – large number = issues  <10 = Workload – this indicates an issue writing to log files
      • Correlate to disk and RPC
      • 10 MS writes recommended – solution could be add Disks, additional SG, balance servers.
      • Failure to add info into the log buffer
    • Msexchange Database\Log Threads Waiting/Sec – Disk issue
      • Correlate to \log Record Stalls/sec
      • Log Stall Issues (Disk or workload) – threads high along with log stalls indicate a workload issue, threads low indicate a disk issue
  3. Active Directory to exchange
    • Should all be Average of 50 or less, spike should not be higher than 100, all of these indicate an issue accessing a GC
      • MsexchangeADAccess\LDAP read Time (MSec)
        •  \LDAP Search
      • MSExchangeADAccessProcesses
      • MSExchangeADAccess Domain Controllers
      • MSExchangeADAccess\LDAP Reads/sec
        •  \LDAP Search/sec
  4. Hardware Counters
    • Storage – 
      • Physical or Logical Disk Read\Write Time –  Look at latency spikes in relation  to other (RPC Latency, Log Stalls, etc) – if RPC is ok disk is immaterial (unless dealing with transport or Edge)
      • Check Physical Disk or  logical if SAN or mount point
    • Memory –
      • Memory\Available Mbytes – Should always have Physical memory avail otherwise you will be paging to disk
      • Process, and Processor
        • \Working Set = RAM – See what process is using the most
        • \Virtual Bytes =- RAM + Page – See what process is using the most
        • \Private Bytes, etc  –  only it can use 486 (256 if /3gb used)
      • Note: X64 – will not crash but will start thrashing (memory leak)
    • Network
      • Network Interface\Output Queue Length – should be less than 2
        •  \Packet Outbound Errors – this is cumulative not a point in time, may have to reboot to check for new errors
        • \Current Bandwidth – correlate with NIC capability
        • Note: don’t capture loopback
    • Processor
      • Processor(_total)\% Processor Time  Average < 75%
      • Processor(_total)\% Privileged Time < half of Processor = problem, 75% real problem
      • Process(*)\% Processor – – See what process is using the most
Counters Thresholds
MSExchangeIS\RPC Averaged Latency < 25 ms
MSExchangeIS\RPC Operations/sec used a baseline:  online – .75 and 1 RPC hop, cache mode higher
MSExchangeIS\RCP Requests max 500, should be < 70
MSExchangeIS Client(*)\RPC Average Latency < 50ms on average
MSExchangeIS\RPC Client Backoff/Sec Identifies that the server is rejecting Connections
MSExchange Database\Database Page Fault Stalls/sec 0
MSExchange Database\Database Cache Size Minus 2 GB from what RAM is in System, Servers with sync – minus 3 GB
MSExchange Database\Log Record Stalls/sec Average of 10 or less, spike should not be higher than 100
MSExchange Database\Log Threads Waiting/sec Average of 10 or less
MSExchange Database(Information Store)\Log Threads Waiting Should be less than 10 on average.
MSExchangeIS Mailbox(_Total)\Messages Queued For Submission Below 50
MSExchangeADAccess*\LDAP Read Time Average of 50 or less, spike should not be higher than 100
MSExchangeADAccess*\LDAP Search Time Average of 50 or less, spike should not be higher than 100
MSExchangeADAccess*\LDAP Read/sec Average of 50 or less, spike should not be higher than 100
MSExchangeADAccess*\LDAP Search/sec Average of 50 or less, spike should not be higher than 100
Memory\Available Mbytes > 100 MB
Processor\Working Set Review baseline look for large changes
Processor\Virtual Bytes Review baseline look for large changes
Processor\Private Bytes Review baseline look for large changes
Processor(_Total)\% Processor Time Average < 75%
Processor(_Total)\% Privileged Time Remain below 75%
Processor(*)\% Processor Time Look for spikes
Network Interface\Output Queue Length\Packets Outbound Should not be > 10
Network Interface\Output Queue Length\Current Bandwidth Review baseline look for large changes
Database Drives  
LogicalDisk(*)\Avg  Disk Sec/Read below 50 MS (may need faster for +1000 users)
PhysicalDisk(*)\Avg  Disk Sec/Read below 50 MS (may need faster for +1000 users)
LogicalDisk(*)\Avg  Disk Sec/Write Below 100
PhysicalDisk(*)\Avg  Disk Sec/Write Below 100
Log Drives  
LogicalDisk(*)\Avg  Disk Sec/Read Below 20
LogicalDisk(*)\Avg  Disk Sec/Write Below 10
Temp Drives  
LogicalDisk(*)\Avg  Disk Sec/Read Below 20
LogicalDisk(*)\Avg  Disk Sec/Write Below 10
Network  
Network Interface\Output Que Length Below 2
Network Interface\Packet Outbound Errors No Greater than 0
Network Interface\current Bandwidth Match NIC capability
   
   
   

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s