Javascript required
Skip to content Skip to sidebar Skip to footer

How to Troubleshoot Slow Disk Read/write Interview Questions

In this article, nosotros will discuss how to resolve I/O bug that is a very important point for the SQL Server troubleshooting. The storage subsystem is one of the meaning functioning factors for the databases. Detecting and identifying I/O problems in SQL Server can be a tough task for the database administrators (DBAs). More often than not, the underlying reasons for the I/O problems can be:

  • Misconfigured or malfunctioning deejay subsystems
  • Insufficient disk performances
  • Applications that generate redundant I/O activities
  • Poor designed or unoptimized queries

Analyzing the symptoms should be a major principle to clarify the underlying reason that causes the I/O problems on SQL Server. Otherwise, nosotros can waste matter time dealing with irrelevant problems or discussing the problems with system or storage administrators unnecessarily. Wait types give very useful information for SQL Server troubleshooting. The post-obit await types can point I/O issues, but these look types exercise not suffice to decide whatever trouble on the disks.

  • PAGEIOLATCH_*
  • WRITELOG
  • ASYNC_IO_COMPLETION

At first, we will briefly describe these wait types and their relations to the I/O problems.

PAGEIOLATCH_*

SQL Server reserves an area on the memory to itself, and this area uses to cache information and index pages to reduce the disk activities. This reserved memory expanse is called Buffer Puddle. The working mechanism of the buffer puddle is very elementary; the data loads from the disk to the memory when any request has been received for reading or irresolute, and they process in the buffer pool. The data is written to the disk over again when information technology is modified. In light of this information, PAGEIOLATCH_* occurs when transferring data from disk to buffer pool. Information technology is very normal to detect some PAGEIOLATCH_* nevertheless, it indicates a problem when we meet this wait type oft and more than than the other expect types. PAGEIOLATCH_* does not indicate deejay problems by oneself considering this wait blazon tin can occur for a variety of reasons. For example:

  • Outdated statistics or poorly designed indexes can crusade to PAGEIOLATCH_* waits because these types of problems crusade redundant deejay activities
  • Enabling CDC (Change Data Capture) option can cause extra I/O workload
  • Insufficient memory can cause PAGEIOLATCH_* problems because SQL Server does not keep the data pages long enough in the buffer enshroud. The other sign of this trouble is the Page Life Expectancy metric

WRITELOG

When any modification is performed in the database, SQL Server writes this modification to log buffer, and then it writes this buffer data to deejay. Therefore, this wait type is related to the physical disk that contains the log file (ldf). Placing log files (ldf) on as fast and dedicated disks as possible will be the correct arroyo to overcome these problems. At the same time, performance statistics of concrete disks that shop ldf files should be considered when this problem occurs. The log information is written into the disk sequentially, and the reading process is likewise performed sequentially. Due to this working principle, the disks selected for the log files must perform well for the sequential read and write throughput along with the minimum latency.

ASYNC_IO_COMPLETION

This wait type occurs when the SQL Server processes backup and restore operations; however, when this operation takes more time than usual, it might exist a alert for the I/O problems. The BACKUPIO can exist seen with the ASYNC_IO_COMPLETION so we tin consider about any deejay problem.

I/O Stalls

I/O stall time is an indicator that can be used to discover I/O problems. The dm_io_virtual_file_stats is a dynamic management part that gives detailed information about the stall times of the data and log files then information technology will simplify the SQL Server troubleshooting process. This dynamic management part takes two parameters start 1 is database id, and the second i is the database file number.

Nosotros tin can execute this dynamic management role similar the below for all databases.

Using dm_io_virtual_file_stats function for SQL Server troubleshooting.

database_id: This cavalcade represents the id number of the database, and we can use sys.databases table to obtain all database id numbers.

file_id: This column represents the id number of the file, and we can use sys.master_files table to obtain all database id numbers.

sample_ms: This column shows the duration of the since server restarted.

num_of_reads: This column shows the number of physical reads that occurred since the server restarted.

num_of_bytes_reads: This column shows the total corporeality of physical reads in bytes that occurred since the server restarted.

io_stall_read_ms: This column shows total latency for the read operations in a millisecond.

num_of_writes: This column shows the number of writes that occurred since the server restarted.

num_of_bytes_written: This column shows the full amount of reads in bytes that occurred since the server restarted.

io_stall_write_ms: This column shows total latency for the write operations in a millisecond.

io_stall: This cavalcade shows the total latency fourth dimension for the I/O operations in a millisecond.

The high stall times bespeak I/O problems and busy deejay activities. With the help of the following query, we can find out the read, write, and total latency of the database files and so that nosotros tin can diagnose any storage problems.

Analyzing I/O latency for SQL Server troubleshooting

The Average Total Latency column represents the full latency about the database files, and we tin utilize the following tabular array as reference to evaluate the deejay functioning confronting latency.

Excellent

<1 ms

Very good

<v ms

Good

<5 – 10 ms

Poor

< 10 – 20 ms

Bad

< 20 – 100 ms

Very Bad

<100 ms -500 ms

Atrocious

> 500 ms

Using Functioning Monitor to Analyze I/O Bug

Functioning Monitor is also known as Perfmon and this tool helps to track metrics about reckoner resources or installed applications. Specially, Perfmon assists in analyzing and troubleshooting SQL Server performance bug considering It includes some particular counters for SQL Server beside the general resource counters. We can understand that Perfmon plays a key part in the SQL Server troubleshooting according to this explanation. When we focus on the I/O counters of the Perfmon, some of them come to the forefront. Showtime of all, we should go on on eye to latency metrics considering these values can tell everything about the disk performance.

Latency is a performance metric that measures the fourth dimension gap between requests and responses for the disks. We can use the following counters to measure the disk latency.

  • Avg. Disk sec/Transfer counter shows the total latency, and these values should exist nether the 10 milliseconds
  • Avg. Disk sec/Read counter shows the read latency
  • Avg. Deejay sec/Write counter shows the write latency

Measuring disk latency with Perfmon for SQL Server troubleshooting

When we analyze the image above, this box performs an awful performance. The average latency is 0.229 seconds, so it equals 0.229*thousand=229 milliseconds.

IOPS (Input/Output operations per second) is a performance metric for the disks that measures the total input and output operations performed past the deejay in one second.

  • Disk Reads/sec counter indicates write IOPS
  • Disk Writes/sec counter indicates read IOPS
  • Disk Transfers/sec counter indicates the total number of the IOPS this value equals to summing Disk Reads/sec and Deejay Writes/sec counters

The throughput metric indicates how much MB can exist read or written past the deejay subsystems per 2d. The throughput value will exist changed according to our disk infrastructure, types, and vendors. For this reason, the exact value could not be given for this counter.

  • Disk Bytes/sec counter shows the total throughput of the disk per second
  • Disk Read Bytes/sec counter shows the read throughput
  • Disk Write Bytes/sec counter shows the read throughput

Measuring disk throughput with Perfmon for SQL Server troubleshooting

Conclusion

In this article, nosotros learned the basic methods that assist to diagnose and troubleshoot SQL Server I/O problems. To overcome this type of problem, we demand to observe all metrics that can be helped to figure out the principal problem. Understanding the main trouble is a very significant betoken for the SQL Server troubleshooting.

  • Author
  • Recent Posts

Esat Erkec

mariabeign1941.blogspot.com

Source: https://www.sqlshack.com/sql-server-troubleshooting-disk-i-o-problems/