SPS Home    >   Dgreath    >   RDBMS    >   Storage

STORAGE

Sql Server stores everything in 8K pages which, after deducting required overhead, provides 8,060 bytes per page. Eight pages are arranged into an extent which is the smallest unit of disc space allocated by the file system. SQL Server will fill page one through eight, then allocate the next eight pages. Then it will fill page nine through sixteen in order and so on.  When the file nears capacity, the file will be expanded according to the file growth setting until it reaches its maximum size.

Computation of Record Size:
Compute the sum of all columns in the table and ensure that the total bytes does not exceed 8,060.

Computation of Rows per Page:
Divide the record size into 8,060 to determine the maximum rows per page and discard any fractional result as rows may not span pages. 

Computation of Number of Pages:
Divide the expected number of rows by the rows per page to determine the number of pages.

Computation of Number of Extents:
Divide number of pages by eight and round up to the next whole number.

Computation of Maximum Filesize:
Multiply the number of extents required by 64KB where 1KB is 1,024 bytes.

Computation of Initial Filesize:
A value in the range of five to twenty percent of maximum filesize is a suggested initial value. 1MB is the minimum value.

Computation of File Growth:
A value around ten percent of initial filesize is a suggested initial value.

Computation of Log File values:
Use values around 25% of the computed values for the associated datafile.

RAID ARRAYS -- Redundant Arrays of Inexpensive Discs

RAID arrays can be set up as a Windows feature or via a hardware RAID cluster controller to provide enhanced performance, fault tolerance, or both.. The hardware solution will provide higher performance and is the better choice for high end applications. RAID arrays are configured after the basic Windows installation and prior to the installation of applications or loading of data. The relevant RAID configurations are discussed below:

RAID 0 - Disc Striping
Disc striping provides faster data throughput by distributing the load across multiple discs in a sequential fashion. An array of three 100GB discs provides 300GB of total storage and since all three discs are accessed simultaneously, access time is slightly more than 1/3 that of the same drives operating independently. The downside is that a failure of any one disc in the array renders all data lost. Raid 0 is most appropriate where performance is critical, there is little transaction processing and an alternate method of fault tolerance is provided.

RAID 1 - Disc Mirroring
Disc mirroring involves provisioning a pair of discs on one disc controller. Each write to the master disc is subsequently written to the slave disc and provides an exact duplicate. Upon failure of the master disc, the slave can serve as an immediate replacement by either a hardware cable swap or by a drive reconfiguration in the operating system. When the defective disc has been replaced, the mirror can be reestablished to resume fault tolerant operation. The downside of this configuration is that a failure of the disc controller renders both drives inaccessible.  An array of two 100GB discs will provide 100GB of total storage.  Read access will be the same as for a single un-arrayed disc, however write time will be double because of the double write requirement. Raid 1 is most appropriate for servers where maximum fault tolerance is critical and the loss of performance is acceptable. Log files should always be mirrored.

RAID 1 - Disc Duplexing
Disc duplexing involves provisioning a pair of discs on two disc controllers. Each write to the master disc is subsequently written to the slave disc and provides an exact duplicate. Upon failure of the master disc, the slave can serve as an immediate replacement by either a hardware cable swap or by a drive reconfiguration in the operating system. When the defective disc has been replaced, the mirror can be reestablished to resume fault tolerant operation. The advantage over mirroring is that a failure of one disc controller doesn't render both drives inaccessible.  An array of two 100GB discs will provide 100GB of total storage.  Read access will be the same as for a single un-arrayed disc, however write time will be double because of the double write requirement. Raid 1 is most appropriate for servers where maximum fault tolerance is critical and the loss of performance is acceptable. To minimize performance loss, purchase the fastest discs and controllers available. Log files should always be mirrored. 

RAID 5 - Disc Striping with Parity
Disc striping with parity provides faster data throughput with some fault tolerance.  The data is distributed across multiple discs as with Raid 0, however, an additional disc is added to store parity information.  With this setup, any one disc in the array can be rebuilt from the parity bits but the loss of two discs results in total data destruction.  An array of three 100GB discs provides 200GB of total storage since the third discs only holds parity information. Larger arrays are more efficient than smaller arrays (ten 100GB discs yields 900GB) and possibly faster performance depending on the drive controllers. Raid 5 is most appropriate where performance is critical and transaction processing is a greater share of server load than is the case of Raid 0 because of its increased fault tolerance. An alternative method of fault tolerance is advisable. To minimize performance loss, purchase the fastest discs and controllers available.

RAID 10 - Disc Striping with Mirroring
Disc striping with mirroring is a greatly enhanced solution. Two arrays of discs are striped for maximum performance as with Raid 0 and then duplexed for maximum fault tolerance as with Raid 1. The performance loss of duplexing is offset by the performance gain of striping but the drive count is high resulting in low storage efficiency. Six 100GB discs yield a total storage capacity of 300GB.

Ultimate Solution
A excellent solution for a high performance, maximally fault tolerant single server would entail setting up two duplexed discs for the operating system, applications, and log files using the internal EIDE buss. Then use a hardware RAID SCSI controller to provision a cluster of discs for bulk database storage.

A high end solution would likely entail a cluster server to provide load sharing or failover, two or more SQL servers, and a shared NAS array.