The recommendations described in this section are general guidelines and best-practices to apply to a SharePoint implementation. The majority of the items listed here are not of my own discovery, but rather a summary of the information that can be found in the whitepaper Scaling SharePoint 2007: Storage Architecture (R2) created by Russ Houberg and KnowledgeLake.
Virtualization
In reality, any server in the farm can be virtualized and some servers in the farm definitely should be. However, in a large scale deployment where performance goals are a major factor not every server in the farm should be virtualized. The list below outlines some general recommendations when considering virtualization for a SharePoint environment.
- If possible use RAW LUNs when creating data volumes that will be used for databases and search indexes. Keep in mind that this may have a negative impact on the ability to create a VM snapshot.
- Never use dynamically expanding disks when RAW storage volumes will not be used.
- If attempting to create a virtualized SQL server before going to “bare-metal”, use SQL Aliases so that the change over to a physical storage subsystem is easier.
- Virtualized servers filling a similar role in the farm should be spread between multiple physical hosts for there to be true high availability.
Database Performance
tempdb
- The tempdb is the most read/write intensive database used during search operations and sorting of views.
- The tempdb should be pre-sized to 25% of the predicted size of the largest content database. Keep auto-growth enabled, but if the tempdb size should grow beyond the original setting the pre-sized value should be permanently increased.
- The tempdb should consist of multiple data files when the SQL Server has more than 1 processor core. Use 2 data files on a 2 or 4 core system and 4 or more data files on an 8 or 16 core system.
- Each tempdb data file should be equal in size.
- To optimize read and write performance for the tempdb database the database files should be placed on separate RAID 10 LUNs, this also includes the database log file on a separate RAID 10 LUN.
- If possible, data files for other databases should not reside on the same LUNs that contain a tempdb data file.
WSS Content Databases
- The content database overhead size estimates can be anywhere from 1.2 to 1.5 times that of the raw storage size of documents placed into a SharePoint site.
- Databases should be pre-sized to handle the initial data storage expectations. Autogrow should be enabled.
- Volumes that will host SharePoint content database should be on RAID 10 logical units when available, otherwise RAID 5 volumes can be used as an alternative.
- The content database should consist of multiple data files when the SQL Server has more than 1 processor core. Use 2 data files on a 2 or 4 core system and 4 or more data files on an 8 or 16 core system.
- Data files should be spread onto separate LUNs using unique spindle sets.
- Database log files for all content databases should be placed onto a single shared LUN with its own isolated spindle set.
- Log files should be pre-sized and set to grow in 4 or 8 GB increments to prevent fragmentation. If the log file begins to grow increase the initial size by another 4 or 8 GB or adjust the backup schedule.
- When performing a massive data load operation it may be wise to temporarily set the database recovery model to simple and change it back to full when the operation has completed.
- Do not share a LUN between data files of multiple content databases when possible. Otherwise, stripe content database data files across multiple LUNs.
Search & SSP Databases
- The search database will store every metadata value for every SharePoint asset served by the SSP. This includes values for all documents, lists, site, site collection, and web application. There is also information related to crawl statistics, the search log, etc.
- The ratio of content database size to search database size is indirectly proportional to the average file size of documents contained in the database. When file sizes are larger there are a smaller number of them contained within the database and less indexing information to track. When storing small files (1 KB) estimate the size of the search database to be 4 times that of the content indexed. As the average file size increases to 10 KB estimate the size of the search database by multiplying by 0.48, and when the average file size is 100 KB multiple the content size by 0.2.
- The SSP database includes content related to user profile information, BDC applications, Excel Services and Project Server data. The SSP database tends to be relatively small and will not usually exceed 4 GB.
- The search database is used heavily for both read and writes and should only be implemented on a RAID 10 array whenever possible.
- The search database can consist of a 2nd file group to improve the performance of crawl and query operations by placing these distinct operations on different disk spindles. http://blogs.msdn.com/enterprisesearch/archive/2008/09/16/sql-file-groups-and-search.aspx
- Each file group for the search database should consist of multiple data files when the SQL Server has more than 1 processor core. Use 2 data files on a 2 or 4 core system and 4 or more data files on an 8 or 16 core system.
- The search and SSP databases should reside on their own LUNs.
- Log files should be pre-sized and set to grow in 4 or 8 GB increments to prevent fragmentation. If the log file begins to grow increase the initial size by another 4 or 8 GB or adjust the backup schedule.
Configuration, Central Administration Content, and SSP Administration Content Databases
- These databases are rarely used and do not need multiple data files.
- These database data and log files can all reside on a single spindle.
Database Maintenance
- Perform an index rebuild after long running operations or massive migrations of data.
- Use the procedure outlined in KB 943345 http://support.microsoft.com/kb/943345/en-us.
- Don’t rely solely on SharePoint timer jobs to perform statistics and index maintenance. Create a good SQL maintenance plan. You can use the procedure outlined in KB 943345 http://support.microsoft.com/kb/943345/en-us to help along with Bill Bear’s whitepaper on SharePoint database maintenance http://technet.microsoft.com/en-us/library/cc262731.aspx.
Other
- Use SharePoint column indexes on columns that are used to sort and filter views of large lists or libraries. This can also help with performance when performing CAML queries. Create and maintain these indexes during a maintence schedule or during off-peak hours. The creation or deletion of these indexes are long-running processes that will lock database tables during their modification effectively rendering any SharePoint sites that share the same database unusable for users until the operation completes. These types of indexes are not true SQL indexes.