Suresh and I have blogged about these issues before but this post will put another spin on the information, as it applies to BIG DATA.
Previous Blog References
I ran into a 665 issue with a customer attempting to BCP data out of a database. The scenario was that it worked if one instance of BCP was running but if they started a second instance of BCP, at the same time, (using where clause to divide the table and queryout parameter) the BCP(s) would fail with the following error.
SQLState = S1000, NativeError = 0 Error = [Microsoft][SQL Server Native Client 10.0]I/O error while writing BCP data-file
After reproducing it in the lab and debugging it I found the problem to be the file system limitation (FILE SYSTEM LIMITATION = 665) during the WriteFile call. (I did file work items with the SQL Server development team to surface more error details so we don't have to troubleshoot with the debugger in the future.)
Tracking down the source of the problem it was the physical, disk cluster allocations and the way NTFS tracks allocations for an individual file. I highly recommend you read the following post, it describes the NTFS behavior really nicely: http://blogs.technet.com/b/askcore/archive/2009/10/16/the-four-stages-of-ntfs-file-growth.aspx
To summarize, when you are at the maximum allocation state for a single NTFS file you have a MTF allocation (1K) for the Attribute List Entries that points to ## of Child records (1K allocations) holding information about the starting physical cluster and how many contiguous clusters are allocated for that segment of the file. The more of these you have the more fragmented the file becomes.
A nice example I found was the following.
- Mapping Pair (Physical Cluster, ## of Clusters From Physical)
- The file segment starts at physical cluster 100 and used 8 clusters.
- The entry is 100, 8
The mapping pair information can be consolidated and compressed so it it not a simple division calculation of the MTF size / Mapping Pair it depends on the cluster locations, contagious cluster acquire capabilities and compression of the mapping pair (for example if cluster location can be stored in less than 8 bytes NTFS can compress the LARGE_INTEGER value.)
The cluster size and MTF sizes determine the maximum size an NTFS file can accommodate. The following articles highlight these options and limitations.
Best Case - Contiguous Allocations
Worst Case - Alternating Allocations between files
In the BCP scenario the customer had determined that running 6 instances of BCP on this target system maximized the rows per second transferred. However, the more instances of BCP they enabled the faster they encountered the file system limitation. As you can see the more allocations taking place on the same disk/LUN raises the fragmentation level chances and internally puts pressure in the mapping pair space.
Making it a bit more pronounced is that BCP uses a 4K buffer for its writes. It fills the buffer, writes it and repeats. When on a system that uses 4K clusters this aligns well but it allows 4K osculation of clusters between multiple copies to the same physical media.
Here are a few suggestions for handling Big Data.
|Apply QFE (Requires Format) || |
- Windows provides a QFE to extended the size of the MTF allocation to 4K. While this will not eliminate the problem the size of the tracking tree is much larger. Not only is the ATTRIBUTE ENTRY LIST larger so too are the allocations for the MAPPING PAIRs. Each of the two levels are 3x larger.
Be careful: Increasing the MTF size means any file will require 4K of physical storage.
|Avoid NTFS Compression || |
- Avoid using NTFS compression for the target files. NTFS compression is known to use more mapping pairs.
|Use 64K Clusters || |
- Allow large allocations for each cluster tracked.
|Defragmentation || |
- Reducing the physical media fragmentation always helps. In fact, if a file starts to encounter the limitations, defragmenting the file may restore functionality.
|BCP || |
- Avoid causing fragmentation by using separate physical paths when available
- Use queryout and a where clause to grab chunks of data in reasonable file sizes to avoid hitting the limitations
- Evaluate native vs character mode BCP to determine the smaller output format for the data
|Backup || |
- Use T-SQL compression to reduce storage space requirements
- Avoid increasing fragmentation when using stripes (Disk=File1, Disk=File2) by separating the physical I/O. Placing stripes on the same physical media can increase the likely hood of physical fragmentation.
|MDF/NDF/LDFs ||It is unlikely to encounter on database files but for completeness I have included some aspects of database files. |
- Avoid auto-grow activities that could lead to numerous fragments but do not pick a growth size for the LDFs that can lead to concurrency issues during the growth. It is best to avoid auto growth and do specific grows based on need.
- Create on drives known to be low on physical fragmentation
- Avoid files that will push against the limitations. For example, you may need to restore and the restore may not be able to acquire the same physical layout.
- Use caution with read only database on NTFS compressed volumes
Using Contig.exe from sysinternals (http://technet.microsoft.com/en-us/sysinternals/bb897428.aspx) and the -a parameter you can view the number of fragments used by a given file.
- 33 - The process cannot access the file because another process has locked a portion of the file. Note: This usually occurs when NTFS compression is enable on the target.
- 665 - The requested operation could not be completed due to a file system limitation
- 1450 - Insufficient system resources exist to complete the requested service
Bob Dorr - Principal SQL Server Escalation Engineer