To compare two filenames (Lucian Wischik)

To compare two filenames (Lucian Wischik)

  • Comments 6

I once visited an ancient tower in China. A sign said "tread carefully for you bear the weight of history on your shoulders". Our guide explained that the tower was over 800 years old! Oh yes, he said, it was built 800 years ago, had burnt to the ground three times in its history, had been moved to completely different locations twice, but it still counted as 800 years old.

Back in VB land I was making two different calls into some framework, and each call gave me back a filename, and I needed to judge whether the two results were talking about the same file. But how? The framework docs didn't give any guarantee that they'd give back filenames in the same format. What if one call returned long filenames and the other returned it with different casing? or 8.3 format? Or UNC? Or long UNC?

The following function is the best I could come up with. I think it only makes sense to compare filenames if they point to an existing file. And I called the function "DidFileNamesPointToSameFile", in the past tense, because by the time the function returns then they might no longer point to the same file anymore!

If anyone has suggestions for neater ways to accomplish the same task, I'd love to hear.

 

Module Module1
 
Sub Main()
   
Dim same1 = DidFileNamesPointToSameFile("C:\SHARE\LONGFILENAME.TXT", "c:\share\longfilename.txt")
   
Dim same2 = DidFileNamesPointToSameFile("c:\share\longfilename.txt", "\share\longfilename.txt")
    Dim
same3 = DidFileNamesPointToSameFile("\\lwischikl\share\longfilename.txt", "c:\share\longfilename.txt")
   
Dim same4 = DidFileNamesPointToSameFile("\\?\c:\share\longfilename.txt", "c:\share\longfilename.txt")
   
Dim same5 = DidFileNamesPointToSameFile("c:\share\longfi~1.txt", "c:\share\longfilename.txt")
   
Dim same6 = DidFileNamesPointToSameFile("c:\share", "C:\SHARE")
    Dim diff1 = DidFileNamesPointToSameFile("c:\share\NonExistantFile.txt", "c:\share\NonExistantFile.txt")
  
End Sub

  ''' <summary>
 
''' Determines whether two filenames pointed to the same (existing) file/directory on disk.
 
''' This function deals correctly with mismatched case, with relative vs absolute paths, with
 
''' UNC-style \\server\filenames, with super-long \\?\filenames, and with long vs 8.3 filenames.
 
''' Note: it's conceivable that the result of this function has become out of date by the time
 
''' it function returns, e.g. if another process is deleting or creating or renaming files.
 
''' </summary>
 
''' <param name="FileName1">The first filename to compare</param>
 
''' <param name="FileName2">The second filename to compare</param>
 
''' <returns>True if both filenames referred to the same file and it existed; false otherwise</returns>
 
''' <remarks>This function uses BY_HANDLE_FILE_INFORMATION. This is the only way to determine authoritatively
 
''' whether two filenames are equivalent. MSDN explains:
 
''' The identifier (low and high parts) and the volume serial number uniquely identify a file on a single computer.
 
''' To determine whether two open handles represent the same file, combine the identifier and the volume serial number
 
''' for each file and compare them.</remarks>
 
Function DidFileNamesPointToSameFile(ByVal FileName1 As String, ByVal FileName2 As String) As Boolean
   
Dim desiredAccess As UInt32 = 0 ' We request neither read nor write access
   
Dim fileShareMode As UInt32 = 7 ' FILE_SHARE_READ|FILE_SHARE_WRITE|FILE_SHARE_DELETE
   
Dim createDisposition As UInt32 = 3 ' OPEN_EXISTING
   
Dim fileAttributes As UInt32 = &H2000080 ' FILE_FLAG_BACKUP_SEMANTICS|FILE_ATTRIBUTES_NORMAL. BACKUP is needed to open a directory.
   
'
   
Using handle1 = CreateFile(FileName1, desiredAccess, fileShareMode, IntPtr.Zero, createDisposition, fileAttributes, IntPtr.Zero)
     
Using handle2 = CreateFile(FileName2, desiredAccess, fileShareMode, IntPtr.Zero, createDisposition, fileAttributes, IntPtr.Zero)
       
If handle1.IsInvalid OrElse handle2.IsInvalid Then Return False
       
Dim info1 As New BY_HANDLE_FILE_INFORMATION, info2 As New BY_HANDLE_FILE_INFORMATION
       
Dim r1 = GetFileInformationByHandle(handle1, info1)
       
Dim r2 = GetFileInformationByHandle(handle2, info2)
       
If r1 = 0 OrElse r2 = 0 Then Return False
       
Return info1.nFileIndexLow = info2.nFileIndexLow AndAlso info1.nFileIndexHigh = info2.nFileIndexHigh AndAlso info1.dwVolumeSerialNumber = info2.dwVolumeSerialNumber
     
End Using
   
End Using
  End Function

  Structure BY_HANDLE_FILE_INFORMATION
   
Dim dwFileAttributes As Integer
   
Dim ftCreationTime As System.Runtime.InteropServices.ComTypes.FILETIME
   
Dim ftLastAccessTime As System.Runtime.InteropServices.ComTypes.FILETIME
   
Dim ftLastWriteTime As System.Runtime.InteropServices.ComTypes.FILETIME
   
Dim dwVolumeSerialNumber As Integer
   
Dim nFileSizeHigh As Integer
   
Dim nFileSizeLow As Integer
   
Dim nNumberOfLinks As Integer
   
Dim nFileIndexHigh As Integer
   
Dim nFileIndexLow As Integer
 
End Structure

 
<System.Runtime.InteropServices.DllImport("kernel32.dll", SetLastError:=True, CharSet:=System.Runtime.InteropServices.CharSet.Auto)> _
 
Function GetFileInformationByHandle(ByVal hFile As Microsoft.Win32.SafeHandles.SafeFileHandle, ByRef lpFileInformation As BY_HANDLE_FILE_INFORMATION) As Integer
 
End Function

 
<System.Runtime.InteropServices.DllImport("kernel32.dll", SetLastError:=True, CharSet:=System.Runtime.InteropServices.CharSet.Auto)> _
 
Function CreateFile(ByVal lpFileName As String, ByVal dwDesiredAccess As UInt32, ByVal dwShareMode As UInt32, _
   
ByVal lpSecurityAttributes As IntPtr, ByVal dwCreationDisposition As UInt32, _
   
ByVal dwFlagsAndAttributes As UInt32, ByVal hTemplateFile As IntPtr) As Microsoft.Win32.SafeHandles.SafeFileHandle
 
End Function

End Module

Leave a Comment
  • Please add 3 and 7 and type the answer here:
  • Post
  • PingBack from http://www.easycoded.com/to-compare-two-filenames-lucian-wischik/

  • Although this will work "most of the time", it is impossible in Windows (and probably other OSes) to make 100% sure two paths or two handles point to the same file, because everything can be virtual, redirected, or whatever by file system drivers, anti viruses, virtual disks, etc... So it really depends on what you really do with this kind of functions, but you're working for NASA or DOD, beware :-)

  • Excellent (useful) algorithm. It could be a method of System.IO.Path (or File, if more convinient) class :-)

  • You've been kicked (a good thing) - Trackback from DotNetKicks.com

  • If you're looking for a solution that allows you to find the "canonical" form of the name (for instance to put the names into a hash table or something), there's an example of that in Writing Secure Code (2nd ed.) page 386-390. Basically it prepends "\\?\" and calls GetFullPathName followed by GetLongPathName. It also rejects out of hand paths that are longer than MAX_PATH or that contain invalid characters (this is a security book after all, so it prefers to jsut call a.text::$DATA invalid than try to handle it).

    Anyhow, it's an interesting approach that I wish was wrapped in an API somewhere.

  • The documentation for GetFileInformationByHandle says: "nFileIndexLow: Low-order part of a unique identifier that is associated with a file. This value is useful ONLY WHILE THE FILE IS OPEN by at least one process. If no processes have it open, the index may change the next time the file is opened."

Page 1 of 1 (6 items)