I've been writing a number of helpful tools at work such as a tool to extract transforms and cabinets from a patch and wasn't satisfied relying on the file extension to identify a patch, or other Windows Installer file types for that matter. File extensions are only one way to help the shell to invoke actions on files, filter file types in common dialogs, and more.

If you try to open an .msp file with the MsiOpenDatabase function you must pass the MSIDBOPEN_PATCHFILE flag to the szPersist parameter or the function fails. Passing this flag for other Windows Installer file types also results in an error, and this function cannot be used to open .msi files. Changing the file extension makes no difference. Obviously something more helps Windows Installer functions validate that files are of a certain type.

Recall from What's in a Patch that Windows Installer relies on OLE structure storage. Embedded transforms are stored as sub-storage while embedded cabinets and binary data fields are streams. Sub-storages are identified by class identifiers (CLSIDs) you can get using structured storages functions. You can open any of the supported Windows Installer types with the StgOpenStorage and the StgOpenStorageEx functions. This provides a pointer to the IStorage interface, and with the pointer we can call the IStorage::Stat method. That provides a STATSTG structure that contains the CLSID for the storage. If you compare the CLSID values for the various Windows Installer file types, you'll find a few differences as seen in the table below.

Description

Extension

CLSID

Installer Package

msi

{000C1084-0000-0000-C000-000000000046}

Merge Module

msm

{000C1084-0000-0000-C000-000000000046}

Patch Package

msp

{000C1086-0000-0000-C000-000000000046}

Transform

mst

{000C1082-0000-0000-C000-000000000046}

Patch Creation Properties

pcp

{000C1084-0000-0000-C000-000000000046}

Keep in mind that this information could change any time. That said, the Windows Installer team is typically pretty good about maintaining backward compatibility. To discover this information, example code follows.

HRESULT hr = NOERROR;
LPSTORAGE pStorage = NULL;
STATSTG stg = { 0 };
const int cchCLSID = 40;
OLECHAR szCLSID[cchCLSID];

// Argument validation excluded for brevity.

// Open the storage file exclusively. Files like MSPs seem to require this flag.
hr = StgOpenStorage(argv[1], NULL, STGM_SHARE_EXCLUSIVE, NULL, 0, &pStorage);
if (SUCCEEDED(hr) && pStorage)
{
    // Get the stats for the storage document.
    pStorage->Stat(&stg, STATFLAG_DEFAULT);
    
    // Print STATSTG.clsid.
    if (StringFromGUID2(stg.clsid, szCLSID, cchCLSID))
    {
        wprintf(L"%s\n", szCLSID);
    }
    
    CoTaskMemFree(stg.pwcsName);
    pStorage->Release();
    pStorage = NULL;
}

For use in C and C++ source code, example definitions follow:

EXTERN_C const CLSID CLSID_MsiTransform = {0xC1082, 0x0, 0x0, {0xC0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x46}};
EXTERN_C const CLSID CLSID_MsiPackage = {0xC1084, 0x0, 0x0, {0xC0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x46}};
EXTERN_C const CLSID CLSID_MsiPatch = {0xC1086, 0x0, 0x0, {0xC0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x46}};

You can download the full implementation for a tool to print all STATSTG fields for structured storage files using the compound file implementation.