A very common pattern is to allow a caller to ask for the number bytes (or elements) required and then ask for the data, many user mode Win32 APIs (like RegQueryValueEx) and kernel mode (like IoGetDeviceProperty) implement it. You first ask for the number of bytes needed (passing NULL and a pointer to a size), allocate the memory needed, and then ask for the data using the buffer and size to get the data. It is a very useful pattern. So how do you implement it in a driver? Well, there are 2 ways I can think of implementing it, both use IOCTLs.

The first implementation is to define a structure which has a size as its first field and then return the number of bytes/elements required in the first field. For instance,

    typedef struct _MY_ELEMENT {
         ULONG Id;
         UCHAR Name[32];
    } MY_ELEMENT, *PMY_ELEMENT;
    
    typedef struct _MY_ELEMENT_LIST {
        ULONG NumElements;
        MY_ELEMENT ElementList[1]; // open ended array
    } MY_ELEMENT_LIST, *PMY_ELEMENT_LIST;
    
    // An easy way to figure out how many bytes are in the structure before the list
    // of elements
    #define MY_ELEMENT_LIST_HEADER_SIZE (FIELD_OFFSET(MY_ELEMENT_LIST, ElementList))
    
    size_t length;
    NTSTATUS status;
    PMY_ELEMENT_LIST pList = NULL;
    ULONG_PTR information = 0x0;
    
    switch (IoctlCode) {
    case IOCTL_GET_ELEMENTS:
        //  This will make sure that buffer length is at least sizeof(ULONG)
        status = WdfRequestRetrieveOutputBuffer(Request, sizeof(ULONG), (PVOID*) &pList, &length);
        if (!NT_SUCCESS(status)) {
            // do nothing since buffer length is < sizeof(ULONG), we will break out and return error after the if() clause
        } 
        else if (length == sizeof(ULONG) {
            pList->NumElements = [get number of elements];
            information = 0x0;
        }
        else {
            // compute how many elements we can put into the buffer
            length -= MY_ELEMENT_LIST_HEADER_SIZE;

            // check to make sure the buffer is an integral number of MY_ELEMENTs
            if ((length % sizeof(MY_ELEMENT)) != 0) {
                status = STATUS_INVALID_PARAMETER;
            }
            else {
                pList->NumElements = [get number of elements]
                [copy elements into pList->ElementList]
    
                // compute how many bytes we copied over
                information = MY_ELEMENT_LIST_HEADER_SIZE  + 
                              [num elements copied]*sizeof(MY_ELEMENT);
                status = STATUS_SUCCESS;
            }
        }
        break;
    }
    ...
    WdfRequestCompleteWithInformation(Request, status, information);
 

In this example, we return the number of elements required in the list. We first check for the size of the output buffer; WdfRequestRetrieveOutputBuffer will help with the inital check. If it is a ULONG then we know that the caller is asking for the number of elements, so we return this information to the caller in the supplied output buffer. If the size of the output buffer is less than a ULONG, it is an error. If the size of the output buffer is greater than a ULONG, we first make sure that the buffer size is an integral number of elements (what good is a partial element?) and then copy over as many elements as possible. The caller would have the following code to get the data.

    PMY_ELEMENT_LIST pList;
    ULONG numElements, listSize; size;

    DeviceIoControl(hDevice, IOCTL_GET_ELEMENTS, NULL, 0, &numElements, sizeof(numElements), ...);

    listSize = MY_ELEMENT_LIST_HEADER_SIZE + numElements * sizeof(MY_ELEMENT);
    pList = (PMY_ELEMENT_LIST) LocalAlloc(LPTR, listSize);

    DeviceIoControl(hDevice, IOCTL_GET_ELEMENTS, NULL, 0, pList, listSize, ...);

In the second implementation we do not use a size field at all. Instead we let the I/O manager return the data for us to the caller. In this implementation we return STATUS_BUFFER_OVERFLOW and the number of required bytes (vs. the required number of elements in the previous example, a very important distinction!). The value of STATUS_BUFFER_OVERFLOW, 0x80000005, is very important in this implementation. This value will not return TRUE for NT_SUCCESS() or NT_ERROR(), rather it will return TRUE for NT_WARNING(). When an NT_WARNING() value is returned in response to an IOCTL IRP, the I/O manager will copy the value of Irp->IoStatus.Information to the lpBytesReturned parameter of DeviceIoControl (and does not copy anything to the caller's output buffer). This allows you to just have an array of elements as your output buffer.

NOTE:  if Irp->IoStatus.Status is an NT_WARNING() and the output buffer is buffered (e.g. METHOD_BUFFERED), the I/O manager will also try to use Irp->IoStatus.Information as the number of bytes to copy to the output buffer! This means that you should only return the number of bytes required if the output buffer is not present, otherwise the I/O manager can overwrite your process's address space very easily.  Checking for the presence of the output buffer is problematic.  The easiest solution is to avoid using METHOD_BUFFERED in the IOCTL value and use METHOD_IN/OUT_DIRECT instead.  Otherwise, you can check for IRP_INPUT_OPERATION in the IRP's Flags field, see MSDN for details on how to do this.  Thank you to Ishai Ben Aroya for pointing out this subtle problem!

Here is an example implementation (with differences betweent the first and second implementations highlighted in blue), where I am assuming METHOD_BUFFERED is not being used

    PMY_ELEMENT pList = NULL;

    switch (IoctlCode) {
    case IOCTL_GET_ELEMENTS:
        // This will make sure that buffer length is at least sizeof(MY_ELEMENT).
        // Anything less and we will return a warning and indicate the number
        // of required bytes.
        status = WdfRequestRetrieveOutputBuffer(Request, NULL, (PVOID*) &pList, &length);
        if (!NT_SUCCESS(status)) {
            // do nothing, error will propagate after the switch
        }
        else if (length == 0x0) {
            status = STATUS_BUFFER_OVERFLOW;
            information = [get number of elements]
        } 
        else {
            // check to make sure the buffer is an integral number of MY_ELEMENTs
            // this will also handle the case of length < sizeof(MY_ELEMENT)
            if ((length % sizeof(MY_ELEMENT)) != 0) {
                status = STATUS_INVALID_PARAMETER;
            }
            else {
                [copy elements into pList]

                // compute how many bytes we copied over
                information = [num elements copied]*sizeof(MY_ELEMENT);
                status = STATUS_SUCCESS;
            }
        }
        break;
    }
    ...
    WdfRequestCompleteWithInformation(Request, status, information);

And the calling code would look like

    DWORD bytesReturned;
    PMY_ELEMENT pList;
    BOOL result;

    // This will return FALSE since the driver will return an NT_WARNING value!
    result = DeviceIoControl(hDevice, IOCTL_GET_ELEMENTS, NULL, 0, NULL, 0, &bytesReturned, ...);
    ASSERT(result == FALSE);

    listSize = bytesReturned;
    pList = (PMY_ELEMENT) LocalAlloc(LPTR, listSize);

    DeviceIoControl(hDevice, IOCTL_GET_ELEMENTS, NULL, 0, pList, listSize, ...);