Larry Osterman's WebLog

Confessions of an Old Fogey
Blog - Title

My favorite APIs: TransmitFile

My favorite APIs: TransmitFile

  • Comments 11

Back in NT 3.5ish, Microsoft first deployed IIS, our web server.  The people working on it very quickly realized that the server wasn’t up to snuff performance-wise.

A bit of profiling and they realized the problem – the application was doing:

            CreateFileEx(inputFile);
while (!eof(inputFile))
{
    ReadFile(inputFile, &inputBuffer);
    WriteFile(socket, inputBuffer);
}

The problem was the transfer through inputBuffer.  The file was in the cache in the kernel (from the read file), but the data had to be transferred from the cache, into the user mode inputBuffer, then back into the kernel to the in-kernel TCP buffer.  All those copies were making the web server CPU bound.

Needless to say, this wasn’t a good situation, so for NT 3.51, a new system call was added: TransmitFile.  TransmitFile was tailored for the web server, because it allowed the user to specify a set of user buffers that were to be prepended and appended to the file before and after the transfer.  So a web server could put the HTTP response headers in the prepend portion, and any trailing data needed in the append portion.

It turns out that TransmitFile was absolutely perfect for Exchange, because we were able to take advantage of it for both our POP3 and IMAP servers – we set the prepend portion to the +OK response for POP3, and the append portion to “<crlf>.<crlf>” for POP3 for example.

Now TransmitFile can be tricky to use – you need to lock the TCP/IP connection before and afterwards – if you attempt to send data on the connection before the TransmitFile completes, the results are “undefined” – it might work, you might get the data interleaved with the file contents.

 

  • Lock the TCP/IP connection?
  • Sorry - you can't send anything on the socket while the TransmitFile API is in progress - it means that unlike other socket I/O (where TCP will sequence the traffic in the order the requests are submitted), the data for TransmitFile will be sent immediately, regardless of what data is queued in the socket. Similarly, when you start sending data AFTER the TransmitFile is submitted, you need to wait for the TransmitFile to complete before you can submit the next write (or send())
  • Interesting. Why was the decision not made to support non-blocking mode? Surely it would have been neater if sendfile had sent all of the data that it could possibly and then told you "I sent 10K of data out of the 30K that you wanted. Any more and I would have blocked", so that you could reissue it when you next got an available write?

    At the end of the day, I imagine it has the same mechanics, but how would you, for example, cancel a request in progress? I might be missing something here :)
  • As far as my reading of the documentation goes, TransmitFile supports overlapped i.e. asynchronous I/O. It signals once the whole file has been transferred.

    If you need to cancel a request in process, call CancelIo, casting your socket handle to a HANDLE. It's a dirty secret of Winsock on NT-based systems that a SOCKET is a HANDLE in disguise. This means you can play tricks like associating a socket with an I/O completion port to get good concurrent behaviour. For more, see http://msdn.microsoft.com/msdnmag/issues/1000/Winsock/default.aspx.
  • Thanks Mike, and thanks for the link, I hadn't seen that one before.
  • You can say those 7 words on Australian TV after 9:30pm (or 9pm or something).

    I had a similar issue with extracting files stored as resources. The largest file was 60K and there were 7 alltogether. Trying to read into memory (A$=A$ & readbyte) was taking 1/2 an hour. By using the file writing APIs (ret=writefile(readbyte)) improved the speed to under a minute (still slow).

    I thought in my case I was freaking out the VB memory manager as each loop would be multiple implied temporary variables.
  • Larry:
    Hmmm... that's odd. Does that apply even if you've queued up several asynchronous IO requests? Or does it play nice in that instance?


    Eg.

    Overlapped Send - 1
    Overlapped Send - 2
    Transmit File - 3
    Overlapped Send - 4
    Overlapped Send - 5

    ... will they all come out in the order 1,2,3,4,5 ?

    I'd imagine so and that this is only a problem with blocking sockets, but I figured I'd check :)
  • Nope, it doesn't play nice. If you did:

    Overlapped Send - 1
    Overlapped Send - 2
    Overlapped Send - 3

    they'd show up 1,2,3

    But TransmitFile starts transmitting immediately. If you do a send while the TransmitFile API is outstanding, then the send data might show up in the middle of the TransmitFile output.

    TransmitFile is extremely powerful, but as I mentioned, it can be tricky.
  • OK... I understand now.

    Ouch. I guess there's a limit where it's faster to just queue up the data transfer instead of using transmit file because of this locking overhead... but I get the feeling that it's probably a really tiny edge case.

    Thanks Larry - that's a gotcha I wouldn't have thought of.

    BTW: Looking through the MSDN doco, I don't see a mention of this gotcha... any chance that you could file a bug in their RAID bin?
  • PingBack from http://www.keyongtech.com/2331324-transmitfile-function

Page 1 of 1 (11 items)