<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Pointless Blathering : DMA</title><link>http://blogs.msdn.com/peterwie/archive/tags/DMA/default.aspx</link><description>Tags: DMA</description><dc:language>en</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>GetScatterGatherList will coallesce contiguous SG list entries </title><link>http://blogs.msdn.com/peterwie/archive/2006/05/25/607608.aspx</link><pubDate>Fri, 26 May 2006 03:24:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:607608</guid><dc:creator>PeterWie</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/peterwie/comments/607608.aspx</comments><wfw:commentRss>http://blogs.msdn.com/peterwie/commentrss.aspx?PostID=607608</wfw:commentRss><description>&lt;P&gt;GregH asked me an interesting question at WinHEC today - one for which I didn't immediately know the answer.&amp;nbsp; He was wondering whether the new DMA DDIs (Get|PutScatterGatherList) would return a single SG entry for each page, or whether they would coallesce physically contiguous pages into a single entry.&lt;/P&gt;
&lt;P&gt;The answer is that the coallescing is "kind of complex and probably won't happen in a way that will make complete sense to someone just looking at the generated [SG] lists".&amp;nbsp; To anyone who is familiar with the DMA DDIs that sort of answer probably isn't surprising :)&lt;/P&gt;
&lt;P&gt;On X86 &amp;amp; for non-scatter-gather devices, the bounce buffers are allocated in big contiguous blocks and the DMA DDI will coallesce entries where possible.&amp;nbsp; Of course non-SG devices always get one big contiguous block no matter what, and so they'll only have a single entry.&lt;/P&gt;
&lt;P&gt;On AMD64 the bounce buffers for an SG adapter aren't necessarily contiguous so there may be less coallescing.&lt;/P&gt;
&lt;P&gt;Still, since you always have to plan for the worst case, this ends up being an optimization but shouldn't change your driver's behavior (unless your device uses single-page SG entries rather than address/length pairs).&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=607608" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/peterwie/archive/tags/Device+Drivers+General/default.aspx">Device Drivers General</category><category domain="http://blogs.msdn.com/peterwie/archive/tags/DMA/default.aspx">DMA</category></item><item><title>What is DMA (Part 9): I/O MMUs and why you'll wish you'd used the DMA DDI in 3 or 4 years</title><link>http://blogs.msdn.com/peterwie/archive/2006/05/11/595460.aspx</link><pubDate>Thu, 11 May 2006 18:19:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:595460</guid><dc:creator>PeterWie</dc:creator><slash:comments>3</slash:comments><comments>http://blogs.msdn.com/peterwie/comments/595460.aspx</comments><wfw:commentRss>http://blogs.msdn.com/peterwie/commentrss.aspx?PostID=595460</wfw:commentRss><description>&lt;P&gt;&lt;FONT size=2&gt;&lt;EM&gt;Perhaps you've read through all of my posts about DMA and still think that using the DMA DDIs is optional.&amp;nbsp; After all you've built a 64-bit card, and you're not using DAC so you don't have to worry about busses being downgraded on you (or maybe you don't believe in the boogyman either).&amp;nbsp; Sure someone might build an X86 machine that has actual map registers (companies have done such things before), but what are the chances you'll ever have to run on one of them?&amp;nbsp; The reason is coming in the next few years.&lt;/EM&gt;&lt;/FONT&gt;&amp;nbsp; &lt;/P&gt;
&lt;P&gt;There are certain problems i hate debugging.&amp;nbsp; Any problem that involves a mis-programmed DMA controller is high on my list.&amp;nbsp; These look like random memory corruption and usually don't reproduce well.&amp;nbsp; If you're lucky the corruption is identifyable (like a text file, or WAV data).&amp;nbsp; If not you sit back and collect repros until the pool of machines is large enough to determine that they all have the same network controller.&lt;/P&gt;
&lt;P&gt;And that's just DMA that's accidentally gone wrong.&amp;nbsp; Since DMA gets around all page protections it could be used to steal data as well.&amp;nbsp; Of course that's a moot point for a kernel-mode driver since they can steal anything they want anyway.&amp;nbsp; But if we ever want to allow direct hardware access from a guest OS (on&amp;nbsp;a Virtual PC system) or from user-mode DMA is going to be the killer issue.&amp;nbsp; It's hard to build a secure system when one of the restricted components can tell its device to read any page on the computer.&lt;/P&gt;
&lt;P&gt;Enter the IOMMU*.&amp;nbsp; Just like the regular MMU creates virtual address spaces from physical memory, the IOMMU creates logical address spaces for each &lt;EM&gt;device&lt;/EM&gt; (function actually) on your PCI-X bus.&amp;nbsp; Smart people tell me these should be in "every" new system by the end of the decade, and I for one am very excited.&lt;/P&gt;
&lt;P&gt;In a nutshell - the IOMMU has page tables for each bus/device/function that describe a logical address space (similar to the CPU pages tables).&amp;nbsp; When your devices attempts to read a logical address L, the IOMMU does a lookup, finds the appropriate physical page, and returns that page's contents.&amp;nbsp; If there's no mapping, or if the protections only allow&amp;nbsp;write, then the DMA is "logged" and isn't allowed to happen.&amp;nbsp; It's much more complex than that, involving lots of electrons and transistors and such, and i've only seen a glimmer of it so i can't talk in detail about how it works.&lt;/P&gt;
&lt;P&gt;In short, the IOMMU lets us control which pages each device can access through DMA.&amp;nbsp; We could, for example, block access to core kernel information (interrupt &amp;amp;&amp;nbsp;system call tables, non-paged kernel code, etc...) by leaving them out of the device page maps.&amp;nbsp; We could allow a Virtual PC guest to directly access a device since we could now virtualize the DMA operations as well - ensuring the device can't access any pages that aren't in the guest's physical address pool.&amp;nbsp; We could coallesce fragmented transfers or make 32-bit devices work on 64-bit systems without any copies (if there are still 32-bit devices in three years :).&lt;/P&gt;
&lt;P&gt;Of course we'd want to enable this in a way which didn't break existing drivers, or at least "well-behaved" ones.&amp;nbsp; To do this, we'd need to add code to some function calls the driver makes before and after every DMA operation.&amp;nbsp; Maybe something like GetScatterGatherList and PutScatterGatherList?&lt;/P&gt;
&lt;P&gt;We're only just starting to make decisions about what we'll do with this in the base OS.&amp;nbsp; It's clearly interesting, but how much of it we can use is up in the air.&amp;nbsp; However even now a couple things are clear:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;This can help with some of the reliability problems around DMA&lt;/LI&gt;
&lt;LI&gt;This helps most if we can tie it into every DMA operation a driver initiates&lt;/LI&gt;
&lt;LI&gt;The most logical place to do that is building it into the DMA DDI (whatever that may look like in the future).&lt;/LI&gt;&lt;/OL&gt;
&lt;P&gt;So to summarize: even if you don't see a reason you need to use the DMA DDI today, that reason is looming on the horizon.&amp;nbsp; I think we'll see demand from system administrators, IT departments &amp;amp; OEMs with high support costs to start using the IOMMU as a protection mechanism.&amp;nbsp;&amp;nbsp;It's going to see usage in the virtualization space to allow direct hardware access (and on an enlightened OS i'm sure it will leak into the DMA functions).&lt;/P&gt;
&lt;P&gt;If you're using the DMA DDIs already you're probably in good shape.&amp;nbsp; If you're not, you should start thinking about how you'll integrate it in during one of your future design changes.&lt;/P&gt;
&lt;HR&gt;

&lt;P&gt;&lt;FONT size=2&gt;* IOMMU is AMD's term for it.&amp;nbsp; Intel uses some different term that starts with V, which is less catchy and which i can't rememeber.&amp;nbsp; So for now assume IOMMU applies to both.&lt;/FONT&gt;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=595460" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/peterwie/archive/tags/Device+Drivers+General/default.aspx">Device Drivers General</category><category domain="http://blogs.msdn.com/peterwie/archive/tags/DMA/default.aspx">DMA</category></item><item><title>What is DMA (Part 8) - BuildScatterGatherList</title><link>http://blogs.msdn.com/peterwie/archive/2006/04/06/570431.aspx</link><pubDate>Fri, 07 Apr 2006 03:05:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:570431</guid><dc:creator>PeterWie</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/peterwie/comments/570431.aspx</comments><wfw:commentRss>http://blogs.msdn.com/peterwie/commentrss.aspx?PostID=570431</wfw:commentRss><description>&lt;P&gt;&lt;EM&gt;Since i'm on vacation next week, i thought i'd tackle something light this week.&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;Last time i talked about GetScatterGatherList and PutScatterGatherList and how much better they are than the older method of doing DMA.&amp;nbsp; But as much as I like these two functions, they have one major problem that hit us while we were working on the storage stack - they allocate memory.&lt;/P&gt;
&lt;P&gt;During the development of Windows XP one of our goals in the storage team was to ensure we could successfully issue disk I/O operations even if the pool was exhasted or the system address space was full.&amp;nbsp; In Windows even kernel memory is pageable, and when you can't page in a kernel thread's stack the system can't really continue.&amp;nbsp; The kernel can crash if a page-in fails due to a bad block on disk, or due to some driver returning STATUS_INSUFFICIENT_RESOURCES.&lt;/P&gt;
&lt;P&gt;AllocateAdapterChannel &amp;amp; MapTransfer would work for us, but the performance isn't good on modern systems since you can't issue more than one request at a time through the channel.&amp;nbsp; We needed something new.&lt;/P&gt;
&lt;P&gt;The trick to making forward progress even when you can't freely allocate memory is to preallocate all the resources you need for at least one I/O operation for use in emergencies.&amp;nbsp; When you get an I/O you try to allocate what you need, and if you can't get it you try to use the "emergency" resources.&amp;nbsp; If those aren't available you queue the request for later processing when the emergency resources are free.&lt;/P&gt;
&lt;P&gt;In order to do this around DMA, we needed the ability to pre-allocate a scatter-gather list, then hand that to the DMA engine to fill in.&amp;nbsp; This is exactly what BuildScatterGatherList does - it constructs the SG list within the supplied buffer but otherwise acts just like GetScatterGatherList.&lt;/P&gt;
&lt;P&gt;There's only&amp;nbsp;one problem.&amp;nbsp; GetScatterGatherList doesn't just allocate space for your scatter gather list.&amp;nbsp; It also allocates private memory so that it can track the DMA mapping operation&amp;nbsp;- list entries for enqueing it, the map register base, the number of map registers - all of those things you would normally have to keep track of yourself.&amp;nbsp; Obviously BuildScatterGatherList can't allocate memory, and your driver shouldn't have to guess how much extra space it might need.&amp;nbsp; So how do you know how big to make the buffer you hand in?&lt;/P&gt;
&lt;P&gt;You find that out by calling CalculateScatterGatherList().&amp;nbsp; It takes a CurrentVa and a Length along with an optional MDL.&amp;nbsp; This function determines the size of buffer that BuildScatterGatherList requires.&amp;nbsp; If you provide an MDL the function will compute the required size for any chained MDLs as well.&amp;nbsp; If you provide NULL for the MDL then it uses CurrentVa and Length to determine how many pages you're transferring &amp;amp; determines the rest from there.&lt;/P&gt;
&lt;P&gt;With these two functions you can ensure that you'll always have enough memory to handle one DMA mapping for some reasonable sized I/O operation (where that reasonable size is whatever you passed in for Length).&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=570431" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/peterwie/archive/tags/Device+Drivers+General/default.aspx">Device Drivers General</category><category domain="http://blogs.msdn.com/peterwie/archive/tags/DMA/default.aspx">DMA</category></item></channel></rss>