1/4/11 - Updated a couple of images and some of the code explanation

Adam Cogan asked me a question the other day that asked (among other things) "How do you know if a doc has multiple sections?"

In Word, of course, you can break a document up into sections by inserting a section  break from the Breaks button in the Page Setup group on the Page Layout tab:

Inserting section breaks manually in Word is easy

It turns out that counting these programmatically is really easy using the OpenXML SDK 2.0 (download)

I created a new console application, added a reference to DocumentFormat.OpenXML and WindowsBase and used this code:

Count Sections in Word Docx
  1. using System;
  2. using System.Collections.Generic;
  3. using System.Linq;
  4. using System.Text;
  5. using System.IO.Packaging;
  6. using DocumentFormat.OpenXml.Wordprocessing;
  7. using DocumentFormat.OpenXml.Packaging;
  8. using DocumentFormat.OpenXml.Office2010.Word;
  9.  
  10. namespace CountSections
  11. {
  12.   class Program
  13.   {
  14.     static void Main(string[] args)
  15.     {
  16.       if (args.Length != 1)
  17.       {
  18.         Console.WriteLine("Usage: CountSections <filename>");
  19.       }
  20.       else
  21.       {
  22.         using (WordprocessingDocument d =
  23.             WordprocessingDocument.Open(args[0], false))
  24.         {
  25.           Console.WriteLine("Found {0} section(s)",
  26.               d.MainDocumentPart.Document.Body.Descendants().
  27.               OfType<SectionProperties>().Count()
  28.               );
  29.         }
  30.         Console.WriteLine("Press any key ...");
  31.         Console.ReadKey();
  32.       }
  33.     }
  34.   }
  35. }

In it I take the filename passed as an argument and open it ReadOnly (line 22-23). I then find the number of sections using the typed enumerator (lines 26-27)

I also added the path to a file in the Command Line Arguments edit box in the Debug tab of the project Properties so there's a file being passed in when I press F5:

Pass a file name in as a command line argument

Running the program gives the answer very quickly:

Resulting output - we found 2 sections

Cool huh?

Sample file with 2 sections Download