Blog - Title

Streaming From Text Files to XML

Streaming From Text Files to XML

  • Comments 3

Quite some time ago, I wrote a blog post on how you can stream text files as input into LINQ queries by writing an extension method that yields lines using the yield return statement.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOC
You then can write a LINQ query that processes the text file in a lazy deferred fashion. If you then use the T:System.Xml.Linq.XStreamingElement to stream output, you then can create a transform from the text file to XML that uses a minimal amount of memory, regardless of the size of the source text file. You can transform a million records, and your working set will be very small.

The following text file, People.txt, is the source for this example.

#This is a comment
1,Tai,Yee,Writer
2,Nikolay,Grachev,Programmer
3,David,Wright,Inventor

The following code contains an extension method that streams the lines of the text file in a deferred fashion.

public static class StreamReaderExtension
{
    public static IEnumerable<string> Lines(this StreamReader source)
    {
        String line;
        if (source == null)
            throw new ArgumentNullException("source");
        while ((line = source.ReadLine()) != null)
            yield return line;
    }
}
class Program
{
    static void <_st13a_place _w3a_st="on">Main<_st13a_place><_st13a_place><_st13a_place />(string[] args)
    {
        using (StreamReader sr = new StreamReader("People.txt"))
        {
            XStreamingElement xmlTree = new XStreamingElement("Root",
                from line in sr.Lines()
                let items = line.Split(',')
                where !line.StartsWith("#")
                select new XElement("Person",
                           new XAttribute("ID", items[0]),
                           new XElement("First", items[1]),
                           new XElement("Last", items[2]),
                           new XElement("Occupation", items[3])
                       )
            );
            Console.WriteLine(xmlTree);
        }
    }

}

This example produces the following output:

<Root>
  <Person ID="1">
    <First>Tai</First>
    <Last>Yee</Last>
    <Occupation>Writer</Occupation>
  </Person>
  <Person ID="2">
    <First>Nikolay</First>
    <Last>Grachev</Last>
    <Occupation>Programmer</Occupation>
  </Person>
  <Person ID="3">
    <First>David</First>
    <Last>Wright</Last>
    <Occupation>Inventor</Occupation>
  </Person>
</Root>

Leave a Comment
  • Please add 7 and 1 and type the answer here:
  • Post
  • Can you translate this code into VB? This is useless unless you're a C# coder.

  • It is not very easy to translate this code into VB. It doesn't actually translate directly, as there is no yield return statement in VB. Instead, you have to write your own iterator, implementing the Current property, and the Reset and MoveNext methods.

    For more information on using the yield return keyword, and an example of an iterater that is implemented not using the yield return keyword, see:

    http://blogs.msdn.com/ericwhite/pages/The-Yield-Contextual-Keyword.aspx

  • Hi Eric

    Thank you for your post. I understood almost everything but the fisrt line:

    static void <_st13a_place _w3a_st="on">Main<_st13a_place><_st13a_place><_st13a_place />(string[] args)

    Can you explain it to me?

    Thanks

Page 1 of 1 (3 items)