Performance Quiz #6 -- Chinese/English Dictionary reader

Published 10 May 05 04:22 PM | ricom 

Raymond Chen is running a series of articles about how to build and optimize the startup time of a Chinese/English dictionary.

Actually truth be told I got a look at his article quite some time ago as he was kind enough to ask me for comments well in advance.  At the time I couldn't resist doing a managed version of the same program to see how it would do.  So I encourage you to watch as Raymond works through various steps optimizing his program and see how it comes along. 

This managed code is a line for line conversion in the dumbest possible way of his initial program with no attempt whatsoever to optimize anything.

And then, the question of the hour:  How does Raymond's program fare vs. the equivalent managed code below?

Feel free to comment on the code, the problem, or just the unfairness of it all but please don't accuse me of concluding too much from the result of just this one benchmark :) :)

using System;
using System.IO;
using System.Text;
using System.Collections;

namespace NS
{
   class Test
    {
        [System.Runtime.InteropServices.DllImport("Kernel32.dll")]
        private static extern bool QueryPerformanceCounter(out long lpPerformanceCount);

        [System.Runtime.InteropServices.DllImport("Kernel32.dll")]
        private static extern bool QueryPerformanceFrequency(out long lpFrequency);

        static void Main(string[] args)
        {
            long startTime, endTime, freq;

            QueryPerformanceFrequency(out freq);
            QueryPerformanceCounter(out startTime);

            Dictionary dict = new Dictionary();        

            QueryPerformanceCounter(out endTime);

            Console.WriteLine("Length: {0}", dict.Length());
            Console.WriteLine("frequency: {0:n0}", freq);
            Console.WriteLine("time: {0:n5}s", (endTime - startTime)/(double)freq);
        }

        class DictionaryEntry
        {
            private string trad;
            private string pinyin;
            private string english;

            static public DictionaryEntry Parse(string line)
            {
                DictionaryEntry de = new DictionaryEntry();
               
                int start = 0;
                int end = line.IndexOf(' ', start);
               
                if (end == -1) return null;
                de.trad = line.Substring(start, end - start);
               
                start = line.IndexOf('[', end);
                if (start == -1) return null;
               
                end = line.IndexOf(']', ++start);
               
                if (end == -1) return null;
               
                de.pinyin = line.Substring(start, end - start);

                start = line.IndexOf('/', end);
               
                if (start == -1) return null;
                start++;
               
                end = line.LastIndexOf('/');
                if (end == -1) return null;
                if (end <= start) return null;
               
                de.english = line.Substring(start, end-start);

                return de;
            }
        };

        class Dictionary
        {
            ArrayList dict;
           
            public Dictionary()
            {
                StreamReader src = new StreamReader(
                   
"cedict.b5", 
                    
System.Text.Encoding.GetEncoding(950));
                string s;
                DictionaryEntry de;
                dict = new ArrayList();

                while ((s = src.ReadLine()) != null)
                {
                    if (s.Length > 0 && s[0] != '#') {
                        if (null != (de = DictionaryEntry.Parse(s))) {
                            dict.Add(de);
                        }
                    }
                }
            }

            public int Length() { return dict.Count; }      
        };        
    }
}

Filed under: ,

Comments

# Michael Kaplan said on May 10, 2005 6:58 PM:
Rico Mariani decided to try a managed version of the dictionary I talked about earlier today. According to Rico...
# mgrier's WebLog said on May 10, 2005 10:37 PM:
I want to go on the record and note that I will not be deveoping a Chinese/English Dictionary, in unmanaged...
# The Old New Thing said on May 11, 2005 8:56 AM:
Converting the file as we read it is taking a lot of time.
# Rico Mariani's Performance Tidbits said on May 12, 2005 4:43 PM:
Stefang jumped into the fray with his analysis in the comments from my last posting.&amp;nbsp; Thank you...
# Programando .NET said on May 13, 2005 7:05 AM:
# `(joe (@ (version "2.0")) ,(mk-blog)) said on May 14, 2005 12:40 AM:
# Jonathan Hardwick said on May 20, 2005 7:43 PM:
Raymond Chen (aka &quot;fixed more Windows bugs than you've had hot dinners&quot;) and Rico Mariani (aka &quot;Mr .NET...
# Alan's Corner said on May 31, 2005 9:26 AM:
So I was reading through one of my favorite MSDN blogs (http://blogs.msdn.com/oldnewthing/)

And he...
# The Old New Thing said on July 31, 2006 10:00 AM:
I'm just not an expert.
# H??gerkonspiration » Genier i arbete: C# vs C++ said on August 2, 2006 7:11 AM:
PingBack from http://w2k.fz.se/blog/?p=26
# Impersonation Failure said on November 30, 2006 5:35 AM:

&bull; Closures and Continuations / c# .net continuations Continuations in their full glory capture more

# mikelehen: Rico and Raymond said on December 28, 2006 8:14 PM:

PingBack from http://www.livejournal.com/users/mikelehen/5481.html

# Rico and Raymond by mikelehen () | LjSEEK.COM said on December 28, 2006 8:14 PM:

PingBack from http://www.ljseek.com/rico-and-raymond_53939019.html

# Console.ReadLine() :: C++ is ugly after you’ve been doing C# for a while said on January 4, 2007 6:39 PM:

PingBack from http://console.writeline.net/blog/?p=6

# Rico Mariani's Performance Tidbits said on January 23, 2007 9:30 PM:

The fun continues as today we look at Raymond's third improvement . Raymond starts using some pretty

# Rico Mariani's Performance Tidbits said on January 23, 2007 9:31 PM:

Stefang jumped into the fray with his analysis in the comments from my last posting . Thank you Stefang.

New Comments to this post are disabled

Search

This Blog

Syndication

Page view tracker