Welcome to MSDN Blogs Sign in | Join | Help

Bill Li

Think globally, act locally.

Syndication

News

These postings are provided "AS IS" with no warranties, and confers no rights.


Move
My blog has moved to http://www.BaliOnWeb.com. Most of articles are also moved to new site. This is retired at this moment.

Posted Wednesday, July 29, 2009 1:20 PM by Bali | 0 Comments

Filed under:

My PM talk in SJTU

Finally, I delivered my first talk. Kind of excited, you can imagine. :-)

PPT can be found @

http://cid-0c5963e0b1a0be5d.skydrive.live.com/self.aspx/Public/PM%20Talk.ppt 

Let me know what you think of it.

Update 7/30/2009

This post has been moved to http://www.balionweb.com/my-pm-talk-in-sjtu.html

Posted Tuesday, June 02, 2009 12:44 AM by Bali | 10 Comments

If I were designing a new email service, I would…

All great designs come from deep understanding to customers. In my case, I'd like to design the email service for information workers(IW) as I am one of them. Basically they are hired to get things done. Modern projects, or tasks in smaller granularity, are getting too complex to be accomplished individually. So people have to work together – we call it collaboration. Consequently they have to communicate. They are forced to communicate. Email solves the problem of logistics and synchronization so that communication could happen between different time frames and locations, so it is still indispensable currently. But it is far from perfect, sometimes it is a trouble. Can it be done any better? Let us have a try. This new thing is called Pmail.

Entities

First of all, let us keep in mind that tasks are actually what IWs really care about. They are much happier if they can complete assigned task without touching emails. Right? Completion of tasks is what they are trying to achieve. Second, when they have to communicate, they care about distilled information they are expecting while writing/reading emails. You will not get any additional credit through presenting your idea by writing a poem. Senders and recipients come last. Without them, communication can't happen. These two entities get lower priorities because if distilled information can be gained by other means, say search engine or anonymous DL, who cares them?

Again - email for IWs is all about tasks, not messages. This is the most fundamental philosophy differentiating Pmail's design from others.

Problems - an abused tool in workplace

By nature, current email systems allow one sends anything to any number of persons at anytime anywhere. Gradually it turns out true that something that can be anything can do nothing actually. Theoretically any email should be of your interest; otherwise sender will not send it to you. Realistically it is so easy to be over-used:

  • Email overload. You might probably find everyone around you is complaining about too many emails. Doubt? Open you email clients, see how many unread mails you have. NY Times reports, E-MAIL has become the bane of some people's professional lives. This occurs likely because of complicated job nature, but I'd say often it is actually because of poor email prioritization. Can you easily tell which email is much more important than others? Or emails tell its priority to you?
  • Hard to map emails to one's day-to-day jobs. How many times you have to search your emails to dig certain messages out because you need something?
  • Mail thread discussions often go wild. Do you find you are in trouble figuring out what is going on when suddenly looped in a thread?

At a glance

Pmail's primary design goal is to help team get jobs done efficiently – under right timing, priorities, order and resources. Everyone would have a clear picture about how his work contributes to team success.

Bird view

Pmail would also aim to minimize communications as much as possible. Leave users alone please. In task-centered email design, all emails will fall into one of below four categories:

Category

Description

Example

Your Action

FYI

There is no immediate impact to any tasks in your plate at this moment

Ÿ   High level org changes announcement

Ÿ   Knowledge sharing

Ÿ   News letter

No – No immediately action required

Catch the ball

Someone requires you do something to unblock his task(s)

Ÿ   Sign Off request

Ÿ   Fix broken printer

Ÿ   Mandatory code review for check in

Action required – Do something and then back to requester with results

Here you go

Someone provides you what your task(s) require

Ÿ   Approval letter

Ÿ   Team member phone number collection table

Action unblocked – You are ready to go complete specific task(s) if all dependencies are resolved

Collective efforts

Someone needs your inputs for task(s) he is working on

Ÿ   Execution plan review

Ÿ   Brainstorming

Best efforts – do what you can to provide inputs

Persona/User scenarios

Key scenarios that Pmail addresses include:

  • Know big picture for better prioritizationSteven, who is a developer in a software development team, came to his office in a morning. He opens bird view of pmail to check how his tasks (or WBS in more general term) fits into the entire team progress. He notices one of his tasks, task8, is in critical path according to updated plan(never expect plan is really locked after lock-down). This task is automatically prioritized to p0(highest priority) by Pmail. It is about implementing a feature according to design specification by feature PM.
  • Maintain dependencies Steven goes ahead to open task8 and find this task depends on two resources, 1) one LHS virtual machine, which has been completed by task7. VM address and credential are also attached. Nice! 2) PM spec. It is also claimed completed.
  • Main email threads around tasks Steven opened the spec document in team Sharepoint server, and he finds several points are not clear for him. Then Steven starts a mail conversation with relative PM, Joanne, against this task. The mail thread is linked to task8 and its status is open.
  • Drive team work flow – Joanne received a catch-the-ball mail from Steven about p0 task8, so it is a shadow p0 task for her as well. Joanne can check task details in the Pmail. She quickly comes up with answers regarding Steven's questions and writes back a here-you-go email to Steven. Steven receives the email and get unblocked. Steven closed the email thread by several sentences. This mail thread is traceable in task8 and can't be replied any more as it is closed. Several hours later, Steven completes task8 and closes it in Pmail. When Steven re-visits the bird view page of Pmail, he finds task8's color turns green due to status change. Steven then picks up another one unblocked taks, task9.
  • Maintain team discussion – Task9 is about designing a new feature F9. Deliverable is reviewable dev design spec. Steven has two options about solving a technical problem, but not sure which way to go. He then sends a collective-effort discussion mail to the whole team. Since this is not a catch-the-ball mail, team mates will treat it with best efforts, but Steven still gets several great feedbacks. When he feels the problem is solved, he summarizes the thread with several sentences and closes the thread.
  • Check history – One year later, Steven transfers to another team and his replacement, Eric, would like to better understand why F9 is designed this way. He opened Task9 in Pmail, check the mail thread and better understand original design decision.

Demo/UI Mock-up

Bird view will look like above diagram. In following task view, you can see all mail threads about one task are at your finger tips.

Task view

Mail view will also a bit different. Every mail has a TaskID field to help you get the whole context conveniently.

Mail view

Key features in Pmail

In addition to biggest differences we covered before, you might also get excited when seeing below features.

  • Sometimes feature-rich is not a good thing. It is so easy to reach this point when email system is designed for multiple purposes after several releases, say home usage included. But in any event, Pmail will provide a button to show/hide features(say, increase/decrease indent) which I don't use in last month. Get me a simple world.
  • Mouse hover will show abstract of email(user could assign, or automatically select) before deciding to read it
  • Send me SMS notification when a selected critical thread gets new replies. No special carrier service needed, input my cell phone and go
  • For discussion mails, send brainstorming results or decision made, and then close the thread by marking it as un-reply-able
  • Enforce sequential replies, for example fill in a excel table
  • Automatically memorize folder or url of files you are editing, and go to Pmail, it can be attached to the email by a hot key, say "ctrl+shift+v"
  • mailURL(say, steven-task9-show-me-money-88; don't use GUID, please) which can be shared across the clients. Don't to forward again
  • Automatically send out ping for dead catch-the-ball emails after pre-configured interval times out
  • Play video/audio right inside the mail client
  • Mail web client also provides web API like Facebook. OWA kills programmability.
  • Tag everywhere – one mail thread might relate to more than one task. It is not a good idea to put a mail in one folder exclusively.
  • Don't allow send active documents to team for review, put it on shared place(say, Sharepoint) and send a link

 I am interested in your thoughs on it. Let me know. Thank you.

Posted Wednesday, March 11, 2009 7:49 PM by Bali | 1 Comments

Building Global Development Team

Nowadays software is getting so complex that it needs incredibly more and more people to build it. For example, there are 9000 engineers working on Vista simultaneously. In certain sense, you can call that it is a labor-intensive industry. Ideally, it would be best to put all people in one place; however there are also lots of sound reasons to build global development teams.

Pros and Cons

Why bother to build global development team?

  • New talents pool – Software development needs so many persons with similar attributes. In Microsoft Company wide, top 3 hiring criteria are smart, passion for technology and fit to company values such as openness, continual self-improvement and mutual respect, etc. From various studies, supply of talented IT staffers isn't keeping up with demand. And it won't change anytime soon. Even when Microsoft is cutting 5000 jobs now, we still have another 2000~3000 job openings there. What to do if enough qualified hires can be not found within US? Go outside.
  • Lower cost – let us straightly go to data. Annual pay for Level "59" is about $74,000 in US, while same level is paid about $20,000(~RMB150K equally) in China. Another slight fact - It costs about $1.5 to go most common places in Shanghai
  • Close to market – Most innovative ideas often comes out of interactions with customers. And customers' requirements vary from country to country. So best way to serve a local market is to be in the market.
  • Specific knowledge – Cross-nation acquisition for specific technologies is another reason to build global development. You can't(or not able to) move all folks to HQ.

 

Like many other things, challenges always go with benefits hand in hand. You can't take only part of it. Fair enough.

  1. Distance – People who work on the same software can't work alone, they have to exchange information and make decisions. Less than 3 minutes conversation is often enough for slight but frequently arising communication needs, for example, "Could you show me the bug in your environment?" Distance makes it impossible to stop by one's office. Network bandwidth is another issue introduced by distance. Why it matters? Suppose one team has to copy large file sets from another remote team daily, it would be a problem. The file sets can be growing amazingly big – for example, daily SQL Server build is about 300GB.
  2. Time zone – Would communication a real headache since modern technologies such as email and telephone have been there for quite a long time? But the things are no one is there when you are working due to time zone. You have to wait another day to get an email response. When you rush to office to check it, the most frustrating but concise reply could be - "What do you mean by <insert anything you assume others should understand>? How can I help you?"
  3. Culture/Language – Master level of English is unbalanced in global teams. There are dozens of ways to say "A beats B" in English, but only several of them are understandable for general public; dialect is another obstacle. For example, Chinese folks often have hard time pronouncing letter "L", as a result, words like "girl" might sound weird sometimes.
  4. Junior team – In my organization, most hires right come out of college. We are raw smart, but less experienced. It is hard to deny that some key ingredients for great engineers just take time – design skills, debugging techniques, influencing capabilities. This is not a cutting-corner example.
  5. Conflicts between hierarchical management and local branding – By organization hierarchy definition, many business units have existence in China. STB, Live, Online Service, etc. They functions nearly independently of each other and reports to US. But from the perspective of customers/partners/talents, it looks a bit confusing because there are so many Microsoft's. 
  6. People development – folks in remote sites are not equally exposed to development resources such as face-to-face training, library, mentors, etc.
  7. Governing issue – due to well known concerns, something confidential can't be moved out of US.

How-to

There are several factors playing critical role in deciding appropriate distributed development model. At least you should consider project type, team seniority, team size, team culture, communication cost and history.

  • Build trust first. No matter how adaptable a new team is, it always takes time to fit into a specific team culture. So starting with easy jobs to build team moral and credibility is the safest steps before moving forward. Beyond this, the new team can target increased ownership.
  • Decide on team coupling level. From highest to lowest, the level could be: pseudo random assignment, same branch same feature/component, same branch different feature/component, different sub branch but same main branch, different main branch & clearly defined data contract. Avoid circular dependency.
  • Come up with communication plan. Minimized communication is not always the best, especially for relatively junior teams. The best way to develop people is to work with senior folks on daily basis as much as possible. So you have to make tradeoffs here based on various inputs.
  • Shared lab if possible. If you have to copy large files across the ocean frequently, consider doing your job on a lab down in remote team. If you have to do copying, make sure the files meet most critical quality requirements before doing so, for example, a build verification test.
  • People development plan. Regular staff exchanging plan such as Marco Polo and Silk Road program, getting a mentor, record trainings, etc.
  • Local brand management. Externally shown as one image; internally run by functionality as usual.

Posted Saturday, March 07, 2009 5:58 PM by Bali | 1 Comments

Filed under: , ,

#3, Hulu, Why?

Fast Company recently published its version of the world’s top 50 most innovative companies. Although I would question why Intel is among top10, what surprised me most is the fact that Hulu is listed #3.  I know there might be political things about the particular order, as it goes with most ranking, but it would be also interesting to find out “why Hulu, not others”.

Origin

Hulu, an online video streaming company, managed to do something which YouTube failed to do. Copyright is one of YouTube’s headaches, but it is incredible positive thing for Hulu, because it is built intentionally aimed to server property content by two major stream media dogs, NBC Universal and Fox.

{YouTube, watch someone’s DIY video} VS { Hulu, watch TV & Movies online legally} -> Similar but different market niche.

People might be asking, why NBC and Fox executives don’t rely on YouTube to serve their plays? You can imagine following conversation which probably happened behind the scene:

NBC/Fox:

Hey, YouTube, are you interested in serving my video to the world?

YouTube:

Why not.  Let’s sit down and take a look at this. Now we are owned by Google, and we are overwhelming dominator in online video market. We have great brand. We have great infrastructure. We have most talented engineers. Blabla… (down to the point) so you have to pay x dollar for every minute show.

NBC/Fox(think):

uh-um… let me do some math here. Plan A is to work with YouTube, plan B is to build up something myself. In next 3-5 years, if everything goes as predicted, plan B will bring much more money to our shareholders than plan A, and less risky.

YouTube(ping NBC/Fox):

What do you think of the plan?

NBC/Fox:

Nice plan, but no, thanks.

This is the first site who delivers property video to your computer for free.  Traditional Media Company gradually realized that they have to embrace the changes if they are not able to prevent them. It is online streaming, in this case. Hulu has more than 120 sources now.

Key is property content sources are nonrenewable rare resources.

Independence

Hulu’s CEO said to capital angels, “I don't think you'll be seeing the name Fox or NBC on the site hardly at all, Hulu is about the shows, not the networks. The shows are the brands that users care about." Another quote, “the key to Hulu's success is its freedom to operate essentially as a stand-alone company…”

From startup’s perspective, captical can be good thing, or bad thing. It can help you grow much faster, but it can also easily enable you miss your initial goals. Capital often appoints some seemingly smart guy, who is with XYZ MBA degree or n years of experience in ABC company, to take over the company as one of its investment agreements.

That is indeed one of the worst investment risk controls, although it happens again and again. Give money to most passionate guys, and letting them be passionate always is the only way to maximize the probability of getting most out of your investment.

Feature?Solution? Experience!

Question: If you are given a task to build a video streaming site within less than 3 months, what would you do?

We were taught this way:

1)      Identify who will be using your site

2)      Draw use case diagram

3)      List scenarios for each user role

4)      To support each scenarios, figure out needed features

5)      Design/Code/test your features

6)      Go live

7)      Yeah! Party! :- )

If you follow this in your next interview, I can almost guarantee a pass. Do we miss anything? Actually we missed most critical one – Experience! Experience is a combination of brand/feeling/easy-to-use/enjoyable process. For example, given below requirement:

"Design something which is used to sit on, commonly for use by one person. It often has the seat raised above floor level, supported by legs."

People will respond immediately, "Chair!". You may notice that there are at least hundreds of types of chairs in the world, if not thousands of, if not millions of. Only most imaginative ones who deeply understand that particular set of users’ needs, care about their feelings and eventually apply those into product designs can do the best work. Let us take a look at "art of chair".

Art of Chairs

Be COOL, in show time, although you might have the similar hard time figuring out what some of them really are. :-) Just like someone said about iTune – “iTune is not selling features. iTune is selling experience.”

Hulu’s key experiences:

1)      Simple

2)      Larger screen

3)      High-resolution video

4)      Clutter-free

5)      Quality control

6)      Free to users

7)      No download

8)      Obsessed with users

Posted Monday, February 23, 2009 4:10 PM by Bali | 1 Comments

Lead Without Authority

As to the complexity level of problems, developers in Microsoft, who deal with coding, are probably taking one of the most challenging jobs in the world. But another engineering role, program manager, is not easier in any sense. Before I explain why, let me illustrate what typical product teams look like.

Typical MS Engineering Team Org Chart

In the diagram, group manager often runs a team owning individual components or feature areas, say, one spell checker DLL in office; a product unit manager owns a product, say Outlook; while a general manager runs a product family, say Office. One compelling advantage Microsoft possesses is that its products are designed to work together seamlessly. For example, in a recent CIO.com article, 10 Reasons to Use Microsoft Outlook for Your Company's E-Mail, 8 of 10 points is actually about interoperability with other software. PM's job

Managers define strategy, Developers write code, testers test code, what do PMs do? Well, it is really hard to define - there is even no consensus within Microsoft. However, one thing is clear – PMs do anything except coding(unless they have to :-) to drive projects meeting strategic goals. Enable right things get done right. Figure out gaps and fill them. The role is something like concrete among brick walls. (cartoons on the right)

If you take a look at PM's position, blocks with black bold border, in the organization hierarchy, you will get the challenges here. PMs are supposed to lead their peers to work towards the right direction.

"You are not my boss. Why do I have to listen to you?"

This question gets down the topic of "Lead without Authority". Before you read along, I have to clarify that I am not a PM expert, but I (0) I've been a PM for quite a while, (1) work with great PMs, (2) learn and try most fundamental things which work well, (3) I faced similar situations for many many times, either in schools, or various organizations. Here is my sharing.

  • Charisma. You might be incredibly raw smart – you can optimize an algorithm from O(N*N*N) down to O(logN), or you might be able to design a system of .9999 availability, extremely secure, and super cheap compared with competitors… That doesn't mean you can be a good leader. Instead, many of the best leaders, however, will point to the fact that they were "C" students(From Source). 
  • Credibility. It is the key to make people follow you. It works like stock value – hard to build, easy to lose. Be careful of it always. Do right things right. Bring something valuable to the team. Your credibility will accumulate gradually. Do what you say, and say what you mean.
  • Consistency. People hated to be randomized. Don't always change directions. Clear goal.
  • Partnership mentality. Low ego. Help Others Great
  • Help others great. Most influences are indirect. People often complain about "over-managed, but less supported". (Check out photo on the right)
  • Lead by showing, not lead by talking only. Get your hands a bit dirty. Leaders should be willing to do anything they tell their reports to do. Show domo, mock UI, prototypes to other when presenting a new way to do things.
  • Has sense of people. Most basic thing is How to Win Friends and Influence People. Learn to inspire people. Always be positive in every aspect.
  • Communicate. Communicate. Communicate. Listen. Ask questions. You can be introverts, but have to be proactive. Be transparent. Understand your audience – strength, weakness, style, and figure out most effective approaches.
  • Relationships enhance communication. Invest time in talking to people interactively. When you need to request advice, an opinion, or a task, you can talk to people from a healthy and positive place.
  • Sharpen your vision. Look around. Look ahead. This helps in making sound decisions.
  • Start to drive teams with strong pseudo leaders
    • Talk to them especially leaders or submarine leaders, get to know what are their likes/dislikes, Ask: How can I help you guys? Win their support.
    • Build credibility. Bring something invaluable to the team, for example, build up strong relationships with a partner team, present a competitor research report, improve team visibility
    • Before making changes, present it to the leaders in private, say 1on1, to reach rough consensus before it goes public.
    • Help others great. Let them get the credit.
  • Conflict Management
    • Focus on goals, not conflicts themselves. Not a question of smart or dumb. 
    • If you can control your emotion well enough, 25% of conflicts will go away.
    • Overwhelming majority of conflicts is just because of perspectives and different values. Try to figure out stories of other people's side. People often come to the same solution when presented same set of information. Use negotiation tips. Keep win-win in mind.
    • In worst case – when you have to resolve unavoidable conflicts, say 5 pies for 10 kids, make sure put a fair decision-making system in place, which is often a engineering way.
  • You can reward desired performance even without authority
    • Promotions, raises, bonuses are not in your list, but you can affect them in indirect ways
    • Fame. Complimenting an employee on a job well done at a staff meeting or in front of company officers can be extremely rewarding
    • Increased trust. For example, give top performers the freedom to work in a more self-directed manner.

Posted Friday, February 20, 2009 3:56 PM by Bali | 1 Comments

Filed under: , ,

My AD Fun Experiment

Today I got a mail from lakequincy.com, saying:

“Hi Bill,

I noticed that you were never able to plug the Lake Quincy Media ad tags into your site. Are you still interested in earning revenue from displaying ads targeted to Microsoft developers? If so, you can get your tags from here: <certain link> 

Let me know if you have any questions or if you’ve decided against running the ads and I’ll set your account to inactive for you.”

I even forgot that I resgistered in their site. To encourage such great customer service, I go ahead to try how the AD really works for me; finally decide to put a small square in the side bar. You can easily find it now if you scroll down a bit and pay attention to the left side bar. Let us see how many dollars I can make after one or two quarters. By putting an AD to the blog, it looks more of-the-business, doesn’t it? Of course, you can bid for that AD position. :- )

Update(2/26/2009)

The AD is disabled temporarily due to security alert.

Posted Wednesday, February 18, 2009 11:28 PM by Bali | 2 Comments

Filed under: , ,

P2P Backup System w/o SPOF for Work Group

This is also one of my half-completed ideas years ago. It was recalled recently by two stories:

Stories 

#1: One of my team mates lost his Outlook email archive due to a mistaken operation. He is very upset because all of his emails in past two years are gone, unrecoverablly. Suggested Outlook mail file is 2GB, but people can easily drive it up to 10GB and often notice this too late. As what he said in MSN signature, “my archive, 555…”

#2: While I was browsing one of my favorite bloggers – Brad Abrams’s blog, ran into a shocking post, Help! My hard disk crashed. He can afford a new disk, but he need the data – especially “the pictures of the kids recent birthday party”. In the comments, people suggested various ways to recover it, unfortunately looks like no luck.

This leads to the most basic questions: why people don’t backup data even if they know they are in the risk of losing them? Probably because of:

1)      The chance is very tiny

2)      Current backup solutions are not “good enough”

For 1), although it might be true, but remember the result might be cataclysmic. Let us take a look at what is going on about 2).

Existing Backup Solutions

Traditional C/S – Most of current backup solutions are based on traditional C/S architecture. Assume one use 40% of 500GB disk drive on average; it is still a big capacity for enterprise with thousands of staff. Backup time is also another concern. Let me illustrate this with some math. For a backup server with two 1-Gbit NICs, its max throughput is about 200MB. If backup window(people are sleeping) is 10 hours, it means it can back up about 7.2TB(2*100*3600*10MB) data. Obviously it is hardly scalable along with business expansion.

Online backup – People can turn to AWS S3, or Azure SQL Data Services to simplify things on their own side. But this doesn’t make things even better if you look at current network speed. Calculating data is much easier than moving data around.

My P2P Backup System

p2p backup comes with several inherent advantages in handy:

1)      No additional hardware expense. People often have at least 30% capacity left; we can leverage that for backup space.

2)      No obvious network bottleneck.

Some people might also list “no single point of failure”, but that is not true. Let us take a look at how BT works, a general P2P network.

Basically here are what happens in P2P network - peers need talk to a central coordination server(i.e., Tracker in BT system) to get info about other peers, and then talk to individual peers about real data exchange. These two steps are essential. Single Point of Failure(SPOF) can happen in the coordination server. You may argue that you can enhance its reliability by techniques like mirroring, but does it look like over-heavyweight? We have many reliable services running by full time IT guys, why do we bother to re-invent the wheel? Several services can be utilized, such as AD, Sharepoint, exchange, which allow customized data writing. Data needed written into a central place would be:

1)      Peer list by <IP, port>

2)      Other meta data such as backup network name, Software RAID level

Of course, we need do compression, encryption, incremental backup, scheduled backup in the solution. I may do investigation later and update the post.

Also see other backup solutions

http://www.storegrid.com/index.html - select a folder, then backup, traditional

www.streamload.com - Large net disk, Upload your files to streamload, then view it, or email it to someone else.

http://www.beinsync.com/ - Access you computer: install a mini PHP web server onto it.

http://base.google.com/base/about.html - In about 15 minutes, your item will have a unique web address and be visible to the world.

http://www.openomy.com/ - openomy is an online file storage system designed to be a platform for Web 2.0 applications, built by Ian Sefferman

Posted Sunday, February 15, 2009 11:23 PM by Bali | 2 Comments

Memory Leaks Demo & Detection in .NET Application

Memory leaks are always headache of developers. Do .NET developers no longer bother to worry about memory leaks because of garbage collection? Yes and NO. GC periodically find objects that cannot be accessed in the future and then reclaim the resources used by the objects. GC achieves this by maintaining a list of references to live objects. When this mechanism is broken, memory leak happens.

There are many reasons to leak memory. In addition to calling unmanaged code from managed code, another one of general cases is about event handler. If you do this:

     Foo.FooEvent += new EventHandler(MemoryLeaksHere.Method);

When you complete using MemoryLeaksHere, but you are still using Foo, then MemoryLeaksHere will still remain alive as well. MemoryLeaksHere object will leak memory as a result of failing to GC.

Let us take a look at one simple example first.

using System;

namespace MemoryLeakSample

{

    class Foo

    {

        public static Foo myFoo;

        public event EventHandler FooEvent;

        public Foo()

        {

            myFoo = this;

        }

        public void FooMethod()

        {

            MemoryLeaksHere memLeak = new MemoryLeaksHere();

            memLeak.TryQuit();

        }

        public void FireEvent()

        {

            FooEvent(null, null);

        }

        static void Main(string[] args)

        {

            Foo foo = new Foo();

            for (int i = 0; i < 5; ++i)

            {

                foo.FooMethod();

            }

 

            GC.Collect();

            GC.WaitForPendingFinalizers();

            GC.Collect();

            Console.WriteLine("Check memory leak here.");

        }

    }

 

    /// <summary>

    /// This object will cause memory leak

    /// </summary>

    public class MemoryLeaksHere

    {

        public MemoryLeaksHere()

        {

            Foo.myFoo.FooEvent += new EventHandler(OnMyFooEventFired);

            Console.WriteLine("\nObject-{0}: Construct. Subscribe.", this.GetHashCode());

        }

        ~MemoryLeaksHere()

        {

            Console.WriteLine("Object-{0}: Deconstruct.", this.GetHashCode());

        }

        public void TryQuit()

        {

            Console.Write("Object-{0}: leak me?", this.GetHashCode());

            string input = Console.ReadLine();

            if (string.Equals(input, "no"))

            {

                Foo.myFoo.FooEvent -= new EventHandler(OnMyFooEventFired);

                Console.WriteLine("Object-{0}: Unsubscribe.", this.GetHashCode());

            }

            else

            {

                Console.WriteLine("Object-{0}: Not Unsubscribe", this.GetHashCode());

            }

        }

        private void OnMyFooEventFired(object sender, EventArgs e)

        {

            // Do something

        }

    }

}

In MemoryLeaksHere object’s constructor, Foo starts to hold a reference to MemoryLeaksHere by registering event handler. In MemoryLeaksHere.TryQuit(), if we don't unregister, memory leak will happen.

To be more intuitive, you can copy/paste sample code to VS2008, and then enable unmanged code debugging by following:

Project->Properties->Debug->Enable Unmanaged Code debugging

Now set a breakpoint at Check memory leak here”, and start build/debug. When being asked leak me or not, you can choose either yes or no. For example:

 

Here, looks like we leak two of them. Finally app will hit the breakpoint and stop. At this point, we can go to VS immedate window to load sos.dll, and then check how many objects in the heap:

!load sos.dll

extension C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\sos.dll loaded

!dumpheap -type MemoryLeaksHere

PDB symbol for mscorwks.dll not loaded

 Address       MT     Size

0132e7d0 00983104       12    

0132eba0 00983104       12    

total 2 objects

Statistics:

      MT    Count    TotalSize Class Name

00983104        2           24 MemoryLeakSample.MemoryLeaksHere

Total 2 objects

So now we know there are two object instances are not recycled. Why are they not GC-ed? Because someone has a reference to them. Choose one of them, and use gcroot command.

!gcroot 0132e7d0

Note: Roots found on stacks may be false positives. Run "!help gcroot" for

more info.

Error during command: Warning. Extension is using a callback which Visual Studio does not implement.

 

Scan Thread 7592 OSTHread 1da8

ESP:12f434:Root:01312d48(MemoryLeakSample.Foo)->

0132f704(System.EventHandler)->

0132f6ec(System.Object[])->

0132e7dc(System.EventHandler)->

0132e7d0(MemoryLeakSample.MemoryLeaksHere)

Scan Thread 4704 OSTHread 1260

Now we can see that MemoryLeakSample.Foo is still referencing MemoryLeakSample.MemoryLeaksHere via event handler. If it is not 5 iterations, image what would happen if every incoming request results in a slice of memory leak... Soon or later, you online service will be down.

See also:

http://www.codeproject.com/KB/dotnet/Memory_Leak_Detection.aspx

http://blogs.msdn.com/jgoldb/archive/2008/02/04/finding-memory-leaks-in-wpf-based-applications.aspx

http://blogs.msdn.com/calvin_hsia/archive/2008/04/11/8381838.aspx

http://www.automatedqa.com/techpapers/net_allocation_profiler.asp

http://blogs.msdn.com/greg_schechter/archive/2004/05/27/143605.aspx

Posted Wednesday, February 11, 2009 12:32 PM by Bali | 8 Comments

Filed under: ,

Designing Your Own Recent Posts Widget for MSDN Blog

In my MSDN blog, I need “Recent Posts”, but I don’t need archive side bar. After having played with template for a while, still no luck. Hmmm, looks like I have to DIY it. Fortunately in News sidebar, you can fill in raw html including JavaScript. Then next question is where we can retrieve post tiles. The immediate idea is from current DOM document. Through experiments, I found this is impossible because the DOM is not fully loaded yet when the script is executed. Later, I figured it out that all posts title can be gotten from RSS. For my blog, the address is http://blogs.msdn.com/bali_msft/rss.xml. One thing worth noticing is the fact that RSS in MSDN blog is not up to date - Your post will not instantly appear in the RSS. After I get all posts in RSS format, things became much easier. And then I go ahead to add more interesting things:

  • Posts background use two colors in turn
  • Show a new tag for posts less than 3 days old
  • Show latest 8 posts only
  • Show posts' age

So, the final thing will look like this:

If you find it is useful, feel free to paste below code to you blog’s news section. Note to customize “Configurable params” to your own needs and leave other code intact. It works well at least in my IE 8 and Firefox 3.0.6.

<div id="RecentPosts"></div>

<Script>

// Configurable params

var recentPostNumber = 8;

var rssUrl = "http://blogs.msdn.com/bali_msft/rss.xml";

var title = "My Recent Posts";

var newPostAgeInHour = 72;

 

// Cacluate age of one post. It is all about getting time span in Javascript

// return formate: x min; x hour y min, x day y min, x days, x yeas (ago)

// Refer to: http://www.w3schools.com/jsref/jsref_obj_date.asp

function calculateAge(postDate)

{

    var ret = "fresh!";

    CurrentDate = new Date();

    TimeSpan = new Date(CurrentDate - postDate);

    var mySpanArray = new Array();

    mySpanArray[0] = TimeSpan.getUTCFullYear()-1970;
    mySpanArray[1] = TimeSpan.getUTCMonth();
    mySpanArray[2] = TimeSpan.getUTCDate()-1;
    mySpanArray[3] = TimeSpan.getUTCHours();
    mySpanArray[4] = TimeSpan.getUTCMinutes();

    var TimeSpanTagArray_1 = new Array("years", "months", "days", "hours", "minutes");

    var TimeSpanTagArray_2 = new Array("year", "month", "day", "hour", "minute");

   

    // Starting from non-zero element and pick two significant values

    for(i = 0; i < mySpanArray.length; i++) {

        if(mySpanArray[i] != 0) {

            var correctTag = (mySpanArray[i] == 1)?(TimeSpanTagArray_2[i]):(TimeSpanTagArray_1[i]);

            ret = mySpanArray[i] + " " + correctTag;

            if(i+1 < mySpanArray.length && mySpanArray[i+1] != 0) {

                correctTag = (mySpanArray[i+1] == 1)?(TimeSpanTagArray_2[i+1]):(TimeSpanTagArray_1[i+1]);

                ret = ret + ", " + mySpanArray[i+1] + " " + correctTag;

            }

            break;

        }

    }

   

    return ret;

}

 

// Display the recent posts

// Refer to:

// http://www.w3schools.com/DOM/dom_node.asp

// http://www.w3schools.com/DOM/dom_methods.asp

function displayPosts (xmldoc)

{

    var newTag = "<SPAN style=\"COLOR: red\">(New!)</SPAN>";

    var posts = xmldoc.getElementsByTagName("item");

    var displayText = "<h3>" + title + "</h3><UL>";   

    if (posts.length < recentPostNumber) {

        recentPostNumber = posts.length;

    }

    for(var i = 0; i < recentPostNumber; i++)

    {

        PostTitle = posts[i].firstChild.firstChild.nodeValue;

        PostLink = posts[i].firstChild.nextSibling.firstChild.nodeValue;

        PostDateStr = posts[i].firstChild.nextSibling.nextSibling.firstChild.nodeValue;

        PostDate = new Date(PostDateStr);

        CurrentDate = new Date();

       

        // Calculate age

        var _PostAge = calculateAge(PostDate);

        var PostAge = "<SPAN style=\"font-size: 80%; color: black\"> (" +_PostAge + " ago)</SPAN>";

               

        // Show a new tag for posts happening last days defined by 'newPostAgeInHour'

        var myNewTag = "";

        if((CurrentDate.getTime() - PostDate.getTime())/1000/60 < newPostAgeInHour * 60) {

            myNewTag = newTag;

        }

       

        // Get background color

        var BKColor = (i%2 == 0)?("#B8CCE4"):("#DBE5F1");

        displayText = displayText + "<LI style=\"background-color:" + BKColor + "\"><A href=\"" + PostLink + "\">" + myNewTag + PostTitle + PostAge + "</A></LI>"       

    }

    displayText = displayText + "</UL>";

    var target = document.getElementById("RecentPosts");

    target.innerHTML=displayText;

}

 

// Call back

function complete(){

    if (req.readyState == 4) {

      if (req.status == 200) {

            displayPosts (req.responseXML);

        }

    }

}

 

// Initial async call

function getPosts()

{

  if (window.XMLHttpRequest) {

      req = new XMLHttpRequest();

  }else if (window.ActiveXObject) {

      req = new ActiveXObject("Microsoft.XMLHTTP");

  }

  if(req){

      req.open("GET", rssUrl, true);

      req.onreadystatechange = complete;

      req.send(null);

  }

}

 

// Entry point

getPosts();

</Script>

 

Posted Saturday, February 07, 2009 9:42 PM by Bali | 1 Comments

Searching For a Number in Shifted Sorted Array within O(log(n)) time

Run into the algorithm problem long time ago. Now post my answer here. A sorted array, say: {1,2,3,4,5,6,7,8,9,10,11,12}, do right rotate through carry unknown times, and then it might become: {6,7,8,9,10,11,12,1,2,3,4,5}. Now we need get the index of a given number, say 4, from the array within O(log(n)) time. Apparently a 8-year-old can get it done with O(n) time.

We can think of it this way: take the middle element of array, if target is found, fine; if not, and then array become two parts, one is sorted array, the other is shifted sorted array. As illustrated as below diagram:

If the target falls into the sorted array half, we can simple do a binary search; otherwise, repeat this operation in the other half in recursive way. You can see this is divide-and-conquer algorithm. Obviously this is O(log(n)).

Code

//

// A typical binary search implementation

//

int _BinarySearch(unsigned int ShiftedArray[], unsigned int start, unsigned int end, unsigned int target)

{

    // Not found

    if( start == end && ShiftedArray[start] != target) {

       return -1;

    }

 

    unsigned int middle = start + (end - start)/2;

    if(target == ShiftedArray[middle])

    {

       return middle;

    } else if (target > ShiftedArray[middle]) {

       return _BinarySearch(ShiftedArray, middle + 1, end, target);

    } else {

       return _BinarySearch(ShiftedArray, start, middle - 1, target);

    }

}

 

//

// Select a given number from shifted array.

// ShiftedArray is something like = {6,7,8,9,10,11,12,1,2,3,4,5}

// If found, return index of the number; if not, reutrn -1

// Require log(N)

//

int SearchShiftedArray(unsigned int ShiftedArray[], unsigned int start, unsigned int end, unsigned int target)

{

    // Start meets end

    if( start == end && ShiftedArray[start] != target) {

       return -1;

    }

 

    unsigned int middle = start + (end - start)/2;

    if(target == ShiftedArray[middle])

    {

       return middle;

    } else if(ShiftedArray[middle] < ShiftedArray[start]) { // Right half is sorted linearly

       if((target > ShiftedArray[middle]) && (target <= ShiftedArray[end])) {

           return _BinarySearch(ShiftedArray, middle + 1, end, target);

       } else {

           return SearchShiftedArray(ShiftedArray, start, middle-1, target);

       }

    } else { // Left half is sorted linearly

       if((target >= ShiftedArray[start]) && (target < ShiftedArray[middle])) {

           return _BinarySearch(ShiftedArray, start, middle - 1, target);

       } else {

           return SearchShiftedArray(ShiftedArray, middle + 1, end, target);

       }

    }

}

 

Test cases

Positive: {6,7,8,9,10,11,12,1,2,3,4,5}, target = 3, target = 8

Negative: {6,7,8,9,10,11,12,1,2,3,4,5}, target = 0, target = 13

Boundary: {6,7,8,9,10,11,12,1,2,3,4,5}, target = 6, target = 5

Exceptional: {…max}, target = max

 One more interesting thing is the statement that “only about 10 percent of the professional programmers implemented binary search correctly.” Do you know why? Check this.

Posted Tuesday, February 03, 2009 11:50 PM by Bali | 1 Comments

Filed under: , ,

To Next Cuil

Cuil, another so-called Google killer, is at its last gasp. I just knew it. I am not predicting present. Cuil is not the first one, and apparently not the last. For upcoming cuils, here are my words.

Brand. Brand. Brand.

For many people, word of Google has close sentimental connection with bunch of splendid words such as cool, innovation, unselfish, impartial, revolution, and powerful, etc… With brand, Google claims that “People don’t work at Google for the money. They work at Google because they want to change the world!”. With brand, debut of Google’s every new service always arouses buzzes, but seldom notices that Live also has compelling equivalence. With brand, people think only Google can provide best results, but often they can’t tell who is search provider when presented anonymous results set. It is very interesting to take a look at curve of Cuil’s daily unique visitors:

Curl's Daily Unique Visitors

At launch momentum, people rushed to see what this Google killer looks like because of Google’s brand. Ridiculous? Not actually. It is everyone’s inherent attributes as people love to check out events of small probability such as Shoes thrown at Bush, one crazy million-dollar idea. As part of branding strategy, naming is essential. Cuil might not a good name actually. Let me share a story of mine. Back to several years ago, a group of my friends decided to build a website aimed to provide 3rd service for franchising, called JiaMeng in Chinese. The guys with solid academic management background came with the domain name of 51franchise.com. It turned out a real trouble – hard to explain to customers, not localized. Even ordinary college students don’t know the word franchise, not to mention clients with much less schooling. So, ditu.live.com for Chinese is much better than chinamap.live.com if you take a look at average education level of internet users. All in all, BRAND works like religion, and it takes lifetime to build.

Brand -> prouducts

"A Google approach to email" - see how brand helps product marketing.

Infrastructure

 GFS. BigTable. MapReduce. They can be competitive advantages. With these put in place, Google can roll out new internet services faster, cheaper, and at scale at few others can compete with. They are designed solely for Internet services. Users quit quickly after dissatisfied performance experience in Cuil. Microsoft software is mainly for an enterprise, supporting 100K concurrent users is “good enough”, but it is far more perfect in internet scenario.

Understand/Repsect Customers

There is no one-size-fits-all solution given the growingly diversified market. Of course you can educate customers, but never expect to change their inherent attributes coming from culture/history/economic development level. If you doubt this claim, check out this article: Search site moves at the speed of China, which reports, “But appreciating such cultural differences is what Baidu.com Inc.’s chief financial officer, Shawn Wang, says gives the Chinese search giant unique insight into the country’s 1.3 billion people as it competes with American rivals such as Google Inc. and Yahoo Inc.” As a result:

Baidu beats Google in China market

Culture

Per Wikipedia, culture means the set of shared attitudes, values, goals, and practices that characterizes an institution, organization or group. Google’s business is built on top of internet, so its organization/knowledge base is built for the internet, just like Microsoft is built for software, mainly enterprise software. I met strong feature PM with deep knowledge needed for enterprise software, say reporting, admin UI, DB admin UI, and information work flow. They understand their customers so much after years of interactions with them. It takes time to accumulate. Top-down hierarchy, heavyweight development process, years of in-house development can hardly catch up with the pace of internet evolution. The same thing is applied to Google – I am equally not optimistic if Google step into enterprise software because of the same reason – culture, enterprise’s DNA.

Web Competition Strategy

What is Cuil’s selling point? (1) Fancy UI. UI is critical for adoption and usage, but it hardly provides a moat. This is provided by two case studies of Apple computer of the nineties and the “X window” system on *nix OS. Both these systems with more attractive UI couldn’t beat windows OS with lower cost and rich applications available. (2) More relevant result. This is an ambiguous area which lacks of widely accepted measure criteria. (3) Cheaper solution. There is a question of sunk cost, of course you can claim you are 1/10 cheaper once reaching Google’s current scale. None of these is compelling from users’ point of view. Why do users bother to go to your site instead? One of the significant differences between web service (say, search) and traditional software business(say, DB) is purchasing decision making process. DB vendors can send to salesmen to target customers’ office and argue the deal. Only quite a few key persons have the final call. They are more analytical, love data. As comparison, everyone can be customers of search, we are more emotional. If I don’t miss anything, looks like the best strategy to monetize Cuil is to be acquired by Google.

No chance to win in search?

Definitely No. But you are doomed to fail if following essential parts are missed:

  1. Remember brand. Remember “winners take all”.
  2. Build your DNA towards internet. DNA = SUM(people, team arch, process, knowledge, …)
  3. Put infrastructure in place. This is the way to help turn your idea into profitable traffic. Not scale-up, scale-out instead.
  4. One thumb rule to compete with dominant market leader
    • Avoid playing games whose rules are set by opponents. You can hardly win. In this case, better search engine defined by Google are faster, relevant results, simple UI, magic algorithm, PB of data, … Let us think of solving same problems with different approaches. Why search? Help explore and share information. If someone tries to solve this problem by following Yahoo’s tail light to build yet another portal, he has little change to take off. Another example is download - P2P technology solved the download problem without adding more expensive servers/bandwidth.
    • Attack opponents’ weak points. Google is designed to search everything, but it may not be good at all vertical industries, say shopping. Nibble at its market share if we can’t win in head-to-head way.
  5. Before rolling up sleeves, why we have to win? Why not step away and go find next big thing? Let Google be Google.

Posted Monday, February 02, 2009 4:02 PM by Bali | 3 Comments

Principles for Building Secure Database Applications in Action

What I am talking about in this post might be well known to many people(too simple, sometimes naive?), but often most basic things make a difference. OK, get down to business. Thumbs rules for DB security might be:

  • Define your security boundary(or attack surface)
  • All input is evil! Evaluate them with whitelist
  • Don't store blank password, even hard-coded in the source
  • Put DB in a dedicated server and access it with accounts with least privilege
  • Put connection string in registry and read it out from code
  • Use stored procedure
  • The attacker is told nothing
  • Save your resources
  • Specify least assembly permission requirements with attributes

FxCop is obviously a “must-have” for .NET developer, but we have to eliminate complaints one by one. Instead of remembering all “bad behavior” in various tutorials, why not make them our built-in features towards great developers? (if you are still developers, why not much better? ) Let us put most significant principles into simple sample lines of code. Pay special attention to highlighted words.

// <THIS IS UPDATED ON 2/5/2009 PER FEEDBACKS> 

using System;

using System.Data;

using System.Data.SqlTypes;

using System.Data.SqlClient;

using System.Security.Principal;

using System.Security.Permissions;

using System.Text.RegularExpressions;

using System.Threading;

using System.Web;

using Microsoft.Win32;

 

namespace Sample

{

    public class SecureDBAppSample

    {

        [SqlClientPermissionAttribute(SecurityAction.PermitOnly,

            AllowBlankPassword = false)] // (1) Blank password is never allowed

        [RegistryPermissionAttribute(SecurityAction.PermitOnly,

            Read = @"HKEY_LOCAL_MACHINE\SOFTWARE\MyApp")] // (2) Can read only one specific registry key

        static string GetName(string Id)

        {           

            string Status = "Name Unknown";

            try

            {

                // (3) Check for valid shipping ID with white list

                // 4-10 digist only, anything else is bad. In most production environment,

                // inputs check should be done in attack boundary instead. Of course we can check

                // it here for defensive programming efforts

                Regex r = new Regex(@"^\d{4,10}$");

                if (!r.Match(Id).Success)

                {

                    throw new Exception("Invalid ID");

                }

 

                // (8) Shut down connection--even on failure.

                using (SqlConnection sqlConn = new SqlConnection(ConnectionString))

                {

                    //Add shipping ID parameter.

                    // (4) Use a store procedure to hide the application business logic

                    // in case the code is compromised

                    string str = "sp_GetName";

 

                    // (8) Release resources--even on failure.

                    using (SqlCommand cmd = new SqlCommand(str, sqlConn))

                    {

                        cmd.CommandType = CommandType.StoredProcedure;

 

                        // (5) Use parameters, instead of string concatentation to build the query

                        // (6) Force the input to be 64 bits integer

                        cmd.Parameters.Add("@ID", Convert.ToInt64(Id));

                        cmd.Connection.Open();

                        Status = cmd.ExecuteScalar().ToString();

                    }

                }

            }

            catch (Exception e)

            {

                // TODO: For better debugging purpose, we need log the exception with

                // something like Logger.Log(e);

 

                // (7) On error, the attacker is told nothing

                if (HttpContext.Current.Request.UserHostAddress == "127.0.0.1")

                {

                    Status = e.ToString();

                }

                else

                {

                    Status = "Error Processing Request";

                }

            }

            return Status;

        }

 

        //Get connection string.

        internal static string ConnectionString

        {

            get

            {

                // (9) Store connection string in registry key intead of xml files

                return (string)Registry

                .LocalMachine

                .OpenSubKey(@"SOFTWARE\MyApp\")

                .GetValue("ConnectionString");

            }

        }

    }

}

The data in registry key is the connection string.

Data Source=MyDb008;     // (10) DB is on remote server.

                         // Compromised web service does not lead to SQL data access automatically

Integrated Security=SSPI;// (11) Use Windows authentication 

Initial Catalog=client

 

In stead of storing plain text, we can encrypt above connetion string. Keep in mind that I don’t say that they are necessarily the best choice at all times, but many times they are. 

Reference: Write Secure Code

Posted Thursday, January 29, 2009 11:10 PM by Bali | 2 Comments

Happy New Year of the OX!

Hi, my dear friends, Happy New Chinese Year! First of all, a small quiz to you:

Happiness

Can you guess what it means? Let me explain it a bit. The pic is actually a Chinese word written in a piece of square-shaped paper, and then rotate 180 degree. means happiness, “stand-upside-down” and “come” have the same pronunciation in Chinese. So by putting them together, the pic means my best wishes to you. Do you get it? J

During the holiday, I ran into a wiki page saying that the world cattle population is estimated to be about 995,838,000 head. How do they get that number? That is next challenge to you.

I will post my answer when I get one moment.

Posted Wednesday, January 28, 2009 12:17 PM by Bali | 1 Comments

Filed under: ,

An AD System to Pay Content Generators

Back to not too long ago, I had a half-completed advertising idea related to social shopping. Now I post it here to collect more feedbacks. I call it HappyDog. (Just a name, not related to that DogFood widely used within Microsoft J)

Problems

As everybody knows, ‘YOU’ is named Time’s person of 2006 for the growth and influence of user-generated content on the internet.  Why? In Wikipedia’s words:

“… chose the millions of anonymous contributors of user-generated content to Wikipedia, YouTube, MySpace, Facebook, Digg, Second Life, the Linux Operating System, and other providers, as Person of the Year, personified simply as You.”

But on the other hand, if we carefully think of people’s motives of generating contents we can easily find that people do this mostly out of curiosity, self-achievement or volunteerism. Problem #1: Per basic economy principles, these efforts can hardly stand long. Content quality is another downside in such circumstances.

Another awkward problem around current most successful business model, online ad, is the fact that people get tired of spammed AD when they use certain online service for free. Problem #2: But interesting enough, people often have trouble in making the right purchasing decision given even exposed to so much AD, probably because: (1) AD timing is not good, (2) they don't trust the publisher, (3) AD is not carefully targeted. This looks pretty twisted, doesn’t?

At a glance

Taking above problems into consideration, HappyDog is an innovative advertising system which provides following unique values:

  • For general content generator – get cash paid by sharing your experience with the world, even if you does not own a website.
  • For shopper – make your shopping decision more smartly
  • For business, either online or offline – market your product more directly to the most potential customers

Let us take a look at below diagram first:

DogFood Big Picture

By content generators, it could be anyone(professors, housewives, kids,...) who contribute anything(answers, e-books, songs, tutorials, ...) in any format(wiki, blog post, video,...), including but not limited to wiki & QA.  

When people produce contents, HappyDog helps insert contextual AD into the content. The big difference from existing advertizing system is that content generators gain real revenue in HappyDog system. The revenue coming from advertisers would be divided into several parts: 60% to content generator, 30% to site owner, 10% to HD. HD takes the smallest piece of pie, but apparently we will gain most amount of them, because there is only single HD, 1*M site owners, 1*M*N content generators. In addtion to real $s, the incentives here can also be points/happiness, etc. The point here is to encourage ones by certain ways. But the reason why we highlight monetary incentives is the belief that this makes something long-awaited possible. Let us think about this: Sites like Wikipedia, IMDB, or Amazon, have tons of high quality content that have been contributed freely, but why only them? Because They are cheap, handy offerings to the community. How about an book written online, sold online? Few will do this free mainly because of its costly nature. HD makes it possible though. Another interesting thing is how wikipedia-clones are going in China. They never take off. It might be cultural reasons related to general finance status.

Monetary incentives can help imporve quality of content because one gets financial penalty due to "thumb down" when we put a rating system in place.

HD helps people do better purchasing decision in that HD AD is more personal, and people are encouraged to advertise the products they sincerely love and have first hand experience on. We will address this in detail next. One will be concerned to play dirty because one also has to care about its repuation/credit disclosed by rating system.

Demo

Take Yahoo answers as example, an working page from publisher side would be:

DogFood Demo

P0 features in 1st Milestone

In order to pay content generator effectively, following features are essential in M0:

(1)    Users have the rights to select their preferred AD - When you answer questions in forums; you actually care your reputation in same way you deliver a public speech. So you also care the AD itself along with your content. Another reason supporting users selectable AD is that people might like to show certain products more persuasively via real personal experience. In short, we prefer “recommend your favorites to friends” scenario.

(2)    PPA(Pay Per Action) as major pricing model along with PPC – Advantages of PPA: no click fraud, advertisers’ preferences.

(3)    AD type – signature text, picture, inner-text popup, mini-cast

(4)    Support offline mode – Although internet users reach 0.8 billion statistically, but not every business has a website. This is especially true when people surf the net mainly for entertainment instead of business. HappyDog aims to benefit this kind of business via so called offline mode. We will take about this later.

(5)    An open platform – Open API to foster a strong ecosystem around HappyDog.

How It Works

The conceptual architect for HappyDog might be looking like follows:

DogFood Arch

In real implementation, HappyDog doesn’t depend on any specific technology platform. Take general open source platform as example, related technology could be:

       Client

      HTML

      Use Javascript/Flash extensively

       Servlet container

      Tomcat/Jboss

      Spring, OR structs

       JAS(Java Application Server)

      Jboss

      Hibernate

       File Server

      Store static data, say photos, html, pic, js

       DB

      MySQL

Its key use cases would be:

DogFood Use Cases

Buyers can be publishers or unregistered users. By online, it means the transactions whose completion can be confirmed online, say online order, user registration, complete one survey, software download, etc. By contrast, offline means the remaining trade types, such as haircare, restaurant, face-to-face trading.

PPA Implementation

PPA, as one of the core features of HappyDog System, comes with two flavors: online and offline, as illustrated in following sequence diagrams.

Online-PPA

Online PPA

The steps here are:

(1)    User browses and click the publisher site link somehow

(2)    Browser sends request to publisher

(3)    Publisher response the corresponding content along with Javascript inside

(4)    Browser starts rendering the page, and then the Javascript in Browser call the HD to get AD

(5)    HD returns content with targeted AD

(6)    Brower completes rendering

(7)    User continues browsing and click one of our AD

(8)    Brower follow the link and send the request to HD along with the needed parameters

(9)    HD do several things here:

a)        Write “who is publisher, when, advertiser, campaign, etc” into the DB and return with a TRANSACTION_ID

b)        Write TRANSACTION_ID into user’s browser cookie

c)         Redirect user to advertiser website

(10)Advertiser returns the commercial pages to user browser

(11)Just show it

(12)User is interested in something in advertiser’s web site and fills in form(sales order, lead, signup, etc) and submit

(13)Submit user inputs to advertiser

(14)Advertiser will do below things:

a)        Do anything necessary to close the deal

b)        Read user’s browser to get TRANSACTION_ID

c)         Call beacon code the confirm the transaction with HD with TRANSACTION_ID as parameter

(15)Charge advertiser and share revenue with publisher

To take part in AD promotion of this advertiser, user must give his comments about this shopping experience.

Offline-PPA

Offline mode is bit more complex.

Offline PPA

(1)    User browses and clicks the publisher site link somehow

(2)    Browser sends request to publisher

(3)    Publisher response the corresponding content along with Javascript inside

(4)    Browser starts rendering the page, and then the Javascript in Browser call the HD to get AD

(5)    HD returns content with targeted AD

(6)    Brower completes rendering

(7)    User continues browsing and click one of our AD

(8)    Brower follow the link and send the request to HD along with the needed parameters

(9)    HD do several things here:

a)        Write “who is publisher, when, advertiser, campaign, etc” into the DB and return with a TRANSACTION_ID

b)        Give user the advertiser’s profile and coupon which contains TRANSACTION_ID, advertiser name, promotion campaign.

(10)Just show it

(11)User is interested in advertiser’s  promotion program and decide to print the coupon out

(12)Go to HD

(13)Record this as a successful transaction in DB and wait for confirmation(TTL is 2weeks)

(14)User go to the advertiser with coupon to enjoy the service or product(say haircare, dinner, etc), and go back with CONFIRMATION_CODE

(15)Provide service and a CONFIRMATION_CODE

(16)User logon to HD, submit TRANSACTION_ID and CONFIRMATION_CODE

(17)After validation, give user further kickback

A series of CONFIRMATION_CODE are issued to advertiser by HD while he/she enrolls the program. Additionally, in step(15), alternatively advertiser could logon to HD and confirm the transaction.

Open Questions

There are also several problems deserving consideration at current stage:

1.       How to get advertisers? Alimama? Google AdSense is such a close system, hard to extend.

2.       Looks like current publishing place are not ready for HappyDog. They often filter AD content out. How to solve this?

3.       How to share revenue in Wiki scenario? Say 50 people contribute to a wiki page, how to distribute revenue among them?

4.       Is the money too little to stimulate enthusiasm among high quality content contributors who are usually payed very well in regular job?

5.       How to protect content, for example 1st tutorial about jailbreaking iPhone? Share revenue?

This post is mainly for feedback collection purpose. Feel free to Pai Zhuan. :-)

Posted Tuesday, January 20, 2009 11:21 PM by Bali | 1 Comments

More Posts Next page »
Page view tracker