Welcome to MSDN Blogs Sign in | Join | Help

massively multiplayer online game site

I tripped across the daedalus project this morning while I was looking for player statistics for World of Warcraft. What a great site! The author, Nick Yee, has apparently been doing user surveys to MMOG players for years, and compiling various articles and statistics based upon them. The ones I've looked at so far are quite interesting!

Unfortunately it looks like I discovered his site a little late. There was a post on March 8th indicating he's not going to be updating it anymore.

 Nevertheless, a treasure trove of data to comb through. Thank you Nick Yee, for putting this together and leaving it up!

Interesting Cheating Talk

Jon Crowcroft pointed me at an extremely interesting talk by Dan Ariely on Cheating. Take a peek if you're interested in cheating and cheater behavior, especially as influenced by group dynamics, it'll be 15 minutes well spent.
Posted by John L. Miller | 0 Comments
Filed under: ,

Second Life paper answers many questions

A recent paper by Matteo Varvello et. al. answers a lot of the questions I've found myself asking about Second Life and its community. The paper Is There Life In Second Life? and if you're interested in Second Life's popularity, community, and performance, it is a must-read!

A quick inspection of recent web articles about Second Life shows a common sentiment that perhaps popularity its is waning - or was never as great as total user account statistics suggested. When user comments are enabled on these articles, you also see that whatever their numbers, Second Life aficionados are extremely vocal when defending the popularity and functionality of their world.

It'll be interested to see how the Second Life community reacts to this paper.

It presents a lot of salient statistics about how many users are online at a time, and what those users are up to - at a macroscopic level.

So check it out! Is There Life In Second Life? is extremely accessible, so don't worry about needing a PhD to make heads or tails of it :)

 

Microsoft Research AutoCollage - try it now!

I work on an incubation team at Microsoft Research in Cambridge, UK. Today we released our first public offering, AutoCollage. In a nutshell, AutoCollage lets you select a folder containing images, and creates a collage synthesized from interesting bits of some of the pictures in that folder. You can see examples of collages here, they're pretty cool IMO.

You can download AutoCollage and try it for thirty days free. If you like it, you can buy it.

AutoCollage works best on Vista, but you can use it on Windows XP SP2 and later, as long as you've got the .NET framework 3.0 or later installed.

AutoCollage is based on a lot of cool research technology from Microsoft. Carsten Rother developed the primary AutoCollage algorithm, incorporating research from Microsoft Research Asia, Redmond, and Cambridge. It's truly a global effort :)

I hope you enjoy it!

 

 

 

 

Posted by John L. Miller | 1 Comments
Filed under: ,

Crashing WoW servers

Yesterday I was forwarded a link about retirement of a long-time player from World of Warcraft. While that in itself might be interesting, the really juicy bit was the way he went out: with a mighty 'crash' from the server. Scalability issues, or something else?

Check out Boom's Goodbye Blog, where he describes his farewell, including the convergence of more than 1,000 players in a single capital city, and three different 40-party groups (in addition to lots of individuals) engaging in combat.

It's nice to see that WoW can scale past the 100-200 player limit (within mutual interaction range) I had privately hypothesized. It would be very interesting to find out exactly how many it could support sustained!

 

DVE Scalability - More to be done?

Much of my last year has been spent reading about distributed virtual environment scalability. As it turns out, perhaps it shouldn't have been.

A lot of research papers I've read begins like this:

"DVE's consume loads of bandwidth, and there are lots of opportunities to improve their network usage."

Network games researchers have source access to - such as Quake III - have different behavior than the latest generation of multiplayer online games. Being an avid player of World of Warcraft, I finally spent some time analyzing traffic usage for a few basic scenarios:

  • Sitting in a capital city with a few hundred other players
  • Playing a battleground with 79 other players
  • Fighting in close proximity to 30+ players and several computer-controlled avatars.

I was shocked at how LOW the traffic usage was. Even when I was beating on the keyboard and mouse along with 30 other players next to each other in the battleground, my upload bandwidth stayed below 5 kbps, and download below 50 kbps. Better still, when I moved away from other players - out of interaction and viewing range - I stopped getting information about them.

There could still be issues with massively scaled games where thousands of players will be within interaction range of each other, but it's unclear if other computing resources (such as video cards) can keep up with the demand of displaying those avatars.

Does this make scalability research in DVE's obsolete? Absolutely not, but it does mean we need to be careful about what we assume is and is not already implemented.

I wonder if Blizzard talks to researchers about their Networking design...

Distributed Virtual Environment Scalability

In the previous post I parrotted scalability figures for World of Warcraft. While investigating DVE's, I tripped across interesting figures for WoW and several other environments.

Halo-3

From this press release, we can see that in the first week of Halo-3's release,

  • 2.7 million people played Halo-3 online
  • They logged 40 million hours of online play that week

That's absolutely astonishing for a single game! 40 million hours is 19,230 US person YEARS of work, in a week!

World of Warcraft 

From Neilsen video game figures for September 2007, we see that

  • WoW was the most popular PC game by a factor of 3
  • The average player played 1051 minutes per week - 17.5 hours!

If the math holds up, that's 10 million subscribers * 17.5 hours/wk = 175 MILLION hours of WoW play per week, or 84,135 US person years of work equivalent for each week of play.

As a side note, "Traffic Analysis and Modeling for World of Warcraft" describes WoW traffic overall, and says the median download bandwidth for a player is 6.9 kbps, and the median uplink is 2.1 kbps. If we accept the peak number of simultaneous active users as 900,000, that's a total of 6.2 Gbps peak average upload from WoW datacenters for gameplay. Imagine all the processing that goes into calculating what's being communicated... Zowie!

Second Life

It's difficult to find reliable statistics for Second Life, and I haven't gone far into the research literature. From my own observations and inferences, and a quick search on the internet

  • Online user population tends to be between 25,000 and 50,000 at the times I connect.
  • A given region (a land parcel whose simulation is handled by a single server, and whose inhabitants can interact) looks able to support no more than one or two hundred users. I've seen limits of 63 attendees at invitation-only performance events, for 'technical reasons', which makes me wonder if perhaps the limit isn't 64 users per region?
  • Checking just now, it says 1,271,025 users have logged in in the last 60 days. Several months ago I saw a figure of 25,000 to 50,000 new accounts per day. If those numbers hold true on an average day today, and each new user logs in once, that would be 1.8M to 3M unique user logins. This leaves me uncertain of the active returning population in Second Life. Does anyone have better figures?

Anyways, enough random numbers for now. If you have anything to add, please leave a comment or send me mail, this sort of stuff is fascinating for me.

World of Warcraft hits 9M + active subscribers!

The holidays gave me a chance to re-acquaint myself with World of Warcraft (WoW). I have to say, it's still the single most impressive online game I've ever seen. For my money, it does everything right. It literally *is* for my money, since I'm one of those paying 9 million subscribers.

I'm researching distributed virtual environment (DVE) scalability. As part of that, I was curious about the populations in WoW, both numbers supported per server, and total number of simultaneously active users. "A Measurement Study of Virtual Populations in Massively Multiplayer Online Games" provides a great peek into WoW, based on results from the CensusPlus UI add-in for WoW. Based on that paper, and on CensusPlus results published at http://www.warcraftrealms.com/, it appears:

  • There's a peak of almost 900,000 simultaneous users logged in and playing in the US and EU.
  • Average daily populations on a given server fluctuate by around a factor of 4-5 between minimum and maximum number of players online.
  • The peak number of simultaneous users on a given server appears to be around 4,000.
  • Data on maximum users in a zone (able to interact directly) isn't provided, but appears to be on the order of a couple hundred.

I've downloaded and added CensusPlus to my characters addins. It's out of date, but still works fine with 2.3. It's fun to see the numbers for your own server.

In summary, congratulations Blizzard on the incredibly successful World of Warcraft, and thank you researchers and add-in developers for giving us insight into player populations!

 

 

Content versus Form

If I could go back 30 years, I would tell myself to focus on the content and intention of each message, rather than its form.

I'm struggling through a technical report describing application of Bayesian techniques to a particular problem. Like papers in any new (to me) field, nomenclature and conventions for equations are quite confusing.

In the past I focused on - obsessed over, really - these nomenclature differences between fields. Why couldn't they be consistent in mathematical formulations, terminology, and symbols? In college inconsistencies between my pure math, physics, and chemistry texts drove me nuts. They frustrated me to the point where I had problems memorizing formulas, but more importantly, understanding the reason for those formulas and their derivation. My grades reflected my frustration and ignorance.

Today I find myself accepting that I don't necessarily understand nomenclature in the paper I'm reading, and assuming it will differ from what I already know, even if it looks the same. Instead of getting frustrated over three different interpretations for superscripts, I think "what is the author trying to tell me here?" It's a much more rewarding experience, and I'm already reaping dividends.

If you find yourself faced with inconsistencies in presentation of technical material, or in the formulas that back it up, remember the price I paid for focussing on form rather than content. Hopefully you can have a more productive 30 years to come as a result. :)

 

Peer-to-Peer Content Distribution and download speeds

When I talk to people about P2P content distribution, there's a common misperception. They assume that the more people there are downloading that file, the faster download goes. This isn't usually true, as I'll explain below. What is true is that a peer-to-peer system in which servers participate should always be faster than just using the servers alone.

The following formula - assuming everyone's download capacity exceeds the speed they can actually get the content at - is true for all three systems.

[per-user average download speed] = [total upload speed] / [number of downloaders]

Let's call the system with no peer contribution - such as traditional web downloads - 'client-server.' P2P systems where peers serve files even when they're not actively downloading them we'll call 'always on.' Finally, if peers only serve files for as long as they're actively downloading that specific file, we'll call them 'greedy.'

All three systems will usually have servers (called 'seeds' in the peer-to-peer world), so there's always someone with a full copy of the file that can make sure people can download.

We need a few other numbers to illustrate the speed for these systems. Let's say

  • An average user has 4 Mbps download and 250 Kbps upload.
  • The seed has 10 Mbps upload, 40x as fast as an average user's upload. 
  • The always-on system has 2x as many users uploading as it has doing both upload & download, for a total of 3x as many uploading nodes as the greedy system.

Here's a table that shows average user download speed for 10, 100, 1000, and 10,000 users for each of the three systems

# Clients    Server             always on       greedy
10 1.000 Mbps 1.750 Mbps 1.250 Mbps
100 100 Kbps 850 Kbps 350 Kbps
1,000 10 Kbps 760 Kbps 260 Kbps
10,000 1 Kbps 751 Kbps 251 Kbps

From the table above, you can see that the P2P download *should* always be faster than the seed server on its own. However, the average download speed keeps dropping as the number of clients grows.

The always-on system has significantly faster download speed than the greedy system. This looks great on the surface, but it comes at a price. Users in always-on systems are donating their system's bandwidth even when they're not immediately benefiting. As long as they're OK with this, the system can usually offer improved download speed when the user does want content. But, it means a longer imposition to the user: the upload bandwidth being consumed can impact their other activites, such as web browsing, playing network games, etc. It's also more likely that the upload for a file retrieved earlier in an always-on system will interfer with the system's ability to provide a file the user wants to download NOW, at least in a system where users tend to want newer files rather than older ones.

The framework we use in MSCD leaves the choice of whether to behave as an always-on or a greedy system in the hands of the programmer. For the MSCD CTP which allows users to download Visual Studio 2008 Beta-2 images, we've configured the client to behave as a greedy client. In other words, you only share with other peers until you finish your download, and then you disconnect from the cloud.

Please post a comment if you have any questions in this area, I'm always on the look-out for new reasons to blog :)

 

 

'Managed Prototypes'

MSCD has a front-page story on research.microsoft.com. A friend of mine asked me about a quote in the article which could perhaps be misunderstood:

“It is as much as eight times faster than our original managed prototype, and it’s great that customers will have a chance to experience the benefits for themselves.”

The fact that our prototype was managed is orthogonal to the performance gains we've seen in our MSCD CTP. The speed-up is instead the result of algorithmic and architectural improvements that came out of our lengthy design and optimization efforts.

For the record, managed code is awesome. It does run a little slower (5% - 20%) for some things - especially if you're new to it and write your code in a way that makes the system do unneccesary work - for example appending to a string 50 times rather than using StringBuilder. Managed code also runs some things a little faster than C++, which surprised me at first. One thing that I think is incontrovertable: developing in C# and managed code is much, MUCH quicker than doing the same job in C or C++. My experience has been a factor of two or factor of three speed-up in development for the same quality results.

So, please don't misinterpret the quote in the MSCD story: we're patting ourselves on the back for our algorithmic improvement ingenuity, not dissing managed code.

Posted by John L. Miller | 0 Comments
Filed under: , ,

MSCD links to download Visual Studio 2008 Beta 2

If you're interested in using Microsoft Secure Content Distribution to download Visual Studio 2008 Beta 2, just click here, install and run the downloader, and you'll be off and running! This version of MSCD will be available for four weeks, so you have until 22-August to give it a try!

 

Microsoft Secure Content Distribution

A few years ago, Pablo Rodriguez and Christos Gkantsidis applied Network Coding to Peer-to-Peer file swarming, calling their system 'Avalanche'. I was lucky enough to be involved in their project. Over time, Cambridge Incubation at Microsoft Research Cambridge built a content distribution system around the Avalanche research results. Today that work culminates in a public customer technology preview (CTP) of the resulting system, Microsoft Secure Content Distribution (MSCD).

Before I go any further, let me stress: this is a four-week CTP. Microsoft has no announced plans to incorporate MSCD into any of its products, or to offer it as a separate product.

MSCD allows authorized content publishers to distribute their content to a large audience via file swarming. The publisher can choose to use MSCD to augment their existing server bandwidth, or use it to enable them to reach a much larger audience than they could have otherwise with a relatively small server investment. MSCD is NOT a file searching or file sharing technology: it's intended for a small number of publishers to distribute content to a large number of customers.

Security, preservation of publisher rights, and providing a good customer experience are core goals of MSCD.

The goal of the CTP is to gain real-world experience with MSCD. You can test and simulate to your heart's content, but with internet protocols, you never realy know how well it works until you deploy it.

As you may have heard, Visual Studio 2008 Beta-2 is expected out soon. In addition to the existing distribution mechanisms for VS Beta-2, the Visual Studio team is allowing Cambridge Incubation to make it available via MSCD for the next four weeks. Visual Studio 2008 has a lot of great new features which should make it extremely popular. My personal hope is that some of the more adventurous customers will choose to download it via MSCD.

We've been developing and testing MSCD for quite a while, but this is our first public CTP, and so our first opportunity to see how it works 'in the wild.' There may be hiccups in the experience of downloading the VS Beta via MSCD. If so, please remember these are issues with MSCD, NOT with Visual Studio. Anyone who chooses to try downloading via MSCD can also use the server-based download mechanism at any time.

If you're interested in more information about MSCD, you can read this article to start. If you'd like to read more about some of Visual Studio 2008's great features, you can check out Soma's blog or ScottGu's blog.

Second Life - Reality sets in?

Earlier I commented about the disparity in numbers quoted for Second Life's population. It's not that any of the numbers are wrong - for what's being expressed, they're no doubt correct. Rather, it's a question of what's being measured. For my money, steady-state population, and those willing to pay for the experience are both fine metrics.

 A recent LA Times article talks about a trend of some large businesses either reducing or eliminating their presence in Second Life. The article is short, insightful, and worth reading IMO. Some key quotes from that article:

  • "Even at peak times, only about 30,000 to 40,000 users are logged on, said Brian Haven, an analyst with Forrester Research."
  • "[Philosophy professor Peter Ludlow] said most firms were more interested in the publicity they received from their ties with Second Life than in the digital world itself."
  • "Between May and June, the population of active avatars declined 2.5%, and the volume of U.S. money exchanged within the world fell from a high of $7.3 million in March to $6.8 million in June."

My two specific longest-term motivations for being a computer scientist are making games and to enable some sort of global cyber space, not that I've done much for either of these in the last decade.

Where's the future of virtual communities and communications, and for that matter, is there one? I still think lack of a payoff for time invested in Second Life is its biggest conceptual challenge. World of Warcraft (WoW) has an incentive and reward structure built in, with social interaction as a side-benefit. WoW continues to do well, with more than 8.5 million subscribers, each paying more than $10 a month for the privilege.

World of Warcraft taps a large existing community - computer gamers - and offers an experience attractive to that community with significant value-adds. Perhaps the biggest challenge in Second Life is that the only existing community it really taps into are those who frequent online social mechanisms such as ICQ, MUD/MOO, and others, and press and the curious inspired by the press, who are no doubt a highly transient population. 

I definitely want to see Second Life (or similar environments) succeed. It seems like they need to offer something else, though, in order to be anything more than a niche application.

Data persistence in a digital world

A while back I read a news article pointing out an issue largely overlooked, namely the transience of digital data.

For thousands of years, institutional and personal memory were stored solely in physical written form, on paper, papyrus, wax, stone, you name it. The systems for writing these memories are well known and easily accessible: anyone who can read the language can read virtually anything. The down side is that accessing a piece of data in its native form requires a physical token, such as the book containing that data. How many of you remember taking a trip to the library when you needed to know something 30 years ago? And waiting for a book from inter-library loan if the local library didn't have it, sometimes being shipped from hundreds of miles away...

We've switched ease of access and permanence. Thanks to computers and networking, virtually any information you're interested in is no further away than your desktop, or perhaps your pocket if you're especially well connected. This comes at a price, though: most institutional and personal memory will be lost in just a few years, thanks to rapidly evolving digital storage formats and the lack of a long-lived, easily accessible physical store.

It's not uncommon to find written diaries from both common folks and historical figures going back hundreds of years. However, I challenge you to find an electronic diary of someone from the 80's that can still be read today. For example, I did a lot of writing on my Atari 520ST, and stored it on 3.5" floppies. Assuming the floppies are even working today, I don't have any machines with a floppy drive. I certainly don't have any that can read the data files created by whatever word processor I used, and it didn't save in plain text.

Is the web a temporally localized phenomenon, or will it last for centuries? It's too early to tell. But if it *doesn't* last, how much information will be lost when it fades into obscurity? Will digital archaeologists be able to make heads or tails of what came even a few decades before?

With luck, we'll never know. But our grandchildren will.

Posted by John L. Miller | 1 Comments
Filed under: ,
More Posts Next page »
 
Page view tracker