Welcome to MSDN Blogs Sign in | Join | Help

Deven Kampenhout's Tech Blog

Experiences of a Web Infrastructure Architect in the Hosting Industry
The Journey Begins Part 2 - The Early Days of Hosting

… Continued from part one:

 In the beginning of 1999, the small hosting company I worked for (Virtual Servers, LLC.) realized that they were loosing business because they didn’t have a Windows based hosting product offering. By this time we were using NT 4.0 as workstations in the office, but we didn’t have any real expertise on IIS or running NT as a web server. They hired a New Yorker named “Vinnie” to put together our Windows based hosting solution. He did as good a job as anyone could do. The goal was to make it a similar style “virtual” product offering as our UNIX product. Unfortunately, true virtualization on Windows was impossible with the architecture at the time. In order to allow customers the “feel” of having more control of their Web server, we allowed customers to register their own dll’s. As you might imagine, this cause all kinds of troubles for our Windows Sysadmins. I also remember the company having all kinds of hardware troubles (only on the Windows side), as we had been used to building up our own servers to save on hardware costs. Eventually, we had to invest in some Dell servers and the major hardware problems went away.

 During this time of our Windows hosting product beginning, I was busy creating the Unix systems administration team. At first, I was the only sysadmin. I had to wear all kinds of hats in that role: security, abuse, system administration, backup/recovery, and top tier support. All of our servers were hosted in a small NOC located in one of the Westin office towers in downtown Seattle. The NOC was owned and operated by our sister company, Lightrealm Communications. I wouldn’t quite call it a data center, as it couldn’t have been more than 1500 square feet in total size. Our office, where all personnel worked, was located on the East Side of Seattle in the city of Kirkland. As such, we had a “dark” data center.

 We had built a custom monitoring system that would automatically page me if one of the servers went down. Each server’s power was hooked through a device called an “e-commander”, which allowed one to telnet in and power-cycle a box remotely if the box stopped responding and needed a hard reboot. This worked decently *most* of the time. Unfortunately, the UFS filesystem which was utilized by BSD/OS wasn’t the most resilient to hard booting. After coming up after a hard reboot, it would attempt to automatically repair the corrupted inodes, but every once in a while it wouldn’t be able to repair it automatically. If this happened, it meant that I had to make a trip downtown to give it personal attention. I don’t think I can count on one hand the number of times I had to make a trip downtown during business hours, but the systems had an uncanny knack of going down between 1-5 AM. The first few times I had to take a trip downtown in the middle of the night to fix a server, I thought it was pretty exciting. It didn’t take long to become quite a chore, however. I had been married for less than a year, and I can still remember my wife getting very angry at my pager.

Besides early morning file system recoveries, the biggest problem with the systems was performance. Since we allowed our customers a great deal of freedom on how they configured their servers, they were a very popular hosting solution for people who wanted to push the limits. Due to the backend system which created the “virtual” server environment, the systems tended to use more resources than a standalone hosting solution would have. The problems ranged from customers running bad “runaway” scripts, to sites which just got more traffic than the system could support. During the night, the systems tended to run pretty smoothly (when they weren’t crashing with filesystem problems), but during the day the loads could get pretty high. Any time the load averages rose above 7, the system performance would degrade to the point where customers would start to complain. I had to come up with a number of creative ways to automate managing the growing number of systems…

[To be continued] part 3

Posted: Tuesday, January 11, 2005 10:21 PM by devenkamp

Comments

David Neal said:

Cool stuff! I look forward to reading the rest of the story. Subscribed :)
# January 11, 2005 11:17 PM

Alex Gadea said:

Too funny! I felt your pain! I actually started one of the first Windows hosting company in 1997 and we did only nt and access/sql hosting until the demand for Linux became too much to ignore. It was called Virtualscape and was located on the other side of the country in NY. I can't even begin to count the number of times, I or one of our sys admins had to make the trip to the datacenter to reboot an NT server in the middle of the night. Man, did everyone hate being on "beeper duty." I can't even hear a beeper these days without feeling my heart rate increase rapidly. By the time we were up to 200 servers we actually had a person sitting next to the servers 24x7 in an Exodus datacenter with the rest of the staff doing remote management of the servers using PCAnywhere.

My have we come a long way. Right now I'm working on a new company that is doing hosted CRM using asp.net/sql server. Our servers are Windows 2003 and I can count on one hand the number of times one of them have had to be rebooted in the last year.
# January 12, 2005 2:13 AM

Joshua Bentham said:

It will be interesting to see how the problem of excessive resource usage is solved using W2k3 Server.

With BSD (and Linux) it's very easy to limit resources (memory, CPU time, number of processes, disk space, file descriptors, etc.) on a per-user basis. It would be interesting to know why this was not done when the system began to slow down under the load.

Also, I'm curious about why the BSD boxes kept going down or needing to be rebooted. Were there hardware issues? It's noted that the boxes tended to go down between 1 and 5 AM, but the next paragraph indicates that this was not the busy period - the daytime was. Was BSDi called in to support their product?

I'll be very interested to see how the comparison between current technologies (eg Windows Server 2003 and RedHat Enterprise Linux 3/4) relates to what we've seen so far in Deven's blog.
# February 2, 2005 3:15 PM

Ovidiu said:

I can only imagine that managing consolidated servers used to be such a pain. I'm not that experienced in the IT field, but on Windows Server 2003 you have an excellent tool for similar issues, the System Resource Manager (http://www.microsoft.com/windowsserver2003/downloads/wsrm.mspx)

As a side note, as a developer, you can build the same kind of functionality into your own application by using NT Job objects, but that's quite close to rocket science (from Microsoft, the only apps that use them are WSRM and SQL Server, to my knowledge).
# February 5, 2005 4:27 AM

Deven Kampenhout's Tech Blog said:

… Continued from part two (part one):
As we last left my chronicle of my experiences in Web Hosting,...
# April 9, 2005 12:01 PM
New Comments to this post are disabled
Page view tracker