MySpace Data Architecture: Hello Large Data
MySpace.com uses SQL Server in a big way. On Tuesday night MySpace Chief Data Architect Christa Stelzmuller spoke to the Silicon Valley SQL Server User Group in Mountain View. We had a record turnout. This was a rare opportunity to learn how a high profile company is using SQL Server to manage very large data. And I mean large – think 130 million active users a month!
It’s pretty well known that MySpace.com started out as a two-tier system. They used ColdFusion on the front-end, and SQL Server at the back-end. Traffic grew radically, and the technical team scrambled to adapt. Over the years, the technology has matured, but we’re talking about big data, heavy traffic, and continued rapid growth.
Now ColdFusion is gone, replaced by C# and ASP.NET. They added a middle tier, and are running mainly on SQL Server 2005, Standard Edition, with a few instances of Enterprise where required. They have about 4 petabytes of disk space, spread across 17,000+ disks. You can read more about the specifics in this MySpace Microsoft Case Study.
That volume of data pushes the database hard – and in some cases, beyond what SQL Server can handle out of the box. Load during replication was so high that they had to write their own replication mechanism. Likewise for many other processes. The load also impacts the development, testing, release, and backup routines. According to Christa, they literally invented their own processes and tools, as they are in uncharted territory.
Despite continued growth, MySpace is making real technical progress. For instance, when Christa joined the team from Yahoo 2.5 years ago, they were experiencing more than 2 million data integrity errors per day. Now that’s down to about 100,000 per day. My hat goes off to the MySpace engineering team!
The audience was so engaged that an extended Q&A that broke out in the middle of the presentation. Christa fielded dozens of questions, ranging from hardware configurations to backup strategies, and then finished off her presentation. You can check out Christa’s slides here.
Christa will speak to the San Francisco SQL Server User Group on October 14, 2009 when her topic will be Service Dispatcher: The MySpace Implementation of Service Broker, and I expect we’ll see another record turnout.
Sphere: Related Content2 comments
2 Comments so far
Leave a reply




[...] my recent post on how MySpace is using SQL Server, I mentioned that the original MySpace.com was built with ColdFusion. Even though MySpace moved [...]
Excellent slideshow link. It’s always interesting to read how SQL Server is being used in different organizations, especially database a infrastructure as large as MySpace.
If only I was in SF to come along to the user group!