SharePoint Data Storage: Beam Me Up Scotty
By default, when you upload a document or any other large file to SharePoint, it gets stored as a Binary Large OBject (BLOB) in the content database in SQL Server. As revisions are made, each version of that file also gets stored (not just the differences). The amount of BLOB data grows significantly faster than associated metadata, causing SharePoint to consume large amounts of expensive SQL Storage space. Burzin talked about externalizing BLOB storage, as well as options for storing infrequently used BLOBs in the Cloud. These approaches can help ease the backup and storage cost problems content-heavy SharePoint sites encounter.
Burzin’s SharePoint Storage Best Practices talk also covered Configuration, Maintenance, and Performance Tuning. He explained some of the unusual stresses SharePoint puts on SQL Server, and offered suggestions on how to avoid degraded performance. If you’re planning a significant SharePoint implementation, you’ll want to take a close look at his specific recommendations regarding recommended I/O Capacities, Database configuration and sizing, processors and memory.
Given the headaches SharePoint BLOBs cause in many organizations, it makes sense that StorSimple has a complete solution to externalize them. Their storage-on-demand appliance provides tiered storage for SharePoint with the option to secure and store infrequently updated BLOBs to the cloud to achieve substantial cost savings. According to Ursheet Parikh, StorSimple’s Founder and CEO, Burzin’s extensive SQL Server and SharePoint experience make him a key member of the StorSimple team.
I’ll write about StorSimple’s product in an upcoming post, and will follow that with a case study once DesignMind has had a chance to implement StoreSimple’s Cloud Storage Solution for one of our clients. For data storage, Space is the Final Frontier.
Sphere: Related Content1 comment
MySpace: SQL Server at its Best
Christa Stelzmuller, Chief Data Architect at MySpace.com, spoke Wednesday night to the San Francisco SQL Server User Group about the MySpace Service Broker. Last summer, Christa spoke to the Silicon Valley SQL Server User Group about the MySpace Data Architecture. MySpace is an amazing example of what can be done with SQL Server.
Christa started her presentation with a description of Service Broker, and the challenges they faced creating it. She then covered basic features, advanced features, and the major use cases. She concluded with a roadmap of their continuing development plans, and some fun examples of how their developers have sometimes used Service Broker to solve their problems in somewhat misguided ways.
Keep an eye out on CodePlex, where her team will be posting their work. We’ll get a chance to speak more with Christa in early November at the PASS Community Summit in Seattle.
Sphere: Related ContentNo comments
MySpace Data Architecture: Hello Large Data
MySpace.com uses SQL Server in a big way. On Tuesday night MySpace Chief Data Architect Christa Stelzmuller spoke to the Silicon Valley SQL Server User Group in Mountain View. We had a record turnout. This was a rare opportunity to learn how a high profile company is using SQL Server to manage very large data. And I mean large – think 130 million active users a month!
It’s pretty well known that MySpace.com started out as a two-tier system. They used ColdFusion on the front-end, and SQL Server at the back-end. Traffic grew radically, and the technical team scrambled to adapt. Over the years, the technology has matured, but we’re talking about big data, heavy traffic, and continued rapid growth.
Now ColdFusion is gone, replaced by C# and ASP.NET. They added a middle tier, and are running mainly on SQL Server 2005, Standard Edition, with a few instances of Enterprise where required. They have about 4 petabytes of disk space, spread across 17,000+ disks. You can read more about the specifics in this MySpace Microsoft Case Study.
That volume of data pushes the database hard – and in some cases, beyond what SQL Server can handle out of the box. Load during replication was so high that they had to write their own replication mechanism. Likewise for many other processes. The load also impacts the development, testing, release, and backup routines. According to Christa, they literally invented their own processes and tools, as they are in uncharted territory.
Despite continued growth, MySpace is making real technical progress. For instance, when Christa joined the team from Yahoo 2.5 years ago, they were experiencing more than 2 million data integrity errors per day. Now that’s down to about 100,000 per day. My hat goes off to the MySpace engineering team!
The audience was so engaged that an extended Q&A that broke out in the middle of the presentation. Christa fielded dozens of questions, ranging from hardware configurations to backup strategies, and then finished off her presentation. You can check out Christa’s slides here.
Christa will speak to the San Francisco SQL Server User Group on October 14, 2009 when her topic will be Service Dispatcher: The MySpace Implementation of Service Broker, and I expect we’ll see another record turnout.
Sphere: Related Content2 comments




