Tuesday, August 12, 2008

EXACTLY where SharePoint documents are stored


Ever wonder EXACTLY where SharePoint stores documents? This isn't going to be news to developers or others that have followed SharePoint for a good while, but if you are relatively new to SharePoint it might be somewhat eye-opening.

Every time you hit "Save" on a Microsoft Office document and the path for the save is a SharePoint document library on a SharePoint site, the entire contents of the document is saved in binary format in a single image-type field in the SharePoint database. That's right, the entire document, up to 2 GB in size, is stuffed completely into a single field in the SQL Server database that SharePoint uses.

Want to know by the letter of the law that this is true? If you keep following the trail in the official documentation here is where you find it (click on the image to see it more clearly):



Here is the exact link to this page of the documentation on the Microsoft Developer Network:

http://msdn.microsoft.com/en-us/library/ms998690.aspx

(Note: the description of the Docs table on this page says that its function is to store metadata for the document. That's true, but it also stores the full contents of the document as well.)

The documentation is definitely not an exciting read, but it does tell the truth (in this case at least :) )

So, are you thinking "how can this possibly work in environments that have any kind of volume at all?". Well, as they say, that's a deep subject - especially depending on how far you want to delve down into it. In this short post, all I can say is that it indeed DOES work, if the environment is properly architected, and performs very well at incredibly high volumes.

Maybe one of my colleagues at SharePoint Solutions will jump in and write an easy-to-understand post on some of the technical reasons why this approach is able to work even at very high volumes?

5 comments:

@binarybrewery said...

Isn't this the documentation for WSS v2 though? Realizing that there haven't been too many changes in the table structure, but I'm going to guess that it's not a 1 for 1 property list for WSS v3...

It has definitely come in handy in the past though to extract information from the Docs table :)

Anonymous said...

But where in the database would one find this. For example, in Sharepoint v. 3.0, if you create a site you have an attendant port number associated with it, e.g., WSS_Content_6789 (with 6789 being the chosen port number for this particular site). Then within this database there are several tables, e.g.,

AllDocs
AllDocVersions
AllDocStreams

Is one of these the actual "location" of the document?

MK said...

Will PDF files also store in sql database?
If Yes how it is stored in database

Russell Wright said...

Yes, PDF files are stored in the SQL database. Any content that is loaded in SharePoint is stored in the content database for the site collection it belongs to. They are stored as "binary blobs" in the database.

Anonymous said...

Where does the AllDocsStreams table fit in?

Isnt that where the blobs are stored?