Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Question on Page split

Status
Not open for further replies.

sujosh

Programmer
Nov 29, 2001
93
US
Though I know that "when a new row is added to a full index page, SQL Server moves approximately half the rows to a new page to make room for the new row."

I have few questions..

1) A full index page is a 8kb page that is indexed with a clustered or non-clustered index right?

Let us say I have a table with 25 columns ( and say 10 rows of data) with a clustered and a non -clustered index will I have a 8kb page that consists of all 25 columns or will the columns on this table be accross multiple 8kb pages?

2) I dont understand why this should happen ?

"...SQL Server moves approximately half the rows to a new page to make room for the new row."

Could some one enlighten me?
Thanks
 
Some clarifications: there are index pages and data pages. Most of pages have ordered values, with a notable exception of data pages for a table without clustered index (heap table). Ordered values help engine/optimizer a lot. Page split occurs whenever page (index or data) is full and order must be preserved. Here is a good article on this subject:

 
Ordered values allow a binary search which theoretically means you only have to seek N log 2 times (rounded up, where N = the number of records). This isn't an exact formula because the data is not stored 'flat' in one chunk but it does give you an idea of what's going on.
 
1. Each row is stored on a single page, not split across pages, many rows per page. "Text" (different to varchar) and other big-type fields are an exception. The proof that rows are not split is that it wont store a row bigger than 8K (page size).

2. Lets say there are three 'pages' either clustered or a secondary index - A,B,C. What ordering means is that every key value in A must be less than every key value in B, same for B<C. Then you want to insert data "b1" that within the range of keys already in B. Data "b1" cannot be stored in A or C (or any other existing page) because it would break the rules, eg if in A it would be greater than some B keys. A new page must be allocated, that will be logically inserted either between A&B or B&C. It probably cannot create a new page with just data "b1" in it, given that it falls in the middle of data already in page B. So the existing page B data is split in half and new data "b1" put in the old or new B page. The overhead is in allocating a new page on disk, not dividing up the data.

3. The indirect cost of page splits is that they cause fragmentation. So rebuild data regularly. Daily vs weekly rebuilds and setting FillFactor and fragmentation .. are a long story. The payoff id usually marginal. You can probably be content with weekly rebuilds unless you are squeezing every fraction of performance from the database or you have data growth >10%.
 
Where do I get started on finding out about rebuilding data? The database 'admin' here doesn't know nearly enough and I'd like to give her some hints. I can look in Books Online or search the web, but some keywords to search with would be nice.
 
1) A full index page is a 8kb page that is indexed with a clustered or non-clustered index right?

No. An index is a separate object and there doens't have to be one. For a clustered index the leaf level will be the data page.

>> will I have a 8kb page that consists of all 25 columns or will the columns on this table be accross multiple 8kb pages?

Columns are defined in syscolumns. The data will be held in slots on the page - one row per slot. Apart from blobs the row must be on a single page (otherwise no need for page splitting).

>> "...SQL Server moves approximately half the rows to a new page to make room for the new row."

When you rebuild an index you can specify room to leave to add entries. A clustered index will define how much space is left for rows on a page. If there is no clustered index then rows will be added at the end (a heap) and the only data page splitting will be due to updates increasing the size of existing rows.
If a page is full then the server has to move data off that page. It could just move the row affected but it's allocating a new page anyway so it will be better to move half the data to the new page so leaving room for growth on both pages. The same applies to indexes - new entries can cause page splits at any level.

If you are interested in the way data is stored have a look at

and get a copy of Inside SQL Server.


======================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top