Shadow Copies of Shared Folders (Part 2)

In the first part of Shadow Copies of Shared Folders we learned what shadow copies are and how they can be enabled on a server volume. To recap, we have setup a drive that contains users’ redirected My Documents folders (E: in this example), have configured SCSF to make a shadow copy at 7:00 AM and 12:00 PM, and configured SCSF to store shadow copies on an alternative drive (F: in this example). In part two we are going to see how to restore a previous version of a file and how to restore a deleted file from a shadow copy. We will also look at how client software can be installed on previous versions of Windows to enable access of shadow copies from client workstations.

To save some time I have created a file named Important.doc in my My Documents folder and then made three modifications to the file, taking a manual shadow copy after each update. Although the updates in these screen shots will be only a few minutes apart, the operation of SCSF will be exactly the same if the updates were only taken twice a day.

Let’s start out with the most simplistic example – restoring a previous version of a file on the server itself (i.e. from the server console). To start you must navigate to the shared folder using a UNC path, mapped drive, or another method that accesses the shared folder. Note that you must access the files through the share – if you access the files directly from the server’s hard drive you will not see the “Previous Versions” tab in the following steps.

Right click the file in Windows Explorer, select properties, and then select the “Previous Versions” tab.

The buttons on the tab are self explanatory:

View – Allows you to view a copy of the file at the selected date/time

Copy – Allows you to copy the file at the selected date/time to a new location

Restore – Allows you to restore the file at the selected date/time over the current file

Let’s restore the copy from 10:29 which looks like:

When you click Restore you will be prompted to confirm the choice:

As you can see the Date Modified on our file now shows the time that the file was last modified when the 10:29 PM shadow copy was made. Cool, eh?

While being able to restore a file that still exists to a previous version is very useful, what about if the file has been deleted? There would be no file to right click, so how can you access the restore tab in order to restore the file? Simple, right click the shared folder, or a folder inside of the shared folder and open its properties. You will find the same Previous Versions tab in folder properties that appears in file properties. You can then click the View button to see what the folder looked like (i.e. the files and other folders it contained) at the selected point in time. From there you can copy and paste the file(s) you want to restore from the previous version of the folder to the “real” folder. You can also use the Copy and Restore buttons on the Previous Versions tab to Copy/Restore the entire folder all at once – there is no need to go file by file.

For example, let’s say it is 11:00 PM and I just deleted Important.doc by accident (and yes, I used Shift + Delete so it’s not in the Recycling Bin). To get my file back I can right click the maubert folder, select the Previous Versions tab, choose a previous version of the folder (10:52 PM in this example), and click View:

(Note the date the Shadow Copy was made in the title and address bars.)

I can now copy and paste this file back to the folder, to a new location, or I can open and view the file directly from the previous version of the folder. One thing to note is when restoring an entire folder (i.e. using the “Restore” button on the Previous Versions tab) any files that were added after the Shadow Copy you are restoring was made are not removed. For Example, let’s say there is one file in a shared folder called fileA which I have several previous versions of. I then create a new file in the same shared folder called fileB that was added after the Shadow Copies that include fileA were made. If I then restore a previous copy of the folder that contains an older version of fileA, but does not contain fileB, fileA will be overwritten with the older copy and fileB will be left alone. Continuing with this example, let’s say several more Shadow Copies are taken that now include fileA and fileB. While newer Shadow Copies contain both files, there are still older Shadow Copies that only contain fileA. Again, if I restore the folder from one of the older Shadow Copies that contains only fileA, fileA will be replaced with an older copy, but fileB will still be left untouched. On the other hand, if I restore the folder from a Shadow Copy that contains fileA and fileB, both files will be replaced with previous versions. If all of that is confusing think of restoring a folder as a copy and paste operation from the previous version of the folder to the current version of the folder and saying “yes” to overwriting existing files – the same rules apply as a typical copy/paste operation.

Shadow Copies of Shared Folders – Part 1

If you ask most network administrators what the top five tasks they perform on a repeated basis are, one task that is sure to be on the list is restoring a single file or a small number of files from backup (that and resetting passwords!). Users accidentally delete files, overwrite files, or may need an older copy of a file. In order to get these files back the user would most likely have to contact the IT department, give them a description of the file, where the file was stored, and the time or date when the deletion or modification occurred. Now unless you’re the CEO, CIO, or the person who is responsible for backups has a crush on you, restoring a file from backup is not exciting and in most situations does not get top priority. Additionally, while we know they try their best to give as much information as possible, users can get confused and the information about the files they need may not be quite right – and the file hunt begins!

Sometimes we can get lucky – the file may have been deleted only an hour ago and there is a backup from last night on another hard drive in the server. In this situation restoring the file is a simple copy/paste operation and is relatively painless if you don’t have to do it a lot. But what happens if the file they need is from a week ago? Or what if the user is not sure if they need a copy from last week or two weeks ago? What about if they need a copy from Monday, Wednesday, and Friday of last week? While hunting through tapes is not the end of the world, I can sure think of a few other things I rather spend my time on!

During this whole process of restoring a file the network administrator’s time is occupied, the user may be unable to complete a task until they get the file, or the user may decide dealing with IT would just take too long and they end up rebuilding the file by hand. Anyway you look at it restoring files ends up causing lost productivity and in turn lost revenues for the company – but what if there was a better way?

Windows Server 2003 introduces a new feature called Shadow Copies of Shared Folders which solves many of the problems associated with restoring a small number of files from backup. To put it simply, Shadow Copies of Shared Folders provide point-in-time copies of files located in shared folders on a Windows Server 2003 server. These copies are accessible by end users and show what a shared folder, or a single file inside of the shared folder, looked like a few hours ago, yesterday, last week, or even a few weeks ago.

Shadow Copies of Shared Folders works by storing a copy of data that has changed since the last shadow copy. Because Shadow Copies of Shared Folders only stores block-level (a.k.a. cluster) changes to files rather than the entire file the amount of hard drive space needed is greatly reduced. The administrator can specify when shadow copies are made and the amount of disk space that is used to store changes – with newer copies replacing older copies as needed.

While this is a great new feature, there are a few things you need to be aware of when planning to use Shadow Copies of Shared Folders (called SCSF from here on out):

  • SCSF is set on a volume-by-volume setting and is only available on NTFS volumes. That is, SCSF is enabled for every shared folder on a given volume or none at all – you can’t pick and choose which shared folders on a given volume will or will not use SCSF. Additionally the schedule of when shadow copies are made is also set at the volume level.
  • SCSF will not work on mount points (when a second hard drive is mounted as a folder on the first)
  • Dual booting the server into previous versions of Windows could cause loss of shadow copies.
  • SCSF is NOT a replacement for undertaking regular backups!

Let’s look at an example of how to setup and use SCSF. In the example we have three drives – C: contains the Windows Server 2003 installation, E: contains a shared folder with user documents that are redirected from the user’s My Documents folder (i.e. folder redirection setup in group policy), and F: is not being used.

By putting the shares I want to use SCSF with on another hard drive (or even another partition on the same physical drive) this keeps from wasting shadow copy disk space or I/O bandwidth of copying shares we don’t need SCSF on.

To enable SCFS on a volume right click the drive in Windows Explorer, select properties, and choose the “Shadow Copies” tab.

While SCSF will let you store the shadow copy data on the same drive as the shares being copied, this is not optimal – the drive head has to go back and forth in order to read the data in the shared folder and then write it to shadow copy storage. Lightly loaded file servers can deal with having everything on one drive, but adding the shadow copy storage area to the same drive on servers with high I/O loads can cause serious slow downs and is not recommended.

If we wanted to setup SCSF with default settings (which are: store the shadow copy data on the same drive, set the maximum limit to 10% of the total drive space, scheduled copies to be made Monday – Friday at 7:00 AM and 12:00 PM, and make the initial copy) we could simply select the drive and choose the “Enable” button. But because we have an additional physical hard drive from the drive that contains the shared folder (convenient, wasn’t that?) we will configure SCSF to use drive F: as the shadow copy storage area. Note that a single volume can act as the storage area for multiple other volumes – the only limitation is the amount of free drive space available.

To continue configuring SCSF select drive E: from the list of volumes and click “Settings…”

In the “Located on this volume” drop-down list select the drive you want shadow copy data to be stored on. In this case we will choose drive F:. Next set the Maximum size of the storage area to an appropriate amount of disk space for your situation.

So just what is an “appropriate amount of disk space” anyway? Well it all depends on the situation (doesn’t it always). Although 10% of the total drive space is a good estimate, you need to take the following variables into consideration:

  • The amount of data in the shared folders
  • The frequency that different blocks change in the shared folders
  • The amount of free disk space on the drive that contains the storage area
  • How many past shadow copies you want to keep
  • The cluster size of the volume that holds the shared folders
  • There is a 100MB minimum

Note that when I say “frequency that different blocks change” I mean the number of blocks that change between shadow copies – not the number of times each block changes. For example, say I make a shadow copy at 7:00AM and 12:00PM and I have two files that change (let’s assume each file fits in a single block for simplicity): fileA and fileB. Between 7:00AM and 12:00PM let’s say fileA is updated 4 times and fileB is updated 2 times. Because shadow copies take a “snapshot” of the shared folders at a point in time there is only one copy per updated block, not per change to each block. So in our example when the 12:00PM shadow copy is made only the most recent versions of our two files are copies – the 4th update to fileA and the 2nd update to fileB, not all 6 different updates that were made.

Another example would be a large file that is made up of multiple blocks – let’s say 100. Next let’s open the file, modify the last two lines, and save the file. By doing so our modifications don’t modify the entire file just the last block (Note this is dependent on the application and if it rewrites the entire file to disk or not). Now let’s assume that we repeat our modifications – changing the last few lines of the file a half dozen more times. When the next shadow copy is made only our final change to the last block of the file is copied – none of the first 99 blocks or the first six modifications to the 100th block are copied. In other words, if you up date a file 5 times or 5,000 times the space needed to store the shadow copy is still the same (assuming that the same blocks are modified between the 5 and 5,000 updates) for that file. Got that?

Also, why does the cluster size on the volume matter? The simple answer is that when you defragment a volume the clusters that makeup files are reorganized. SCFS may see this as a modification and will make a copy at the next shadow copy. To minimize the number of times this occurs Microsoft is recommending that you use a cluster size of at least 16K or any multiple of 16K when you format the volume. The driver used to support shadow copies is aware of defragmenting and can optimize for it, but only if blocks are moved in multiples of 16K. Note however that the larger you make the cluster size the more space you waist (if you have a 24K file the minimum space allocation is still 1 cluster, so if the cluster size was 64K we would end up wasting 40K). Additionally, if your cluster size is over 4K you can’t use file compression – file compression requires cluster sizes of 4K or less. If you can’t use a larger cluster size don’t worry, just keep in mind you may need additional space in the storage area due to defragmentation.

Back to the settings screen – when you are done selecting a drive to use as the storage area and setting the limit click OK

The Shadow Copy tab now shows that we have set drive F: to be used as the storage area for shadow copies made on drive E:

To continue select the E: drive and click “Enable.”
We are informed that this will use the default settings – although this is true for the schedule, it *will* use our selection of drive F: as the storage area.

Click Yes to continue.

The initial shadow copy is then made and the schedule for updates is enabled.

If you would like to modify the schedule of when copies are made you can select the drive, click Settings, and then choose schedule.

When done, click OK on all screens to exit out of the drive properties.

One thing that you may notice is that a new scheduled task appears in the scheduled task folder in control panel for each drive you setup SCSF on. While this is the same schedule that is available from the Settings on the Shadow Copies tab, I would recommend that you don’t directly modify the scheduled task itself – there are many more settings that could be accidentally “goofed.”

So speaking of schedules, how often should we make shadow copies? Again, this depends on the data and when your users use the data, but there are a few things to keep in mind:

  • Microsoft recommends a minimum of one hour between shadow copies and even that is probably way to low for most situations.
  • Taking one shadow copy per day is probably the maximum amount of time you want to go on weekdays between making shadow copies.
  • The longer the time between shadow copies the longer and more I/O intensive the shadow copy will be (due to the fact there are more changed blocks).
  • Your goal should be to take snapshots of data that would be most useful to users and made when they will impact the system the least.
  • There is a maximum limit of 64 shadow copies per volume regardless if there is free space in the storage area.

Twice a day during weekdays is probably sufficient for most M-F 9-5 operations – once in the morning before anyone shows up and than once at lunch when a good number of people are out of the office. The times here are important – you want shadow copies taken during times users are using the system to impact the system as little as possible. By taking a shadow copy right before most people are in the office the number of blocks that have to be copied during the noon shadow copy is reduced and in turn the I/O impact on the system is less. If a shadow copy was only taken at noon the I/O impact would be for blocks updated over the last 24 hours (and a lot higher) rather than just the last 5-6 hours.

Some sites may need more than two shadow copies per day, maybe an additional copy made at the end of the day, or maybe even four copies a day in some situations. However keep the 64 shadow copies per volume maximum in mind – if you take two copies per day during weekdays you can store almost six and a half weeks of shadow copies (assuming you have the hard drive space), a little over four weeks if you take three copies per day, and a little over three weeks if you take four copies per day. The 64 limit should not be a problem for most situations, but it’s the reason why you don’t want to just take a copy every couple of hours – there would only be a little over a week of shadow copies before they start to get overwritten with newer shadow copies.

Well that about does it for this article, part two will cover the client side of Shadow Copies of Shared Folders.

SQL Server Tables Check Constraints

So just how do you add a check constraint to a table or column? As with just about everything in SQL Server you can use SQL statements or Enterprise Manager to make the modifications. I’m going to show you how to add a constraint using Enterprise Manager, but feel free to look at the CREATE TABLE and ALTER TABLE syntax in Books Online for more information.

What we are going to do is add a check constraint similar to the country/state one above to the Customers table of the Northwind database. Start out by opening Enterprise Manager, select your server, expand the Databases folder, expand the Northwind database, and select tables. Next right click the Customers table and select Design Table. The Design Table window should appear.

Next click the Manage Constraints icon on the toolbar.

Once the Properties window appears click the New button and enter the following statement for the Constraint expression (again, this is just a shortened list):

Country <> ‘USA’ OR Region IN (‘AL’, ‘AK’, ‘CA’, ‘FL’, ‘GA’, ‘ID’, ‘MT’, ‘NM’, ‘OR’, ‘WA’, ‘WY’)

Last, for Constraint name, enter CK_Customers_USA_Region. Your screen should look like this.
Before clicking close lets look at what the three options at the bottom do:

  • Check existing data on creation will check all existing data in the table to see if any rows violate the check constraint. Note that the check is not performed until the table is saved and if there are any invalid rows an error is shown and the table modifications are not saved. You can leave this unchecked to only apply this constraint to new/updated rows.
  • Enforce constraint for replication will enforce this constraint if the table is replicated to another database. We will be getting into replication later in the series.
  • Enforce constraint for INSERT and UPDATE will enforce the constraint for any INSERT and UPDATE operations to the table.

Close the Properties window and save the table. If you receive the error Unable to add constraint ‘CK_Customers_USA_Region’, you probably have rows that were not part of the original Northwind database that have the Country USA and a Region abbreviation that is not part of the check constraint we created.

One last thing to note about creating check constraints in Enterprise Manager is that the only place you add constraints is on the Check Constraints tab of the tables properties window (see above). In other words there is no way to specify that a check constraint should be tied to a column in Enterprise Manager whereas with the CREATE TABLE statement you can add a constraint to a specific column. However, the good news is that Enterprise Manager will automatically decide if it should add a column or table level check constraint based on the constraint you enter.

SQL Server Table Logical Expressions

To put it simply, a check constraint is a Boolean expression that evaluates to TRUE or FALSE (a note about NULL in a second). When a row is inserted or updated each check constraint is evaluated. If any of the check constraints evaluate FALSE the transaction is rolled back. If all of the check constraints evaluate to any value other than FALSE the transaction continues. I say any other value because NULL introduces three-value logic into the mix. A NULL or unknown value will not cause the transaction to fail – only a check constraint that evaluates to FALSE will.

For example, say we had a table that was used by employees to track mileage to and from job sites. We can use the following check constraint to limit the maximum number of miles to 100:

miles <= 100 This statement will return true if miles (the column name) is less than or equal to 100. The <= part of the above check constraint is called a comparison operator and should be familiar to anyone who has done basic programming. SQL Server supports the following comparison operators and can be used on all data types except text, ntext, and image. Operators and their meanings: = Equal to > Greater than

< Less than >= Greater than or equal to

<= Less than or equal to <> Not equal to

!= Not equal to*

!< Not less than* !> Not greater than*

* Are not standard SQL comparison operators, but are implemented by SQL Server.

Another type of operator you can use are logical operators such as AND and OR. For example, we can add more constraints on the number of miles by using the AND operator. For example, lets add the restriction that each trip must be more than 0 miles:

miles > 0 AND miles <= 100 Our new check constraint now only allows miles to be in the range 1 to 100. So far we have looked at column level check constraints, that is, constraints that only involve one column. In addition to that kind of constraint, we can also write constraints that involve multiple columns from the *same* table and the *same* row - known as table level check constraints. The word same in the last sentence is very important, because check constraints cant reference columns from other tables, only the one they are created in. Additionally, a check constraint can only reference the current row in the table. For example, say we have a table that tracks customer address information with the columns country and region and we want to add the constraint that if that country is USA, region must contain a two letter state abbreviation. For this we can use the IN operator instead of having to type OR statements for each state abbreviation. (state abbreviations list shortened for readability) country <> ‘USA’ OR region IN (‘AL’,’FL’, ‘GA’)

There are at least a half dozen other ways you can write the above constraint that will accomplish the same thing.

Although understanding the basics of logical expressions are important, writing them is really more of a database programming issue rather than administration. There are many more operators you can use in your check constraints and I recommend checking out the topic operator precedence in Books Online if you are not used to writing logical expressions in SQL Server and want to learn more. Order of precedence and NULLS can cause unexpected results unless you have a good amount of experience.

SQL Server Tables and Referential Integrity

So all the tables have been added to the database, most if not all with logical relationships to one another EX: Each Order has many Order Detail, Each Salesperson has many Order, etc (if your not following I would suggest you stop here and read up on Logical Database Design)…what more do we need to do? Validate Validate Validate! Here is an example:

Say I have two tables called Order and OrderDetail. Each Order can have many Order Detail. Each OrderDetail describes a single line item sold for a given Order. Now what happens if a program adds several OrderDetail rows and then the application freezes before adding the corresponding row to the Order table? Or what happens if an application deletes a row from Order without deleting all the rows from OrderDetail? In short we end up with orphaned child rows (foreign key table) that have no corresponding parent (primary key table).

In a perfect world all applications would modify tables in the proper order. But this is not a perfect world and there are cases where situations like the above can (and will!) occur. For a small project, where the programmer is also the DBA, it is possible to diminish the odds of violating any relationships with careful application design…but what happens if a database is accessed by several dozen applications all written by multiple programmers? Or what if someone accesses the tables directly (gasp!) and makes some changes? A simple mistake could end up violating the integrity of the database with no one ever knowing.

Repeat after me: “It is the job of the RDBMS, not the end application, to ensure referential integrity.”

SQL Server Tables and Transactions

Back in More Database Creation Topics I gave an example of a transaction that moved money from one bank account to another. As the example illustrated, a transaction can prevent modification A from being made to the database unless modification B is also made. In other words, in a transaction, either all the modifications are made to the database (commit) or none of the modifications are made (roll back). Let’s take our definition of a transaction and expand on it.

The one line definition of a transaction is “a sequence of operations performed as a single logical unit of work.” To be considered a “transaction,” a single logical unit of work must exhibit the ACID (Atomicity, Consistency, Isolation, and Durability) properties:

Atomicity – A transaction must be an atomic unit of work where either all of its modifications are made, or none of them are made.

Consistency – When finished a transaction must leave all data in a consistent state. In this case “consistent state” means that all rules, or constraints, are applied to a transaction’s modifications.

Isolation – Changes made by simultaneous transactions must be isolated from the changes made by any and all other simultaneous transactions. In other words a transaction will either “see” the data in the state it was in before another simultaneous transaction modified it, or it will see the data after the second transaction has completed, but it will not see an intermediate state. This is known as serializability because it is possible to load the data back to a starting point, “replay” the transactions, and end up with data in the same state as it was after the original transactions were made.

Durability – Once a transaction is committed, the data modifications are permanent and can withstand system failure.

The good news is that from a physical/technical standpoint SQL Server does all of the work of implementing the ACID properties. However, the logic behind what is considered a transaction and if a modification is or is not valid is the job of the database administrator and/or programmer.

Because choosing when a transaction begins and ends is more of a SQL Server programming issue rather than an administration issue, it will not be covered. Instead, the rest of this article is going to focus on the latter, how to implement data validation.

There are many types of constraints that we can use to control the validation of data, including some that we have already used such as Allow Nulls and Data Types. There are however many more that we will look at over the next few articles. The rest of this article will focus on setting up foreign key constraints with check constraints following in my next article.

Creating SQL Server Tables

Once you have a logical design to work from, the process of creating your tables is quite simple. Start by opening Enterprise Manager, expand the server and database you would like to add a table too, then right click “Tables” and select “New Table…” from the menu.

The New Table window appears.

Each row represents an individual column of the table being created/edited.

“Column Name” is used to enter the name of the column

“Data Type” sets the type of data the column will contain

“Length” specifies the length ( or ‘n’ ) of the column. For some data types, such as text and ntext, this value can be changed and specifies the maximum number of characters the column will store. For other data types, such as int and decimal, that don’t have a length, “Length” shows the size (in bytes) the column will need.

“Allow Nulls” determines whether a particular column can be left blank. A check indicates nulls are allowed in that column. That is, if a row is added/edited, the row is not required to have a value in that field. If there is no checkmark in Allow Nulls, any rows that are added or edited are required to have a value entered for that field. — Remember: 1, 0, and (or ‘ABC’, ‘   ‘, ) are three different things. For example, while 0 and may seem similar, 0 typically indicates “False” and indicates “No Value.” It is important to keep this concept in mind, it will have implications for us further down the road.

A table that holds employee data may look something like this.

There are few more things to note here.

First, if you have ever used Access before, you should recognize the key icon on the left of the EmployeeID row. This indicates what row(s) of the table make up the primary key (i.e. the column(s) that are used to uniquely identify every row in a table). To set the primary key, select the appropriate row (you can hold down the Ctrl key to select more than one row at a time) and then click the “Set primary key” icon in the toolbar.

Next, as you move from row to row, you will notice that the available options on the “Columns” tab change. Lets take a look at these options individually:

Description – is a space for you to enter any comments about what the column is used for. Anything entered in this box will have no effect on SQL Server or the table.

Default Value – is the value that will be entered if no other value is specified when a row is added to the table. For example, I could set the default value of the Title column to “staff.” Each time an employee gets added and no title is entered, “staff” will automatically be substituted in place of . In addition, you are not limited to just text for the default value, you can also use functions. Some more common ones include GETDATE() that returns the current system date/time and NEWID() that returns a new globally unique identifier. Note that if you set the column’s “Identity” property to yes, the default value is unavailable.

Precision/Scale – used to set the precision and scale values of the decimal and numeric data types. These options are not available for other data types. For more information on precision and scale see the decimal data type in the data types table at the beginning of this article.

Identity/Seed/Increment – Similar to the AutoNumber option in Access – if a column’s Identity option is set to true, SQL Server will automatically generate a new number for each row added to the table in this column. A table can have only one column defined with the Identity property set to yes and that column must use the decimal, int, numeric, smallint, bigint, or tinyint data type. The Seed value specifies what number SQL Server should start at. The Increment value specifies the number that should be added to the Seed value to determine successive identity numbers. Note that Identity columns are only guaranteed to generate unique numbers within one table – two tables both with an Identity column may (or may not) generate the same numbers. Another thing to keep in mind is that if the table experiences many deletions, large numbering gaps may occur because deleted identity values are not reused.

Is RowGuid – Specifies that this column contains the GUID for the rows in this table. Only one table can have its Is RowGuid value set to yes and the data type for this column must be set to uniqueidentifier. In addition, you must also set the default value for this column to use the NEWID() function. Unlike an Identity value, GUID values are guaranteed to (aka “should”) be unique for every row, in every table, in every database, on every computer in the world.

Formula – used to enter a formula to make this a computed column. Instead of just storing data, a computed column takes a formula (like LastName & ‘, ‘ & FirstName) and generates a value depending on the formula you entered. In addition, computed columns are generated on the fly, so updating one column that the formula references will update the computed column automatically.

Collation – gives the ability to set the collation for each individual column of a table. will set the column to use the same collation setting as the database the table is located in.

Once you have added all the columns you can click the save icon (left most, looks like a disk) on the toolbar and you will be prompted to enter a name for the table.

Once you enter a name and click OK, you can close the New Table screen by clicking the lower, inner, X at the upper right. You can also use the “Window” menu to switch between windows or arrange them.

After refreshing the Tables list (F5) the table we just created appears in SQL Server Enterprise Manager.

You can always go back and edit a table by right clicking it in Enterprise Manager and selecting “Design Table”.

One last thing, using enterprise manager is not the only way to create a table – you can also use the CREATE TABLE statement. For more information see CREATE TABLE and ALTER TABLE in the SQL Server books online.

SQL Server Data Types

Every column in your table must have a “data type,” which is simply a property that defines what type of data is stored in that column. In addition a data type will reject data that is not of the correct type (i.e. attempting to store a letter in a data type designed for numbers). SQL Server has over 25 different data types – some that have more options than others. Let’s look at the different data types and the options for each (a lot of this is copy and pasted from the SQL Server Books Online):

Data Type (Size) Description


bigint (8 bytes) Holds integer (whole number) data from -2^63 (-9,223,372,036,854,775,808) through 2^63-1 (9,223,372,036,854,775,807).
int (4 bytes) Holds integer (whole number) data from -2^31 (-2,147,483,648) through 2^31 – 1 (2,147,483,647).

smallint (2 bytes) Holds integer data from 2^15 (-32,768) through 2^15 – 1 (32,767).

tinyint (1 byte) Holds integer data from 0 through 255.

bit (1 byte for up to 8 bit columns) Holds integer data with either a 1 or 0 value. Each set of up to 8 bit columns requires 1 byte. So if there are anywhere from 1 to 8 bit columns in a table, the storage space you will need is 1 byte. If there are anywhere from 9 to 16 bit columns in a table, the storage space you will need is 2 bytes. And so on…


decimal (Anywhere from 5 to 17 bytes depending on the precision) Holds fixed precision and scale numbers. When maximum precision is used, valid values are from – 10^38 +1 through 10^38 – 1. The Precision specifies the maximum total number of decimal digits that can be stored, both to the left and to the right of the decimal point. The precision must be a value from 1 through the maximum precision. The maximum precision is 38. The Scale specifies the maximum number of decimal digits that can be stored to the right of the decimal point. Scale must be a value from 0 through Precision. Examples:
if precision is set to 10 and scale is set to 3 the smallest (other than 0)/ largest number we could store would be 0.001 / 9999999.999 if precision is set to 8 and scale is set to 6 the smallest (other than 0)/ largest number we could store would be 0.000001 / 99.999999

numeric (Same as decimal data type)


money (8 bytes) Holds monetary data values from -2^63 (-922,337,203,685,477.5808) through 2^63 – 1 (922,337,203,685,477.5807), with accuracy to one ten-thousandth of a monetary unit

smallmoney (4 bytes) Holds monetary data values from -214,748.3648 through +214,748.3647, with accuracy to a ten-thousandth of a monetary unit.


float(n) (Anywhere from 4 to 8 bytes depending on the precision) Holds floating precision number data from -1.79E + 308 through 1.79E + 308. The value n is the number of bits used to store the mantissa of the float number and can range from 1 to 53

real (4 bytes) Holds floating precision number data from -3.40E + 38 through 3.40E + 38. Real is the same as float(24).

Date and Time:

datetime (8 bytes) Holds date and time data from January 1, 1753, through December 31, 9999, with an accuracy of three-hundredths of a second, or 3.33 milliseconds.
smalldatetime (4 bytes) Date and time data from January 1, 1900 through June 6, 2079, with accuracy to the minute.

Strings (non-Unicode):

char(n) (n bytes) Holds fixed-length non-Unicode character data with length of n characters, where n is a value from 1 through 8000. If less than n number of characters are entered n bytes are still required because unused character spaces get padded to the end to make them n bytes long.

varchar(n) (Depends on the length of data entered, 1 byte per character) Holds variable-length non-Unicode character data with a length of n characters, where n is a value from 1 through 8000. The storage size is the actual length in bytes of the data entered, not n bytes.

text (16 bytes for the pointer) Variable-length non-Unicode data with a maximum length of 2^31 – 1 (2,147,483,647) characters. A text column entry can hold up to 2^31 – 1 characters. It is a pointer to the location of the data value, the data is stored separately from the table data.

Strings (Unicode):

nchar(n) (2 bytes * n) Holds fixed-length Unicode character data of n characters, where n is a value from 1 through 4000. Unicode characters use 2 bytes per character and can support all international characters. If less than n number of characters are entered n bytes are still required because unused character spaces get padded to the end to make them n bytes long.

nvarcher(n) (Depends on the length of data entered, 2 byte per character) Holds variable-length Unicode data of n characters, where n is a value from 1 through 4000. Unicode characters use 2 bytes per character and can support all international characters. The storage size is the actual length in bytes * 2 of the data entered, not n bytes.
ntext (16 bytes for the pointer) Holds variable-length Unicode data with a maximum length of 2^30 – 1 (1,073,741,823) characters. The column entry for ntext is a pointer to the location of the data. The data is stored separately from the table data


binary(n) (n + 4 bytes) Holds fixed-length binary data of n bytes, where n is a value from 1 through 8000. Use binary when column data entries are consistent in size.

varbinary(n) (Depends on the length of data entered + 4 bytes) Holds variable-length binary data of n bytes, where n is a value from 1 through 8000. Use varbinary when column data entries are inconsistent in size.

image 16 bytes for the pointer Used for variable-length binary data longer than 8000 bytes, with a maximum of 2^31 – 1 bytes. An image column entry is a pointer to the location of the image data value. The data is stored separately from the table data


sql_variant (size varies) A column of type sql_variant may contain rows of different data types. For example, a column defined as sql_variant can store int, binary, and char values. The only types of values that cannot be stored using sql_variant are text, ntext, image, timestamp, and sql_variant.

timestamp (8 bytes) Timestamp is a data type that exposes automatically generated binary numbers, which are guaranteed to be unique within a database. Timestamp is used typically as a mechanism for version-stamping table rows. Each table can have only one timestamp column.

uniqueidentifier (16 bytes) Stores a 16-byte binary value that is a globally unique identifier (GUID).

That about covers it for the data types. There are two data types I left out (cursor and table), but they are really not applicable here because you can’t have a column that is made up of those data types.

Relational Database Design Concepts

Up to this point we have focused on the physical design of a database which includes things like files and filegroups. The physical design is focused on how data is stored and what is the most efficient way to access data. Now we are going to start to look at implementing the logical design of a database which includes objects like tables and the relationships between them. The logical design is only concerned with modeling a real-world data scenario – it is unaware and is not dependent on one particular Database Management System. For Example, I could use the same logical design to create a database in both SQL Server and Access and I would come up with the exact same database as far as the logical design is concerned. However, the physical design for the two databases could be very different – the Access database would be made up of only one file whereas the SQL Server database could be made up of many different files spread across several hard drives. Another example is a database on SQL Server that gets another data file added to it. While the physical design has change by adding another file, the logical design is still the same. The important point to get here is that the physical implementation of a database and the logical implementation of a database are two distinct concepts that are, for the most part, independent of one another.

I think I’ve said this already, but if not, SQL Server is a Relational Database Management System (RDBMS). So it makes sense that the databases that we are going to be working with are relational databases.
If this is your first time working with a relational database of any kind or you never got a good grasp on the design concepts, I encourage you to stop here and get a book on designing relational databases. Check out the following links to get started learning about logical database design:

Support WebCast: Database Normalization Basics

Understanding Relational Database Design
ACC2000: Database Normalization Basics (Q209534)

Recovering and Restoring a SQL Server Database

Restoring a database from a file or tape backup is quite simple, but there are a few things you must think about when using the Full and Bulk_Logged recovery models. Note that we are not going to cover restoring a database that uses the simple recovery model. The first thing to understand is that restoring a database and recovering a database are two different operations entirely. Restoring is simply the process of copying data from backups into the database. Recovering, on the other hand, is the process of using the transaction log to bring the data in the database to a consistent state. Rather than just give you a one sentence description, let’s take a closer look at exactly what the recovery process is.

If you remember back a few articles (Understanding Transaction Logs) I talked about how the recovery process was used to recover modifications if SQL Server was improperly shut down, known as restart recovery. Well a similar process to the one SQL Server uses if it’s improperly shut down is also used when we restore a database – known as restore recovery. If a SQL Server is improperly shut down we end up with a chunk of data that is in an inconsistent state, that is, we don’t know what modifications have been saved or what modifications were not saved before the unexpected shutdown. In addition to a chunk of inconsistent data, we also have a log file that contains a list of all the changes that were made to the data – this is where the recovery process comes in.

In order to bring the data into a consistent state all transactions in the log that are marked as having been completed are re-applied to the data, known as “rolling forward,” whereas all transactions in the log that had not been completed at the time of the unexpected shutdown are undone to the data, known as “rolling back.” By rolling forward completed transactions and rolling back uncompleted transactions, we are left with data that is in a “consistent state”…meaning we do not end up with half a transaction completed which could result in, for example, funds being deducted from account A and not being added to account B (i.e. a big mess).