If you ask most network administrators what the top five tasks they perform on a repeated basis are, one task that is sure to be on the list is restoring a single file or a small number of files from backup (that and resetting passwords!). Users accidentally delete files, overwrite files, or may need an older copy of a file. In order to get these files back the user would most likely have to contact the IT department, give them a description of the file, where the file was stored, and the time or date when the deletion or modification occurred. Now unless you’re the CEO, CIO, or the person who is responsible for backups has a crush on you, restoring a file from backup is not exciting and in most situations does not get top priority. Additionally, while we know they try their best to give as much information as possible, users can get confused and the information about the files they need may not be quite right – and the file hunt begins!
Sometimes we can get lucky – the file may have been deleted only an hour ago and there is a backup from last night on another hard drive in the server. In this situation restoring the file is a simple copy/paste operation and is relatively painless if you don’t have to do it a lot. But what happens if the file they need is from a week ago? Or what if the user is not sure if they need a copy from last week or two weeks ago? What about if they need a copy from Monday, Wednesday, and Friday of last week? While hunting through tapes is not the end of the world, I can sure think of a few other things I rather spend my time on!
During this whole process of restoring a file the network administrator’s time is occupied, the user may be unable to complete a task until they get the file, or the user may decide dealing with IT would just take too long and they end up rebuilding the file by hand. Anyway you look at it restoring files ends up causing lost productivity and in turn lost revenues for the company – but what if there was a better way?
Windows Server 2003 introduces a new feature called Shadow Copies of Shared Folders which solves many of the problems associated with restoring a small number of files from backup. To put it simply, Shadow Copies of Shared Folders provide point-in-time copies of files located in shared folders on a Windows Server 2003 server. These copies are accessible by end users and show what a shared folder, or a single file inside of the shared folder, looked like a few hours ago, yesterday, last week, or even a few weeks ago.
Shadow Copies of Shared Folders works by storing a copy of data that has changed since the last shadow copy. Because Shadow Copies of Shared Folders only stores block-level (a.k.a. cluster) changes to files rather than the entire file the amount of hard drive space needed is greatly reduced. The administrator can specify when shadow copies are made and the amount of disk space that is used to store changes – with newer copies replacing older copies as needed.
While this is a great new feature, there are a few things you need to be aware of when planning to use Shadow Copies of Shared Folders (called SCSF from here on out):
- SCSF is set on a volume-by-volume setting and is only available on NTFS volumes. That is, SCSF is enabled for every shared folder on a given volume or none at all – you can’t pick and choose which shared folders on a given volume will or will not use SCSF. Additionally the schedule of when shadow copies are made is also set at the volume level.
- SCSF will not work on mount points (when a second hard drive is mounted as a folder on the first)
- Dual booting the server into previous versions of Windows could cause loss of shadow copies.
- SCSF is NOT a replacement for undertaking regular backups!
Let’s look at an example of how to setup and use SCSF. In the example we have three drives – C: contains the Windows Server 2003 installation, E: contains a shared folder with user documents that are redirected from the user’s My Documents folder (i.e. folder redirection setup in group policy), and F: is not being used.
By putting the shares I want to use SCSF with on another hard drive (or even another partition on the same physical drive) this keeps from wasting shadow copy disk space or I/O bandwidth of copying shares we don’t need SCSF on.
To enable SCFS on a volume right click the drive in Windows Explorer, select properties, and choose the “Shadow Copies” tab.
While SCSF will let you store the shadow copy data on the same drive as the shares being copied, this is not optimal – the drive head has to go back and forth in order to read the data in the shared folder and then write it to shadow copy storage. Lightly loaded file servers can deal with having everything on one drive, but adding the shadow copy storage area to the same drive on servers with high I/O loads can cause serious slow downs and is not recommended.
If we wanted to setup SCSF with default settings (which are: store the shadow copy data on the same drive, set the maximum limit to 10% of the total drive space, scheduled copies to be made Monday – Friday at 7:00 AM and 12:00 PM, and make the initial copy) we could simply select the drive and choose the “Enable” button. But because we have an additional physical hard drive from the drive that contains the shared folder (convenient, wasn’t that?) we will configure SCSF to use drive F: as the shadow copy storage area. Note that a single volume can act as the storage area for multiple other volumes – the only limitation is the amount of free drive space available.
To continue configuring SCSF select drive E: from the list of volumes and click “Settings…”
In the “Located on this volume” drop-down list select the drive you want shadow copy data to be stored on. In this case we will choose drive F:. Next set the Maximum size of the storage area to an appropriate amount of disk space for your situation.
So just what is an “appropriate amount of disk space” anyway? Well it all depends on the situation (doesn’t it always). Although 10% of the total drive space is a good estimate, you need to take the following variables into consideration:
- The amount of data in the shared folders
- The frequency that different blocks change in the shared folders
- The amount of free disk space on the drive that contains the storage area
- How many past shadow copies you want to keep
- The cluster size of the volume that holds the shared folders
- There is a 100MB minimum
Note that when I say “frequency that different blocks change” I mean the number of blocks that change between shadow copies – not the number of times each block changes. For example, say I make a shadow copy at 7:00AM and 12:00PM and I have two files that change (let’s assume each file fits in a single block for simplicity): fileA and fileB. Between 7:00AM and 12:00PM let’s say fileA is updated 4 times and fileB is updated 2 times. Because shadow copies take a “snapshot” of the shared folders at a point in time there is only one copy per updated block, not per change to each block. So in our example when the 12:00PM shadow copy is made only the most recent versions of our two files are copies – the 4th update to fileA and the 2nd update to fileB, not all 6 different updates that were made.
Another example would be a large file that is made up of multiple blocks – let’s say 100. Next let’s open the file, modify the last two lines, and save the file. By doing so our modifications don’t modify the entire file just the last block (Note this is dependent on the application and if it rewrites the entire file to disk or not). Now let’s assume that we repeat our modifications – changing the last few lines of the file a half dozen more times. When the next shadow copy is made only our final change to the last block of the file is copied – none of the first 99 blocks or the first six modifications to the 100th block are copied. In other words, if you up date a file 5 times or 5,000 times the space needed to store the shadow copy is still the same (assuming that the same blocks are modified between the 5 and 5,000 updates) for that file. Got that?
Also, why does the cluster size on the volume matter? The simple answer is that when you defragment a volume the clusters that makeup files are reorganized. SCFS may see this as a modification and will make a copy at the next shadow copy. To minimize the number of times this occurs Microsoft is recommending that you use a cluster size of at least 16K or any multiple of 16K when you format the volume. The driver used to support shadow copies is aware of defragmenting and can optimize for it, but only if blocks are moved in multiples of 16K. Note however that the larger you make the cluster size the more space you waist (if you have a 24K file the minimum space allocation is still 1 cluster, so if the cluster size was 64K we would end up wasting 40K). Additionally, if your cluster size is over 4K you can’t use file compression – file compression requires cluster sizes of 4K or less. If you can’t use a larger cluster size don’t worry, just keep in mind you may need additional space in the storage area due to defragmentation.
Back to the settings screen – when you are done selecting a drive to use as the storage area and setting the limit click OK
The Shadow Copy tab now shows that we have set drive F: to be used as the storage area for shadow copies made on drive E:
To continue select the E: drive and click “Enable.”
We are informed that this will use the default settings – although this is true for the schedule, it *will* use our selection of drive F: as the storage area.
Click Yes to continue.
The initial shadow copy is then made and the schedule for updates is enabled.
If you would like to modify the schedule of when copies are made you can select the drive, click Settings, and then choose schedule.
When done, click OK on all screens to exit out of the drive properties.
One thing that you may notice is that a new scheduled task appears in the scheduled task folder in control panel for each drive you setup SCSF on. While this is the same schedule that is available from the Settings on the Shadow Copies tab, I would recommend that you don’t directly modify the scheduled task itself – there are many more settings that could be accidentally “goofed.”
So speaking of schedules, how often should we make shadow copies? Again, this depends on the data and when your users use the data, but there are a few things to keep in mind:
- Microsoft recommends a minimum of one hour between shadow copies and even that is probably way to low for most situations.
- Taking one shadow copy per day is probably the maximum amount of time you want to go on weekdays between making shadow copies.
- The longer the time between shadow copies the longer and more I/O intensive the shadow copy will be (due to the fact there are more changed blocks).
- Your goal should be to take snapshots of data that would be most useful to users and made when they will impact the system the least.
- There is a maximum limit of 64 shadow copies per volume regardless if there is free space in the storage area.
Twice a day during weekdays is probably sufficient for most M-F 9-5 operations – once in the morning before anyone shows up and than once at lunch when a good number of people are out of the office. The times here are important – you want shadow copies taken during times users are using the system to impact the system as little as possible. By taking a shadow copy right before most people are in the office the number of blocks that have to be copied during the noon shadow copy is reduced and in turn the I/O impact on the system is less. If a shadow copy was only taken at noon the I/O impact would be for blocks updated over the last 24 hours (and a lot higher) rather than just the last 5-6 hours.
Some sites may need more than two shadow copies per day, maybe an additional copy made at the end of the day, or maybe even four copies a day in some situations. However keep the 64 shadow copies per volume maximum in mind – if you take two copies per day during weekdays you can store almost six and a half weeks of shadow copies (assuming you have the hard drive space), a little over four weeks if you take three copies per day, and a little over three weeks if you take four copies per day. The 64 limit should not be a problem for most situations, but it’s the reason why you don’t want to just take a copy every couple of hours – there would only be a little over a week of shadow copies before they start to get overwritten with newer shadow copies.
Well that about does it for this article, part two will cover the client side of Shadow Copies of Shared Folders.