Online KMWorld CRM Media Streaming Media Faulkner Speech Technology Unisphere/DBTA
Other ITI Websites
American Library Directory Boardwalk Empire Database Trends and Applications DestinationCRM Faulkner Information Services Fulltext Sources Online InfoToday Europe KMWorld Literary Market Place Plexus Publishing Smart Customer Service Speech Technology Streaming Media Streaming Media Europe Streaming Media Producer Unisphere Research



Magazines > Computers in Libraries > November/December 2004
Back Index Forward
 




SUBSCRIBE NOW!
Vol. 23 No. 10 — Nov/Dec 2004
THE SYSTEMS LIBRARIAN
Now That It's All Digital, Where Do I Put It? Exploring Data Storage Technologies
by Marshall Breeding

The thing about cutting-edge technology is that it dulls so quickly. The hardware, software, and technology concepts that today seem blazingly fast, superabundant in capacity, or transformative in their effects will in just a few short years be considered mediocre or passé. Many cutting-edge technologies fizzle out and slip into obscurity once the hype dies. Yet, it's important to follow the latest in technology and to ride as close to the leading edge as we dare—or at least as close as we can afford. It's also good to take note of the technologies that have passed their prime. Technology that's actually practical to use lies somewhere in between the cutting edge and the obsolete.

The area of technology that I struggle with the most is data storage. Like most libraries, we're involved in projects to digitize portions of our collections. Besides those projects, everyday computing—both at home and at work—involves the need to constantly store, transfer, and back up data. My involvement with the Vanderbilt Television News Archive causes me to think about all the available possibilities for storing large amounts of data. We are currently working on a project to digitize our videotape collection and finding ways to store, archive, and move the data has proved to be an enormous challenge. We're producing almost 3.5 terabytes of content a month, and at the end of the project we'll have created over 130 terabytes. While, in general, the capacity of storage devices increases each year as the cost per megabyte goes down, at the scale of this project, current capacities are inadequate and costs are too high.

Yet not all storage needs are large-scale. There are times when the need centers on small and portable. Whether you want to store a megabyte, a gigabyte, or a terabyte, there are a lot of great technologies available today. I'll cover some here.

The Diskette Is Dead

First of all, let's recognize that the days of floppy disks or diskettes have passed. The diskette is dead. Once standard equipment on computers, most new models only offer these disk drives as added options. The 1.44 MB offered by the latest generation of 3.5-inch diskettes just doesn't hold enough data to be useful in today's world. Therefore, it is important to transfer information that you have on diskettes or floppy disks before it becomes hard to find drives that can read them. One of the realities of data storage in the digital world is the need to constantly refresh and transfer content to current technologies. The problem especially applies to librarians who may have items in their library collections that include content delivered on now-obsolete media, such as books with supplementary materials supplied on diskettes. It might be a good idea to keep a computer equipped with a diskette drive available until you are positive that everything in your library's collection has been transferred.

Optical Storage Solutions

Optical discs have taken over as the preferred media for portable storage. CD-R and CD-RW are convenient and low-cost ways to store and transport data and music. Blank CD-R discs sell for mere pennies and offer up to 700 MB of storage. Almost any new computer comes equipped with a CD-R drive and the software for burning data or music onto discs. While no longer on the cutting edge, optical discs are solid, practical technology, and they are far from extinction.

While CD-R is holding its own, recordable DVD is a rising star. The storage capacity of CDs (while fine for data or audio files) falls short of what's needed for video. While commercially pressed DVDs have been around for quite some time, the hardware and software for burning your own has been a bit pricey until recently. The cost of recordable DVDs has come down enough in the last year to make them well-suited, not just for video, but also for a variety of applications that deal with large amounts of data. Three years ago when I was planning and budgeting a video project, a DVD-R drive sold for just under $1,000 and the blank discs cost as much as $20 each. Now, drives are under $200 and a blank disc costs well below $1. At 15 to 25 cents per GB, DVD-R stores data relatively inexpensively and while they function well as backup storage, given the fragility of the discs, they aren't reliable enough for long-term archiving.

One of the maddening realities in the realm of DVD is the different media types and drives. Options include DVD-R and DVD-RW (supported by a group of manufacturers called the DVDForum) and DVD+R and DVD+RW (supported by the DVD+RW Alliance). While nuances of advantages exist between the two camps, the lack of a single standard makes selecting equipment and media more complicated than necessary. Fortunately, a large portion of the DVD drives support all the different media options.

While DVD-R has shown dramatic improvements over its short lifecycle, in terms of cost versus capacity when dealing with large-scale video projects, I still find the options unsatisfactory. The next generation of optical storage, based on blue lasers rather than the red ones used today, promises some improvement. Dubbed Blu-ray (http://www.blu-ray.com), these discs use a dual-layer approach to hold as much as 50 GB. Don't expect to see this technology on the shelves until the end of 2005. Unfortunately, by then, 50 GB of storage may not seem so impressive.

These days, we have lots of options for computer storage based on magnetic drives. Individual workstations now come with hard drives from 80 to 300 GB. While that seems generous (now that many have graduated from listening to music in MP3 format to watching videos in MPEG), even today's largest drives will quickly become cramped. The vast majority of users will find that the amount of storage offered on most current computer models is quite sufficient for their needs. When more capacity is needed, the EIDE architecture used by most desktop computers makes it easy to install an additional drive. Having half a terabyte of storage on a PC is quite feasible. The ability for relatively low-cost PCs to offer large-scale storage meets my standards for great technology.

Storing Outside the Box

Large-capacity external drives that connect through USB or FireWire rate high on my list of useful technologies. These come in just about any size. I've seen them as small as 40 GB and as large as 1.6 TB. Remember the days of "sneakernet," when the only way to move files from one computer to another was by copying them to a diskette and walking them over to the recipient? Today, a large external drive can be extremely handy when you need to transfer very large data sets. Suppose you need to move 500 GB worth of data from one organization to another located across the country. Using a DSL connection at 1.5 MB/sec., the transfer would take more than 31.6 days, nonstop. It would definitely be faster to copy the data to an external drive and ship it. These drives can also serve as a fast and convenient way to make backup copies of data. I have a database of image files that takes about 30 DVDs to back up. Making an extra copy on a 250 GB external USB drive is fast and convenient, though a bit more expensive. (Expect costs of about $1 per gigabyte.)

Network-Based Storage

Network-based storage options available today include server attached storage, network attached storage (NAS), and storage area networks (SAN). Selecting and building a large-scale storage system is a complex issue and may ultimately involve a combination of these technologies. I won't go into the details here, but for my latest projects I have steered away from SAN technologies. My previous experience with them led me to the opinion that unless there are very specific needs that require high manageability, ultra-high availability, and extremely large capacity, then the cost and complexity of a SAN may not be warranted. I'm currently using a cluster of servers with simple-attached storage to provide about 14 terabytes of storage for our digital video system.

You Can Take It with You

You don't always need a lot of storage; sometimes you just need a modest amount, but you want it to be small and portable. USB-attached flash drives fill this niche superbly. With capacities from a few megabytes up to a couple of gigabytes, these devices provide a convenient way to carry your presentation to a conference, to move files between home and work, or to do any number of chores. And they're incredibly easy to use—just plug the tiny device into the USB port of any relatively current computer and it almost instantly shows up as an additional drive.

Choose It and Use It Wisely

Data storage is not a one-size-fits-all proposition. Every project or problem brings a different focus of concern: capacity, portability, long-term archiving, performance, economy, reliability, or flexibility. While data storage remains the one technology problem that I wrestle with the most, fortunately, solutions abound.

Keep in mind that digital data is fragile. Any given copy of a file can be instantly destroyed through a hardware or software failure or through human error. Don't ever leave yourself in the situation of having only one copy of your data—make multiple copies and keep them in separate locations. It is important to remember all the different technology options that we have available for safely storing the data we create.


Marshall Breeding is the library technology officer at Vanderbilt University in Nashville, Tenn., and a consultant, speaker, and writer in the field of library automation. His e-mail address is marshall.breeding@librarytechnology.org. You can also reach him through his Web site at http://staffweb.library.vanderbilt.edu/breeding.

       Back to top