CEDPA Logo DataBus Header

 
Conference
DataBus Index
Listservs
Presentations
Events
Organization
Bylaws
Directors
More Info
CEDPA Home

Issue Index

   DataBus - Vol 41 No. 4: June-July, 2001
  

Networked Storage

Part II of a two-part series on
deploying NAS or SANs in your
school or district's environment
Darryl La Gace
Lemon Grove School District
Technical contributions also provided by
Hewlett-Packard

 
Two storage solutions: There are two popular forms of networked storage today: Network Attached Storage or NAS and Storage Area Networks or SAN. Both NAS and SAN provide the ability for multiple servers or clients to share storage resources, but they differ greatly in their implementation. The following diagram shows how NAS and SAN can be differentiated based on connectivity:
Storage Networking Diagram
In client-server computing, multiple clients (PCs or workstations) talk to a server over a local area network or LAN. A single server can service hundreds of clients depending on the configuration. Now, imagine those same clients seamlessly accessing storage resources without going through a traditional general-purpose file server. This is Network Attached Storage, or NAS. Next, picture the same principle, except that instead of multiple clients seamlessly accessing storage, you now have multiple servers accessing the storage resources. This is a Storage Area Network, or SAN.

Editor's note: This is Part II of a two part article on Networked Storage. Part I appeared in the April-May issue of the DataBus and can also be viewed online at http://cedpa-k12.org/databus-issues/v41n3/storage.shtml.

In Part II we'll cover Lemon Grove's storage issues, considerations, and strategy for the deployment of a Hewlett-Packard SAN solution.
 
As discussed in part one of this article, the Lemon Grove School District has developed a connected learning community model that allows students to log onto thin client workstations at their desks and begin working on the day's activities. Teachers use Intranet sites to deliver instruction, and after school, students can go home or to one of many access centers in the community to log on and do their homework. Parents can go online to check assignments or to email a teacher to find out how their child is doing in school. This project is known as LemonLink.
 
To support LemonLink, we have installed approximately 70 file servers located at the district's data center. These servers provide data processing and storage services for three organizations including the school District and other local government agencies. Lemon Grove has more than 3,500 workstations connected to its network of servers, each of which depends on the data center for access to programs as well as to centrally stored information such as videos and other learning tools. Users save all their data files to the central server to allow for access to information anytime anywhere.
 
In the past, this student and teacher data had been saved on seven disparate storage servers located in the data center. But we found that this was not a reliable environment as each of the servers represented a weak link with little more than software RAID as a fault tolerant measure. A typical storage server hosts the home directories for 1,000 users our application servers have hundreds of students accessing content as part of their class activities. If just one of these servers goes down it is sure to impact daily instruction.
 
It also not unusual for a teacher to call in a panic and say that they just realized they had deleted two years worth of curriculum development or that a student erased the class project by accident. Teachers have come to rely on our backup resources, as we probably restore more than 500 MB per week. This not only takes a significant amount of time, it also puts tremendous pressure on an IT shop that has to be able to handle those situations swiftly and confidently, not to mention the nightmare of managing internal storage for 70 servers. How do you realistically manage hundreds of disparate drives and the constant need for more storage capacity with the limited IT departments that most school districts have?
 
It took only a few crashes of major storage servers for us to realize that IT would be in for big problems in the future if we did not minimize the impact of such problems, especially as teachers became increasingly dependent on technology to deliver instruction. We had to confront serious questions: Do we continue to add more servers with local storage only to create another point of failure? How can we dynamically reallocate existing storage? If, for example, an exchange sever needs more storage space, do we take the server down for hours while we add more local storage?
 
Lemon Grove worked with Hewlett-Packard Company (HP) to install a Storage Area Network (SAN) solution to address these needs.

Our biggest storage challenges were maintaining high availability and accessibility of stored data; ensuring scalability to grow the solution as students and teachers required more and more storage space and as the student population grew; and centralized management of the storage resources for easier, quicker recovery or addition of storage devices. We also needed a more fault tolerant environment that allowed for disk failures without service interruption. Although the IT department successfully delivered resources to thousands of users every day, it was also working under the certainty that, with 70 file servers with five to eight drives on each, it was going to have to contend with an average 1.5 crashes per month. This did have an impact on the schools' instructional program.
 
Consequently, HP recommended a SAN configuration using complete redundancy. The solution is based on the HP SureStore Disk Array FC60, capable of holding up to 4.4TB. Two FC60s each with two fibre channel controllers, redundant SCSI controllers, redundant 8-port HP Brocade switches, and redundant tape backup devices created a storage solution with no single point of failure.
 
The FC60's allow for user-configurable RAID that can be tailored to your availability requirements. In our case, the FC60s are implemented using RAID 5 for on the fly data redundancy plus multiple hot swap spare hard disk drives. If a disk drive were to fail, the RAID 5 keeps the data stream flowing while the hot spare would take over in the failed drive's place. Our IT staff could then replace the failed drive at their convenience – not in the middle of the night, as was the case when we relied on internal storage.
 
In our quest for a better enterprise storage solution, we wanted to consolidate several disparate storage servers. We also wanted to use the latest SAN features such as fibre channel, so the application servers could be directly connected to the centralized storage solution and to give us the higher performance (Up to 170MB per second transfer rates) offered by fibre channel. Performance is further enhanced by using 10,000rpm (6ms seek) or 15,000rpm (3.9ms seek) disk drives.
 
With the FC60s and SAN in place, we migrated all data off of the individual disparate servers. As a result, data that used to sit on hundreds of small disk drives spread out all over the place now sits on less than 50 high-speed disk drives located in two racks in the data center.
 
Each FC60 enables you to divide up the storage capacity among up to eight Windows servers, all of which can be managed from a laptop – it's that easy. And with the Java-based SM60 software that came with the FC60s, we can easily see the status of all of stored data and can dynamically allocate storage to a server if needed – without any downtime.
 
So in addition to high availability, we have a flexible growth strategy. We can even add servers to the SAN without taking anything down. And we can add up to 70 additional drives to the FC60s on the fly. Today, we are utilizing about 40 percent of the FC60s full capacity, so we have a lot of room to grow. The new SAN solution has cut the amount of time I spend managing our storage by more than half.
 
Improved Backup Performance with an automated HP SureStore Tape Library

In addition to managing and consolidating our online storage, we needed to improve the performance of our offline storage, or backup, because we were finding we had to back up more data in less time. We installed two HP SureStore Tape Library 2/20s. These libraries have two DLT 8000 drives and 20 slots each, and one is connected to each FC60 on the fibre channel. This allows us to easily restore users' information, because it is stored at the data center and backed up regularly. In our environment, it's no problem if a user loses data; we can simply go to the backups from the day before, and everything can be restored, automatically even from my laptop while I'm on the road, without even touching tapes.
 
One of the strategies we are now discussing, as a means to enhance our backup solution in the future, is to move our tape libraries to a local fire station about two blocks away. Given the possibility of a natural disaster, having an on line copy of all of our data at a location physically remote from our data center makes sense, providing an extra layer of data security. Fibre channel's long distance capability makes this a viable solution, without losing performance.
 
Gaining NAS Functionality Using SAN

Originally, we considered both Network Attached Storage (NAS) and SAN – SAN for consolidation of the internal storage on the 70 servers and NAS for the student file serving for the 3,500 thin clients. But working with HP's engineering team, we decided that SAN storage is ideal for thin client or server based computing model such as what we have implemented to increased access to one computer for every two students.
 
A Storage Area Network or SAN is typically defined by workstations and servers attached to a storage device via a high-speed fibre channel network. Unlike NAS where the drives are seen as network shares at Ethernet speeds, the storage allocated to a Server from a SAN is seen as local storage at speeds that exceed typical SCSI configurations.
 
The reason why NAS would not work well in our environment is that thin clients actually do not store data, or run applications. Instead, multiple terminal sessions are actually run on servers. As such, thin clients do not access NAS storage directly; instead a server would try to access it. The bottleneck becomes the network speed of the NAS, which is many times slower then direct SCSI access and for that matter fibre channel access.
 
The issues centers around the implementation of the internal NAS server that manages data from clients that utilize a standard file structure like Windows FAT 32 or Unix (NFS. The Thin Clients send "Screen Pages" in HTML and IP Commands that need to be processed by the host server to become storable data files. Thus the NAS device was not designed to be a Thin Client Host and a File Manager).
 
Instead of using a NAS solution, we added an additional server, an HP NetServer LT6000r with four processors, and allocated an additional 730GB of storage from one of the FC60s to this server. This server provides the processing capability for the student files, and their data is stored on the FC60 and backed up on the libraries.
 
Recommendations

Below are some recommendations we came up with that may help you in your future storage deployment strategies.
  • Making your data more manageable is the most essential thing you can do to improve your storage infrastructure. This can be achieved with consolidated storage and with software that enables remote monitoring and management, dynamic storage allocation, management of logical unit numbers (LUNs, for partitioning disk space) and SAN components, mapping and device discovery, and proactive troubleshooting.
  • Consolidate storage if it makes sense. An excessive number of storage devices in multiple locations consumes a lot of unnecessary management time.
  • Investigate the software options from HP for SAN management.
  • The amount of data doubles every 12 months. Make sure that the storage solution you put in place today can handle your growth for the near future.
  • Invest in a redundant architecture. You may think you don't need the highest levels of redundancy, but make sure you weigh the cost of downtime and the impact this has on the classroom. Make sure your teachers and administrators can stay focused on their students without the worry of losing critical data.
  • The efficiency of consolidating redundant parts in centralized storage for high availability versus doubling the parts in numerous, separate servers provides full redundancy at a fraction of the cost.
Additional Resources

For more information on Lemon Grove's LemonLINK visit www.lgsd.k12.ca.us/LemonLINK
For more information on HP's storage solutions visit www.hp.com/storage