IBM Storage Tank â„¢
en.ibm.com/cs/storagesystems/stortank/ExtStorageTankPaper01_24_02.pdf. It's a snapshot of the page taken as our search engine crawled the Web.
The web site itself may have changed. You can check the current page or check for previous versions at the Internet Archive.
Yahoo! is not affiliated with the authors of this page or responsible for its content.
IBM Storage Tank
IBM Storage Tank
A Distributed Storage System
IBM
Corporation
January 24, 2002
1
Overview
IBM
Storage Tank
provides a complete storage management solution in a heterogeneous
distributed environment.
IBM
Storage Tank is designed to provide performance that is comparable
to that of file systems built on bus-attached, high-performance storage. In addition to high
performance, the goal of
IBM
Storage Tank is to provide high availability, increased scalability, and
centralized, automated storage and data management.
Storage Area Network Technology
Storage Area Network
(SAN
) technology allows an enterprise to connect large numbers of
devices, including clients, servers, and mass storage subsystems, to a high-performance network.
On a
SAN
, clients can access large volumes of data directly from storage devices, using
high-speed, low-latency connections.
IBM
Storage Tank is designed to be independent of the
actual
SAN
fabric technology.
IBM
Storage Tank will work with Fibre Channel networks as well as
new emerging storage networking technologies such as Gigabit Ethernet (iSCSI) and Infiniband.
By using
SAN
technology,
IBM
Storage Tank can meet the needs of general data sharing in a
distributed environment, as well as the needs of special, data-intensive applications, such as
imaging, animation, digital video, and large-scale distributed applications.
IBM Storage Tank vs. Traditional Distributed Systems
Traditional distributed file systems use a client/server data access model that requires servers to
access data from storage devices, and then send the data to clients. They have the additional
limitation of using conventional network bandwidth to transfer the data. While these systems allow
users to share data, they do not provide the high performance required for data-intensive
applications.
In contrast,
IBM
Storage Tank uses a data access model that requires clients to obtain only
metadata from a server. Clients can then access data directly from storage devices using the
high-bandwidth provided by a Fibre Channel or other high-speed network. Direct data access
helps eliminate server bottlenecks and provides the performance necessary for data-intensive
applications.
Storage Virtualization
Storage virtualization masks the physical characteristics of storage devices and presents users
and applications with a unified, logical pool of shared storage. It gives storage administrators the
flexibility to create virtual disks that better meet the needs of users and their applications.
IBM
Storage Tank provides storage virtualization through the use of storage pools. A storage pool
can consist of multiple disks that reside on any combination of heterogeneous storage devices. To
an application, a storage pool appears as a single storage space in which it can store data without
the need to know anything about the characteristics or boundaries of the physical disks.
A storage administrator sets up storage pools to meet specific needs of an enterprise. For
example, an enterprise might want to have storage pools that consist of disks located on fast
devices for transactional data and storage pools of disks on slower devices for backup data. An
enterprise might also want to have multiple storage pools that provide different availability
characteristics, or, perhaps, separate storage pools for each department within the enterprise.
2
After setting up storage pools, an administrator can increase or decrease the sizes of specific
storage pools to meet changing needs, and can easily move data from one storage pool to
another. All of these tasks are transparent and non-disruptive to users and applications.
System-Managed Storage
The
IBM
Storage Tank architecture makes it possible to bring the benefits of system-managed
storage (
SMS
) to a distributed environment. Features such as policy-based allocation, volume
management, and file management have long been available on
IBM
mainframe systems.
However, the infrastructure for such centralized, automated management has been lacking in the
open systems world. The centralized storage management architecture of
IBM
Storage Tank
makes it possible to realize the advantages of system-managed storage for all of the data that the
IBM
Storage Tank system manages.
On conventional systems, storage management is platform dependent. While storage devices
may be attached to a
SAN
, they are still allocated to specific server machines and cannot be
centrally or consistently managed.
IBM
Storage Tank provides a single, centralized point of control to better manage storage devices
and data. Centralized storage and data management simplifies administration and can result in
lower cost of ownership.
3
IBM Storage Tank Architecture
Figure 1 illustrates the basic
IBM
Storage Tank architecture.
Figure 1.
IBM
Storage Tank Architecture
Figure 1 shows that
IBM
Storage Tank clients and the administrative client communicate with
IBM
Storage Tank servers over an enterprises existing
IP
network using the
IBM
Storage Tank
protocol. It also shows that
IBM
Storage Tank clients, servers, and storage devices are all
connected to a high-speed Storage Area Network (
SAN
).
The
IBM
Storage Tank administrative client serves as the administrative control point. An
administrator can perform almost all administrative tasks online with no service interruption to
clients.
An installable file system (
IFS
) is installed on each
IBM
Storage Tank client. An
IFS
directs
requests for metadata and locks to a
IBM
Storage Tank server and sends requests for data to
storage devices on the
SAN
. Storage Tank clients can access data directly from any storage
device attached to the
SAN
.
IBM
Storage Tank clients aggressively cache file data, as well as metadata and locks that they
obtain from a Storage Tank server, in memory. They do not cache files to disk.
An enterprise can use one
IBM
Storage Tank server, a cluster of
IBM
Storage Tank servers, or
multiple clusters of
IBM
Storage Tank servers. Clustered servers provide load balancing, fail-over
4
Admin
Client
Existing IP Network for Client/Server Communications
(Storage Tank Protocol)
Heterogeneous
Clients
(workstations
or servers)
Win2000
Client
IFS w/cache
IFS w/cache
IFS w/cache
IFS w/cache
IFS w/cache
IFS w/cache
AIX
Client
Solaris
Client
Linux
Client
HP-UX
Client
Other
Clients
Storage
Tank
Server
Storage
Tank
Server
Storage
Tank
Server
Metadata
Private server
storage, shared
among Storage
Tank Servers
Storage Area Network
SAN
Fibre Channel
Network
Server Cluster for
Load Balancing
Fail-over Processing
Scalability
Shared
Storage
Devices
Active Data
Backups and migrated data
Device-to-device
copy for backup
and migration
processing, and increased scalability. The
IBM
Storage Tank servers in a cluster are
interconnected on their own high-speed network or on the same
IP
network they use to
communicate with
IBM
Storage Tank clients. The private server storage that contains the
metadata managed by a cluster of
IBM
Storage Tank servers can be attached to a private network
connected only to the cluster of servers, or it can be attached to the
IBM
Storage Tank
SAN
.
IBM Storage Tank Protocol
The
IBM
Storage Tank protocol is a locking and data consistency model that allows the
IBM
Storage Tank distributed storage system to look and behave like a local file system. The objective
of the
IBM
Storage Tank protocol is to provide strong data consistency between clients and
servers in a distributed environment.
The
IBM
Storage Tank protocol provides locks that enable file sharing among
IBM
Storage Tank
clients, or, when necessary, provides locks that allow clients to have exclusive access to files. A
IBM
Storage Tank server grants the locks to clients. With the
IBM
Storage Tank protocol, when a
client reads data from a particular file, it always reads the last data written to that file from
anywhere in the
IBM
Storage Tank distributed storage system.
To open a file in the
IBM
Storage Tank distributed storage system, a client does the following:
1.
Contacts a
IBM
Storage Tank server to obtain metadata and locks.
Metadata supplies the client with information about a file, such as its attributes and location on
storage device(s).
Locks supply the client with the privileges it needs to open a file and read or write data. The
IBM
Storage Tank locking scheme is designed to ensure strong data consistency.
2.
Accesses the data for the file directly from a shared storage device attached to a
high-performance
SAN
.
IBM Storage Tank Clients
One of the goals of
IBM
Storage Tank is to enable full, transparent data sharing of files among
heterogeneous clients, such as those running the Windows
2000,
AIX,
Solaris, Linux, and
HP
-
UX
operating systems.
All
IBM
Storage Tank