Q: What is a cluster?
A: A cluster is a group of independent systems that work together to provide a single, unified computing resource. Although comprised of multiple systems, clients and applications view the cluster as though it were a single system.
Clusters are often used to load balance work across multiple servers to enable transactions to be processed with speed. Clustering technology answers the business requirement for a highly available, highly scalable IT infrastructure.
Q: Why do businesses need high availability and high scalability?
A: In today's 24x7 business environment, users and customers demand IT systems to be continuously available. Businesses that experience unexpected downtime have to deal with huge revenue losses while placating frustrated customers. The realisation that system downtime is so costly to business has led to the concept of five-nines uptime (99,999%). This equates to less than five minutes of unplanned downtime a year. Some vendors are even striving for the Holy Grail of six-nines uptime - less than 30 s of unplanned outage in a year.
High availability is not the only critical attribute of today's IT infrastructure. Scalability - the ability of computing systems to handle increased or changing loads and to increase performance - is equally essential. Consider a successful e-business implementation. It will frequently experience unexpected and rapid growth. This leads to increased numbers of users that, as well as additional functionality, the system has to support. Businesses must be capable of managing this fluctuating growth effectively.
Q: How does clustering help?
A: Clustering has enabled new levels of availability because there is no single point of failure. Outage of any one component does not disrupt end-user services. This is because of an automatic fail-over process that redistributes the workload to another cluster member. But, clustering is not just a safeguard against disaster. It is also highly scalable. Many of today's clusters are built using standard high-volume server building blocks. This enables the cluster's computing power to be significantly boosted by simply adding more readily available and cost effective resources.
Q: How long has clustering been around?
A: Computer clusters have been around for the last two decades and today most leading hardware and software companies offer clustering solutions.
We expect it to become more popular as more and more people realise the benefits it brings. For example, a recent Meta Group report highlighted some of the benefits: "Application availability greatly impacts the bottom line. Organisations can potentially lower costs, support more users, and gain higher application availability through clustered configurations." (Server Infrastructure Strategies, Meta Group, 14 June 2001) Clustering is relatively well established for front-end servers (Web servers, firewall or proxy servers) and for the middle-tier (servers running the business applications). Traditionally back-end servers running database applications have been run on mainframes or large multiprocessor systems, but clusters of lower cost, standard, high-volume servers are emerging as a viable alternative solution.
Q: Why has clustering at the back-end been more of a challenge?
A: There are two clustering models and both have had their limitations. One approach to clustering is 'shared-nothing' architecture. Each server is independent, owns and manages its local devices and shares nothing with the other cluster nodes. If failure occurs, the data must be redistributed to the other servers. While this model has low communication traffic and scales well, as demonstrated by recent TPC-C benchmark results running Microsoft SQL Server 2000, it can place constraints on the way the database is structured to get the best performance out of this configuration. Another clustering technology in use today is 'shared-disk'. All the servers in a cluster are connected together and share a common data storage. Fail-over is no problem as all the servers see the same data.
The challenge of running a big database on this architecture is the huge amount of inter-node communication. Each server is continuously updating the other cluster members and this high volume of traffic can degrade performance. A solution to this performance degradation problem has come with the introduction of the Oracle9i Database - introduced in June 2001. It includes Oracle9i Real Application Clusters (RAC), a version for clustered configurations. This version is a relational database capable of supporting mainframe levels of users and transactions on low-cost clustered hardware. This huge step forward has been made possible by Oracle's development of Cache Fusion. Integral to Oracle9i RAC, Cache Fusion is a breakthrough technology that provides high bandwidth interconnection between clustered servers. With these clustering technologies RAC, businesses can now have a choice of the platforms on which to run their large databases, with Intel Architecture enabling on a highly available, highly scalable clusters of high volume standard servers, for example those built on Intel Architecture.
Q: How does clustering on Intel Architecture compare to other platforms?
A: It compares excellently on price-performance with the proprietary platforms traditionally used for database servers. The cost of running an enterprise database can run into millions of dollars, but the cost of implementing a cluster like Oracle9i RAC on Intel Architecture-based servers is cheaper than on the competing proprietary platforms. And, by using high volume servers based on Intel Architecture for clustering, businesses are not tied into one vendor. In today's economic climate, that is an important factor.
To show how large database configurations can be built using Intel Architecture servers, Intel's Solution Centres in Russia, Germany, Sweden and UK have built a demonstration cluster using Oracle 9i RAC. In fact, we are so excited about the ability to implement high-end database solutions on Intel Architecture that we are offering live demonstrations at Intel Solution Centres in Russia, Germany, Sweden and the UK. Visitors to the Solution Centre can see Oracle9i RAC in action on an Intel Pentium III Xeon processor-based 4 node cluster. They can witness the performance, availability, scalability and manageability tests. And we will show the analysis, reporting and results of the tests to visitors by using Intel Pentium 4 processor-based clients.
Q: What is the difference between scale-out and scale-up strategies?
A: The 'shared-disk' cluster model used by the Oracle solution complements Intel's support of 'scale-out' strategy. Scaling-out, like clustering, is the addition of more servers to distribute and handle additional workload to increase performance. The alternative is to 'scale-up' and use one very large server to consolidate the number of servers, increasing performance and capacity with additional processors, memory and disks.
Q: What are the benefits and challenges of the scale-out approach?
A: As with clustering, a primary benefit of scaling-out is that it builds with low-cost, standard computer systems. Each server is inexpensive, so adding another remains cost-effective. Today's IT managers are unsure what capacity and performance they will require in six months' time, so the ability to add, remove and re-deploy servers from the cluster results in an important financial competitive advantage.
However, clustering and the scale-out models are not without their critics and there are two key criticisms levelled against them. The first is that there is more logical distance between the components. This brings inter-node communication challenges. But, with the development of technologies such as Oracle's Cache Fusion, this challenge is significantly reduced. The second challenge is manageability. As the number of servers increases, so does the complexity of managing the system. Critics claim the ability to identify and respond to faults and performance slowdowns becomes harder as does the overall maintenance of the whole system. However, there are tools that address this - for example, the Oracle Enterprise Manager, which is integrated into the Oracle9i Database. It detects performance slowdowns, isolates root cause, determines business impacts, and deploys 'fix-it' operations before service interruptions affect end-users. Manageability therefore, is not an issue.
Q: What does the future hold for clustering?
A: At Intel, we believe that the rest of the industry will soon share our enthusiasm for back-end clustering and its popularity will grow. There are signs already. For example, a recent Meta Group report said: "Through 2005+, IT organisations driven by the need to reduce infrastructure costs and improve application availability will increasingly explore clustered DBMS [database management solution] architectures." Prime time for back-end clustering is now. It is not a question of whether it is here to stay or not. The question IT managers will increasingly be asking themselves is "Why are we not clustering?" The cost benefit is just too compelling to be ignored. And with the major database vendors currently porting their software to Oracle 9i RAC currently being ported to the Intel Itanium processor family, Intel's 64-bit EPIC architecture processor, the performance capabilities will be even better. In addition new I/O technologies such as Infiniband will also enable faster inter-node communications within clusters. Clustering is definitely here to stay.
Joubert de Lange: business development manager, Intel South Africa
Top tips for clustering
1. Database structure is important - shared-nothing and shared-disk clusters place different requirements on the data structure. Take time to understand which is best for your needs.
2. Consider your future needs (as well as those of today) and build a solution that can be easily scaled with your business - both storage and transactions.
3. Chose Open standard servers enable flexibility when choosing your platform and to ensure cost effectiveness.
4. Work with your database vendor to understand your options and the tools they have to aid porting to new hardware and software platforms. Migration from old versions of hardware and software is easier than you think and database vendors offer special packages.
5. This is a rapidly evolving sector of the market with new capabilities becoming available from the platform and software vendors; work with your supplier to understand what they can offer you today - and tomorrow.
6. Check out some of the new I/O technologies, such as Gigabit Ethernet, Infiniband etc, and how they can help speed up inter-node communication within your cluster.
7. Choose a software package that includes manageability-enabling features.
8. Choose a flexible/scalable cluster for changing and unpredictable needs.
9. If clustering in back-end, choose a database capable of supporting mainframe level of transactions on clustered hardware.
10. Even using simple fail-over clusters can aid your business by improving platform 'uptime'.