Monster VM Study: Moves 3 RAC Nodes Under 180 Seconds! Part 1

    By: George O'toole on Dec 16, 2013

    There seems to be a trend with the Oracle database community: The use of current commodity hardware enables a large System Global Area (SGA) and a correspondingly large database buffer cache. The SGA is a group of shared memory structures that contain data and control information for an Oracle database. The database buffer cache is a component of the SGA that holds copies of frequently used data blocks. For more information use Oracle’s documentation library. In this study by Principled Technologies each virtualized RAC node was allocated 148 GB of virtual memory. Why did we include so much memory for a virtualized Oracle RAC node?

    1. This acknowledges a trend in servers having more main memory (RAM), as the additional cost is relatively minimal. This additional memory is therefore available for Database Administrators (DBA) to allocate to the SGA.
    2. VMware’s vMotion is a foundational technology that enables the business to non-disruptively move a Virtual Machine (VM) from one server to another. As the study indicates, this can include vMotion of a virtualized production Oracle RAC node.
    3. The goal of the study was to demonstrate the simultaneous vMotion of multiple large virtual memory VMs which are running an OLTP workload. We proved that these vMotion operations could be done efficiently, quickly and without data loss or other instability. Since Oracle RAC is specifically designed to fail a node and fence it from the cluster in the event of the slightest hint of instability, this is an amazing accomplishment.

    DBAs manage the enterprise’s most mission critical applications. Thus, DBAs want clear and convincing evidence of the benefits virtualization offers before adding another layer to the Oracle stack. In this recent study by IOUG entitled, “From Database Clouds to Big Data: 2013 IOUG Survey on Database Manageability” the 4th greatest challenge of database administrators surveyed was, “Managing a larger number of databases with the same resources.” The value of the Principled Technologies research study is that it demonstrates that Oracle DBAs can virtualize their Oracle database servers including production RAC nodes, and then easily move those VMs without any loss of availability. Oracle RAC is a cluster of many servers which access one database, a many-to-one architecture. This provides resiliency: If one server fails the surviving servers continue to provide database availability. Another interesting finding from this study is, “We have already consolidated our database and infrastructure onto a single technology platform for our critical business applications”. The study adds more color by noting, “A developing trend may be taking hold, as many IT organizations move towards database consolidation onto a shared and/or cloud environment.” Virtualization is the platform which enables both consolidation and manageability for Oracle DBAs. In terms of manageability, virtualization offers the DBA:

    • The ability to proactively move database servers in preparation for planned outages: If a server, network switch, or other hardware within the infrastructure requires maintenance the virtualized database server / RAC node can be moved off of that hardware.
    • Consolidation: Multiple databases can efficiently and securely share the same hardware as they are isolated to individual VMs.
    • Improved Service Level Agreements (SLA): Improvements in proactive and automated reactive manageability will improve the SLAs for the business by improving database uptime.
    • Strong Storage Integration: VMware integrates with EMC storage and other vendors for increased manageability, performance and availability.

    There is another part of manageability picture that is equally important. That is quality of service (QoS). For example, if a given vMotion event takes too long and during that time database performance is significantly degraded then the QoS (Quality of Service) is low. The Principled Technologies study punches this up by saying, “Quick and successful virtual machine migrations are a key part of running a virtualized infrastructure.” So, if vMotion adds value in terms of manageability, then QoS must provide the following functionality:

    • vMotion windows should be kept to a minimum
    • The vMotion end-user performance impact should be minimal

    The Principled Technologies study was designed to validate both ease of manageability and QoS. In three different tests, multiple virtualized Oracle RAC nodes under a significant OLTP workload are simultaneously moved with vMotion. Diving into the details of the architecture:

    Software technology stack includes

    • Oracle 11g Release 2 (11.2.0.3)
    • Oracle Enterprise Linux 6.4 x86_64 Red Hat compatible kernel
    • VMware vSphere 5.1
      • VMware vCenter Server 5.1

     Monster VM

    Hardware infrastructure includes

    • Cisco UCS 5108 Blade Server Chassis
      • 3 x Cisco UCS B200 M3 Blade server
        • Each with two Xeon E5-2680 processors
        • Each with 384 GB RAM
        • 2 x Cisco UCS 6248UP Fabric Interconnects
        • 2 x Cisco Nexus 5548UP switches
        • EMC VMAX Cloud Edition
          • 2,994 GB in the Diamond level
          • 18,004 GB in the Gold level
          • 15,308 GB in the Silver level

     Hardware Infrastructure

    Workload Software

    Most of the above software and hardware above is self-explanatory but more detail is needed for the EMC VMAX Cloud Edition.  This storage array is unique, as it enables self-service provisioning of storage based upon performance levels. The concept is simple and powerful: Create a storage catalog of pre-engineered storage levels designed to provide different levels of performance. For example, the fastest storage level is Diamond offering a greater amount of Flash drives to ensure low latency and high IOPS. Below Diamond are the Platinum, Gold, Silver and Bronze storage performance levels. Let’s look at how the Oracle RAC database was easily configured on the EMC Cloud Edition:

    VMAX Cloud Edition

     

    The picture above shows how simple it was to provision the Oracle 11gR2 database used in this study. The datafiles and tempfiles were placed on the Diamond level, as this level offered the best latency and IOPS. In the study, the goal was to place a significant workload on the virtualized Oracle RAC nodes and configure the storage to support that workload. One notable metric: The average core utilization prior to vMotion of the RAC nodes was approximately 33% with no significant I/O waits. This is important: Core utilization was used to define the workload parameters, not storage performance. As noted in the study, “In the configuration we tested, Transactions Per Second (TPS) from the hardware and software stack reached a maximum of 3,654 TPS.” In terms of storage performance, there was plenty of TPS to spare and by no means was the VMAX CE maxed out.

    Everything else, i.e. the VM’s guest operating system, Oracle home, redo and undo, were placed on the Gold catalog level as these files didn’t require the same level of performance. Does this database layout seem relatively simple and straight forward? The strength of the VMAX CE is it enables a simplified database layout, as performance has been pre-engineered for the business. But there is another key consideration that some DBAs might not consider when working with storage administrators: Shared disk performance. It is entirely possible for the storage administrator to have multiple applications share the same disk meaning shared latency and IOPS. This has the potential for one application to impact another application using the same disk, and this is sometimes difficult to diagnose. The EMC VMAX CE isolates logical resources to secure the appropriate performance for each tenant.  It is this isolation that prevents applications from impacting each other on the same storage array. For a DBA managing multiple mission critical Oracle database servers, this means he or she can reserve resources for CPU, RAM and now storage. This in turn provides an improved SLA to the business.

    A very recent video by AAR Corp. has been posted at EMC in which Vice President of IT Operations Jim Gross discusses, “Why AAR is using the VMAX 10K for improved availability and decreased maintenance windows for the company.” The core of the VMAX CE storage array is the VMAX 10K, so the AAR video is a good example of how performance, availability and maintenance can be improved using EMC VMAX. For Oracle DBAs interested in an EMC VMAX / virtualized Oracle customer case study, Jim Gross reviews their use of VMware vSphere and Oracle on EMC VMAX.

    Continue on to Part 2 of the series.

    Released: December 16, 2013, 9:31 am | Updated: January 21, 2014, 12:57 pm
    Keywords: Announcements | EMC | manageability | virtualization


    Independent Oracle Users Group
    330 N. Wabash Ave., Suite 2000, Chicago, IL 60611
    phone: 312-245-1579 | email: ioug@ioug.org

    Copyright © 1993-2017 by the Independent Oracle Users Group
    Terms of Use | Privacy Policy