Fleet Maintenance Best Practices: Achieving Database Patching Success
-
Posted by Harry E Fowler
- Last updated 1/23/20
- Share
As part of Quest Experience Week (QXW) 2019, Gary Henderson and John Norman from Nationwide spoke about how Nationwide leveraged Oracle Enterprise Manager’s Fleet Maintenance feature to effectively and proactively patch its Oracle Database environment to address critical vulnerabilities and maintain version currency. Nationwide was able to patch more than 7,000 databases over a three-year period, with a success rate exceeding 99 percent. The company’s patching approach gets a lot done with few people, minimal downtime, and minimal risk.
Various Approaches for Patching
The Center for Internet Security (CIS) provides global standards for IT security. The following statement came from CIS:
“One of the best ways to ensure secure Oracle security is to implement Critical Patch Updates (CPUs) as they come out, along with any applicable OS patches that will not interfere with system operations. Therefore, using the most recent Oracle database software, along with all applicable patches can help limit the possibilities for vulnerabilities in the software. The installation version and/or patches applied during setup should be established according to the needs of the organization. Ensure you are using a release that is covered by a level of support that includes the generation of Critical Patch Updates.”
According to CIS, patching is extremely necessary to avoid security breaches. It is also advantageous because it keeps your organization up to date with the latest Oracle software. Oracle typically support versions for five years, and a new version is released every year. New versions may necessitate an upgrade. While patching can be applied in a variety of ways, the Fleet Maintenance feature proved to be the best option for Nationwide.
Each approach is briefly explained below:
- Opatch: This is the only offering that does not have some licensing or cost associated with it. It is in place, meaning it requires less storage than out of place options. All databases in the Home must be patched together. Error recovery and rollback are challenging with this approach.
- Multi-Tenancy: This out of place option may require double the memory resources. Essentially, it involves unplugging from the old and plugging into the updated or patched container database. A Multi-Tenant License is required.
- Fleet Patching and Provisioning (formerly Rapid Home Provisioning): This is out of place patching that utilizes gold image deployment and a quick switch from the current home to the new patch code set. There is a minimal outage window. A Lifecycle License is required.
- EM Fleet Maintenance: This is out of place patching that utilizes gold image deployment and a quick switch from the current home to the new patch code set. Version 1 is Switch. Version 2 is DB software maintenance. There is a minimal outage window. It utilizes EM’s deployment job system. This approach is scalable and requires a Lifecycle License.
- Cloud DBaaS: Cloud Provider applies patches.
Clustering Technologies vs. Database Versions
It is important to reduce and eliminate clusters that do not provide strategic value and capability. Try to stay on supported, current versions and keep the number of versions to a bare minimum. Getting the database versions more consistent is beneficial to the patching effort, and the patching effort will help maintain uniformity. Non-clustering solutions will rise in the years ahead as the industry pivots to virtualized offerings both on-premise and in the Cloud.
Nationwide’s Patching Strategy
You Have to Crawl Before You Can Fly
Nationwide made adjustments to its system over a five-year period. The graphic below shows Nationwide’s slow and steady progress to achieve fast results in the present age. Representatives claimed that the biggest challenge throughout was having faith in Oracle Enterprise Manager.
Quarterly Patching Windows
One of the best efficiency-bolstering decisions that Nationwide made was to assign a schedule for your patching activity. Since Oracle releases quarterly patch updates, Nationwide follows suit. Each quarter, Nationwide patches databases with the most recent PSU, Bundle Patch or Release Update. For instance, in October, Nationwide patched with those that were released by Oracle in July. In order to prepare for a patch, there is work that must be done.
First, patches must be downloaded. They must create new Oracle Homes on Gold Image servers. They must also create software images of the new Oracle Homes. They make the new version current within the Gold Image. Finally, they test and validate fleet processes in a sandbox environment.
Once prep work is completed, they are ready for the patching cycle. Within a patching cycle, there should be a minimal number of patching windows. At Nationwide, non-production patching occurs on Wednesdays. Previously, when patching was done at various times, customers’ perception was that patching was constant, and on-call DBAs could not determine whether pages were due to patching activity or something else. By having a standard patching day, these problems were alleviated. Production patching is limited to one weekend day per month. The patching window is six hours long and begins at midnight. Most patching windows are completed in less than three hours.
To do this, Nationwide has a three-person team. One DBA is in charge of schedule coordination and prep work testing and is the primary DBA for clusterware patching. Another DBA is the primary DBA for Database Patching. The EM Administrator takes care of all things Enterprise Manager-related and exists as the developer for all automation surrounding fleet maintenance and the patching process.
The chart below shows the number of databases patched over the past three years. Prior to using EM to patch, Nationwide needed the entire 10-DBA staff patching multiple days per week for non-production and six DBAs scheduled every IRW. With that, they achieved only about 70 percent of database patches in a year.
As indicated by the chart, they have a much higher success rate with the enhanced approach. The green bars specify databases that are patched successfully in 15 minutes. Yellow bars specify that the database was patched successfully, but it took more than 15 minutes. Red bars signify an issue was encountered and manual intervention was required. These numbers were achieved by multi-threading up to 25 concurrent patching jobs.
Procedure Activity Monitoring
The Procedure Activity Screen in Enterprise Manager is an extremely advantageous capability when it comes to patching and provisioning. Each deployment procedure causes an entry, which is broken down into smaller steps available for viewing. You can drill into each step to see the corresponding log. If a step encounters an issue, the log will help you identify the problem. After addressing the issue, an action button to the right of the screen allows you to either re-try the step or ignore it to continue the process.
Advice on Dealing with Issues
After encountering issues during patching, Nationwide shared a few pieces of advice.
- Look at your environment and assess where you stand. The software standardization advisor does a good job of this. To get there, follow this path: Targets > Databases > Administration > Software Standardization Advisor
- Download the latest patch set you typically patched with Opatch to create a Gold install.
- Begin the cycle by utilizing the Create Software Image operation within DB Software Maintenance to capture the gold image.
- Subscribe databases to the gold image that corresponds to its configuration.
- Deploy the gold image software to the database server.
- EM deploys the software based on the target type in the command: database instance, or RACDB. If multiple databases share the home, you only have to deploy it to one of them.
- When you issue an update on the database, it will patch or upgrade it based on the image it subscribes to.
- Once all databases have been updated to the successor software home, use cleanup software to properly remove the old unused software from the servers in EM.
All of these launch deployment procedures can be monitored in Enterprise Manager.
Tips and Tricks
Representatives at Nationwide shared the following tips and tricks for others to reference during this process.
- Reference the online Oracle Database Fleet Maintenance Manual via Google.
- Create Global Credentials (Sys, Oracle, Privileged Account)
- A private role can be given access to these credentials. Then the role is granted to DBAs.
- Set the preferred credentials on accounts doing provisioning and patching.
- Change your staging location (emStageDir) from /tmp (DOC ID 1610321.1)
- Ensure your oralnst.loc are correct, especially in the Home and /etc.
- Ensure your cluster/has/database EM target properties are correct.
- Clean up your EM Oracle Home Targets. Add missing delete, decommed…
- Grant access to all DBAs to look at the procedure activity of others.
- EM Resource Privileges > Job System > Edit any procedure configuration
- Ensure your Oracle base can hold at least 3 homes (current, future, unzip).
- Study/use the gold agent provisioning.
- EMCLI commands:
- Switch_database, Switch_GI, > DB_Software_Maintenance
- Establish a clearly defined cycle.
- A cycle ends/begins when the next version is made current.
- Pick a model for subscribes and deploys.
- Mass operation at beginning of cycle of Just-in-Time. Each has pros and cons.
- Have a directory naming convention to make versions obvious.
- …./12.1.0.2.190416
- …./12.1.0.2.190416-oneoff
- …./19.3.0.0.190416
- EM Book Keeping … Associations … Oracle Home to Target … “Installed At”
- Keep track of errors as you encounter them. You may be hitting them each cycle.
- Clean up software. (Default 3 homes) Set to 1.
- emctl set property -name oracle.sysman.emInternalSDK.db.gis.lineageLength
- NEW ability to delete home by home name
- Have a development EM install… Test out new patches… Stay current.
- Base Product patch (quarterly)
- Plugin Bundle patch (monthly)
- Agent Patch
- Weblogic (quarterly)
- New ability to move Gold Images between EM systems.
- Lineage: Rare but occasional problems (Doc ID 2346150.1)
Conclusion
Based on Nationwide’s experience, security requirements are driving the need to patch a large number of assets more frequently, and automation is the key to accomplishing that. With Fleet Maintenance, the payoff is significant. Instead of dedicating the entire DBA staff to constantly patch and coordinate schedules, they have been able to achieve more patches in less time with less risk and less downtime. Take advantage of Fleet Maintenance for your company’s best approach to security and efficiency.
To learn more about Nationwide’s fleet maintenance best practices and use of the Oracle Enterprise Manager Fleet Maintenance feature, check out the QXW 2019 presentation attached below.
Additional Resources
COLLABORATE 20 will take place April 19-23, 2020 at the Mandalay Bay Resort and Casino in Las Vegas, Nevada! Don’t miss this chance to share inspiration, insights, and solutions with your peers, vendors, and the Oracle team! Register before March 6, 2020, to take advantage of Early Bird pricing.