Guide to Migration Project Planning, Part 3: Migration Prep & Solution Design

Catching Up

In this Third entry in “Device42’s Ultimate Guide to Migration Planning,” we’ll be covering Migration Preparation & Solution Design. We hope you’ve enjoyed reading our first two entries [Part 1: “Readiness Assessment & Creation of the Pre-Migration Plan” and Part 2: “Migration Project Planning”], but if you’re new to the series, those two links will take you to their respective articles.

Last, for those catching up, here’s a link to the Migration Roadmap Infographic. It visually lays out the migration process, from migration project planning through execution, and also serves as an overview of the migration project planning series.

In the last two articles, we covered topics ranging from ensuring migration viability to recruitment of key stakeholders to review and creation of configuration management policy, and vendor identification. We went on to discuss defining a central location for migration-related documents, and we touched on ensuring the readiness of monitoring systems, and their importance to the migration process.

Introduction

Migration Preparation and Solution Design is a very important step in the overall Migration process. This is the time when all the nuances and nitty gritty details are ground out, and it’s the very last planning step before the actual migration itself. Therefore, it must be executed both with extreme caution and attention to detail. The success or failure of your migration can depend heavily on proper execution of your Final Migration Preparation and Solution Design.

Migration Prep & Solution Design will discuss the following steps:

Gathering a detailed assessment of your current data center
Using the assessment to map dependencies for all applications involved in the migration
Breaking up the migration into smaller logical units, or ‘migration groups’
Ensuring a proper back out plan exists for each identified migration group
Finalizing SLA’s, setting the standards for go/no-go calls, and defining the criteria that triggers a back out
Making all hardware and software purchases and finalizing vendor agreements [both for assets and for labor]
Performance of pre-Migration testing and simulations of processes, procedures, and migration targets

When all the above are complete, the only major thing left to do will be execute the actual migration, and thus the criticality of this final step. This step may in fact take more time than either of the previous two steps, and that’s perfectly OK. This is the time to ‘cross your T’s and dot your i’s’. The phrase is fitting, as it conveys a level of refinement that should be the driving attitude during execution of this final stage of preparations.

Assess Your Current Data Center in Detail

The requirements that were defined during the last stage of planning [“Part 2: Project Planning – ‘Define the Hardware & Software Requirements’”] should make a good starting point for this task. Those requirements should have been based on real data, which is hopefully close to correct. However, hopefully close to correct isn’t going to cut it when it comes migration day, at least not if you’d care to succeed!

You should have been able to capture most of your IT assets when you ran discovery with your CMDB-type documentation tool. However, IT Departments often support systems that aren’t officially “on the books”, so to speak. These components, legacy or one off, must also be identified. This includes the servers that live under engineer’s desks that ‘temporarily’ became part of production, and all the other secret databases, servers, and software that are rarely talked about but nonetheless depended upon by someone, somewhere.4 Verify counts of the known IT Assets, combine all the figures, and then double check to ensure they all make sense. Then check again to ensure not even one was missed, as the success of the migration could depend on it!

Though this can be a lot of work, you can consider automating most of it with a comprehensive discovery tool like Device42. A good starting point is to run discovery against the “non-production” network segments that were found to be hosting one or more of the “off the books” services.

When finished, you’ll have a list of all hardware, software, network equipment, storage devices, air and cooling systems and their specifications, CRAC’s, power-related equipment like UPS’s and PDU’s, application components, and the applications they support. The specifics and correctness of this data will be critical to the next step – Dependency Mapping.

Map Application Dependencies for Target Apps

“Target Apps” refers to both applications that are part of the migration as well as any applications that will be affected by the migration. For example, if two applications share a database, and one application & the database are moving, both applications need to be listed and mapped as dependent upon that database.

The migration plan will need to include not only the steps required to move application #1 and its database, but also the steps to shut down application #2 while the shared database is being migrated. Add to this the steps to be performed to re-connect app #2 to the database’s new location, if any, and of course the steps required to bring app #1 up in its new location as well.

Be very vigilant documenting these details in your CMDB, and ensure you don’t forget to run discovery against any involved network segments. After all, software can only do what it’s told to do, and any overlooked dependencies at this stage can cause systems that weren’t listed as ‘involved’ in the migration to go unexpectedly offline. This is especially true in the virtualized world most companies operate in today, where a given piece of physical hardware and its supporting systems can be responsible for tens, or hundreds of “virtual” servers functionality.

The latest generation of CMDB’s [Device42, BMC, ServiceNow, and others] can identify dependencies in your environment as your infrastructure and the connections between it are auto-discovered. As such, your CMDB’s dependency mapping functionality should prove extremely helpful during this stage of planning. Do recall, as stated previously: the dependency information you get can only be as accurate as the data you are working with. The versatile CMDB tool that was chosen during Stage 2 (designating and assessing tools for centrally organizing migration documentation) will prove very helpful for dependency identification, and can clearly display the important information you need (see Device42’s Application Topology / Dependency Mapping page for a good example). The next planning step uses this data to produce groups of servers and associated hardware that share dependencies, creating move groups from that data.

Use Dependency Info To Create Migration Groups

A “migration group” (or ‘move group’ – considered the same thing for the purposes of this writing, and used interchangeably) is composed of a group of one or more applications and all the hardware and other dependencies that support that application. Any affected server and its services need to be considered when creating a given move group, and plans for those affected servers and services made.

Begin by choosing a target application for migration from the prior step, and identify the server or group of servers responsible for hosting it. Next, identify its network connectivity and the switches used by the hardware, the power connections and UPS’s, the shared storage, if any, and anything else that your setup depends on. If shared storage is utilized, (very likely, as it’s quite common in today’s data center), mapping out shared storage utilization is a great technique to create move groups.

Why and how? Well, a common configuration is for a group of N virtual servers to share Y physical servers, all sharing redundant storage array X [as all that algebra comes rushing back…]. Therefore, all of the virtual servers have a dependency on the physical servers, and all the physical servers are dependent on the storage.

Discovered dependencies and their commonalities will dictate the smallest possible groups as well as the makeup of the larger groups. You might find it convenient to start with all servers sharing a given storage appliance, for example. Once that list is made, if there is anything else left in the rack you are free to add it to the same group, or start anew. Progress down the dependency chain, identifying servers that share the same network switches as the current group, and then possibly those that share power, and so on.

Keep an eye out all the while for any applications that aren’t supposed to be migrating, if any. Your plan will have to include pre-migration steps or at least pre-move steps that re-locate the identified hardware, servers, services, or applications before the move commences.

Create SLA’s & A Back Out Plan

For Each Migration Group

Now that you’ve gone down the list of applications that must be moved, and fit each application into a migration group, it’s time to consider the ramifications of a failure of any individual application in a migration group. Identify those that are and are not business critical, and those failure modes which are “fatal” vs. those failure modes which are temporary, and steps to resolve and back out each. Designate someone to make the final call of go or no-go, and success or failure, for each migration group. There can be no arguments on migration day and no doubt in anyone’s mind as to the success of a given part of the migration. The designated resource will have full authority and final call to initiate the rollback plan, so make sure criteria for success is very clearly planned. There will be no time to argue this on migration day!

Consider, for any non-critical applications, or any critical applications that are only running during business hours just how important they really, truly are. Many will jump to say that they “can’t afford any downtime”, but that just isn’t logical. Everyone and everything can tolerate some degree of downtime; figuring out how much, and possibly equally as important for this stage of planning: when. Certain departments, and even customer facing applications may have both allowable maintenance windows as well as peak and off peak times. Planning migration schedules to take advantage of downtimes and off peak times will be key to maximizing your move window and minimizing chances for rollbacks.

Gather existing SLA’s, and meet with business leaders to determine appropriate SLAs for any applications both internal and external that one cannot be located for. Customer agreements may need to be consulted in some cases, and in others the cost of downtime calculated to determine appropriate values. At the end of the day, these values are extremely important and once established are essentially gospel. With the assistance of said SLA’s and a bit of simple math, in many cases the criteria that puts a back out plan into motion can be as simple as SLA – time taken so far to migrate – time to execute back out plan, or even more simply put, SLA – time to execute back out plan = Move SLA, e.g. the amount of time you have in which to migrate.

Once SLAs have been gathered and finalized, and move SLA’s have been calculated, and the plans to abort migration of that move group and rollback are detailed out, it also needs to be determined if the failure of one or more move groups is so critical so as to cause the entire migration’s roll back plan to be initiated. No one outside your organization can help with this step – only you can make these determinations on behalf of your business, and they should be made based on requirements dictated by the business for uptime and service availability. An external application supporting your main Line of Business is obviously going to be a lot more important here than, say, your company’s internal wiki – but nonetheless, these are exactly the things that need to be figured out and considered when creating the back out procedure for each component of the migration. Include temporary, interim supporting equipment and other backup systems that will be required for the migration or as part of the contingency in the plan. Factor all of this into the migration plan, and always err on the cautious side here. Of all places, this is not the place to save a few dollars (at least not during my migration!).

Finalize Purchases and Vendor Agreements

For Hardware, Software & Labor

This step is rather straightforward, but important nonetheless. Finalize all purchase quantities, delivery dates, agreements and contracts for hardware, software, and labor. Ensure everything that you are going to need for the migration as planned will be where it needs to be when it needs to be there. Ensure that vendors are on board, and contractors know exactly what they’ll be doing, when they’ll be doing it, etc. Sweat the small details and ensure building access and systems access and revocation has all been accounted for.

Over staff and have contingencies in place. People get sick, stuck in traffic, etc. — especially people that that don’t know your project from the inside out and don’t have the emotional and moral investment that you might. Try to have a few variations planned. The better you understand the ins and outs of your migration, the easier rearranging a few operations without going off plan will be. It’s best if these contingencies are part of the plan for just that reason.

The point of the former being that it’s a good idea to factor in both spare parts and a spare unit or two into the initial purchase shipment, as during the migration isn’t really an appropriate time to be scrambling to call vendors for warranty claims and replacement parts.

Run Pre-Migration Tests and Validate Targets

Before Migration Day, there’s a lot of testing to be done. Many things, if properly planned, should work fine as hopefully you are just updating to newer versions of similar hardware. Every so often, there can be cases where similar or “compatible” hardware doesn’t exist anymore, especially if your migration is dragging legacy systems around.

Now, and not during the migration is the time to run pre-migration tests and related tasks, and validate migration targets for anything that is even a little bit iffy. You can pre-order a small-scale setup (a single server, an example or two of the new switch, router, or access point, etc.) and pre-configure and test it ahead of time. Put an example of the new hardware or new setup that is in doubt into “production” in your current data center well in advance of the move. You’ll gain valuable experience for migration day, proof, and ‘burn in’ the new hardware all in one fell swoop.

For migration processes, devise solutions and test them out locally for anything that is in doubt. You do not want to find out six months after migration that a customer database that appeared to migrate successfully only migrated names, but not customer contact info! Image the fiasco. Vet the migration independently, and work tests into the plan. Once the migration is over, and the hired hands leave, it can prove quite difficult to resolve issues with systems you have no in house expertise on.

Though it’s been stated more than a few times throughout this article, this, too, is not something that can be left to be ‘figured out’ during the migration. Come up with the procedures ahead of time, test them, and validate them. Use the same validation methods you devise during testing to validate the live migrated data. Come migration day, it’ll be too late, and you don’t want to be the reason that the migration is declared a partial failure, or worse, enough of an overall failure to kick the rollback plan into gear.

Synopsis

Even perfect planning can’t fix poor execution, but poor planning nearly guarantees it. When it comes to this final stage of Migration Preparation & Solution Design, take nothing for granted. The migration can only go as well as the plan dictates. Ensure your plan dictates success!

The ‘Devil is in the details’ – of the assessment of your current data center, of your application and supporting hardware interdependencies. Plan your move groups with care, and ensure that anything that isn’t supposed to move is accounted for as early as possible, and if that isn’t feasible, is part of the migration plan. Ensure every migration group has a back out plan, and designate someone to make the final call for each migration group.

Remember, too, that there is an order of operations that makes the most sense in all things, and that is also true of data center migrations. After the new systems are tested, and hardware comes in and is shipped to the new location, it needs to be unboxed and racked, racks having been set up already if the new location is not so equipped. Once Power circuits and equipment are all verified, along with HVAC and ancillaries, network wiring is usually run, network connections are made, and then a small migration “dry run” can take place. The size and specifics of this dry run are only limited by planning. With proper planning, a lot of test ground can be covered, and many strategic migration tasks can possibly be accomplished ahead of time. All this testing and verification allows for time to roll back and make changes to plans should a significant problem be discovered during the testing – A saving grace versus discovering the same during the live migration.

In our next entry, we’ll cover the actual migration itself. Come migration day, most of what is happening should leave little wiggle room for thought or judgement. It should be a lot of following planned and well-rehearsed steps and moving slowly, baby step after baby step towards a well pre-defined goal. It’s truly the test of your plans and planning skills, but in all hopes, they’ve been put through the ringer and rehearsed more than a few times prior to actual migration day.

If all goes well, migration day should be quite boring and monotonous. As anticlimactic as that sounds, it’s the sign of a migration going well. Like those who repair cars that have been in accidents, most won’t even know you even touched anything… unless there’s a problem. It’s thankless work, but then again, that’s not always a bad thing. When all’s said and done, at least you’ll sleep well at night, which is better than the so-called ‘excitement’ that comes along with the hard way.

Happy Planning, and see you next month for our final entry in the series: “The Migration”. Until then, ‘measure twice…’, and happy planning!!

Sources / References: