Top Ten Considerations for Content Migration Projects
When migrating content (of any kind) from one content system to another, here’s the top ten things you’ll want to think about.
There’s a really long list of things that have to be taken into account in such a project, and many things to consider so this is far from an exhaustive list. None the less, hopefully it will help with some of your planning for such a project, regardless of the type of content tools involved. All the tool types share similar challenges, so here’s the list.
1) Find your SMEs!
Finding people who know the old system and the content in it from both technical and business perspectives is critical. These people are your keys into making sure nothing is missed or lost by mistake during the migration. Involve these people in analyzing and resolving all of the items below.
2) Content Connectivity
While understanding how to connect to the old system and extract content is important, you need to know how content is put together to form a document, or how a link from one piece of content to another occurs. This structure will need to be echoed in the new system and therefore needs to be understood.
3) Content Format & Transformations
In the old system, content can be stored in a variety of formats which may not match the desired target format. Text based content is usually easiest to transform. However, even this can be problematic. HTML markup may not follow a strict HTML standard in the old system, and may not follow a strict template or structure making it harder to transform. Binary formats can be very hard to extract content from, see below for more on this.
4) Presentation Transformations
It’s quite possible that a content system or the associated presentation mechanism (such as a portal) performs last minute transformations of content in order to tailor them to the viewer or to simply make them readable. If this is overlooked, this can result in content being imported that looks very different to what was seen on the screen in the old system.
5) Extraction & Import
Pulling content out of the old platform can often be difficult, and depending on the target platform similar problems can occur. Make sure there’s API’s available that give you full access to the information you need – otherwise, SME’s who understand the database or file system schema for the old content system will be required.
6) Meta Data
Meta data is critical to the operation of most portals and content systems. Keeping useful meta information around from the old system into the new is usually beneficial if not required. Establish a clear understanding (a dictionary) of meta attributes in the old system as soon as possible. Ensure that the meta data dictionary is specific about meta attributes like date formats, what each attribute means, and which should be kept (this part of the work is a skill set all of its own).
7) Permissioning or Authorization
It is important that content that was marked with permissioning rules continues to only be shown to authorized users. Ensure to be explicit about permissioning / authorization mapping from the old to the new system – it’s likely that authorization models will be quite different. Old system permission attributes may not reflect current business groups or people, a plan will be required to deal with this.
8 ) Translations
Content is often stored in different languages for different the different consumers of the system. Translation storage can differ greatly between content systems and should be carefully analyzed. Ensure that each translation can be associated to the original language document (or that they can all be associated together if there’s no explicit master).
9) Binary Data
Binary data is any content not stored in text formats. This includes Microsoft Office documents (especially older versions) such as Word, Excel, PowerPoint, and Visio. It also includes PDF’s, images (of all types – there are many), and CAD or other application specific file formats. While these are often found in document management systems, this type of content often forms part of a web content or knowledge management system as well.
10) Manual Intervention
Depending on how similar the two systems are (the more similar, the easier the migration), and despite the success of the migration, it is worth planning on at lease a detailed manual review of the content when it arrives in the new system. In addition, plan on some manual cleanup and repair being required in even the most successful project.
And here’s a bonus item for all the project planners and project managers out there:
11) Effort and Timing
Most of the time and effort for content migration is not around just building the tool to make it happen and the run time to do the export, transform, and import. Most of your time will be spent analyzing, understanding, testing, and cleaning up. You’ll want to plan for that! But of course, all of this work and rework takes time. What can’t be overlooked is the timing issues that can occur if the old content tool is in production and in regular business use. When that’s the case, content is constantly changing as you’re trying to extract and import it and depending on the volume that can be a huge challenge if you’re not able to ‘freeze’ the old system.
Architech has experience managing content migration processes and building code to make it the exports and imports happen. Every migration is different, and different challenges will arise. The likelihood of anything in the above list being a real risk area is dependant on the type of technologies involved, the quality of the content, and so on. However, none of them are items that won’t need to be tackled at some level or another.
Dealing with these challenges effectively is the difference between a successful migration and a problematic one. The benefits of updates content tools are many, but without quality content, the software becomes less important and realizing the ROI of a new tool will be a challenge. It’s possible to really devalue content during a migration, or pollute the new tool with badly described and structured, out of date, and ‘low value’ content – that’s something you always want to avoid.
Are there any other problems you’ve run into with a migration project? Would love to hear your thoughts on any of the above, as well as any tips on how to overcome these things in a typical project.
