Quantcast
Channel: RSSfeed
Viewing all articles
Browse latest Browse all 60

Success! How We Migrated Legacy Data into Indaba

$
0
0

(Originally posted at getindaba.org)

As we blogged about a few months ago, Global Integrity has been eating its own dog food by serving as the first test subjects in an experiment to migrate legacy research and fieldwork data into the Indaba fieldwork platform.

The good news: it worked! It took longer than expected but we’ve learned some valuable lessons along the way that should help other organizations that want to migrate old PDF reports, Excel files, and who-knows-what-else into Indaba.

The Process

We had two big piles of information to move from legacy web servers and spreadsheets into Indaba: Global Integrity Report scorecard and survey data from 2006 through 2009 (covering a few hundred countries in total) and companion Corruption Notebooks essays from the same time period.

The data work was of course the heaviest lift. First, our developers came up with a new data import protocol in Indaba designed to import legacy data directly into Indaba’s master database in the correct format. The key here was to ensure that survey structures themselves — not just the results to survey questions — could be imported seamlessly so that we wouldn’t have to recreate a 320-question survey tree structure by hand once the questions and answers showed up in Indaba. We managed to do that and have simple spreadsheet templates available for others to use in the future. Those spreadsheets tell Indaba things like which questions should appear in which order, and which questions belong to which categories in a survey. This saves huge amounts of time downstream and allows for immediate re-publication of legacy data once the import is complete.

To crunch our old data and render it in that new import-ready format, we worked with Seaborne and their Delray platform. Delray is basically a customizable data processor that allows you to map messy, disparate data into a clean format of your choosing (in this case, the specific import-ready format we needed for Indaba). Delray worked, but it took many iterations of testing and debugging our own internal processes to get it right. This wasn’t Delray’s fault but our continual discovery of seemingly unimportant fields in Indaba’s data model that needed values from Delray. In our pre-Indaba days, many of those data elements simply didn’t exist, and we often had to run scripts to fill in dummy values to get the datasets ready for import into Indaba.

For the Corruption Notebooks, the process was simpler: we simply copy/pasted a few hundred Notebooks from published web pages into Indaba. Given the simple structure of these essays — text combined with a small amount of metadata — it was faster and simpler to brute force our way to completion. This did require some nifty custom work in Indaba’s Control Panel – such as the creation of single-step workflows in live projects through which we could use Fieldwork Manager to paste text into Indaba — but we got it all done in an afternoon with four of us sitting around a conference room table eating popcorn. Hooray for team building via monotonous scut work!

What’s Next

We’ve been hearing growing interest from various organizations and current Indaba users in wanting to move their old research into Indaba, so they can clone existing work to quickly launch a new round of research. They also like the idea of leveraging Indaba’s powerful API and export tools to publish and manipulate data. We now have the experience and tools to help groups do that, and we’re excited to be able to offer import services to users moving forward.

If you want to take a sneak peak at what the end results are for Global Integrity, here’s a quick look at a soon-to-be-published web page on our not-yet-finished new website that contains some rich legacy data from Pakistan (just one random example).

Interested in learning more about how Indaba can help you liberate old research and data? Reach out to my colleague Monika Shepard to learn more.

– Nathaniel Heller

– Photo Credit: John Markos O’Neill


Viewing all articles
Browse latest Browse all 60

Latest Images

Trending Articles





Latest Images