Breaking your proprietary software habit fileBreaking your proprietary software habit Best practices...

10
Young-Jin Kim, Eileen McNaughton, Micah Lee Breaking your proprietary software habit Best practices for data import into CiviCRM

Transcript of Breaking your proprietary software habit fileBreaking your proprietary software habit Best practices...

Young-Jin Kim, Eileen McNaughton, Micah Lee

Breaking your proprietary

software habitBest practices for data

import into CiviCRM

7 deadly sins of data migration 1. Wrath - Feeling you'll get if you don't plan!2. Gluttony - Failure to restrict import scope3. Greed - Failure to get rid of data4. Sloth - Failure to iterate quickly, work cleanly5. Pride - Failure to validate the import6. Lust - Failure to dedupe7. Envy - Failure to leave behind old ways

liberate your data, set it free

Best practices for data migrations1. Use a dedicated environment for data imports

2. Automate scripts for the full import early on! Use APIs!

3. Judiciously, with client input, limit data import scope

4. Data import is an iterative process: iterate, iterate!

5. Think about current workflow and future workflow as it

impacts data mapping into CiviCRM

6. If you can, draw up a time horizon that will demarcate

stale data from current data, e.g. 3 years in the past

7. Don't reinvent the wheel make use of free tools, i.e.

migrate, civimigrate, ETL tools, Google Refine, APIs

Google Refine

Two possible migration workflows

CiviCRM DBPentahoKettle

LegacyDB

Cleanse

Export Import

Transform

CiviCRM DBCivimigrate

ModuleLegacy

DB

Export Import

Transform

ExportDB

Google Refine

● Free Open Source Data Cleaning tool written in Java running on a local tomcat instance

● Uber-spreadsheet on "steroids" with GUI● Reads in many file types and data formats

and also Google Docs spreadsheets● Many built in data transformations for

merging, clustering, matching, faceting● Ability to extend capabilities by writing

custom transforms in GREL, Python or Clojure

● Cleaning procedure can be saved as JSON and replayed back easily

Pentaho Data Integration

● Free Open Source Extract-Translate-Load tool (ETL) written in Java Eclipse framework

● Visual programming interface (GUI) for pipelining data and inspecting data streams

● Comes with connectors to many existing data(base) formats for input and output

● Write custom Javascript and Java steps● Data stream is routed using a transformation

step, transformations can be chained in a job● Transformations and jobs are stored as XML● Replay XMLs from command line

What is Civimigrate?

It's a bandaid between Migrate Module and the CiviCRM API More technically it exposes the API as a migrate destination

● Maps source data to migrate destinations (csv, oracle , xml, mysql, JSON ....)

● Supplies a framework to do trial imports, rollbacks, updates- Drush or GUI

● Map tables maintain relationships between source data and the resulting CiviCRM entities

● Allows you to use hooks to manipulate data during the migration (prepareRow + callbacks, e.g to sanitize data)

What does migrate do

You've migrated your data,but what about your donors?

EFF had ~1,000 recurring donors in Convio, bringing in ~$20,000 per month. We spent a long, long time saving them, but in the end succeeded. Probably worth it.

Ways to save your recurring donors:● Call them on the phone,

ask them to re-donate (recommended)

● Get credit card numbers, carefully baby-sit selenium script

● Keep old payment processor around until all cards expire, write CiviCRM integration code