How are Einstein Discovery predictive models typically built?
The Lightfold process for building and deploying an ED model usually follows these broad steps over 6-8 weeks:
1. Determine the base hypothesis and predictive goal.
2. Source, connect and prepare the necessary data to train the model.
3. Build the initial model and rapidly refine and iterate.
4. Deploy the operationalised model to a pilot user group.
5. Test the model.
6. Refine the model and deploy it for all relevant users.
7. Establish a program for monitoring and improving the model over time.
By far the single longest and least efficient part of the process is sourcing, connecting, and preparing the data.
Why does it take so long to source the data and get it prepped? Well, partly it’s the complexity involved in the transformations and logic needed to structure data in a way that makes sense for a particular predictive model, but mostly it’s because usually only half the relevant data is stored in the CRM. The rest of the data is stored in silos all over the business. Negotiating access to data can take weeks, and that is only the start. Next you need to convince one or more DBAs / product admins to coordinate with your Analytics Studio admin to setup however many connector types and individual connections you need. Even after all that, considerable value is usually left on the table because project time-frames need to keep rolling, and so we are either forced to accept that the model will lack potentially powerful predictors, or the project begins to stall and lose momentum.
No Comments