Why I am an Einstein Analytics and Discovery enthusiast
In the old days
When the days went back to 2009, I participated in one BI project, which used a tool in the below picture.At that time, all the data-mining algorithms were like the dish on the plate, and you should pick each algorithm manually according to the user's requirement case by case, as whether to use the classification or clustering, or association. So, the deep knowledge of data mining and statistics were the prerequisite to use the tool well.
Nowadays, BI tools are significantly improved
When I first touched the Einstein Analytics and Discovery, I found this is the right tool that I should learn well. The reasons are: the visulazation is multifarious; the data transformation is very flexible and the mining and prediction job are much easier to do. So, let me show a few samples which I had done in my own dev org, all are the picks of my favorite EA features.
(1) Dynamic Binding
As below shows, the records number of result chart and the average of total acres are changable depending on the selection of the filter(on the top left). This demo used both the selection binding and result binding.
(2) Trending analysis using compare table
Two pictures show the amount difference month over month.
(3) Timeline chart with two demesions
This pictures can expain a lot. Eg,you can quickly find which cause is go up or go down between the years and from state to state. Data source:https://catalog.data.gov
(4) Custom Map using GeoJSON
The interesting thing is one can create a new map whenever needed. Here is the Japanese map shows the census distrubiton. Data source:https://www.data.go.jp
(5) Predication using timeseries
Salesforce Einstein Analytics recently introduced a new prediction function called timeseries. You can dispaly multiple lines by partition the dataset over quarter or over month, cool!
(6) Save time from doing python(in many cases)
Take the simple linear model in one variable example. You can quickly calulate the intercept and slop by using the functions(regr_slope, regr_intercept, regr_r2) under compare table.
(7) The flexibilty to join different datasets
The goal of the this demo is to count the case number of each account. The left step is directly using SAQL, the right step is the output of a datafflow. So, there are many ways to do the complex join work. Also, some joins can be done in Recipe, if it is not so complicated. You can even use connected data method.
(8) Fast modification in mobile
It is convenient to chage chart type in Salesforce mobile analytics.
Next part, the Einstein Discovery.
About Einstein Discovery
It is difficult to have a thorough understanding of Einstein Discovery as the processing is invisible to us. I would like to share some basic methodologies after reading related content(*). First, Einstein Discovery is the supervised machine learning, beacuse in many cases human intervention is required. The most important question is, how to measure the model quality, or what is the accuracy of the model created by the story. So I think we can make a better model if we know the fundermentals. Next, I want to dig a little deeper about this page: https://help.salesforce.com/articleView?id=bi_edd_model_about.htm&type=5
(1) Cliassification - logistic regression type
Salesforce uses ROC(Receiver Operating Characteristic)curve for this type of prediction. If we read through this Wiki ' Receiver_operating_characteristic, or this ROC Area-Under-the-Curve Explained we could know why in the ROC space, the points above the curve represent good classification results (better than random); points below the line represent bad results (worse than random). Therefor, metric AUC(Area Under Curve) is the very important one which is listed on the top panel. The cool thing here is we can change the threshold model metric to verify the ohters,
(2) Continuous numeric - linear regression type
Salesforce do Piece Wise Linear Ridge Regression on this type of prediction. Let me someday explain it in another blog and first focus on the metrics MAE and R squared. MAE is the average of the absolute difference between the predicted values and observed value. There are a lot of good written blogs explaining these concepts, eg: Choosing the Right Metric for Evaluating Machine Learning Models ? Part 1
Other metric like coefficient is also very important.One sample is In finance, the coefficient of variation allows investors to determine how much risk, is assumed in comparison to the amount of return expected from investments.
In short, the powfullness, the convenience, the amazing speed, the wonderfully aesthetic data visualisations, the smart insights....., all are the reasons why I love this innovative product.
Other helpful materials: