Predicting Marketing Campaign with R

In my last blog I created a mechanism to fetch data from Salesforce using rJava and SOQL. In this blog I am going to use that mechanism to fetch ad campaign data from salesforce and predict future ad campaign sales using R

Let us assume that Salesforce has campaign data for last eight quarters. This data is Total Sales generated by Newspaper, TV and Online ad campaigns and associated expenditure as follows:

Sales Newspaper TV Online

1 16850 1000 500 1500

2 12010 500 500 500

3 14740 2000 500 500

4 13890 1000 1000 1000

5 12950 1000 500 500

6 15640 500 1000 1000

7 14960 1000 1000 1000

8 13630 500 1500 500

Thus, quarter# 1 indicates that $1000, $500 and $1500 were spent on Newspaper, TV and Online ad campaigns respectively and total sales during that quarter was $16,850.

First step is find out if there is any relationship with sales and advertising expenditure. The tool I am going to use is Regression Analysis. In order to perform regression analysis, I am going to fetch data using rJava as follows:


library(rJava)  # Load rjava library
.jnit()         # Initialize java

sfObj=.jnew("SalesforceHelper") # Instantiate java object
CampaignVector=sfObj$queryObject("SELECT Sales__c,Newspaper__c,TV__c,Online__c from CampaignData__c") 

Campaigndata<-getSFDataFrame(CampaignVector) # Convert vector to R data frame

(For more information on how to integrate R and Salesforce, please refer my previous blog at: http://www.r-bloggers.com/r-and-salesforce/
or https://arungaikwad.wordpress.com/2012/02/25/r-and-salesforce/)

Let us ask R to perform regression analysis on Campaigndata using lm() function as follows:

attach(Campaigndata) 
Campaignmodel<-lm(Sales~Newspaper+TV+Online) # perform regression

After performing regression analysis, R is going to give me relationship in form of following equation:

Total Sales = (sales with no advertising) + (newspaper contribution per dollar*newspaper expenditure)+(TV contribution per dollar*TV expenditure)+(Online contribution per dollar*Online expenditure)

sales with no advertising is called Intercept while each contribution is called coefficient.

R will also gives information on how meaningful or strong this relationship is, with R^2(R squared).
As you can see that campaign manager will be interested to know per dollar contribution by each adverting medium. In other words, how much sales will be generated for each dollar of expenditure.

Let us find out this information from our model

> Campaignmodel

Call:
lm(formula = Sales ~ Newspaper + TV + Online)

Coefficients:
(Intercept)    Newspaper           TV       Online  
  9561.4286       1.2465       0.9193       3.5161  
>

Sales without advertising (Rounded) = $9562
Newspaper returns = $1.25 per $1
TV returns = $0.92 per $1
Online returns““= $3.52 per $1 of expenditure. (Clearly a winner)

But how strong is the model? Let us find out

> summary(Campaignmodel)

Call:
lm(formula = Sales ~ Newspaper + TV + Online)

Residuals:
       1        2        3        4        5        6        7        8 
  308.32  -392.36   467.95 -1353.29   -75.59  1019.94  -283.29   308.32 

Coefficients:
                    Estimate        Std.   Error     t value           Pr(>|t|)   
(Intercept)        9561.4286         1700.5869        5.622             0.00492 **
Newspaper             1.2465            0.8100        1.539             0.19865   
TV                    0.9193            1.0766        0.854             0.44126   
Online                3.5161            0.9584        3.669             0.02141 * 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 938.2 on 4 degrees of freedom
  (1 observation deleted due to missingness)
Multiple R-squared: 0.7879,     Adjusted R-squared: 0.6289 
F-statistic: 4.954 on 3 and 4 DF,  p-value: 0.0781

Look at the t-values for each advertising medium. t value more than 2 is strong. Again Online advertising has strongest relationship with sales while TV has the weakest. Also Multiple R-squared of 0.7879 indicates it’s a good model.

Obviously I am not expecting real world campaign or account managers to analyze R output. So, I created one related custome object called Campaign_predictor__c with custom fields as follows:

Sales_without_ad__c NUMERIC initialized to 9562.00,
Newspaper_expenditure__c NUMERIC,
TV_expenditure__c NUMERIC,
Online_expenditure__c NUMERIC
Predicted_sales__c FORMULA = (Sales_without_ad__c)+(3.52*Online_expenditure__c)+(1.25*Newspaper_expenditure__c)+(0.92*TV_expenditure__c)
Prediction_probability__c = 78

Now the managers have to just plug in the values and predict the sales. Suppose the manager has $3000 to spend on ad campaigns and based on model decides to allocate $2000 to Online $500 to Newspaper and $500 to TV. The predicted sales with 78% probability is:

$9562 + (3.53*2000)+(1.25*500)+(0.92*500) = $17,707

Thus we can move complex predictive analytics from the realm of super specialists and statisticians to marketing and sales managers using R and Salesforce.com

16 thoughts on “Predicting Marketing Campaign with R”

hello,
nice work on the java side (previous post), but on the modeling side, your “prediction” seems quite dangerous to me…
it misses error margin, checking of lm() hypotheses (sales following a normal distribution…), and above all interaction effects between different marketing channels and with previous sales and ad expenditures (time series modeling),
furthermore, it is long known that sales do not respond to advertising in a linear way (check ADBUG model for instance)
your sales manager has some chance to be disappointed…

Arun Gaikwad says:

March 18, 2012 at 3:06 pm

Dr. Willart,
Thanks for your feedback.
The purpose of this post was to demonstrate how one can harness power of R and integrate it with Sales and Marketing.
If the is no linear relationship, then R will indicate it with t analysis.

-Arun Gaikwad

Reply
1. Theun says:
  
  March 18, 2012 at 6:48 pm
  
  You could use this linear regression model for rough descriptive purposes, but like the DrSylWil pointed out it is not good for prediction. Because in that case, shouldn’t the manager then not focus only on online advertising to optimize the advertising expenditures?

> Also Multiple R-squared of 0.7879 indicates that there is 79% probability with respect to predictability of the model.

What? Can you clarify what you mean by that sentence please?

Arun Gaikwad says:

March 19, 2012 at 9:05 pm

Please see line# 21: Multiple R-squared: 0.7879
R-square is between 0 and 1.
R-Squared is a statistical term saying how good one term is at predicting another. If R-Squared is 1.0 then given the value of one term, you can perfectly predict the value of another term. If R-Squared is 0.0, then knowing one term doesn’t not help you know the other term at all. More generally, a higher value of R-Squared means that you can better predict one term from another.

Reply
1. efrique says:
  
  March 20, 2012 at 6:44 am
  
  Yes I am aware of the definition of r-squared. I asked for an explanation of your wording, which – I am sorry to be blunt – makes no sense to me.
  
  Can you explain how any of that definition relates to “there is 79% probability with respect to predictability of the model” – which I still cannot follow.
  
  79% probability of *what*?
  
  what do you mean by ‘predictability of the model’ here?

I want meeting utile info, this post has got me even more info! .

I think this is among the most important info for me. And i am glad reading your article. But wanna remark on few general things, The site style is ideal, the articles is really great : D. Good job, cheers

Intriguing blog – thank you. You often publish a interesting article. I hope to find others very soon.

Very interesting. I just started working with SFDC and also see a limitation in its analytic ability. What is your news step/goal for this R integration? Would it be possible to push information back up to SFDC and display with better graphics?

Arun Gaikwad says:

April 5, 2012 at 10:18 pm

Thanks!
Yes. The goal is to create model in SF and send it to R for analysis and update SF

Reply

Marvelous blog. You consistently publish a riveting post. I wish to uncover more such in the near future.

Kudos for this post. Pretty intriguing and well penned blog. Thanks!

very interesting points you have mentioned , appreciate it for putting up.

A very interesting article and approach. I’m just starting on this journey so appreciate you sharing. Are you available for projects?

Reblogged this on deseRt scrolls.

Predicting Marketing Campaign with R

16 thoughts on “Predicting Marketing Campaign with R”

Leave a comment Cancel reply

Published by Arun Gaikwad

Share this:

Related

16 thoughts on “Predicting Marketing Campaign with R”

Leave a comment Cancel reply

Published by Arun Gaikwad