Customer Personality Analysis: R Studio Logistic Regression OBJECTIVE:

Determining which variables have an impact on campaign acceptance?

Choosing the best logistic regression model?

Y variable = response (campaign number 6)

X variables = campaign 1-5, income, recency, etc

Run Logistic regression

Link: https://www.kaggle.com/imakash3011/customer-personality-analysis ASSIGNMENT (R STUDIO) OUTLINE

Data source?

Customer Personality Analysis:

· Run Logistic regression

· Y variable = response (campaign number 6)

· X variables = campaign 1-5, income, recency, etc.

·

OBJECTIVE

: Determining which variables have an impact on campaign acceptance?

· Which people are liking to accept a campaign?

1. Research question

What customers should we send another campaign to?

3. Model

1. What is the outcome of interest (Y variable)?

1. Campaign acceptance

2. What are covariates or predictors (X variables) you plan on including in your model?

1. Yes: Income, recency, complain, purchases made with discount, numwebvisitsmonth

2. Maybe: year_birth, education, marital status, kid/teen home, dt_customer

1. Year_birth, education, and marital status may be strongly correlated with income – cause bias?

DATA CLEANING:

1. Create a variable for whether a customer accepted any campaign at all

2. Create a GLM Model based on predicting accepted overall based on

3. Predicted acceptance probability

Analysis to Perform:

1. Data Cleaning/Preparation:

1. Remove irrelevant columns??

2. Combine purchase into one variable (web, catalog, store)

3. Combine products purchased into one variable??? (amt vs num of purchases – does this difference matter)

4. Change education into numerical variables (years of education – approx or 1-x scale?) *only if using for analysis

5. Research average campaign cost

Analysis:

Perform logistic regression

Visualization:

Visualize regression data

Data Information:

● Variables:

People

● ID: Customer’s unique identifier

● Year_Birth: Customer’s birth year

● Education: Customer’s education level

● Marital_Status: Customer’s marital status

● Income: Customer’s yearly household income

● Kidhome: Number of children in customer’s household

● Teenhome: Number of teenagers in customer’s household

● Dt_Customer: Date of customer’s enrollment with the company

● Recency: Number of days since customer’s last purchase

● Complain: 1 if customer complained in the last 2 years, 0 otherwise

Products

● MntWines: Amount spent on wine in last 2 years

● MntFruits: Amount spent on fruits in last 2 years

● MntMeatProducts: Amount spent on meat in last 2 years

● MntFishProducts: Amount spent on fish in last 2 years

● MntSweetProducts: Amount spent on sweets in last 2 years

● MntGoldProds: Amount spent on gold in last 2 years

Promotion

● NumDealsPurchases: Number of purchases made with a discount

● AcceptedCmp1: 1 if customer accepted the offer in the 1st campaign, 0 otherwise

● AcceptedCmp2: 1 if customer accepted the offer in the 2nd campaign, 0 otherwise

● AcceptedCmp3: 1 if customer accepted the offer in the 3rd campaign, 0 otherwise

● AcceptedCmp4: 1 if customer accepted the offer in the 4th campaign, 0 otherwise

● AcceptedCmp5: 1 if customer accepted the offer in the 5th campaign, 0 otherwise

● Response: 1 if customer accepted the offer in the last campaign, 0 otherwise

Place

● NumWebPurchases: Number of purchases made through the company’s web site

● NumCatalogPurchases: Number of purchases made using a catalogue

● NumStorePurchases: Number of purchases made directly in stores

● NumWebVisitsMonth: Number of visits to company’s web site in the last month