top of page
Putting Groceries in the Trunk

E-COMMERCE PURCHASING BEHAVIOR ANALYSIS

Purpose and Context

Conduct an analysis of a dataset from Instacart Grocery Basket, an online grocery store. This project is part of a Data Analytics course with Career Foundry, utilizing Python for insights.

Objective

Analyze a dataset from Instacart Grocery Basket, an online grocery store to derive insights and suggest strategies for better segmentation.
Offer insights to facilitate the targeting of diverse customer segments with relevant marketing campaigns.

Period

September 2023

Data
  • Various open-source datasets from Instacart, including customers, orders, products, and departments.

  • Final data set 32,404,859 rows x 36 column

Skills
  • Conducting analysis using Python

  • Data cleaning

  • Data wrangling & subsetting

  • Descriptive statistics analysis

  • Data consistency checks

  • Data combining & aggregating

  • New variables deriving

  • Data visualization with Python

Tools

Language:      Python
Library:           Pandas, NumPy, SciPy

                       Seaborn, Matplotlib
Application:  
Jupyter Notebook 

                       Excel

Population flow
image.png

The grey boxes in the first row represent the original data sets. The second row of colored boxes represents the data sets after manipulation, such as removing missing values and duplicates. The third row of the darkest colored boxes represents the merges between the datasets. In the end, the final dataset is in the yellow box. This provides a visual overview of how the data flows throughout the data consistency checks.

Key Insights

Here are some highlighted key insights I derived from the data.

For a comprehensive analysis, please refer to the Excel file.

The busiest days of the week and hours of the day

image.png

Peak activity occurs on Saturday and Sunday, with Tuesday and Wednesday being the least busy days. The busiest hours are from 10 AM to 4 PM, while the quietest period spans from 11 PM to 7 AM. To enhance effectiveness, the marketing team can strategically schedule ads during off-peak periods and implement promotions or targeted advertisements.

The ordering behaviors among different types of customers

image.png

Regular customers (10 to 40 orders) show the highest order frequency, followed by loyal and new customers. Order frequency distribution among departments remains consistent across loyalty statuses, with regular customers leading in every department, followed by loyal customers.

The marketing team can convert regular customers into loyal ones by targeting them with tailored campaigns. Analyzing campaign launch timings could offer insights into their effectiveness, particularly by assessing how new customers transition to regular ones and regular customers become loyal during these campaigns.

Generate a spending flag for each user based on the average price across all orders. Flag users as "Low spenders" if their mean product price is below $10 and as "High spenders" if it's $10 or higher. Use these flags to tailor marketing campaigns for different spender segments. The outcome reveals regular customers as the dominant force among both low and high spenders.

Ordering habits based on demographic information

image.png

The majority of frequent customers are married with dependents, followed by those who are single with no dependents. Leveraging the insight that married customers exhibit the highest order frequency, the marketing team can craft targeted ads for this demographic, ranging from early 20s to 80 years old and falling within the middle-income range.

Munich, Germany

©2023 by apinya-b. Proudly created with Wix.com

bottom of page