APINYA BOONLAKSAKUL
![Putting Groceries in the Trunk](https://static.wixstatic.com/media/11062b_454800e19f2b4dfc9aa4c75c10931ce3~mv2.jpg/v1/fill/w_611,h_393,al_c,q_80,usm_0.66_1.00_0.01,enc_avif,quality_auto/11062b_454800e19f2b4dfc9aa4c75c10931ce3~mv2.jpg)
![](https://static.wixstatic.com/media/b83767_d0ebd9593f94401f9b47983044696e97~mv2.png/v1/fill/w_231,h_30,al_c,lg_1,q_85,enc_avif,quality_auto/b83767_d0ebd9593f94401f9b47983044696e97~mv2.png)
E-COMMERCE PURCHASING BEHAVIOR ANALYSIS
Purpose and Context
Conduct an analysis of a dataset from Instacart Grocery Basket, an online grocery store. This project is part of a Data Analytics course with Career Foundry, utilizing Python for insights.
Objective
Analyze a dataset from Instacart Grocery Basket, an online grocery store to derive insights and suggest strategies for better segmentation.
Offer insights to facilitate the targeting of diverse customer segments with relevant marketing campaigns.
Period
September 2023
Data
-
Various open-source datasets from Instacart, including customers, orders, products, and departments.
-
Final data set 32,404,859 rows x 36 column
Skills
-
Conducting analysis using Python
-
Data cleaning
-
Data wrangling & subsetting
-
Descriptive statistics analysis
-
Data consistency checks
-
Data combining & aggregating
-
New variables deriving
-
Data visualization with Python
Tools
Language: Python
Library: Pandas, NumPy, SciPy
Seaborn, Matplotlib
Application: Jupyter Notebook
Excel
Population flow
![image.png](https://static.wixstatic.com/media/b83767_ad6581ccb19d4d2fb452905cee395a4a~mv2.png/v1/fill/w_811,h_285,al_c,lg_1,q_85,enc_avif,quality_auto/b83767_ad6581ccb19d4d2fb452905cee395a4a~mv2.png)
The grey boxes in the first row represent the original data sets. The second row of colored boxes represents the data sets after manipulation, such as removing missing values and duplicates. The third row of the darkest colored boxes represents the merges between the datasets. In the end, the final dataset is in the yellow box. This provides a visual overview of how the data flows throughout the data consistency checks.
Key Insights
Here are some highlighted key insights I derived from the data.
For a comprehensive analysis, please refer to the Excel file.
The busiest days of the week and hours of the day
![](https://static.wixstatic.com/media/b83767_6218e95e69554e86a25b35825b29e632~mv2.png/v1/fill/w_310,h_233,al_c,q_85,usm_0.66_1.00_0.01,enc_avif,quality_auto/b83767_6218e95e69554e86a25b35825b29e632~mv2.png)
![](https://static.wixstatic.com/media/b83767_1fa338b15f434887bc0b421f2aabd7aa~mv2.png/v1/fill/w_310,h_233,al_c,q_85,usm_0.66_1.00_0.01,enc_avif,quality_auto/b83767_1fa338b15f434887bc0b421f2aabd7aa~mv2.png)
![image.png](https://static.wixstatic.com/media/b83767_92fe668937cb4968898e607a8f4e2a39~mv2.png/v1/fill/w_64,h_83,al_c,q_85,usm_0.66_1.00_0.01,enc_avif,quality_auto/b83767_92fe668937cb4968898e607a8f4e2a39~mv2.png)
Peak activity occurs on Saturday and Sunday, with Tuesday and Wednesday being the least busy days. The busiest hours are from 10 AM to 4 PM, while the quietest period spans from 11 PM to 7 AM. To enhance effectiveness, the marketing team can strategically schedule ads during off-peak periods and implement promotions or targeted advertisements.
The ordering behaviors among different types of customers
![](https://static.wixstatic.com/media/b83767_c4f25a2ce9234f089ca7cd711f37cb99~mv2.png/v1/fill/w_371,h_368,al_c,q_85,usm_0.66_1.00_0.01,enc_avif,quality_auto/b83767_c4f25a2ce9234f089ca7cd711f37cb99~mv2.png)
![image.png](https://static.wixstatic.com/media/b83767_559db4b8f38644ab8c8982d6173f2bd3~mv2.png/v1/fill/w_85,h_100,al_c,q_85,usm_0.66_1.00_0.01,enc_avif,quality_auto/b83767_559db4b8f38644ab8c8982d6173f2bd3~mv2.png)
![](https://static.wixstatic.com/media/b83767_7c5e303c2c40455a878000dadb617d7c~mv2.png/v1/fill/w_434,h_299,al_c,q_85,usm_0.66_1.00_0.01,enc_avif,quality_auto/b83767_7c5e303c2c40455a878000dadb617d7c~mv2.png)
Regular customers (10 to 40 orders) show the highest order frequency, followed by loyal and new customers. Order frequency distribution among departments remains consistent across loyalty statuses, with regular customers leading in every department, followed by loyal customers.
The marketing team can convert regular customers into loyal ones by targeting them with tailored campaigns. Analyzing campaign launch timings could offer insights into their effectiveness, particularly by assessing how new customers transition to regular ones and regular customers become loyal during these campaigns.
![Step1. Create avg_spent](https://static.wixstatic.com/media/b83767_025ead84e21a4e04972e62ea9d0401e3~mv2.png/v1/fill/w_980,h_528,al_c,q_90,usm_0.66_1.00_0.01,enc_avif,quality_auto/b83767_025ead84e21a4e04972e62ea9d0401e3~mv2.png)
![Step2. Create spending_flag](https://static.wixstatic.com/media/b83767_06f4cec2aac44043a49dd7f5f793ad00~mv2.png/v1/fill/w_980,h_631,al_c,q_90,usm_0.66_1.00_0.01,enc_avif,quality_auto/b83767_06f4cec2aac44043a49dd7f5f793ad00~mv2.png)
![Spending habit by loyalty type chart](https://static.wixstatic.com/media/b83767_73d4e5cf7e444e0fa6b1a3fafa1424d6~mv2.png/v1/fill/w_638,h_453,al_c,q_85,enc_avif,quality_auto/b83767_73d4e5cf7e444e0fa6b1a3fafa1424d6~mv2.png)
![Step1. Create avg_spent](https://static.wixstatic.com/media/b83767_025ead84e21a4e04972e62ea9d0401e3~mv2.png/v1/fill/w_980,h_528,al_c,q_90,usm_0.66_1.00_0.01,enc_avif,quality_auto/b83767_025ead84e21a4e04972e62ea9d0401e3~mv2.png)
Generate a spending flag for each user based on the average price across all orders. Flag users as "Low spenders" if their mean product price is below $10 and as "High spenders" if it's $10 or higher. Use these flags to tailor marketing campaigns for different spender segments. The outcome reveals regular customers as the dominant force among both low and high spenders.
Ordering habits based on demographic information
![](https://static.wixstatic.com/media/b83767_85181e55bcd34e1f97b187e51acc49ec~mv2.png/v1/fill/w_424,h_272,al_c,q_85,usm_0.66_1.00_0.01,enc_avif,quality_auto/b83767_85181e55bcd34e1f97b187e51acc49ec~mv2.png)
![](https://static.wixstatic.com/media/b83767_6e58660b1f334a8cb5f7444793fd62d1~mv2.png/v1/fill/w_388,h_272,al_c,q_85,usm_0.66_1.00_0.01,enc_avif,quality_auto/b83767_6e58660b1f334a8cb5f7444793fd62d1~mv2.png)
![image.png](https://static.wixstatic.com/media/b83767_03e2a25e940d4e6c9b1179f8df9d5108~mv2.png/v1/fill/w_91,h_122,al_c,q_85,usm_0.66_1.00_0.01,enc_avif,quality_auto/b83767_03e2a25e940d4e6c9b1179f8df9d5108~mv2.png)
![](https://static.wixstatic.com/media/b83767_ad063e141084440b9fa0d99a7b3356e7~mv2.png/v1/fill/w_448,h_267,al_c,q_85,usm_0.66_1.00_0.01,enc_avif,quality_auto/b83767_ad063e141084440b9fa0d99a7b3356e7~mv2.png)
The majority of frequent customers are married with dependents, followed by those who are single with no dependents. Leveraging the insight that married customers exhibit the highest order frequency, the marketing team can craft targeted ads for this demographic, ranging from early 20s to 80 years old and falling within the middle-income range.