top of page
Search

Olist E-Commerce Analysis

  • Writer: Rajwol Khadka
    Rajwol Khadka
  • 11 minutes ago
  • 3 min read

📌 Project Overview

This project leverages real commercial data from Olist, a leading Brazilian e-commerce platform that aggregates small merchants under a unified marketplace. The dataset spans approximately 100,000 orders placed between late 2016 and Q4 of 2018, capturing every aspect of the customer journey from order placement, product details, payment, and delivery logistics to customer satisfaction through reviews.


The primary objective of this analysis is to provide Olist stakeholders with actionable insights into marketplace performance, using SQL for data querying and Tableau to craft a modern, visually elegant dashboard that brings these insights to life.


Curious how these insights come to life visually? 👉 Click here to explore the tableau dashboard

If you'd like to dive deeper into the SQL logic behind this analysis, 👉 Click here to view all the created SQL queries

 

📊 Analysis


Revenue Distribution by Customer State

Business Question: Where does Olist generate the most revenue geographically?


Approach: Revenue was aggregated by customer_state, and cumulative revenue was calculated using a window function to reveal skewness in revenue contribution across regions.

SQL Query
SQL Query
SQL Query Results (Left), Tableau Visual (Right)
SQL Query Results (Left), Tableau Visual (Right)

Key Findings:

  • São Paulo (SP) and Rio de Janeiro (RJ) alone account for over 40% of total revenue.

  • The top 5 states contribute more than 70%, displaying a Pareto-like distribution.

  • Less populous states like Roraima (RR) and Amapá (AP) contribute marginally.


Strategic Takeaway:

  • Prioritize high-revenue regions (SP, RJ, MG) for marketing investments, logistics optimization, and seller onboarding.

  • Reevaluate resource allocation in underperforming states to maximize ROI.


 

Customer Retention Analysis

Business Question: How many customers return to make a second purchase?


Approach: Orders were grouped by unique customer ID, then segmented into one-time vs. repeat customers.

SQL Query (Right), Results Visualized (Left)
SQL Query (Right), Results Visualized (Left)

Key Findings:

  • A staggering 97% of customers purchase only once.

  • Only 3% are repeat buyers, indicating low retention.


Strategic Implications:

  • Retention is a critical growth opportunity.

  • Olist may be under-leveraging post-purchase engagement, loyalty incentives, or personalization.


Suggested Strategic Actions:

  • Implement email remarketing and loyalty incentives

  • Highlight "frequently bought together" products or subscriptions

  • Use personalized recommendations post-purchase to drive return traffic


 

Logistics Performance by Seller State

Business Question: Which regions incur the highest delivery costs and delays?


Approach: Calculated average freight value and delivery days per seller_state. Plotted states on a scatterplot by cost vs. speed.

SQL Query
SQL Query


Scatterplot
Scatterplot

Key Findings & Classification:


🚀 Efficient Shippers (Low Cost, Fast Delivery)

  • Low average freight value AND low average delivery days.

  • Example states: SP, RJ, DF

  • These states serve as our logistical role models.


⚠️ Inefficient Shippers (High Cost, Slow Delivery)

  • High average freight value AND high average delivery days.

  • Example states: RO, CE, AM

  • These represent key areas for logistics optimization


🤯 Trade-Off Zones (Mixed Performance)

  • One value is high, the other is low.

  • These states might justify a deeper dive:

    • High freight but fast delivery: Paying for speed.

    • Low freight but slow delivery: Potential for improved routing.

  • Example states: MA (High delivery days, mid freight), PB (high freight, fast)


Strategic Implications:

  • Benchmark top-performing regions to replicate logistics strategies elsewhere.

  • Use this to inform carrier negotiations, warehouse placement, and routing decisions.


 

Review Scores vs. Delivery Time

Business Question: Does delivery speed influence customer satisfaction?


Approach: Average delivery days were calculated for each review_score and plotted.

To emphasize the significance on customer satisfaction and how shipping plays a factor, I wanted to highlight the average delivery days by review score.

SQL Query
SQL Query

Tableau Visual
Tableau Visual

Key Insight:

  • Strong inverse correlation between delivery time and review score.

  • 5-star reviews have an average delivery time of ~10.6 days

  • 1-star reviews take on average ~21.2 days to arrive — double the wait!


Strategic Recommendations:

  • Proactively flag slow deliveries (>14 days) for customer service intervention or compensation.

  • Segment reviews further by product category or region to identify specific issues.


 

Conclusion

This analysis surfaces several key strategic insights for Olist:

  • Revenue is highly centralized: a small number of states (SP, RJ, MG) drive the majority of revenue. These regions warrant prioritized marketing, logistics, and seller support.

  • Retention is the biggest gap: only 3% of customers make repeat purchases. Olist must enhance lifecycle marketing to boost loyalty and CLTV (customer lifetime value).

  • Logistics vary widely by region: while some states ship fast and cheap, others lag significantly — hurting customer satisfaction and profitability.

  • Delivery time directly impacts reviews: faster deliveries lead to better customer ratings, reinforcing the need for logistics optimization and expectation management.


Final Reflection

This project was a highly engaging opportunity to work with rich, real-world e-commerce data and deliver meaningful business insights. Using SQL for extraction and Tableau for visual storytelling enabled a full-funnel analysis of Olist's marketplace — from transaction performance to customer satisfaction.

I'm excited to build upon this foundation, expand the dashboard, and continue exploring deeper aspects like seller performance, product category trends, and retention segmentation in future iterations.

 
 
 

Comments


bottom of page