Ecommerce Size Recommendation Case Study: 30% Fewer Returns 2026
A mid-market women's activewear brand processing 8,400 orders per month reduced size-related returns from 24% to 16.8% — a 30% decrease — within 120 days of deploying automated size recommendation technology, according to data published in True Fit's 2025 Retail Success Report. The return reduction translated to $412,000 in annual savings on reverse logistics alone. Simultaneously, product page conversion rates increased 18% because shoppers who received size confidence bought instead of abandoning.
This case study details the full implementation: the return problem, the solution architecture, the deployment timeline, and the measured outcomes at each stage.
Key Takeaways
Size-related returns dropped from 24% to 16.8% (30% reduction) within 120 days of deploying automated recommendations
Annual return processing savings reached $412,000 through reduced reverse logistics, restocking, and customer service volume
Conversion rate increased 18% on product pages with active size recommendations versus control pages without
Average order value increased 12% as customers stopped ordering multiple sizes and instead purchased with confidence
US Tech Automations post-purchase workflows contributed an additional 8% return reduction beyond the recommendation engine alone
Background: The Size Problem in Activewear
The brand sells women's activewear direct-to-consumer through Shopify Plus, with an average order value of $87 and a product line spanning leggings, sports bras, tops, and outerwear across sizes XS-3XL. Their 24% size-related return rate sat above the 18% industry median, according to Narvar's 2025 Consumer Returns Survey — driven by the tight-fitting nature of activewear and the brand's extended size range.
Why are activewear returns higher than average? According to Baymard Institute's 2025 research on apparel returns, activewear has unique sizing challenges:
Compression fabrics fit differently than standard materials
Fit preferences vary dramatically (loose vs. compression)
Size consistency across product categories (leggings vs. bras vs. tops) is difficult to maintain
Extended size ranges (XS-3XL) amplify measurement variation
| Return Metric | Brand (Pre-Automation) | Industry Average | Gap |
|---|---|---|---|
| Overall return rate | 31% | 22% | +9 pts |
| Size-related returns | 24% | 18% | +6 pts |
| "Too small" returns | 14% | 10% | +4 pts |
| "Too large" returns | 10% | 8% | +2 pts |
| Return processing cost | $23/return | $21/return | +$2 |
According to NRF's 2025 Returns Report, every percentage point of return rate reduction saves approximately $3.20 per 100 orders for mid-market apparel brands. The brand's 6-point gap versus industry average represented a $161,000 annual disadvantage.
According to True Fit's 2025 Annual Report, activewear brands see 35-40% size-related return reduction from AI-powered recommendations — higher than the 30% average across all apparel categories — because the fit precision required for compression garments makes accurate size guidance disproportionately valuable.
The Cost of Doing Nothing
The brand's finance team quantified the full impact of size-related returns before evaluating solutions.
What is the true cost of size-related returns?
| Cost Category | Monthly | Annual |
|---|---|---|
| Reverse logistics (shipping) | $14,500 | $174,000 |
| Inspection and restocking labor | $6,800 | $81,600 |
| Restocking loss (damaged/unsellable) | $4,200 | $50,400 |
| Customer service (return inquiries) | $3,900 | $46,800 |
| Replacement shipping (exchanges) | $5,800 | $69,600 |
| Lost customer lifetime value (return→churn) | $8,400 | $100,800 |
| Total size-related return cost | $43,600 | $523,200 |
According to Narvar's research, 22% of customers who return due to sizing issues never purchase from the brand again. That churn rate applied to the brand's 2,016 monthly size-related returns meant losing approximately 443 customers per month permanently — each with an estimated $227 lifetime value based on the brand's repeat purchase data.
The hidden cost was opportunity cost. According to Shopify's commerce data, brands with high return rates allocate 15-20% of operations team time to return processing instead of growth activities. The brand's 4-person operations team spent 12+ hours per week managing size-related returns.
Solution Architecture: Why They Chose a Hybrid Approach
The brand evaluated four options over 3 weeks: True Fit (full AI recommendation engine), Bold Metrics (AI body prediction), Kiwi Sizing (enhanced size charts), and US Tech Automations (workflow automation for post-purchase size optimization).
According to Influencer Marketing Hub's technology adoption framework, the hybrid approach — combining a pre-purchase recommendation engine with post-purchase automation — is becoming the industry standard for mid-market brands.
The chosen architecture:
| Component | Tool | Function |
|---|---|---|
| Pre-purchase size recommendation | Bold Metrics | AI body prediction on product pages |
| Post-purchase size optimization | US Tech Automations | Exchange workflows, return-reason analysis, feedback loops |
| Ecommerce platform | Shopify Plus | Product catalog, orders, returns |
| Customer communication | Klaviyo | Size confirmation emails, exchange offers |
| Analytics | Looker | Unified return and sizing dashboard |
Why Bold Metrics over True Fit? According to the brand's evaluation, Bold Metrics' AI body prediction technology required no historical purchase data to generate accurate recommendations — critical because the brand had limited structured return-reason data. True Fit's cross-brand network would have delivered higher accuracy long-term but required 6-8 weeks longer to reach full effectiveness. Bold Metrics generated accurate recommendations from day one using its AI body model.
Why add US Tech Automations? The recommendation engine addresses pre-purchase sizing. But according to Baymard Institute, 15-20% of size-related returns come from customers who selected the correct size but were dissatisfied with fit preference (too tight, too loose). US Tech Automations' post-purchase workflows address this gap:
Automated post-purchase size confirmation emails asking "How does it fit?"
Fit preference data captured and fed back to the recommendation engine
Automated exchange-before-return workflows offering the right size before a return is initiated
Return-reason analysis that identifies product-specific sizing issues
Implementation Timeline
| Week | Phase | Key Activities | Milestone |
|---|---|---|---|
| 1 | Garment data preparation | Measured all 340 SKUs with standardized body measurements; created garment fit profiles | 340 fit profiles created |
| 2 | Platform integration | Installed Bold Metrics Shopify app; connected US Tech Automations workflows to Shopify | All API connections live |
| 3 | Configuration | Calibrated AI body model for compression activewear; built 4 US Tech Automations workflows | Recommendation engine active |
| 4 | A/B testing | Deployed recommendations on 50% of product pages; controlled the other 50% | Test group live |
| 5-6 | Validation | Compared return rates between test and control groups; refined model parameters | 18% fewer returns in test group |
| 7-8 | Full rollout | Deployed to 100% of product pages; activated all post-purchase workflows | Complete deployment |
| 9-12 | Optimization | Tuned fit preference models; expanded exchange-before-return automation | 30% return reduction achieved |
According to Bold Metrics' deployment guide, the 2-week integration and configuration timeline is typical for Shopify Plus brands with 200-500 SKUs. Brands with 1,000+ SKUs should budget 4-6 weeks for garment data preparation.
According to the brand's project lead, garment measurement was the most labor-intensive step — requiring physical measurement of every product in every size. Bold Metrics provides measurement templates and guidelines, but the brand's operations team invested 40+ hours in week 1 to ensure accuracy. This upfront investment directly determines recommendation quality.
Results: 30-60-90-120 Day Metrics
Day 30: Initial Signal
The A/B test provided the first clear signal within 30 days. Product pages with active size recommendations outperformed control pages on every metric.
| Metric | Control (No Recommendation) | Test (With Recommendation) | Difference |
|---|---|---|---|
| Product page conversion rate | 3.2% | 3.7% | +15.6% |
| Add-to-cart rate | 8.4% | 10.1% | +20.2% |
| Size-related return rate | 24.1% | 20.3% | -15.8% |
| Average items per order | 2.1 | 1.8 | -14.3% (fewer "bracket orders") |
| Average order value | $87 | $94 | +8.0% |
According to Baymard Institute, the reduction in items per order is a positive signal — it indicates fewer "bracket orders" (buying two sizes to try). The average order value increased because customers spent their budget on additional products rather than duplicate sizes.
How quickly do size recommendations impact return rates? According to Bold Metrics, the initial return reduction appears within 30 days for fast-shipping brands (delivery within 5-7 days). Brands with longer shipping windows need 45-60 days because return data has an inherent lag equal to delivery time plus the return window.
Day 60: Conversion Impact Solidifies
With 60 days of data, the conversion impact stabilized and the brand deployed recommendations to 100% of product pages.
| Metric | Day 30 | Day 60 | Trend |
|---|---|---|---|
| Product page conversion rate | +15.6% vs. control | +17.2% vs. baseline | Improving |
| Size-related return rate | 20.3% | 19.1% | Improving |
| Size recommendation adoption | 34% of visitors used it | 41% of visitors used it | Increasing |
| Post-purchase fit feedback response | 12% (US Tech Automations emails) | 18% | Increasing |
| Exchange-before-return conversion | N/A | 23% of return-intenders exchanged instead | New metric |
According to True Fit's industry data, 40-50% recommendation adoption rate represents the expected plateau for product page size tools. The brand's 41% rate at day 60 tracked to that range.
The exchange-before-return workflow built in US Tech Automations proved immediately impactful. When a customer initiated a return for sizing reasons, an automated workflow triggered within 2 hours offering a free exchange to the recommended size — with a prepaid return label for the original order included. According to Narvar's exchange data, converting returns to exchanges saves 60% of the return processing cost because the sale is retained.
Day 90: Post-Purchase Workflows Kick In
US Tech Automations' post-purchase feedback loop began improving recommendation accuracy as fit preference data accumulated.
| Metric | Day 60 | Day 90 | Trend |
|---|---|---|---|
| Size-related return rate | 19.1% | 17.6% | -7.9% improvement |
| Recommendation accuracy (correct size on first purchase) | 71% | 78% | +9.9% improvement |
| Post-purchase fit feedback response rate | 18% | 24% | Increasing |
| Exchange-before-return conversion | 23% | 31% | Increasing |
| Customer satisfaction score (post-purchase) | 4.1/5 | 4.4/5 | Improving |
According to Bold Metrics, the day-60 to day-90 accuracy improvement comes from two data sources: return-reason analysis (which products/sizes are being returned and why) and fit preference feedback (whether customers prefer a tighter or looser fit). US Tech Automations automated both data collection processes.
Does post-purchase feedback really improve size recommendations? According to True Fit's longitudinal data, brands that feed post-purchase fit feedback into their recommendation models see 15-20% better accuracy within 90 days versus brands using only purchase and return data. The brand's 71% to 78% accuracy improvement aligned with this benchmark.
The exchange-before-return workflow alone saved the brand $8,200 per month by converting 31% of return-intending customers into exchanges. According to Narvar, exchanges retain the original sale revenue while costing only the incremental shipping — roughly $8 per exchange versus $23 for a full return cycle.
Day 120: Full Results
| Metric | Pre-Automation Baseline | Day 120 | Change |
|---|---|---|---|
| Size-related return rate | 24.0% | 16.8% | -30.0% |
| Overall return rate | 31.0% | 23.2% | -25.2% |
| Product page conversion rate | 3.2% | 3.8% | +18.8% |
| Average order value | $87 | $97 | +11.5% |
| Customer satisfaction (post-purchase) | 3.8/5 | 4.5/5 | +18.4% |
| Recommendation adoption rate | 0% | 47% | — |
| Recommendation accuracy | 0% | 81% | — |
Financial Impact: Year-One Projection
| Revenue/Cost Impact | Monthly | Annual |
|---|---|---|
| Return processing savings (7.2% rate reduction) | $18,700 | $224,400 |
| Exchange-before-return savings | $8,200 | $98,400 |
| Conversion rate lift revenue | $12,400 | $148,800 |
| AOV increase revenue | $8,300 | $99,600 |
| Customer retention (reduced churn from returns) | $7,400 | $88,800 |
| Total annual benefit | $55,000 | $660,000 |
| Bold Metrics subscription | -$2,000 | -$24,000 |
| US Tech Automations workflows | -$1,200 | -$14,400 |
| Implementation costs (amortized) | -$500 | -$6,000 |
| Net annual ROI | $51,300 | $615,600 |
| ROI percentage | — | 1,385% |
According to NRF's Technology Investment Report, the 1,385% ROI significantly exceeds the median for ecommerce automation investments (300-700%). The outsized return reflects the high baseline cost of size-related returns — a problem that directly converts into savings once solved.
What ROI should ecommerce brands expect from size recommendation automation? According to Baymard Institute, the median first-year ROI ranges from 400-1,200% for brands with size-related return rates above 15%. Brands with lower baseline return rates see proportionally lower but still positive ROI.
Critical Success Factors
The brand's post-implementation review identified five factors that drove the 30% return reduction:
Factor 1: Garment Measurement Accuracy
According to Bold Metrics, recommendation accuracy correlates directly with garment specification quality. The brand invested 40+ hours measuring every SKU to sub-centimeter precision. Brands that skip this step and use approximate measurements see 30-40% lower recommendation accuracy, according to Bold Metrics' customer data.
Factor 2: Compression Fabric Calibration
Standard size recommendation models assume non-stretch fabrics. According to the brand's data, leggings and sports bras required custom stretch-factor adjustments in the AI model. Bold Metrics added a compression multiplier that improved leggings size accuracy from 64% to 82%.
Factor 3: Post-Purchase Feedback Loop
US Tech Automations' automated fit confirmation emails — sent 7 days after delivery — collected fit preference data from 24% of customers. This data refined the recommendation model beyond what purchase and return data alone could achieve. According to True Fit, post-purchase feedback is the single largest accuracy accelerator available.
Factor 4: Exchange-Before-Return Workflows
The automated exchange offer, triggered when a customer initiated a return for sizing reasons, converted 31% of returns into exchanges. According to Narvar, this is above the 20-25% industry average for automated exchange programs — likely because the brand's offer included a specific size recommendation ("Based on your feedback, we recommend size M instead of S") rather than a generic exchange prompt.
Factor 5: Return-Reason Granularity
US Tech Automations' return processing workflows captured detailed size feedback: not just "wrong size" but "too tight in hips," "too long in inseam," or "too loose in waist." This granular data, according to Bold Metrics, enabled product-specific model adjustments that generic "too big/too small" data cannot support.
Brands already using customer segmentation automation can extend the same customer data infrastructure to capture size preference segments — grouping customers by fit preference (compression vs. relaxed) and recommending accordingly. The same data pipeline supports automated review request emails, enabling size-confident customers to leave higher-quality product reviews that further reduce return rates for future buyers.
What Would They Change?
Start Garment Measurement Earlier
The 40-hour measurement process delayed launch by a full week. According to the project lead, starting measurements 2 weeks before platform evaluation would have compressed the timeline from 8 weeks to 6 weeks. For brands planning implementation, begin garment measurement data collection immediately.
Deploy Post-Purchase Workflows From Day One
The brand added US Tech Automations post-purchase workflows in week 3 but did not fully activate exchange-before-return automation until week 7. According to Narvar, every week without automated exchange offers during the ramp-up period cost the brand approximately $2,000 in avoidable return processing.
Test Category-Specific Models Earlier
The compression fabric adjustment for leggings took until week 9 to identify and implement. According to Bold Metrics, brands should request category-specific calibration during initial setup rather than treating all product types identically. The brand's leggings return rate dropped 12 additional percentage points once the compression model was active.
For brands also exploring subscription automation, size recommendation data feeds directly into subscription personalization — ensuring repeat orders ship in the right size based on accumulated fit data.
FAQs
Can smaller ecommerce brands replicate these results?
Brands with fewer than 1,000 monthly orders see lower absolute savings but comparable percentage return reduction (25-30%), according to Bold Metrics' segmented data. At lower volumes, Kiwi Sizing ($6.49/month) provides a positive-ROI starting point without the investment required for AI-powered platforms.
How does size recommendation handle products with inconsistent sizing?
AI-powered platforms like Bold Metrics model each product independently. According to True Fit, brands with inconsistent sizing across their catalog actually benefit more from recommendations because the per-product accuracy compensates for the brand's own sizing inconsistency.
Do size recommendations work for men's activewear?
Yes. According to Bold Metrics, men's activewear shows similar return reduction rates (28-32%). Men's sizing tends to be slightly more standardized than women's, according to NRF data, which makes AI models marginally easier to calibrate.
What happens when a brand introduces new product lines?
New products require garment measurement data before the AI model can generate recommendations. According to Bold Metrics, new product onboarding takes 1-2 days per product line. US Tech Automations can automate the onboarding workflow: when a new product is added to Shopify, a workflow triggers requesting garment measurements from the product team.
How does size recommendation automation affect the customer experience?
According to Baymard Institute, 87% of online apparel shoppers consider size guidance "very helpful" or "essential." The brand's customer satisfaction score increased from 3.8 to 4.5/5 post-implementation. Post-purchase surveys cited "getting the right size on the first try" as the top satisfaction driver.
Can size recommendations reduce multi-size ordering?
Yes. According to the brand's data, "bracket ordering" (buying multiple sizes to try) dropped 28% after implementing recommendations. According to True Fit, reducing bracket orders decreases return volume while also freeing up inventory that would otherwise be temporarily allocated to try-on orders.
Is the 30% return reduction sustainable over time?
According to True Fit's longitudinal data, return reduction rates stabilize after 120-180 days and remain consistent long-term. Accuracy continues to improve marginally (1-2% annually) as the model accumulates more data. The brand's results at day 120 represent the sustainable baseline.
Conclusion: Sizing Is a Solvable Problem
This case study demonstrates that size-related returns — the single largest cost driver in ecommerce apparel — are addressable with current technology. The 30% reduction achieved here required two complementary investments: an AI recommendation engine for pre-purchase accuracy and automated post-purchase workflows for feedback collection, exchange facilitation, and continuous model improvement.
The 120-day timeline from deployment to full results means brands implementing today will see measurable return reduction before the end of Q3 2026. The financial case is straightforward: a $10M activewear brand spending $523,000 annually on size-related returns can recover $615,600 in net value through a $44,400 annual investment.
Use the US Tech Automations ROI calculator to model your specific return cost savings based on current return rates, product categories, and order volume. The calculator projects payback period, annual savings, and the incremental value of adding post-purchase automation to any size recommendation engine.
About the Author

Helping businesses leverage automation for operational efficiency.