The problem: You've got data everywhere. Transactions in one system, inventory in another, carbon reporting someone's doing manually in Excel. Nothing talks to anything else, and when leadership asks for a number, someone spends three days pulling it together.

Sound familiar?

What I built: End-to-end ETL pipelines processing 1M+ transactions across retail, ticketing, and merchandise — five years of historical data, 10+ source systems, unified into one clean pipeline on Palantir Foundry and Apache Spark.

Also automated a 1,000-row spreadsheet of carbon factors that was being manually reviewed and renamed. What used to take a week? Now it takes 10 minutes. That's not a metaphor — the process genuinely went from days to a single script run.

Results:

  • 95% reduction in ETL prep time

  • $40K+ per hour in operational time recovered

  • Data health monitoring that flagged issues before they became expensive decisions

If your team is spending more time wrangling data than using it, something's broken upstream. That's usually fixable.

Previous
Previous

Healthcare Data Engineering — Clinical NLP & Revenue Cycle Automation

Next
Next

Customer Segmentation & ML Pipeline — Sports & Membership Analytics