West Atlas — Extensiv Sales ETL Pipeline Case Study
West Atlas LLC - Case Study

Replacing a fragile export
with a production sales pipeline

How I moved a daily sales dashboard off a manual Extensiv export and onto a direct Order Manager API pipeline - turning a 45-minute morning ritual into an hourly scheduled job that runs unattended, validates its own schema, and keeps the Power BI report fresh all day.

Built by West Atlas LLC
Use case CPG operations & sales reporting
Stack Python · Extensiv Order Manager API · SharePoint · Power BI
Status Production - running hourly
45 min
Daily task eliminated
0
Manual exports required
1
Scheduled job replaces it
Hourly
Refresh cadence
100%
Schema match with Power BI report
<2 min
End-to-end runtime

A fragile morning ritual, one sick day from failure

The team's Power BI sales dashboard depended on a manual workflow that had been running for years. Someone logged into Extensiv Order Manager every morning, ran a saved export to Excel, dropped the file into a SharePoint folder, and waited for Power BI to pick it up and refresh the report.

It worked - until it didn't. If the person was out, the dashboard didn't update. If Extensiv timed out mid-export, the file was corrupt. If the saved export filter drifted, the downstream schema broke silently and the report came out wrong. Leadership was depending on a dashboard that had a single human failure point buried at the top of its pipeline.

Manual workflow
Log in, export, drop, pray

One person logs into Extensiv each morning, runs the saved export, waits for Excel to finish, uploads to SharePoint. No backup. No run history. Silent failures when filters drift.

Automated workflow
Scheduled API pull, validated, delivered

Python hits the Extensiv Order Manager API on an hourly schedule, transforms the response into the exact Power BI schema, validates it, writes it to SharePoint, and logs the run. Zero humans required.

A four-stage pipeline, end-to-end in under two minutes

01

Scheduled trigger - hourly, unattended

A scheduled task fires every hour throughout the business day. No human involvement. Survives machine reboots via Windows Service registration.

scheduled-task
02

Data fetch - Extensiv Order Manager API

Authenticates with OAuth2 and pulls orders incrementally - only records modified since the last run. Paginated to survive high-volume days - post-promo Mondays that would have broken the old Excel export don't break this.

extensiv-apioauth2
03

Transform - normalize to Power BI schema

Python transforms the API response into the exact column layout, types, and channel mapping the downstream Power BI report expects. Every field is type-coerced and date-normalized. Schema is validated before the write, not after.

pythonschema-validation
04

Load - SharePoint write + run log

The output file lands in the same SharePoint location the manual export used to. Prior version is archived. Every run writes a log entry with timestamp, record count, and status. The existing Power BI report picks it up unchanged.

sharepointrun-log

One job, one contract, zero downstream changes

Scheduled Task (hourly, unattended)
    │
    ▼
pipeline.py             ← entry point / orchestrator
├── auth.py             ← OAuth2 client credentials · token refresh
├── extract.py          ← Extensiv Order Manager API · incremental pullorders · line items · ship dates · channels
├── transform.py        ← normalize to Power BI schematype coercion · date normalization · channel mapping
├── validate.py         ← schema check before writecolumn names · types · non-null fields · row count sanity
├── load.py             ← SharePoint write + prior-version archive
└── logger.py           ← run log · timestamp · record count · status
    │
    ▼
SharePoint  fixed location, identical schema
    │
    ▼
Existing Power BI report picks it up unchanged (zero downstream changes)

What each file does

File Role
pipeline.pyOrchestrates the full extract-transform-validate-load sequence end-to-end
auth.pyOAuth2 client credentials flow with automatic token refresh
extract.pyPaginated, incremental Extensiv Order Manager API pull with explicit field selection
transform.pyNormalizes API response into Power BI schema — type coercion, channel mapping, date handling
validate.pyPre-write schema validation: column names, types, non-null checks, row-count sanity
load.pySharePoint write with prior-version archive; preserves downstream file location exactly
logger.pyStructured run log with timestamp, record count, and pass/fail status per stage

Why it's built this way

Scope
Honor the downstream contract

The temptation was to rebuild the Power BI report from scratch. The better move was to match the exact schema the existing report expected - byte-compatible output, zero downstream changes. One week of work instead of three months.

Validation
Fail before you write

Schema validation runs before the SharePoint write, not after. Silent schema drift is the single most common failure mode in data pipelines. Better to fail loudly and leave the prior file in place than corrupt the downstream report.

Pagination
Build it in on day one

On a normal hour one API call is enough. On the hour after a promo drop it isn't. Pagination goes into the first version or it gets built under pressure in production at the worst possible time.

Observability
Run logs turn questions into one-liners

"Did the dashboard update this hour?" should take one second to answer, not ten minutes. Every run writes a structured log entry with timestamp, record count, and status. Failures alert immediately.

On the bigger pattern: If a human is clicking "export" every morning for a scheduled report, you have a single point of failure disguised as a routine. These are the easiest wins in any operations stack - high-impact, low-risk, invisible to the people consuming the output.

What changed

Dimension Before After
Trigger Human logs in, runs saved export Scheduled task, hourly, unattended
Freshness Once a day, if it ran at all Hourly - dashboard stays current through the day
Time cost ~45 minutes of attention every morning Zero - runs unattended
Failure mode Silent - corrupt file, wrong schema, missed export Loud - schema validation blocks bad writes, run log flags errors
Coverage Breaks on sick days, holidays, PTO Runs every day, including weekends
Volume handling Excel export timed out on high-volume days Paginated API pull, survives any day's volume
Audit trail None - did the export run? no one knew Structured run log with timestamp, count, status
Downstream impact N/A Zero - existing Power BI report unchanged

Technologies used

Deterministic ETL - no LLM agents. The job is structured data in, structured data out, schema-validated at the boundary. Python does the work, the scheduler handles the timing, and the existing Microsoft stack carries it from SharePoint into Power BI.

Python 3.12 Extensiv Order Manager API OAuth2 SharePoint Power BI Azure Function Windows Scheduled Task pandas requests pydantic