Automated retailer PO
processing & upload
How we replaced a manual, error-prone PO conversion workflow with a browser-based tool that parses TJX/HomeGoods/Marshalls PDFs, validates every line, and exports Extensiv-ready upload files in three steps — with a human review gate before anything touches the WMS.
45 minutes of manual work per PO — brittle, slow, and invisible to QC
Processing a single TJX purchase order took roughly 45 minutes from PDF to Extensiv-ready CSV. TJX Companies sends purchase orders as multi-page PDFs: a header page with styles, quantities, and unit costs, followed by a distribution section that breaks those quantities down by destination DC. To get those orders into Extensiv (formerly Skubana), someone has to manually cross-reference the two sections, build one row per SKU per DC, calculate order totals, and format 18 columns correctly every time.
Without a tool, this process lived in spreadsheets. DC addresses were typed by hand. Unit prices were looked up from the header and matched to each distribution row manually. Date fields were reformatted to survive Excel's CSV parser. Any mistake — a transposed quantity, a mismatched SKU, a dropped leading zero on a zip code — flowed directly into Extensiv and required a manual correction. The same PO now processes in under 3 minutes.
Three steps, one human review gate, zero server
The tool runs entirely in the browser. PDF.js extracts text client-side; a parsing pipeline handles the layout variation across TJX, HomeGoods, and Marshalls documents; the result lands in an editable review table before anything is exported. The buyer never touches a spreadsheet until the file is done.
Three layout bugs that required format-aware fixes
TJX/MarMaxx PO PDFs are not structured data exports — they're print-formatted documents extracted as raw text. The parser had to be hardened against three distinct layout failure modes discovered in production documents before the output was reliable.
What the QC screen looks like before export
After the parser runs, the user lands on a review table with every parsed row editable inline. The QC panel runs three checks automatically: per-order total verification, missing unit price detection, and date field validation. The export button only activates once QC passes or the user explicitly overrides a flag.
| Order Number | DC | SKU | Qty | Unit Price | Row Total | Order Total | Check |
|---|---|---|---|---|---|---|---|
| 91-XXXXXX | Retailer DC [A] | SKU-001 | 24 | [redacted] | [redacted] | [redacted] | ✓ PASS |
| 91-XXXXXX | Retailer DC [A] | SKU-002 | 24 | [redacted] | [redacted] | [redacted] | ✓ PASS |
| 96-XXXXXX | Retailer DC [B] | SKU-001 | 12 | [redacted] | [redacted] | [redacted] | ✓ PASS |
| 85-XXXXXX | Retailer DC [C] | SKU-003 | 18 | [redacted] | [redacted] | [redacted] | ✓ PASS |
Distribution center logic baked into the parser
The parser maintains a lookup table mapping two-digit distribution prefixes to their retailer and DC identity. Order numbers are automatically formatted as XX-<HeaderPO>. DC names in the Ship To Name column always have the DC number appended, eliminating any ambiguity in the Extensiv order list.
| Prefix range | Retailer | Notes |
|---|---|---|
| 881–887, 890 | HomeGoods | Columnar layout — positional array pairing required |
| 891, 896–898 | TJX | Standard distribution block format |
| 6610 | TJX Chino | 4-digit prefix — special-cased in order number formatter |
| 885–886 | Marshalls | CONSOLIDATOR block — cancel date on next-line lookahead |
Runs in the browser on infrastructure the brand already uses
The entire parsing and export pipeline is client-side. No upload, no server, no credentials at risk. The Extensiv API integration adds a thin Vercel serverless proxy as the only backend component — credentials stay server-side, the browser never sees them.
| Layer | Tool | Notes |
|---|---|---|
| PDF extraction | PDF.js | Client-side text extraction · no file upload · works offline |
| UI framework | React | Three-step wizard · inline editable review table · warm peach/cream design system |
| Parsing logic | Vanilla JS | Layout-aware: columnar HomeGoods, line-split SKU repair, next-line date lookahead |
| Export format | CSV (quoted) | Zip codes quoted · dates as text · 18-column Extensiv template · named per spec |
| WMS | Extensiv (Skubana) | Order upload target · OAuth2 integration via Vercel proxy (Phase 2) |
| Hosting | Vercel | Static React app + serverless proxy for Extensiv API credentials |
What gets built next
The v1 pipeline covers the full lifecycle from PDF upload through validated CSV export. The next phase connects the output directly to Extensiv via API, eliminating the manual upload step entirely — and expands the platform to support other brands and WMS systems as a SaaS offering.