ETL Playground
A browser-only CSV cleaning lab: deterministic steps, a quick quality report, and download. No backend, no uploads to any server.
Data in
Try sample
Cleaning pipeline
Six steps run after parse: trim/null tokens → dedupe (identical rows, then same normalized email when an email column exists and the table is not event-log-shaped) → infer types → coerce dates/numbers → winsorize numeric columns → impute missing letter+number ids when there is a single gap, format money-like columns to two decimals (empty amounts → 0.00), then drop rows that still have no id and fill other blanks with N/A.
Load data to see pipeline steps.
Data quality report
No report yet.
Preview
Before (parsed)
—
After (cleaned)
—
// Technical summary
What this demo uses
This demo cleans messy CSV files in the browser and shows exactly what changed, so users can trust the result without sending data to a server.
- Methodology: A fixed ETL pipeline runs step-by-step (parse, clean, type handling, outlier control, missing-value fill). The same in-memory table powers reports, previews, and downloads.
-
Technical terms:
- ETL: Extract, Transform, Load - collect data, clean it, and prepare it for use.
- Winsorize: Limits extreme values so outliers do less damage.
- Deterministic: Same input gives the same output every time.
- Toolsets/technology used: JavaScript, Papa Parse, CSV processing, and client-side browser storage/memory only.