HK
0%
VIEW
HK.
AutoSweep preview 1

Details

Duration

Completed

Role

Full Stack Developer

Tools

PythonScikit-learnPandasNumPyPyPI

Summary

A lightweight, published Python preprocessing library on PyPI that automates data cleaning, outlier detection, and scaling through a single flexible API — reducing boilerplate code by 80%. Developed and published a lightweight preprocessing library on PyPI, reducing boilerplate code by 80%. Saved developers 2+ hours per project by automating data cleaning, outlier detection, and scaling. Install via pip: pip install autosweep-preprocessing Usage: from autosweep_preprocessing import AutoSweep result = AutoSweep( file_path="data.csv", target_column="target", encode_categorical="onehot", remove_correlated=True, structured_output=True, ) X = result["X"] y = result["y"] info = result["info"] Autosweep supports: • CSV/Excel loading • Missing value handling and imputation • Numeric scaling (standard, minmax, robust) • Categorical encoding (onehot, ordinal, label) • Optional datetime feature extraction • Optional outlier handling (iqr, zscore) • Optional correlation and low-variance filtering • Structured output for pipeline diagnostics