Replacing nans, clipping outliers, and filling missing values with where and nan-specific routines. We contrast array-only cleaning with Pandas, and set expectations for when to stay in NumPy versus move to a DataFrame. This supports pipelines where sklearn expects ndarray inputs with consistent shapes and no hidden missing values at fit time.