Amazon SageMaker Data Wrangler is a purpose-built data aggregation and preparation tool for machine learning (ML). It allows you to use a visual interface to access data and perform exploratory data analysis (EDA) and feature engineering. The EDA feature comes with built-in data analysis capabilities for charts (such as scatter plot or histogram) and time-saving model analysis capabilities such as feature importance, target leakage, and model explainability. The feature engineering capability has over 300 built-in transforms and can perform custom transformations using either Python, PySpark, or Spark SQL runtime.
For custom visualizations and transforms, Data Wrangler now provides example code snippets for common types of visualizations and transforms. In this post, we demonstrate how to use these code snippets to quickstart your EDA in Data Wrangler.
Solution overview
At the time of this writing, you can import datasets into Data Wrangler from Amazon Simple Storage Service (Amazon S3), Amazon Athena,

Continue reading



At FusionWeb, we aim to look at the future through the lenses of imagination, creativity, expertise and simplicity in the most cost effective ways. All we want to make something that brings smile to our clients face. Let’s try us to believe us.