You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
KDD20 Tutorial: Practical Automated Machine Learning with Tabular, Text, and Image Data
Automated machine learning (AutoML) offers the promise of translating raw data into accurate predictions with just a few lines of code. Rather than relying on human time/effort and manual experimentation, models can be improved by simply letting the AutoML system run for more time.
In this hands-on tutorial, we demonstrate fundamental techniques that enable powerful AutoML. We consider standard supervised learning tasks on various types of data including tables, text, and images. Rather than technical descriptions of individual ML models, we emphasize how to best use models within an overall ML pipeline that takes in raw training data and outputs predictions for test data. A major focus of our tutorial is on automating deep learning, a class of powerful techniques that are cumbersome to manage manually. Each topic covered in the tutorial is accompanied by a hands-on Jupyter notebook that implements best practices.
Most of this code is adopted from AutoGluon, a recent AutoML toolkit that makes it easy to translate your data into highly accurate models: autogluon.mxnet.io
Before running the hands-on tutorials on your own machine, please install AutoGluon (and subsequently make sure you have version 0.0.13).
You'll also need to have installed MXNet by following this guide.
Tutorial #7 also requires you to install Pytorch and torchvision.
A Linux machine with GPU is recommended, although you should be able to easily run the tabular data tutorials (#1-4) on a Mac laptop as well. All tutorials should be run in either Python 3.6 or 3.7.