PYTORCH DEVELOPER DAY: DAY 2 INFORMATION

Community Talks:


Posters

xaitk-saliency: Saliency built for analytics and autonomy applications
Brian Hu, Paul Tunison, Elim Schenck, Roddy Collins, Anthony Hoogs

Despite significant progress in the past few years, machine learning-based systems are still often viewed as “black boxes,” which lack the ability to explain their output decisions to human users. Explainable artificial intelligence (XAI) attempts to help end-users understand and appropriately trust machine learning-based systems. One commonly used technique involves saliency maps, which are a form of visual explanation that reveals what an algorithm pays attention to during its decision process. We introduce the xaitk-saliency python package, an open-source, explainable AI framework and toolkit for visual saliency algorithm interfaces and implementations, built for analytics and autonomy applications. The framework is modular and easily extendable, with support for several image understanding tasks, including image classification, image similarity, and object detection. We have also recently added support for the autonomy domain, by creating saliency maps for pixel-based deep reinforcement-learning agents in environments such as ATARI. Several example notebooks are included that demo the current capabilities of the toolkit. xaitk-saliency will be of broad interest to anyone who wants to deploy AI capabilities in operational settings and needs to validate, characterize and trust AI performance across a wide range of real-world conditions and application areas using saliency maps. To learn more, please visit: https://github.com/XAITK/xaitk-saliency.

https://github.com/XAITK/xaitk-saliency

MEDICAL & HEALTHCARE, RESPONSIBLE AI

CovRNN—A collection of recurrent neural network models for predicting outcomes of COVID-19 patients using their EHR data
Laila Rasmy, Ziqian Xie, Bingyu Mao, Khush Patel, Wanheng Zhang, Degui Zhi

CovRNN is a collection of recurrent neural network (RNN)-based models to predict COVID-19 patients' outcomes, using their available electronic health record (EHR) data on admission, without the need for specific feature selection or missing data imputation. CovRNN is designed to predict three outcomes: in-hospital mortality, need for mechanical ventilation, and long length of stay (LOS >7 days). Predictions are made for time-to-event risk scores (survival prediction) and all-time risk scores (binary prediction). Our models were trained and validated using heterogeneous and de-identified data of 247,960 COVID-19 patients from 87 healthcare systems, derived from the Cerner® Real-World Dataset (CRWD) and 36,140 de-identified patients' data derived from the Optum® de-identified COVID-19 Electronic Health Record v. 1015 dataset (2007 - 2020). CovRNN shows higher performance than do traditional models. It achieved an area under the receiving operating characteristic (AUROC) of 93% for mortality and mechanical ventilation predictions on the CRWD test set (vs. 91·5% and 90% for light gradient boost machine (LGBM) and logistic regression (LR), respectively) and 86.5% for prediction of LOS > 7 days (vs. 81·7% and 80% for LGBM and LR, respectively). For survival prediction, CovRNN achieved a C-index of 86% for mortality and 92·6% for mechanical ventilation. External validation confirmed AUROCs in similar ranges. https://www.medrxiv.org/content/10.1101/2021.09.27.2126

https://github.com/ZhiGroup/CovRNN

MEDICAL & HEALTHCARE, RESPONSIBLE AI

Farabio - Deep learning for Biomedical Imaging
Sanzhar Askaruly, Nurbolat Aimakov, Alisher Iskakov, Hyewon Cho, Yujin Ahn, Myeong Hoon Choi, Hyunmo Yang, Woonggyu Jung

Deep learning has transformed many aspects of industrial pipelines recently. Scientists involved in biomedical imaging research are also benefiting from the power of AI to tackle complex challenges. Although the academic community has widely accepted image processing tools, such as scikit-image, ImageJ, there is still a need for a tool which integrates deep learning into biomedical image analysis. We propose a minimal, but convenient Python package based on PyTorch with common deep learning models, extended by flexible trainers and medical datasets. In this work, we also share theoretical dive in the form of course as well as minimal tutorials to run Android applications, containing models trained with Farabio.

https://github.com/tuttelikz/farabio

MEDICAL & HEALTHCARE, RESPONSIBLE AI

TorchIO: Pre-processing & Augmentation of Medical Images for Deep Learning Applications
Fernando Pérez-García, Rachel Sparks, Sébastien Ourselin

Processing of medical images such as MRI or CT presents different challenges compared to RGB images typically used in computer vision: a lack of labels for large datasets, high computational costs, and the need of metadata to describe the physical properties of voxels. Data augmentation is used to artificially increase the size of the training datasets. Training with image patches decreases the need for computational power. Spatial metadata needs to be carefully taken into account in order to ensure a correct alignment and orientation of volumes. We present TorchIO, an open-source Python library to enable efficient loading, preprocessing, augmentation and patch-based sampling of medical images for deep learning. TorchIO follows the style of PyTorch and integrates standard medical image processing libraries to efficiently process images during training of neural networks. TorchIO transforms can be easily composed, reproduced, traced and extended. We provide multiple generic preprocessing and augmentation operations as well as simulation of MRI-specific artifacts.TorchIO was developed to help researchers standardize medical image processing pipelines and allow them to focus on the deep learning experiments. It encourages good open-science practices, as it supports experiment reproducibility and is version-controlled so that the software can be cited precisely. Due to its modularity, the library is compatible with other frameworks for deep learning with medical images.

https://github.com/fepegar/torchio/

MEDICAL & HEALTHCARE, RESPONSIBLE AI

MONAI: A Domain Specialized Library for Healthcare Imaging
Michael Zephyr, Prerna Dogra, Richard Brown, Wenqi Li, Eric Kerfoot

Healthcare image analysis for both radiology and pathology is increasingly being addressed with deep-learning-based solutions. These applications have specific requirements to support various imaging modalities like MR, CT, ultrasound, digital pathology, etc. It is a substantial effort for researchers in the field to develop custom functionalities to handle these requirements. Consequently, there has been duplication of effort, and as a result, researchers have incompatible tools, which makes it hard to collaborate. MONAI stands for Medical Open Network for AI. Its mission is to accelerate the development of healthcare imaging solutions by providing domain-specialized building blocks and a common foundation for the community to converge in a native PyTorch paradigm.

https://monai.io/

MEDICAL & HEALTHCARE, RESPONSIBLE AI

A Framework for Bayesian Neural Networks
Sahar Karimi, Beliz Gokkaya, Audrey Flower, Ehsan Emamjomeh-Zadeh, Adly Templeton, Ilknur Kaynar Kabul, Erik Meijer

We are presenting a framework for building Bayesian Neural Networks (BNN). One of the critical use cases of BNNs is uncertainty quantification of ML predictions in deep learning models. Uncertainty quantification leads to more robust and reliable ML systems that are often employed to prevent catastrophic outcomes of overconfident predictions especially in sensitive applications such as integrity, medical imaging and treatments, self driving cars, etc.. Our framework provides tools to build BNN models, estimate the uncertainty of their predictions, and transform existing models into their BNN counterparts. We discuss the building blocks and API of our framework along with a few examples and future directions.

MEDICAL & HEALTHCARE, RESPONSIBLE AI

Revamp of torchvision datasets and transforms
Philip Meier, torchvision team, torchdata team

torchvision provides a lot of image and video datasets as well as transformations for research and prototyping. In fact, the very first release of torchvision in 2016 was all about these two submodules. Since their inception their extent has grown organically and became hard to maintain and sometimes also hard to use. Over the years we have gathered a lot of user feedback and decided to revamp the datasets and transforms. This poster will showcase the current state of the rework and compare it to the hopefully soon to be legacy API.

https://pytorchvideo.org/

AUDIO, IMAGE & VIDEO, VISION

OpenMMLab: Open-Source Toolboxes for Artificial Intelligence
Wenwei Zhang, Han Lyu, Kai Chen

OpenMMLab builds open-source tool boxes for computer vision. It aims to 1) provide high-quality codebases to reduce the difficulties in algorithm reimplementation; 2) create efficient deployment toolchains targeting a variety of inference engines and devices; 3) build a solid foundation for the community to bridge the gap between academic research and industrial applications. Based on PyTorch, OpenMMLab develops MMCV to provide unified abstract interfaces and common utils, which serve as a foundation of the whole system. Since the initial release in October 2018, OpenMMLab has released 15+ tool boxes covering different research areas. It has implemented 200+ algorithms and released contain 1800+ pre-trained models. With tighter collaboration with the community, OpenMMLab will open source more toolboxes and full-stack toolchains in the future.

openmmlab.com

AUDIO, IMAGE & VIDEO, VISION

Flood Segmentation on Sentinel-1 SAR Imagery with Semi-Supervised Learning
Siddha Ganju, Sayak Paul

Floods wreak havoc throughout the world, causing billions of dollars in damages, and uprooting communities, ecosystems and economies. Aligning flood extent mapping with local topography can provide a plan-of-action that the disaster response team can consider. Thus, remote flood level estimation via satellites like Sentinel-1 can prove to be remedial. The Emerging Techniques in Computational Intelligence (ETCI) competition on Flood Detection tasked participants with predicting flooded pixels after training with synthetic aperture radar (SAR) images in a supervised setting. We use a cyclical approach involving two stages (1) training an ensemble model of multiple UNet architectures with available high and low confidence labeled data and, generating pseudo labels or low confidence labels on the entire unlabeled test dataset, and then, (2) filter out quality generated labels and, (3) combining the generated labels with the previously available high confidence labeled dataset. This assimilated dataset is used for the next round of training ensemble models. This cyclical process is repeated until the performance improvement plateaus. Additionally, we post-process our results with Conditional Random Fields. Our approach sets the second-highest score on the public hold-out test leaderboard for the ETCI competition with 0.7654 IoU. To the best of our knowledge we believe this is one of the first works to try out semi-supervised learning to improve flood segmentation models.

https://github.com/sidgan/ETCI-2021-Competition-on-FLood-Detection

AUDIO, IMAGE & VIDEO, VISION

Real time Speech Enhancement
Xiaoyu Liu, James Wagner, Roy Fejgin, Joan Serra, Santiago Pascual, Cong Zhou, Jordi Pons, Vivek Kumar

Speech enhancement is a fundamental audio processing task that has experienced a radical change with the advent of deep learning technologies. We will overview the main characteristics of the task and the key principles of existing deep learning solutions. We will be presenting the past and present work done by our group with the overall goal of delivering the best possible intelligibility and sound quality. Finally, we will provide our view on the future of speech enhancement and show how our current long-term research aligns with such a view.

AUDIO, IMAGE & VIDEO, VISION

Kornia AI: Low Level Computer Vision for AI
Edgar Riba, Dmytro Mishkin, Jian Shi, Luis Ferraz

Kornia is a differentiable library that allows classical computer vision to be integrated into deep learning models. It consists of a set of routines and differentiable modules to solve generic computer vision problems. At its core, the package uses PyTorch as its main backend both for efficiency and to take advantage of the reverse-mode auto-differentiation to define and compute the gradient of complex functions.

https://kornia.github.io//

AUDIO, IMAGE & VIDEO, VISION

Video Transformer Network
Daniel Neimark, Omri Bar, Maya Zohar, Dotan Asselmann

This paper presents VTN, a transformer-based framework for video recognition. Inspired by recent developments in vision transformers, we ditch the standard approach in video action recognition that relies on 3D ConvNets and introduce a method that classifies actions by attending to the entire video sequence information. Our approach is generic and builds on top of any given 2D spatial network. In terms of wall runtime, it trains 16.1× faster and runs 5.1× faster during inference while maintaining competitive accuracy compared to other state-of-the-art methods. It enables whole video analysis, via a single end-to-end pass, while requiring 1.5× fewer GFLOPs. We report competitive results on Kinetics-400 and present an ablation study of VTN properties and the trade-off between accuracy and inference speed. We hope our approach will serve as a new baseline and start a fresh line of research in the video recognition domain. Code and models are available at: https://github.com/bomri/SlowFast/blob/master/projects/vtn/README.md . See paper: https://arxiv.org/abs/2102.00719

https://github.com/bomri/SlowFast/blob/master/projects/vtn/README.md

AUDIO, IMAGE & VIDEO, VISION

DLRT: Ultra Low-Bit Precision Inference Engine for PyTorch on CPU
Dr. Ehsan Saboori, Dr. Sudhakar Sah, MohammadHossein AskariHemmat Saad Ashfaq, Alex Hoffman, Olivier Mastropietro, Davis Sawyer

The emergence of Deep Neural Networks (DNNs) on embedded and low-end devices holds tremendous potential to expand the adoption of AI technologies to wider audiences. However, making DNNs applicable for inference on such devices using techniques such as quantization and model compression, while maintaining model accuracy, remains a challenge for production deployment. Furthermore, there is a lack of inference engines available in any AI framework to run such low precision networks. Our work presents a novel inference engine and model compression framework that automatically enables PyTorch developers to quantize and run their deep learning models at 2bit and 1bit precision, making them faster, smaller and more energy-efficient in production. DLRT empowers PyTorch developers to unlock advanced AI on low-power CPUs, starting with ARM CPUs and MCUs. This work allows AI researchers and practitioners to achieve 10x faster inference and near-GPU level performance on a fraction of the power and cost.

https://github.com/deeplite

PERFORMANCE, PRODUCTION & DEPLOYMENT

Serving PyTorch Models in Production at Walmart Search
Adway Dhillo, Nidhin Pattaniyil

This poster is for a data scientist or ML engineer looking to productionalize their pytorch models. It will cover post training steps that should be taken to optimize the model such as quantization and torch script. It will also walk the user in packaging and serving the model through Facebook’s TorchServe. Will also cover benefits of script mode and Pytorch JIT. Benefits of Torch Serve: high performance serving , multi model serving , model version for A/B testing, server side batching, support for pre and post processing

https://pytorch.org/serve/

PERFORMANCE, PRODUCTION & DEPLOYMENT

CleanRL: high-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features
Shengyi Huang, Rousslan Fernand Julien Dossa, Chang Ye, Jeff Braga

CleanRL is an open-source library that provides high-quality single-file implementations of Deep Reinforcement Learning algorithms. It provides a simpler yet scalable developing experience by having a straightforward codebase and integrating production tools to help interact and scale experiments. In CleanRL, we put all details of an algorithm into a single file, making these performance-relevant details easier to recognize. Additionally, an experiment tracking feature is available to help log metrics, hyperparameters, videos of an agent's gameplay, dependencies, and more to the cloud. Despite succinct implementations, we have also designed tools to help scale, at one point orchestrating experiments on more than 2000 machines simultaneously via Docker and cloud providers.environments. The source code can be found at https://github.com/vwxyzjn/cleanrl.

https://github.com/vwxyzjn/cleanrl/

PERFORMANCE, PRODUCTION & DEPLOYMENT

Deploying a Food Classifier on PyTorch Mobile
Nidhin Pattaniyil, Reshama Shaikh

As technology improves, so does the use of training deep learning models. Additionally, since the time spent on mobile devices is greater than on desktop, the demand for applications running natively on mobile devices is also high. This demo will go through a complete example of training a deep learning vision classifier on the Food-101 dataset using PyTorch. We then deploy it on web and mobile using TorchServe and PyTorch Mobile.

https://github.com/npatta01/pytorch-food

PERFORMANCE, PRODUCTION & DEPLOYMENT

Torch-TensorRT: Accelerating Inference Performance Directly from PyTorch using TensorRT
Naren Dasan, Nick Comly, Dheeraj Peri, Anurag Dixit, Abhiram Iyer, Bo Wang, Arvind Sridhar, Boris Fomitchev, Josh Park

Learn how to accelerate PyTorch inference, from framework, for model deployment. The PyTorch integration for TensorRT makes the performance of TensorRT's GPU optimizations available in PyTorch for any model. We will walk you through how with 3 lines of code you can go from a trained model to optimized TensorRT-embedded TorchScript, ready to deploy to a production environment.

https://github.com/NVIDIA/Torch-TensorRT/

PERFORMANCE, PRODUCTION & DEPLOYMENT

Tensorized Deep Learning with TensorLy-Torch
Jean Kossaifi

Most of the data in modern machine learning (e.g. fMRI, videos, etc) is inherently multi-dimensional and leveraging that structure is crucial for good performance. Tensor methods are the natural way to achieve this and can improve deep learning and enable i) large compression ratios through a reduction of the number of parameters, ii) computational speedups, iii) improved performance and iv) better robustness. The TensorLy project provides the tools to manipulate tensors, including tensor algebra, regression and decomposition. TensorLy-Torch builds on top of this and enables tensor-based deep learning by providing out-of-the-box tensor based PyTorch layers that can be readily combined with any deep neural network architecture and takes care of things such as initialization and tensor dropout.

http://tensorly.org/quantum

PERFORMANCE, PRODUCTION & DEPLOYMENT

Catalyst-Accelerated Deep Learning R&D
Sergey Kolesnikov

Catalyst is a PyTorch framework for Deep Learning Research and Development. It focuses on reproducibility, rapid experimentation, and codebase reuse so you can create something new rather than write yet another train loop.

https://catalyst-team.com/

EXTENDING PYTORCH, APIs, PARALLEL & DISTRIBUTED TRAINING

Ray Lightning: Easy Multi-node PyTorch Lightning training
Amog Kamsetty, Richard Liaw, Will Drevo, Michael Galarnyk

PyTorch Lightning is a library that provides a high-level interface for PyTorch which helps you organize your code and reduce boilerplate. By abstracting away engineering code, it makes deep learning experiments easier to reproduce and improves developer productivity. PyTorch Lightning also includes plugins to easily parallelize your training across multiple GPUs. This parallel training, however, depends on a critical assumption: that you already have your GPU(s) set up and networked together in an efficient way for training. While you may have a managed cluster like SLURM for multi-node training on the cloud, setting up the cluster and its configuration is no easy task. Ray Lightning was created with this problem in mind to make it easy to leverage multi-node training without needing extensive infrastructure expertise. It is a simple and free plugin for PyTorch Lightning with a number of benefits like simple setup, easy scale up, seamless creation of multi-node clusters on AWS/Azure/GCP via the Ray Cluster Launcher, and an integration with Ray Tune for large-scale distributed hyperparameter search and state of the art algorithms

https://github.com/ray-project/ray_lightning

EXTENDING PYTORCH, APIs, PARALLEL & DISTRIBUTED TRAINING

Supercharge your Federated Learning with Synergos
Jin Howe Teo, Way Yen Chen, Najib Ninaba, Choo Heng Chong Mark

Data sits as the centerpiece of any machine learning endeavour, yet in many real-world projects, a single party’s data is often insufficient and needs to be augmented with data from other sources. This is unfortunately easier said than done, as there are many innate concerns (be it regulatory, ethical, commercial etc.) stopping parties from exchanging data. Fortunately, there exists an emerging privacy-preserving machine learning technology called Federated Learning. It enables multiple parties holding local data to collaboratively train machine learning models without actually exchanging their data with one another, hence preserving the confidentiality of different parties’ local data.Today, we will be showcasing Synergos, a distributed platform built here at AI Singapore to facilitate the adoption of Federated Learning. Specifically, it strives to make the complex mechanisms involved in any federated endeavour simple, accessible and sustainable.

https://github.com/aimakerspace/synergos_simulator

EXTENDING PYTORCH, APIs, PARALLEL & DISTRIBUTED TRAINING

AdaptDL: An Open-Source Resource-Adaptive Deep Learning Training and Scheduling Framework
Aurick Qiao, Omkar Pangarkar, Richard Fan

AdaptDL is an open source framework and scheduling algorithm that directly optimizes cluster-wide training performance and resource utilization. By elastically re-scaling jobs, co-adapting batch sizes and learning rates, and avoiding network interference, AdaptDL improves shared-cluster training compared with alternative schedulers. AdaptDL can automatically determine the optimal number of resources given a job’s need. It will efficiently add or remove resources dynamically to ensure the highest-level performance. The AdaptDL scheduler will automatically figure out the most efficient number of GPUs to allocate to your job, based on its scalability. When the cluster load is low, your job can dynamically expand to take advantage of more GPUs. AdaptDL offers an easy-to-use API to make existing PyTorch training code elastic with adaptive batch sizes and learning rates. We have also ported AdaptDL to Ray/Tune which can automatically scale trials of an Experiment and can be used to schedule stand-alone PyTorch training jobs on the cloud in a cost-effective way.

https://github.com/petuum/adaptdl

EXTENDING PYTORCH, APIs, PARALLEL & DISTRIBUTED TRAINING

Define-by-run quantization
Vasiliy Kuznetsov, James Reed, Jerry Zhang

Describes a prototype PyTorch workflow to perform quantization syntax transforms in Eager mode with: * no model changes needed (compared to Eager mode which requires manual quant/dequant insertion and fusion) * almost no model syntax restrictions (compared to FX graph mode which requires symbolic traceability)

https://pytorch.org/docs/stable/quantization.html

EXTENDING PYTORCH, APIs, PARALLEL & DISTRIBUTED TRAINING

Fx Numeric Suite Core APIs
Charles Hernandez, Vasiliy Kuznetzov, Haixin Liu

wrong when it doesn't satisfy the accuracy we expect. Debugging the accuracy issue of quantization is not easy and time consuming. The Fx Numeric Suite Core APIs allows users to better diagnose the source of their quantization error for both statically and dynamically quantized modelsThis poster gives an overview of the core APIs and techniques available to users through the Fx Numeric Suite, and how they can use them to improve quantization performance.

EXTENDING PYTORCH, APIs, PARALLEL & DISTRIBUTED TRAINING

snnTorch: Training spiking neural networks using gradient-based optimization
J.K. Eshraghian, M. Ward, E.O. Neftci, G. Lenz, X. Wang, G. Dwivedi, M. Bennamoun, D.S. Jeong, W.D. Lu

The brain is the perfect place to look for inspiration to develop more efficient neural networks. One of the main differences with modern deep learning is that the brain encodes and processes information as spikes rather than continuous activations. Combining the training methods intended for neural networks with the sparse, spiking activity inspired by biological neurons has shown the potential to improve the power efficiency of training and inference by several orders of magnitude. snnTorch is a Python package for performing gradient-based learning with spiking neural networks. It extends the capabilities of PyTorch, taking advantage of its GPU accelerated tensor computation and applying it to networks of event-driven spiking neurons. snnTorch is designed to be intuitively used with PyTorch, as though each spiking neuron were simply another activation in a sequence of layers. It is therefore agnostic to fully-connected layers, convolutional layers, residual connections, etc. The classical challenges that have faced the neuromorphic engineering community, such as the non-differentiability of spikes, the dead neuron problem, vanishing gradients in backpropagation-through-time, are effectively solved in snnTorch and enable the user to focus on building applications that leverage sparsity and event-driven data streams.

https://snntorch.readthedocs.io/en/latest/

EXTENDING PYTORCH, APIs, PARALLEL & DISTRIBUTED TRAINING

PyTorch for R
Daniel Falbel

Last year the PyTorch for the R language project has been released allowing R users to benefit of PyTorch's speed and flexibility. Since then we have a growing community of contributors that are both improving the torch for R interface, building research and products on top of it and using it to teach deep learning methods. In this poster we will showcase what are the past and current developments in the PyTorch for R project as well as what are our plans for the future.

https://torch.mlverse.org/

EXTENDING PYTORCH, APIs, PARALLEL & DISTRIBUTED TRAINING

ocaml-torch and tch-rs: writing and using PyTorch models using OCaml or Rust
Laurent Mazare

The main front-end for using PyTorch is its Python API, however LibTorch provides a lower level C++ API to manipulate tensors, perform automatic differentiation, etc. ocaml-torch and tch-rs are two open-source projects providing wrappers for this C++ API respectively in OCaml and Rust. Users can then write OCaml and Rust code to create new models, perform inference and training, and benefit from the guarantees provided by strongly typed programming languages and functional programming. They can also use TorchScript to leverage existing Python models. The libraries provide various examples, ranging from the main computer vision models to a minimalist GPT implementation. The main challenges for these bindings are to provide idiomatic APIs adapted to the languages specificities; to automatically generate most of the bindings code as there are thousands of C++ functions to expose; and to interact properly with the memory models for each language.

https://github.com/laurentMazare/ocaml-torch

EXTENDING PYTORCH, APIs, PARALLEL & DISTRIBUTED TRAINING

PyTorch Lightning Flash - Your PyTorch AI Factory
Ari Bornstein

Flash is a high-level deep learning framework for fast prototyping, baselining, finetuning and solving deep learning problems. It features a set of tasks for you to use for inference and finetuning out of the box, and an easy to implement API to customize every step of the process for full flexibility. Flash is built for beginners with a simple API that requires very little deep learning background, and for data scientists, Kagglers, applied ML practitioners and deep learning researchers that want a quick way to get a deep learning baseline with advanced features PyTorch Lightning offers. Flash enables you to easily configure and run complex AI recipes for over 15 tasks across 7 data domains

https://github.com/PyTorchLightning

EXTENDING PYTORCH, APIs, PARALLEL & DISTRIBUTED TRAINING

PyTorch-Ignite: Training and evaluating neural networks flexibly and transparently
Victor Fomin, Taras Savchyn, Priyansi

PyTorch-Ignite is a high-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently. PyTorch-Ignite is designed to be at the crossroads of high-level Plug & Play features and under-the-hood expansion possibilities. The tool aims to improve the deep learning community's technical skills by promoting best practices where things are not hidden behind a divine tool that does everything, but remain within the reach of users. PyTorch-Ignite differs from other similar tools by allowing users to compose their applications without being focused on a super multi-purpose object, but rather on weakly coupled components allowing advanced customization.

https://pytorch-ignite.ai/ecosystem/

EXTENDING PYTORCH, APIs, PARALLEL & DISTRIBUTED TRAINING

Benchmarking the Accuracy and Robustness of Feedback Alignment Methods
Albert Jimenez, Mohamed Akrout

Backpropagation is the default algorithm for training deep neural networks due to its simplicity, efficiency and high convergence rate. However, its requirements make it impossible to be implemented in a human brain. In recent years, more biologically plausible learning methods have been proposed. Some of these methods can match backpropagation accuracy, and simultaneously provide other extra benefits such as faster training on specialized hardware (e.g., ASICs) or higher robustness against adversarial attacks. While the interest in the field is growing, there is a necessity for open-source libraries and toolkits to foster research and benchmark algorithms. In this poster, we present BioTorch, a software framework to create, train, and benchmark biologically motivated neural networks. In addition, we investigate the performance of several feedback alignment methods proposed in the literature, thereby unveiling the importance of the forward and backward weight initialization and optimizer choice. Finally, we provide a novel robustness study of these methods against state-of-the-art white and black-box adversarial attacks.

https://github.com/jsalbert/biotorch

EXTENDING PYTORCH, APIs, PARALLEL & DISTRIBUTED TRAINING

Salina: Easy programming of Sequential Decision Learning and Reinforcement Learning Models in pytorch
Ludovic Denoyer, Alfredo de la Fuente, Song Duong, Jean-Baptiste Gaya, Pierre-Alexandre Kamienny, Daniel H. Thompson

salina is a lightweight library extending PyTorch modules for the development of sequential decision models. It can be used for Reinforcement Learning (including model-based with differentiable environments, multi-agent RL, ...), but also in a supervised/unsupervised learning settings (for instance for NLP, Computer Vision, etc..).

https://github.com/facebookresearch/salina

ML Ops, MODELS, MODEL OPTIMIZATION & INTERPRETABILITY

Structured and Unstructured Pruning Workflow in PyTorch
Zafar Takhirov, Karen Zhou, Raghuraman Krishnamoorthi

Two new toolflows for model pruning are introduced: Sparsifier and Pruner, which enable unstructured and structured pruning of the model weights respectively. The toolflow can be combined with other optimization techniques, such as quantization to achieve even higher levels of model compression. In addition to that, the "Pruner" toolflow can also be used for "shape propagation", where the physical structure of the model is modified after structured pruning (in FX graph mode only).This poster gives a high-level overview of the prototype API, usage example, currently supported sparse quantized kernels, as well as provides a brief overview of future plans

https://github.com/pytorch/pytorch

ML Ops, MODELS, MODEL OPTIMIZATION & INTERPRETABILITY

Torch-CAM: class activation explorer
François-Guillaume Fernandez

One of the core inconveniences of Deep Learning comes from its interpretability, which remains obscure for most non-basic convolutional models. Their very performances are granted by optimization processes that have high degrees of freedom and no constraints on explainability. Fortunately, modern frameworks mechanisms grant access to information flow in their components, which paved the way to building intuition around result interpretability in CNN models. The main contributions of the author are described as follows: - building a flexible framework for class activation computation - providing high-quality implementations of most popular methods - making these methods usable by entry users as well as researchers The open-source project is available here: https://github.com/frgfm/torch-cam

https://github.com/frgfm/torch-cam

ML Ops, MODELS, MODEL OPTIMIZATION & INTERPRETABILITY

moai: A Model Development Kit to Accelerate Data-driven Workflows
Nikolaos Zioulis

moai is a PyTorch-based AI Model Development Kit (MDK) that seeks to improve data-driven model workflows, design and understanding. It relies on hydra for handling configuration and lightning for handling infrastructure. As a kit, It offers a set of actions to `train` or `evaluate` models using the corresponding actions which consume configuration files. Apart from the definition of the model, data, training scheme, optimizer, visualization and logging, these configuration files additionally use named tensors to define tensor processing graphs. These are created by chaining various building blocks called monads, which are functional units or otherwise single responsibility modules. Monad parameters and input/output tensors are defined on the configuration file, allowing for the entire model to be summarized into a single file. This opens up novel functionalities like querying for inter-model differences using the `diff` action, or aggregating the results of multiple models using the `plot` action which uses hiplot to compare models in various ways. moai facilitates high quality reproduction (using the `reprod` action), as apart from automatically handling all boilerplate related to it, it standardizes the process of developing modules/monads and implicitly logs all hyperparameters. Even though no code is required, moai exploits python’s flexibility to allow developers to integrate their own code into its engine from external projects, vastly increasing their productivity.

https://github.com/ai-in-motion/moai

ML Ops, MODELS, MODEL OPTIMIZATION & INTERPRETABILITY

Building Production ML Pipelines for PyTorch Models
Vaibhav Singh, Rajesh Thallam, Jordan Totten, Karl Weinmeister

Machine Learning Operationalization has rapidly evolved in the last few years with a growing set of tools for each phase of development. From experimentation to automated model analysis and deployment, each of these tools offer some unique capabilities. In this work we survey a slice of these tools and demonstrate an opinionated example of an end to end CI/CD pipeline for PyTorch model development and deployment using Vertex AI SDK. The goal of this session is to aid an informed conversation on the choices available to PyTorch industry practitioners who are looking to operationalize their ML models, and to researchers who are simply trying to organize their experiments. Although our implementation example will make tool choices at various stages, we will be focused on ML design patterns that are applicable to a wide variety of commercial and open-source offerings.

https://github.com/GoogleCloudPlatform/vertex-ai-samples

ML Ops, MODELS, MODEL OPTIMIZATION & INTERPRETABILITY

Customizing MLOps pipelines with JSON-AI: a declarative syntax to streamline ML in the database
George Hosu, Particio Cerda-Mardini, Natasha Seelam, Jorge Torres

Nearly 64% of companies take over a month to a year to deploy a single machine learning (ML) model into production [1]. Many of these companies cite key challenges integrating with complex ML frameworks as a root cause [1], as there is still a gap between where data lives, how models are trained, and how downstream applications access predictions from models [1, 2]. MindsDB is a PyTorch-based ML platform that aims to solve fundamental MLOps challenges by abstracting ML models as “virtual tables”, allowing models to be queried in the same natural way users work with data in databases. As data is diverse and varied, we recently developed an open-source declarative syntax, named “JSON-AI” to allow others to customize ML model internals without changing source code. We believe that the key elements of the data science (DS)/ML pipeline, namely data pre-processing/cleaning, feature engineering, and model-building [2], should be automated in a robust, reliable, and reproducible manner with simplicity. JSON-AI allows you refined control of each of these steps, and enables users to bring custom routines into their ML pipeline. In our poster, we will show how a user interfaces with JSON-AI to bring original approaches to each of the aforementioned parts of the DS/ML, along with control over analysis and explainability tools. [1] Algorithmia (2021). 2021 state of enterprise machine learning [2] “How Much Automation Does a Data Scientist Want?” ArXiV (2021)

https://github.com/mindsdb/mindsdb/

ML Ops, MODELS, MODEL OPTIMIZATION & INTERPRETABILITY

TorchStudio, a full featured IDE for PyTorch
Robin Lobel

TorchStudio is an open-source, full-featured IDE for PyTorch. It aims to simplify the creation, training and iterations of AI models. It can load, analyze and explore datasets from the TorchVision or TorchAudio categories, or custom datasets with any format and number of inputs and outputs. TorchVision, TorchAudio or custom models can then be loaded or written from scratch, debugged, visualized as a graph, and trained using local hardware, a distant server or GPUs in the cloud. Trainings can then be compared in the dashboard with several analyzing tools to help you identify the best performing set of models and hyper parameters and export it as TorchScript or ONNX files. TorchStudio is also highly customizable, with 90% of its functionalities accessible as open source scripts and independent modules, to fit as many AI scenario as possible.

https://torchstudio.ai/

ACCELERATORS, TOOLS, LIBRARY, DATA

Accelerate TorchServe with Intel Extension for PyTorch
Mark Saroufim, Hamid Shojanazeri, Patrick Hu, Geeta Chauhan, Jing Xu, Jianan Gu, Jiong Gong, Ashok Emani, Eikan Wang, Min Jean Cho, Fan Zhao

Accelerate TorchServe with Intel® Extension for PyTorch: Intel is collaborating with Meta to take advantage of performance boosting from Intel® Extension for PyTorch* from TorchServe, so that users can easily deploy their PyTorch models with out of the box satisfying performance. With these SW advancements, we demonstrated ease-of-use IPEX user-facing API, and we also showcased speed-up with Intel® Extension for PyTorch* FP32 inference with the stock PyTorch and speed-up with Intel® Extension for PyTorch* INT8 inference with the stock PyTorch.

www.intel.com/Performanceindex

ACCELERATORS, TOOLS, LIBRARY, DATA

Kaolin Library
Clement Fuji Tsang, Jean-Francois Lafleche, Charles Loop, Masha Shugrina, Towaki Takikawa, Jiehan Wang

NVIDIA Kaolin is a suite of tools for accelerating 3D Deep Learning research. The Kaolin library provides a PyTorch API for working with a variety of 3D representations and includes a growing collection of GPU-optimized operations such as modular differentiable rendering, fast conversions between representations, loss functions, data loading, 3D checkpoints and more. The library also contains a lightweight 3D visualizer Dash3D and can work with an Omniverse companion app for dataset/checkpoint visualization and synthetic data generation.

ACCELERATORS, TOOLS, LIBRARY, DATA

Accelerate PyTorch training with Cloud TPUs
Jack Cao, Milad Mohammadi, Zak Stone, Vaibhav Singh, Calvin Pelletier, Shauheen Zahirazami

PyTorch / XLA offers PyTorch users the ability to train their models on XLA devices including Cloud TPUs. This compiled path often makes it possible to utilize creative optimizations and achieve top performance on target XLA devices. With the introduction of Cloud TPU VMs, users have direct access to TPU host machines and therefore a great level of flexibility. In addition, TPU VMs make debugging easier and reduce data transfer overheads. Google has also recently announced the availability of Cloud TPU v4 Pods, which are exaflop-scale supercomputers for machine learning. Cloud TPU v4 Pods offer a whole new level of performance for large-scale PyTorch / XLA training of ML models.

ACCELERATORS, TOOLS, LIBRARY, DATA

Accelerating PyTorch on the largest chip ever built (WSE)
Antonio Kim, Behzad Abghari, Chris Oliver, Cynthia Liu, Mark Browning, Vishal Subbiah, Kamran Jafari, Emad Barsoum, Jessica Liu, Sean Lie

The Cerebras Wafer Scale Engine (WSE) is the largest processor ever built, dedicated to accelerating deep learning model for training and inference. A single chip in a single CS-2 system provides the compute power of a cluster of GPUs but acts as a single processor, making it also much simpler to use. We present the current PyTorch backend architecture for the Cerebras CS-2 and how we go all the way from PyTorch to laying out the model graph on the wafer. Additionally, we will discuss the advantages of training on Cerebras hardware and its unique capabilities.

https://cerebras.net

ACCELERATORS, TOOLS, LIBRARY, DATA