Dibakar Gope

Principal Engineer
Machine Learning & AI
Arm Inc., Austin

GitHub
Google Scholar
DBLP


About Me

I am a Principal Engineer in the Machine Learning research group at Arm Research in Austin, TX. I have been working on model and runtime optimization of Generative AI networks, such as large language models(LLMs) and vision transformer networks, for on-device, hardware-aware deployment. I have worked on developing resource-efficient computer vision (CV) and natural language processing (NLP/NLU) models, neural network compression, and neural architecture search techniques for already constrained CV and NLP networks executing on highly constrained platforms, such as microcontrollers, and personal mobile devices, etc. In that context, I have developed novel compact model architectures, sub-byte quantization, low-rank matrix factorization, dynamic execution, and pruning techniques. Besides, I contribute to the design of a next-generation neural hardware accelerator and work on developing new instruction definitions and kernel optimizations for high throughput matrix multiplication. My research work has resulted in multiple high-impact publications in top-tier machine learning conferences and workshops (ICCV, MLSys, CVPR, NeurIPS).

Before joining Arm in July 2017, I received a Ph.D. in Electrical Engineering specializing in Computer Architecture, Machine Learning and AI-Assisted Systems Design from University of Wisconsin-Madison in 2017. During the course of my Ph.D., I conducted original research in designing highly accurate machine learning-guided neural branch prediction for CPU microarchitecture, improving the execution efficiency of PHP scripting language through hardware accelerators and compiler optimizations, and developing efficient memory consistency model for modern processor architecture. My research work has resulted in multiple high-impact publications in top-tier computer architecture conferences (ISCA, MICRO, HPCA, PACT).

I received my Master's Degree (M.S.) in Computer Engineering from Texas A&M University in 2011. During the course of my Master's degree, I conducted original research in design for testing, and dynamic CMOS circuits, designed and implemented a path delay test generator maximizing crosstalk-induced slowdown in modern digital circuits, and performed a detailed evaluation of a circuit and layout fabric based on dynamic circuits. This research work resulted in publications in top-tier circuit/VLSI design conferences (ICCD, MWSCAS).

I received my Bachelor's Degree (B.E.) in Electrical and Electronics Engineering from Birla Institute of Technology & Science (Pilani), India in 2008.


Research Interests

Machine learning for constrained systems, theory and design of deep neural networks for CV and NLP/NLU applications, neural network compression techniques, neural architecture search, Computer Vision, Natural Language Understanding, neural network kernel optimizations, AI-optimized processor and system architecture, computer architecture


Eduction


Industrial Experience


Publications / Patents

  1. Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers [Paper]
    Natalia Frumkin, Dibakar Gope, Diana Marculescu
    International Conference on Computer Vision (ICCV), Oct. 2023.

  2. PerfSAGE: Generalized Inference Performance Predictor for Arbitrary Deep Learning Models on Edge Devices [Paper]
    Yuji Chai, Devashree Tripathy, Chuteng Zhou, Dibakar Gope, Igor Fedorov, Ramon Matas, David Brooks, Gu-Yeon Wei, Paul Whatmough
    ArXiv, 2023.

  3. A Neural Processing Unit for Attention-Based Inference
    Shounak Datta, Dibakar Gope, Jesse Beu, and Mark O’Connor
    US Patent Application, 2022.

  4. Restructurable Activation Networks [Paper]
    Kartikeya Bhardwaj, James Ward, Caleb Tung*, Dibakar Gope*, Lingchuan Meng, Igor Fedorov, Alex Chalfin, Paul Whatmough, Danny Loh (* Equal Contribution)
    ArXiv, 2022.

  5. Collapsible Linear Blocks for Super-Efficient Super Resolution [Paper]
    Kartikeya Bhardwaj, Milos Milosavljevic, Liam O'Neil, Dibakar Gope, Ramon Matas, Alex Chalfin, Naveen Suda, Lingchuan Meng, Danny Loh
    Fifth Conference on Machine Learning and Systems (MLSys), Aug. 2022.

  6. Super-Efficient Super Resolution for Fast Adversarial Defense at the Edge [Paper]
    Kartikeya Bhardwaj, Dibakar Gope, James Ward, Paul N. Whatmough, Danny Loh
    Special Initiative on Autonomous Systems Design (ASD) in conjunction with Design, Automation & Test in Europe (DATE), Mar. 2022.

  7. System and Method for Accelerating Neural Networks
    Dibakar Gope, Jesse Beu, and Milos Milosavljevic
    US Patent Application, 2021.

  8. MicroNets: Neural Network Architectures for Deploying TinyML Applications on Commodity Microcontrollers [Paper]
    Colby Banbury, Chuteng Zhou, Igor Fedorov, Ramon Matas Navarro, Urmish Thakker, Dibakar Gope, Vijay Janapa Reddi, Matthew Mattina, Paul N. Whatmough
    Fourth Conference on Machine Learning and Systems (MLSys), Apr. 2021.

  9. System, Devices and/or Processes for Adapting Neural Network Processing Devices
    Urmish Thakker, Jesse Beu, Dibakar Gope, and Mark O’Connor
    US Patent Application, 2021.

  10. Rank and Run-time aware compression of NLP Applications [Paper]
    Urmish Thakker, Jesse Beu, Dibakar Gope, Ganesh Dasika, Matthew Mattina
    First Workshop on Simple and Efficient Natural Language Processing in conjunction with Empirical Methods in Natural Language Processing (EMNLP), Nov. 2020.

  11. Compressing RNNs to Kilobyte budget for IoT devices using Kronecker Products [Paper]
    Urmish Thakker, Jesse Beu, Dibakar Gope, Chu Zhou, Igor Fedorov, Ganesh Dasika, and Matthew Mattina
    ACM Journal on Emerging Technologies in Computing Systems, 2021.

  12. High Throughput Matrix-Matrix Multiplication between Asymmetric Bit-Width Operands [arXiv]
    Dibakar Gope, Jesse Beu, and Matthew Mattina.
    ArXiv, 2020.

  13. Understanding the Impact of Dynamic Channel Pruning on Conditionally Parameterized Convolutions [Paper]
    Ravi Raju*, Dibakar Gope*, Urmish Thakker, and Jesse Beu (* Equal Contribution)
    2nd International Workshop on Challenges in Artificial Intelligence and Machine Learning for Internet of Things (AIChallengeIoT) in conjunction with ACM (SenSys), Nov. 2020.

  14. Pushing the Envelope of Dynamic Spatial Gating technologies [Paper]
    Xueqin Huang, Urmish Thakker, Dibakar Gope, and Jesse Beu
    2nd International Workshop on Challenges in Artificial Intelligence and Machine Learning for Internet of Things (AIChallengeIoT) in conjunction with ACM (SenSys), Nov. 2020.

  15. Ternary MobileNets via Per-Layer Hybrid Filter Banks [Paper] [Supplemental][arXiv]
    Dibakar Gope, Jesse Beu, Urmish Thakker, and Matthew Mattina
    Joint Workshop on Efficient Deep Learning in Computer Vision in conjunction with (CVPR 2020), Jun. 2020.

  16. Aggressive Compression of MobileNets Using Hybrid Ternary Layers [Paper] [Poster]
    Dibakar Gope, Jesse Beu, Urmish Thakker, and Matthew Mattina
    tinyML Summit 2020, Feb. 2020.

  17. Mixed-Element-Size Instruction
    Jesse Beu, Dibakar Gope, and David Mansell
    US Patent Application, 2020.

  18. Mixed-Precision Computation Unit
    Dibakar Gope, Jesse Beu, Paul Whatmough, and Matthew Mattina
    US Patent Application, 2020.

  19. Hybrid Filter Banks for Artificial Neural Networks
    Dibakar Gope, Jesse Beu, Paul Whatmough, and Matthew Mattina
    US Patent Application, 2020.

  20. Ternary Hybrid Neural-Tree Networks for Highly Constrained IoT Applications (Wake Word Detection) [Paper] [Poster]
    Dibakar Gope, Ganesh Dasika, and Matthew Mattina
    Second Conference on Machine Learning and Systems (MLSys), Mar. 2019.

  21. Pushing the Limits of RNN Compression [Paper]
    Urmish Thakker, Igor Fedorov, Jesse Beu, Dibakar Gope, Chu Zhou, Ganesh Dasika, and Matthew Mattina
    5th Workshop on Energy Efficient Machine Learning and Cognitive Computing, Co-located with the 33rd Conference on Neural Information Processing Systems (NeurIPS), Dec. 2019.

  22. Run-Time Efficient RNN Compression for Inference on Edge Devices [Paper]
    Urmish Thakker, Jesse Beu, Dibakar Gope, Ganesh Dasika, and Matthew Mattina
    4th Workshop on Energy Efficient Machine Learning and Cognitive Computing for Embedded Applications, Co-located with the 46th Int. Symp on Computer Architecture (ISCA), Jun. 2019.

  23. RNN Compression using Hybrid Matrix Decomposition
    Urmish Thakker, Ganesh Dasika, Jesse Beu, Dibakar Gope, and Matthew Mattina
    tinyML Summit, Mar. 2019.

  24. Scoped Persistence Barriers for Non-Volatile Memories
    Arkaprava Basu, Mitesh Meswani, Dibakar Gope, and Sooraj Puthoor
    US Patent, 2019.

  25. A Case for Scoped Persist Barriers in GPUs [Paper]
    Dibakar Gope, Arkaprava Basu, Sooraj Puthoor, and Mitesh Meswani
    11th Workshop on General Purpose Processing using GPU (GPGPU), In conjunction with Symp. on Principles and Practice of Parallel Programming (PPoPP), Feb. 2018.

  26. Apparatus and Method for Bias-Free Branch Prediction
    Mikko Lipasti, and Dibakar Gope
    US Patent, 2018.

  27. The CURE: Cluster Communication Using Registers [Paper]
    Vignyan Reddy Kothinti Naresh, Dibakar Gope, and Mikko H. Lipasti
    Proceedings of the Int. Conf. on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), Oct. 2017.

  28. Architectural Support for Server-Side PHP Processing [Paper]
    Dibakar Gope, David J. Schlais, and Mikko H. Lipasti
    Proceedings of the 44th Int. Symp. on Computer Architecture (ISCA), Jun. 2017.

  29. Hash Map Inlining [Paper]
    Dibakar Gope, and Mikko H. Lipasti
    Proceedings of the 25th Int. Conf. on Parallel Architectures and Compilation Techniques (PACT), Sep. 2016.

  30. Statement-Level Parallelism for Scripting Languages [Paper]
    Dibakar Gope, and Mikko H. Lipasti
    1st Workshop on the High Performance Scripting Languages, In conjunction with Symp. on Principles and Practice of Parallel Programming (PPoPP), Feb. 2015.

  31. Bias-Free Branch Predictor [Paper]
    Dibakar Gope, and Mikko H. Lipasti
    Proceedings of the 47th IEEE/ACM Int. Symp. on Microarchitecture (MICRO), Dec. 2014.

  32. Bias-Free Neural Predictor [Paper] [Code]
    Dibakar Gope, and Mikko H. Lipasti
    Proceedings of the 4th JILP Workshop on Computer Architecture Competitions (JWAC-4): Championship Branch Prediction (CBP), Jun. 2014.

  33. Atomic SC for Simple In-order Processors [Paper]
    Dibakar Gope, and Mikko H. Lipasti
    Proceedings of the 20th IEEE Int. Symp. on High Performance Computer Architecture (HPCA), Feb. 2014.
    *Nominated for best paper award

  34. Maximizing Crosstalk-Induced Slowdown during Path Delay Test [Paper]
    Dibakar Gope, and Duncan M. (Hank) Walker
    Proceedings of the 30th IEEE Int. Conf. on Computer Design (ICCD), Sep. 2012.

  35. Exploring a Circuit Design Approach Based on One-Hot Multi-Valued Domino Logic [Paper]
    Dibakar Gope, Kent Lin, and Sunil P. Khatri
    Proceedings of the 53rd IEEE Int. Midwest Symp. on Circuits & Systems (MWSCAS), Aug. 2010.

  36. Detection of High Resistance Bridge Defects using Slack Based Dynamic Bridging Fault Model [Paper]
    Dibakar Gope, Srinivasulu Alampally, Srinivas Kumar Vooka, and Rubin A. Parekhji
    Proceedings of the Synopsys Users Group India (SNUG), 2008.

Arxiv Preprints/Technical Report

  1. The gem5 Simulator: Version 20.0+ [arXiv]
    Jason Lowe-Power, Abdul Mutaal Ahmad, Ayaz Akram, Mohammad Alian, Rico Amslinger, Matteo Andreozzi, Adrià Armejach, Nils Asmussen, Srikant Bharadwaj, Gabe Black, Gedare Bloom, Bobby R. Bruce, Daniel Rodrigues Carvalho, Jeronimo Castrillon, Lizhong Chen, Nicolas Derumigny, Stephan Diestelhorst, Wendy Elsasser, Marjan Fariborz, Amin Farmahini-Farahani, Pouya Fotouhi, Ryan Gambord, Jayneel Gandhi, Dibakar Gope, Thomas Grass, Bagus Hanindhito, Andreas Hansson, Swapnil Haria, Austin Harris, Timothy Hayes, Adrian Herrera, Matthew Horsnell, Syed Ali Raza Jafri, Radhika Jagtap, Hanhwi Jang, Reiley Jeyapaul, Timothy M. Jones, Matthias Jung, Subash Kannoth, Hamidreza Khaleghzadeh, Yuetsu Kodama, Tushar Krishna, Tommaso Marinelli, Christian Menard, Andrea Mondelli, Tiago Mück, Omar Naji, Krishnendra Nathella, Hoa Nguyen, Nikos Nikoleris, Lena E. Olson, Marc Orr, Binh Pham, Pablo Prieto, Trivikram Reddy, Alec Roelke, Mahyar Samani, Andreas Sandberg, Javier Setoain, Boris Shingarov, Matthew D. Sinclair, Tuan Ta, Rahul Thakur, Giacomo Travaglini, Michael Upton, Nilay Vaish, Ilias Vougioukas, Zhengrong Wang, Norbert Wehn, Christian Weis, David A. Wood, Hongil Yoon, Éder F. Zulian.


Theses

  1. Architectural Support for Scripting Languages [PDF]
    Ph.D. Dissertation: School of Electrical and Computer Engineering, University of Wisconsin-Madison, Jun. 2017
    Advisor: Professor Mikko H. Lipasti

  2. Maximizing Crosstalk-Induced Slowdown During Path Delay Test [PDF]
    Master's Thesis: School of Electrical and Computer Engineering, Texas A&M University, College Station, Jun. 2011
    Advisor: Professor Duncan M. (Hank) Walker


Courses (Machine Learning)


Professional Services


Contact