Juhan Bae

jbae [at] cs [dot] toronto [dot] edu
CV    Github    Twitter    Google Scholar   

Latest Updated: 08. 09. 2023


I am a PhD student in the Machine Learning Group at the University of Toronto and the Vector Institute, supervised by Roger Grosse. I received my HBSc in computer science and statistics from the same university in 2019. I previously interned at Anthropic and Microsoft Research.

I'm interested in understanding wide range of applications in deep learning. I'm currently working on hyperparameter optimization, optimization, and interpretability of deep learning.


[P2] What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions

Sang Keun Choe, Hwijeen Ahn, Juhan Bae, Kewen Zhao, Minsoo Kang, Youngseog Chung, Adithya Pratapa, Willie Neiswanger, Emma Strubell, Teruko Mitamura, Jeff Schneider, Eduard Hovy, Roger Grosse, Eric Xing

arXiv 2024
Paper Code

[P1] Training Data Attribution via Approximate Unrolled Differentiation

Juhan Bae, Wu Lin, Jonathan Lorraine, Roger Grosse

arXiv 2024
Paper Code

[C8] Can We Remove the Square-Root in Adaptive Gradient Methods?

Wu Lin, Felix Dangel, Runa Eschenhagen, Juhan Bae, Richard Turner, Alireza Makhzani

ICML 2024

[W4] Using Large Language Models for Hyperparameter Optimization

Michael Zhang, Nishkrit Desai, Juhan Bae, Jonathan Lorraine, Jimmy Ba

NeurIPS 2023, Foundation Models for Decision Making
Paper Code

[T2] Studying Large Language Model Generalization with Influence Functions

Roger Grosse*, Juhan Bae*, Cem Anil*, Nelson Elhage, Alex Tamkin, Amirhossein Tajdini, Benoit Steiner, Dustin Li, Esin Durmus, Ethan Perez, Evan Hubinger, Kamilė Lukošiūtė, Karina Nguyen, Nicholas Joseph, Sam McCandlish, Jared Kaplan, Samuel R. Bowman

arXiv 2023
Paper Blog Code

[T1] Benchmarking Neural Network Training Algorithms

George E. Dahl*, Frank Schneider*, Zachary Nado*, Naman Agarwal*, Chandramouli Sastry, Philipp Hennig, Sourabh Medapati, Runa Eschenhagen, Priya Kasimbeg, Daniel Suo, Juhan Bae, Justin Gilmer and 13 more authors

arXiv 2023
Paper Code

[C7] Efficient Parametric Approximations of Neural Network Function Space Distance

Nikita Dhawan, Sicong Huang, Juhan Bae, Roger Grosse

ICML 2023
Paper Code

[C6] Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve

Juhan Bae, Michael R. Zhang, Michael Ruan, Eric Wang, So Hasegawa, Jimmy Ba, Roger Grosse

ICLR 2023  (Oral Presentation)
Paper Code

[C5] If Influence Functions are the Answer, Then What is the Question?

Juhan Bae, Nathan Ng, Alston Lo, Marzyeh Ghassemi, Roger Grosse

NeurIPS 2022
Paper Code 1 Code 2

[C4] Amortized Proximal Optimization

Juhan Bae*, Paul Vicol*, Jeff Z. HaoChen, Roger Grosse

NeurIPS 2022
Paper Code

[C3] Analyzing Monotonic Linear Interpolation in Neural Network Loss Landscapes

James Lucas, Juhan Bae, Michael R. Zhang, Stanislav Fort, Richard Zemel, Roger Grosse

ICML 2021
Paper Code

[C2] Delta-STN: Efficient Bilevel Optimization for Neural Networks using Structured Response Jacobians

Juhan Bae and Roger Grosse

NeurIPS 2020
Paper Code

[W3] On Monotonic Linear Interpolation of Neural Network Parameters

James Lucas, Juhan Bae, Michael R. Zhang, Richard Zemel, Jimmy Ba, Roger Grosse

NeurIPS 2020, Optimization for Machine Learning Workshop
Paper Code

[W2] Eigenvalue Corrected Noisy Natural Gradient

Juhan Bae, Guodong Zhang, Roger Grosse

NeurIPS 2019, Bayesian Deep Learning Workshop
Paper Code

[C1] Fast 6DOF Pose Estimation with Synthetic Textureless CAD Model for Mobile Applications

Bowen Chen, Juhan Bae, Dibyendu Mukherjee

ICIP 2019

[W1] Learnable Pooling Methods for Video Classification

Sebastian Kmiec, Juhan Bae, Ruijian An

ECCV 2018, Workshop on YouTube-8M Large-Scale Video Understanding  (Oral Presentation)
Paper Code