Shah, Vedant Rajiv2024-08-202024-08-202024-08-19vt_gsexam:41322https://hdl.handle.net/10919/120961This thesis investigates the integration of Stein Variational Gradient Descent (SVGD) with Joint Energy Models (JEMs), comparing its performance to Stochastic Gradient Langevin Dynamics (SGLD). We incorporated a generative loss term with an entropy component to enhance diversity and a smoothing factor to mitigate numerical instability issues commonly associated with the energy function in energy-based models. Experiments on the CIFAR-10 dataset demonstrate that SGLD, particularly with Sharpness-Aware Minimization (SAM), outperforms SVGD in classification accuracy. However, SVGD without SAM, despite its lower classification accuracy, exhibits lower calibration error underscoring its potential for developing well-calibrated classifiers required in safety-critical applications. Our results emphasize the importance of adaptive tuning of the SVGD smoothing factor ($alpha$) to balance generative and classification objectives. This thesis highlights the trade-offs between computational cost and performance, with SVGD demanding significant resources. Our findings stress the need for adaptive scaling and robust optimization techniques to enhance the stability and efficacy of JEMs. This thesis lays the groundwork for exploring more efficient and robust sampling techniques within the JEM framework, offering insights into the integration of SVGD with JEMs.ETDenCreative Commons Attribution-NonCommercial 4.0 InternationalStein Variational Gradient DescentJoint Energy ModelsStochastic Gradient Langevin DynamicsWide Residual NetworksEnergy Based ModelsAre Particle-Based Methods the Future of Sampling in Joint Energy Models? A Deep Dive into SVGD and SGLDThesis