Bayesian Optimization On Stacked Autoencoder And Deep Boltzman Machines
Target Audience: Software-Architekten, Software-Entwickler, IT-Consultants, IT-Projektmanager, Business Development
Donnerstag, 06.02.2020 | 14:30 - 15:30 Uhr | FDo 4.4
Bayesian Optimization generally is used to optimize an expensive black-box function by creating (cheaper to evaluate) surrogate model using Gaussian Processes and trying to optimize it instead of the expensive black-box function. Bayesian Optimization (BO) can be used to find good hyperparameters in Multilayer Perceptrons and CNNs just like grid search or random search. In our research we used BO on the number of nodes and the number of layers on stacked Autoencoders (SAE) and Deep Boltzman Machines DBM. We initially wanted to answer the question: Can the reconstruction error be decreased when adding more than one layer to Autoencoder or Restricted Boltzman machine. In my talk I would talk about Bayesian Optimization in general. Briefly Explain Gaussian Processes and demonstrate how everybody can use Bayesian Optimization techniques to improve existing deep neural nets. I will demonstrate this by using the library hyperopt. Next I will talk about Autoencoder (AE) and Restricted Boltzman Machines (RBM). AE and RBM are ways to discover latent structure of data (input). It is an unsupervised Learning technique mostly used for feature selection and in combination with a MLP those features selected by an AE/RBM are used for classification tasks for example. After explaining those two concepts I will present my reserach results which consist of serveral experiments with different architectures on MNIST. Taking reconstruction error as the performance measurement I will show how Bayesian Optimization used on the Architecuture (Layers & Nodes) helped us to define SAE and DBM and if those SAEs and DBMS could perform better than AE and Restricted Boltzman Machines (RBM). My talk will focus on explainig the concepts of Bayesian Optimization and Unsupervised Learning and some interesting discoveries when playing around with the hyperparameters settings (i.e layers, nodes, batch size and regularization).