Saturday, April 29, 2017

Saturday Morning Videos: #ICLR2017 videos


Here are the videos of this year's ICLR2017 meeting.

Monday April 24, 2017

Morning Session – Session Chair: Dhruv Batra 

Opening remarks, video starts at 12:15

9.00 - 9.40 Invited talk 1: Eero Simoncelli Elucidating and testing hierarchical sensory models through synthesis, Video starts at 26:00
10.30 - 12.30 Poster Session 1 (Conference Papers, Workshop Papers)
Afternoon Session – Session Chair: Joan Bruna (sponsored by Baidu)

14.30 - 15.10 Invited talk 2: Benjamin Recht What can Deep Learning learn from linear regression Video starts at 18:30
15.10 - 15.30 Contributed Talk 3: Understanding deep learning requires rethinking generalization - BEST PAPER AWARD, Video starts at 53:30
16.30 - 18.30 Poster Session 2 (Conference Papers, Workshop Papers)

Tuesday April 25, 2017

Morning Session – Session Chair: Tara Sainath (sponsored by Google)

9.00 - 9.40 Invited talk 1: ChloĆ©-Agathe Azencott High dimensional feature selection in precision medicine Video starts at 13;24
9.40 - 10.00 Contributed talk 1: Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data - BEST PAPER AWARD ,Video starts at 53:10
10.00 - 10.20 Contributed talk 2: Learning Graphical State Transitions  Video starts at 1;14;50
10.30 - 12.30 Poster Session 1 (Conference Papers, Workshop Papers)

Afternoon Session – Session Chair: Raia Hasdell (sponsored by Amazon)

14.00 - 16.00 Poster Session 2 (Conference Papers, Workshop Papers)
16.15 - 17.00 Invited talk 2: Riccardo Zecchina Video starts at 7:05
17.00 - 17.20 Contributed Talk 3: Learning to Act by Predicting the Future Video starts at 53:50
17.20 - 17.40 Contributed Talk 4: Reinforcement Learning with Unsupervised Auxiliary Tasks Video starts at 1:15:30
17.40 - 18.00 Contributed Talk 5: Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic Video starts at 1;37;10


18.00 - 18.30 Group photo at the RCT Stadium




Morning Session – Session Chair: Slav Petrov
9.00 - 9.40 Invited talk 1: Regina Barzilay Moving beyond supervised realm Video starts at 3;15 + last 8 minutes of this presentation on this video
9.40 - 10.00 Contributed talk 1: Learning End-to-End Goal-Oriented Dialog Video starts at 27:34
10.00 - 10.20 Contributed talk 2: Multi-Agent Cooperation and the Emergence of (Natural) Language Video starts at 6:00
10.30 - 12.30 Poster Session 1 (Conference Papers, Workshop Papers)

Afternoon Session – Session Chair: Navdeep Jaitly
14.30 - 15.10 Invited talk 2: Alex Graves, New Direction for Recurent Neural Networks, Video starts at 4:10


15.10 - 15.30 Contributed Talk 3: Making Neural Programming Architectures Generalize via Recursion - BEST PAPER AWARD , Video starts at 50:12
16.30 - 18.30 Poster Session 2 (Conference Papers, Workshop Papers)




Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Friday, April 28, 2017

Thesis: Randomized Algorithms for Large-Scale Data Analysis by Farhad Pourkamali-Anaraki

Image 1

Stephen just sent me the following:

Hi Igor, 
It's a pleasure to write to you again and announce the graduation of my PhD student Farhad Pourkamali-Anaraki.

It contains a lot of good things, some published some not. In particular (see attached image 1) he has great work on a 1-pass algorithm for K-means that seems to be one of the only 1-pass algorithms to accurately estimate cluster centers (implementation at https://github.com/stephenbeckr/SparsifiedKMeans ), and also has very recent work on efficient variations of the Nystrom method for approximating kernel matrices that seems to give the high-accuracy of the clustered Nystrom method at a fraction of the computational cost (see image 2). 
Best,
Stephen

Image 2




Thanks Stephen but I think the following paper also does 1-pass for K-Means (Keriven N., Tremblay N., Traonmilin Y., Gribonval R., "Compressive K-means" and its implementation SketchMLbox: A MATLAB toolbox for large-scale mixture learning ) even though the contruction seems different. Both of these implementations will be added to the Advanced Matrix Factorization Jungle page.

Anyway, congratulations Dr. Pourkamali-Anaraki !
Randomized Algorithms for Large-Scale Data AnalysisFarhad Pourkamali-Anaraki The abstract reads :

Massive high-dimensional data sets are ubiquitous in all scientific disciplines. Extract- ing meaningful information from these data sets will bring future advances in fields of science and engineering. However, the complexity and high-dimensionality of modern data sets pose unique computational and statistical challenges. The computational requirements of analyzing large-scale data exceed the capacity of traditional data analytic tools. The challenges surrounding large high-dimensional data are felt not just in processing power, but also in memory access, storage requirements, and communication costs. For example, modern data sets are often too large to fit into the main memory of a single workstation and thus data points are processed sequentially without a chance to store the full data. Therefore, there is an urgent need for the development of scalable learning tools and efficient optimization algorithms in today’s high-dimensional data regimes.

A powerful approach to tackle these challenges is centered around preprocessing high-dimensional data sets via a dimensionality reduction technique that preserves the underlying geometry and structure of the data. This approach stems from the observation that high- dimensional data sets often have intrinsic dimension which is significantly smaller than the ambient dimension. Therefore, information-preserving dimensionality reduction methods are valuable tools for reducing the memory and computational requirements of data analytic tasks on large-scale data sets.

Recently, randomized dimension reduction has received a lot of attention in several fields, including signal processing, machine learning, and numerical linear algebra. These methods use random sampling or random projection to construct low-dimensional representations of the data, known as sketches or compressive measurements. These randomized methods are effective in modern data settings since they provide a non-adaptive data- independent mapping of high-dimensional data into a low-dimensional space. However, such methods require strong theoretical guarantees to ensure that the key properties of original data are preserved under a randomized mapping.

This dissertation focuses on the design and analysis of efficient data analytic tasks using randomized dimensionality reduction techniques. Specifically, four efficient signal processing and machine learning algorithms for large high-dimensional data sets are proposed: covariance estimation and principal component analysis, dictionary learning, clustering, and low-rank approximation of positive semidefinite kernel matrices. These techniques are valu- able tools to extract important information and patterns from massive data sets. Moreover, an efficient data sparsification framework is introduced that does not require incoherence and distributional assumptions on the data. A main feature of the proposed compression scheme is that it requires only one pass over the data due to the randomized preconditioning transformation, which makes it applicable to streaming and distributed data settings.

The main contribution of this dissertation is threefold: (1) strong theoretical guarantees are provided to ensure that the proposed randomized methods preserve the key properties and structure of high-dimensional data; (2) tradeoffs between accuracy and memory/computation savings are characterized for a large class of data sets as well as dimensionality reduction methods, including random linear maps and random sampling; (3) extensive numerical experiments are presented to demonstrate the performance and benefits of our proposed methods compared to prior works.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Wednesday, April 26, 2017

ICLR2017, third and last day.

This is the last day of ICLR 2017. The meeting is be featured live on Facebook here at: https://www.facebook.com/iclr.cc/ . If you want to say hi, I am around.and we're hiring.


Morning Session – Session Chair: Slav Petrov
7.30 – 9.00 Registration
9.00 - 9.40 Invited talk 1: Regina Barzilay
9.40 - 10.00 Contributed talk 1: Learning End-to-End Goal-Oriented Dialog
10.00 - 10.20 Contributed talk 2: Multi-Agent Cooperation and the Emergence of (Natural) Language
10.20 - 10.30 Coffee Break
10.30 - 12.30 Poster Session 1 (Conference Papers, Workshop Papers)
12.30 - 14.30 Lunch provided by ICLR

Afternoon Session – Session Chair: Navdeep Jaitly
14.30 - 15.10 Invited talk 2: Alex Graves
15.10 - 15.30 Contributed Talk 3: Making Neural Programming Architectures Generalize via Recursion - BEST PAPER AWARD
15.30 - 15.50 Contributed Talk 4: Neural Architecture Search with Reinforcement Learning
15.50 - 16.10 Contributed Talk 5: Optimization as a Model for Few-Shot Learning
16.10 - 16.30 Coffee Break
16.30 - 18.30 Poster Session 2 (Conference Papers, Workshop Papers)






Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Tuesday, April 25, 2017

#ICLR2017 Tuesday Afternoon Program

 
ICLR 2017 continues this afternoon in Toulon, there will be a blog post for each half day that features directly links to papers from the Open review section. The meeting will be featured live on Facebook here at: https://www.facebook.com/iclr.cc/ . If you want to say hi, I am around.and we're hiring.
 
14.00 - 16.00 Poster Session 2 (Conference Papers, Workshop Papers)
16.00 - 16.15 Coffee Break
16.15 - 17.00 Invited talk 2: Riccardo Zecchina
17.00 - 17.20 Contributed Talk 3: Learning to Act by Predicting the Future
17.20 - 17.40 Contributed Talk 4: Reinforcement Learning with Unsupervised Auxiliary Tasks
17.40 - 18.00 Contributed Talk 5: Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic
18.00 - 18.10 Group photo at the Stade FĆ©lix Mayol
19.00 - 24.00 Gala dinner offered by ICLR

C1: Sigma Delta Quantized Networks 
( code)
C2: Paleo: A Performance Model for Deep Neural Networks
C3: DeepCoder: Learning to Write Programs
C4: Topology and Geometry of Deep Rectified Network Optimization Landscapes
C5: Incremental Network Quantization: Towards Lossless CNNs with Low-precision Weights
C6: Learning to Perform Physics Experiments via Deep Reinforcement Learning
C7: Decomposing Motion and Content for Natural Video Sequence Prediction
C8: Calibrating Energy-based Generative Adversarial Networks
C9: Pruning Convolutional Neural Networks for Resource Efficient Inference
C10: Incorporating long-range consistency in CNN-based texture generation
( code )
C11: Lossy Image Compression with Compressive Autoencoders
C12: LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation
C13: Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data
C14: Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data
C15: Mollifying Networks
C16: beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework
C17: Categorical Reparameterization with Gumbel-Softmax
C18: Online Bayesian Transfer Learning for Sequential Data Modeling
C19: Latent Sequence Decompositions
C20: Density estimation using Real NVP
C21: Recurrent Batch Normalization
C22: SGDR: Stochastic Gradient Descent with Restarts
C23: Variable Computation in Recurrent Neural Networks
C24: Deep Variational Information Bottleneck
C25: SampleRNN: An Unconditional End-to-End Neural Audio Generation Model
C26: TopicRNN: A Recurrent Neural Network with Long-Range Semantic Dependency
C27: Frustratingly Short Attention Spans in Neural Language Modeling
C28: Offline Bilingual Word Vectors, Orthogonal Transformations and the Inverted Softmax
C29: LEARNING A NATURAL LANGUAGE INTERFACE WITH NEURAL PROGRAMMER
C30: Designing Neural Network Architectures using Reinforcement Learning
C31: Metacontrol for Adaptive Imagination-Based Optimization (spaceship dataset )
C32: Recurrent Environment Simulators
C33: EPOpt: Learning Robust Neural Network Policies Using Model Ensembles

W1: Lifelong Perceptual Programming By Example
W2: Neu0
W3: Dance Dance Convolution
W4: Bit-Pragmatic Deep Neural Network Computing
W5: On Improving the Numerical Stability of Winograd Convolutions
W6: Fast Generation for Convolutional Autoregressive Models
W7: THE PREIMAGE OF RECTIFIER NETWORK ACTIVITIES
W8: Training Triplet Networks with GAN
W9: On Robust Concepts and Small Neural Nets
W10: Pl@ntNet app in the era of deep learning
W11: Exponential Machines
W12: Online Multi-Task Learning Using Biased Sampling
W13: Online Structure Learning for Sum-Product Networks with Gaussian Leaves
W14: A Theoretical Framework for Robustness of (Deep) Classifiers against Adversarial Samples
W15: Compositional Kernel Machines
W16: Loss is its own Reward: Self-Supervision for Reinforcement Learning
W17: REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models
W18: Precise Recovery of Latent Vectors from Generative Adversarial Networks
W19: Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization (code)
 
 
 
 
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

#ICLR2017 Tuesday Morning Program

 
 
 
So ICLR 2017 continues today in Toulon, there will be a blog post for each half day that features directly links to papers from the Open review section. The meeting will be featured live on Facebook here at: https://www.facebook.com/iclr.cc/ . If you want to say hi, I am around.and we're hiring.


7.30 – 9.00 Registration
9.00 - 9.40 Invited talk 1: ChloƩ-Agathe Azencott
9.40 - 10.00 Contributed talk 1: Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data - BEST PAPER AWARD
10.00 - 10.20 Contributed talk 2: Learning Graphical State Transitions
10.20 - 10.30 Coffee Break
10.30 - 12.30 Poster Session 1 (Conference Papers, Workshop Papers)


 Conference posters (1st floor)
 
C1: DeepDSL: A Compilation-based Domain-Specific Language for Deep Learning (code)
C2: A SELF-ATTENTIVE SENTENCE EMBEDDING
C3: Deep Probabilistic Programming
C4: Lie-Access Neural Turing Machines
C5: Learning Features of Music From Scratch
C6: Mode Regularized Generative Adversarial Networks
C7: End-to-end Optimized Image Compression (web)
C8: Variational Recurrent Adversarial Deep Domain Adaptation
C9: Steerable CNNs
C10: Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning (code)
C11: PixelVAE: A Latent Variable Model for Natural Images
C12: A recurrent neural network without chaos
C13: Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
C14: Tree-structured decoding with doubly-recurrent neural networks
C15: Introspection:Accelerating Neural Network Training By Learning Weight Evolution
C16: Hyperband: Bandit-Based Configuration Evaluation for Hyperparameter Optimization (page)
C17: Quasi-Recurrent Neural Networks (Keras)
C18: Attend, Adapt and Transfer: Attentive Deep Architecture for Adaptive Transfer from multiple sources in the same domain
C19: A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks
C20: Trusting SVM for Piecewise Linear CNNs
C21: Maximum Entropy Flow Networks
C22: The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables
C23: Unrolled Generative Adversarial Networks
C24: A Simple but Tough-to-Beat Baseline for Sentence Embeddings (blog entry)
C25: Query-Reduction Networks for Question Answering (code)
C26: Machine Comprehension Using Match-LSTM and Answer Pointer (code)
C27: Words or Characters? Fine-grained Gating for Reading Comprehension
C28: Dynamic Coattention Networks For Question Answering (code)
C29: Multi-view Recurrent Neural Acoustic Word Embeddings
C30: Episodic Exploration for Deep Deterministic Policies for StarCraft Micromanagement
C31: Training Agent for First-Person Shooter Game with Actor-Critic Curriculum Learning
C32: Generalizing Skills with Semi-Supervised Reinforcement Learning
C33: Improving Policy Gradient by Exploring Under-appreciated Rewards
 
3rd Floor
 
W1: Programming With a Differentiable Forth Interpreter
W2: Unsupervised Feature Learning for Audio Analysis
W3: Neural Functional Programming
W4: A Smooth Optimisation Perspective on Training Feedforward Neural Networks
W5: Synthetic Gradient Methods with Virtual Forward-Backward Networks
W6: Explaining the Learning Dynamics of Direct Feedback Alignment
W7: Training a Subsampling Mechanism in Expectation
W8: Deep Kernel Machines via the Kernel Reparametrization Trick
W9: Encoding and Decoding Representations with Sum- and Max-Product Networks
W10: Embracing Data Abundance
W11: Variational Intrinsic Control
W12: Fast Adaptation in Generative Models with Generative Matching Networks
W13: Efficient variational Bayesian neural network ensembles for outlier detection
W14: Emergence of Language with Multi-agent Games: Learning to Communicate with Sequences of Symbols
W15: Adaptive Feature Abstraction for Translating Video to Language
W16: Delving into adversarial attacks on deep policies
W17: Tuning Recurrent Neural Networks with Reinforcement Learning
W18: DeepMask: Masking DNN Models for robustness against adversarial samples
W19: Restricted Boltzmann Machines provide an accurate metric for retinal responses to visual stimuli

 
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Monday, April 24, 2017

#ICLR2017 Monday Afternoon Program

 
ICLR 2017 is taking place today in Toulon this week, there will be a blog post for each half day that features directly links to papers and attendant codes if there are any. The meeting will be featured live on Facebook here at: https://www.facebook.com/iclr.cc/ . If you want to say hi, I am around.
 
Afternoon Session – Session Chair: Joan Bruna (sponsored by Baidu) 14.30 - 15.10 Invited talk 2: Benjamin Recht
15.10 - 15.30 Contributed Talk 3: Understanding deep learning requires rethinking generalization - BEST PAPER AWARD
15.30 - 15.50 Contributed Talk 4: Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
15.50 - 16.10 Contributed Talk 5: Towards Principled Methods for Training Generative Adversarial Networks
16.10 - 16.30 Coffee Break
16.30 - 18.20 Poster Session 2 (Conference Papers, Workshop Papers)
18.20 - 18.30 Group photo at stadium attached to Neptune Congress Center.
 
C1: Neuro-Symbolic Program Synthesis
C2: Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy (code)
C3: Trained Ternary Quantization (code)
C4: DSD: Dense-Sparse-Dense Training for Deep Neural Networks (code)
C5: A Compositional Object-Based Approach to Learning Physical Dynamics (code, project site)
C6: Multilayer Recurrent Network Models of Primate Retinal Ganglion Cells
C7: Improving Generative Adversarial Networks with Denoising Feature Matching (chainer implementation)
C8: Transfer of View-manifold Learning to Similarity Perception of Novel Objects
C9: What does it take to generate natural textures?
C10: Emergence of foveal image sampling from learning to attend in visual scenes
C11: PixelCNN++: A PixelCNN Implementation with Discretized Logistic Mixture Likelihood and Other Modifications
C12: Learning to Optimize
C13: Do Deep Convolutional Nets Really Need to be Deep and Convolutional?
C14: Optimal Binary Autoencoding with Pairwise Correlations
C15: On the Quantitative Analysis of Decoder-Based Generative Models (evaluation code)
C16: Adversarial machine learning at scale
C17: Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks
C18: Capacity and Learnability in Recurrent Neural Networks
C19: Deep Learning with Dynamic Computation Graphs  (TensorFlow code)
C20: Exploring Sparsity in Recurrent Neural Networks
C21: Structured Attention Networks (code)
C22: Learning to Repeat: Fine Grained Action Repetition for Deep Reinforcement Learning
C23: Variational Lossy Autoencoder
C24: Learning to Query, Reason, and Answer Questions On Ambiguous Texts
C25: Deep Biaffine Attention for Neural Dependency Parsing
C26: A Compare-Aggregate Model for Matching Text Sequences (code)
C27: Data Noising as Smoothing in Neural Network Language Models
C28: Neural Variational Inference For Topic Models
C29: Bidirectional Attention Flow for Machine Comprehension (code, page)
C30: Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic
C31: Stochastic Neural Networks for Hierarchical Reinforcement Learning
C32: Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning (video)
C33: Third Person Imitation Learning
 
W1: Audio Super-Resolution using Neural Networks (code)
W2: Semantic embeddings for program behaviour patterns
W3: De novo drug design with deep generative models : an empirical study
W4: Memory Matching Networks for Genomic Sequence Classification
W5: Char2Wav: End-to-End Speech Synthesis
W6: Fast Chirplet Transform Injects Priors in Deep Learning of Animal Calls and Speech
W7: Weight-averaged consistency targets improve semi-supervised deep learning results
W8: Particle Value Functions
W9: Out-of-class novelty generation: an experimental foundation
W10: Performance guarantees for transferring representations (presentation, video)
W11: Generative Adversarial Learning of Markov Chains
W12: Short and Deep: Sketching and Neural Networks
W13: Understanding intermediate layers using linear classifier probes
W14: Symmetry-Breaking Convergence Analysis of Certain Two-layered Neural Networks with ReLU nonlinearity
W15: Neural Combinatorial Optimization with Reinforcement Learning (TensorFlow code)
W16: Tactics of Adversarial Attacks on Deep Reinforcement Learning Agents
W17: Adversarial Discriminative Domain Adaptation (workshop extended abstract)
W18: Efficient Sparse-Winograd Convolutional Neural Networks
W19: Neural Expectation Maximization 

 
 
 
 
 
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

#ICLR2017 Monday Morning Program

 
So ICLR 2017 is taking place today in Toulon this week, there will be a blog post for each half day that features directly links to papers from the Open review section. The meeting will be featured live on Facebook here at: https://www.facebook.com/iclr.cc/ . If you want to say hi, I am around.

Monday April 24, 2017

Morning Session – Session Chair: Dhruv Batra

7.00 - 8.45 Registration
8.45 - 9.00 Opening Remarks
9.00 - 9.40 Invited talk 1: Eero Simoncelli
9.40 - 10.00 Contributed talk 1: End-to-end Optimized Image Compression
10.00 - 10.20 Contributed talk 2: Amortised MAP Inference for Image Super-resolution
10.20 - 10.30 Coffee Break
10.30 - 12.30 Poster Session 1

C1: Making Neural Programming Architectures Generalize via Recursion (slides, code, video)
C2: Learning Graphical State Transitions (code)
C3: Distributed Second-Order Optimization using Kronecker-Factored Approximations
C4: Normalizing the Normalizers: Comparing and Extending Network Normalization Schemes
C5: Neural Program Lattices
C6: Diet Networks: Thin Parameters for Fat Genomics
C7: Unsupervised Cross-Domain Image Generation  (TensorFlow implementation )
C8: Towards Principled Methods for Training Generative Adversarial Networks
C9: Recurrent Mixture Density Network for Spatiotemporal Visual Attention
C10: Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer (PyTorch code)
C11: Pruning Filters for Efficient ConvNets
C12: Stick-Breaking Variational Autoencoders
C13: Identity Matters in Deep Learning
C14: On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
C15: Recurrent Hidden Semi-Markov Model
C16: Nonparametric Neural Networks
C17: Learning to Generate Samples from Noise through Infusion Training
C18: An Information-Theoretic Framework for Fast and Robust Unsupervised Learning via Neural Population Infomax
C19: Highway and Residual Networks learn Unrolled Iterative Estimation
C20: Soft Weight-Sharing for Neural Network Compression (Tutorial)
C21: Snapshot Ensembles: Train 1, Get M for Free
C22: Towards a Neural Statistician
C23: Learning Curve Prediction with Bayesian Neural Networks
C24: Learning End-to-End Goal-Oriented Dialog
C25: Multi-Agent Cooperation and the Emergence of (Natural) Language
C26: Efficient Vector Representation for Documents through Corruption ( code)
C27: Improving Neural Language Models with a Continuous Cache
C28: Program Synthesis for Character Level Language Modeling
C29: Tracking the World State with Recurrent Entity Networks (TensorFlow implementation)
C30: Reinforcement Learning with Unsupervised Auxiliary Tasks (blog post, an implementation )
C31: Neural Architecture Search with Reinforcement Learning ( slides, some implementation of appendix A
C32: Sample Efficient Actor-Critic with Experience Replay
C33: Learning to Act by Predicting the Future
 
 
 
 
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Saturday, April 22, 2017

Sunday Morning Insight: "No Need for the Map of a Cat, Mr Feynman" or The Long Game in Nanopore Sequencing.



About 5 weeks ago, we wondered how we could tell if the world was changing right before our eyes ?  well, this is happening, instance #2 just got more real:


Nanopore sequencing is a promising technique for genome sequencing due to its portability, ability to sequence long reads from single molecules, and to simultaneously assay DNA methylation. However until recently nanopore sequencing has been mainly applied to small genomes, due to the limited output attainable. We present nanopore sequencing and assembly of the GM12878 Utah/Ceph human reference genome generated using the Oxford Nanopore MinION and R9.4 version chemistry. We generated 91.2 Gb of sequence data (~30x theoretical coverage) from 39 flowcells. De novo assembly yielded a highly complete and contiguous assembly (NG50 ~3Mb). We observed considerable variability in homopolymeric tract resolution between different basecallers. The data permitted sensitive detection of both large structural variants and epigenetic modifications. Further we developed a new approach exploiting the long-read capability of this system and found that adding an additional 5x-coverage of "ultra-long" reads (read N50 of 99.7kb) more than doubled the assembly contiguity. Modelling the repeat structure of the human genome predicts extraordinarily contiguous assemblies may be possible using nanopore reads alone. Portable de novo sequencing of human genomes may be important for rapid point-of-care diagnosis of rare genetic diseases and cancer, and monitoring of cancer progression. The complete dataset including raw signal is available as an Amazon Web Services Open Dataset at: https://github.com/nanopore-wgs-consortium/NA12878.
Here is some context:

And previously on Nuit Blanche:
 
Credit: NASA, JPL




Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Friday, April 21, 2017

Random Feature Expansions for Deep Gaussian Processes / AutoGP: Exploring the Capabilities and Limitations of Gaussian Process Models - implementation -

[I will be at ICLR next week, let's grab some coffee if you are there]



Random Feature Expansions for Deep Gaussian Processes by Kurt Cutajar, Edwin V. Bonilla, Pietro Michiardi, Maurizio Filippone
The composition of multiple Gaussian Processes as a Deep Gaussian Process (DGP) enables a deep probabilistic nonparametric approach to flexibly tackle complex machine learning problems with sound quantification of uncertainty. Existing inference approaches for DGP models have limited scalability and are notoriously cumbersome to construct. In this work, we introduce a novel formulation of DGPs based on random feature expansions that we train using stochastic variational inference. This yields a practical learning framework which significantly advances the state-of-the-art in inference for DGPs, and enables accurate quantification of uncertainty. We extensively showcase the scalability and performance of our proposal on several datasets with up to 8 million observations, and various DGP architectures with up to 30 hidden layers.
A python / TensorFlow implementation can be found here: https://github.com/mauriziofilippone/deep_gp_random_features

We investigate the capabilities and limitations of Gaussian process models by jointly exploring three complementary directions: (i) scalable and statistically efficient inference; (ii) flexible kernels; and (iii) objective functions for hyperparameter learning alternative to the marginal likelihood. Our approach outperforms all previously reported GP methods on the standard MNIST dataset; performs comparatively to previous kernel-based methods using the RECTANGLES-IMAGE dataset; and breaks the 1% error-rate barrier in GP models using the MNIST8M dataset, showing along the way the scalability of our method at unprecedented scale for GP models (8 million observations) in classification problems. Overall, our approach represents a significant breakthrough in kernel methods and GP models, bridging the gap between deep learning approaches and kernel machines.
and here is a recent presentation by one of the author: "Practical and Scalable Inference for Deep Gaussian Processes"

Wednesday, April 19, 2017

Stochastic Gradient Descent as Approximate Bayesian Inference

[I will be at ICLR next week, let's grab some coffee if you are there]


The recent distill pub on Why Momentum Really Works by Gabriel Goh does provide a some insight on why Gradient Descent might work. Overviews such as the ones listed below also do help:
But today, we have an addtional insight in the mapping of SGD to Bayesian inference: Stochastic Gradient Descent as Approximate Bayesian Inference by Stephan MandtMatthew D. HoffmanDavid M. Blei
Stochastic Gradient Descent with a constant learning rate (constant SGD) simulates a Markov chain with a stationary distribution. With this perspective, we derive several new results. (1) We show that constant SGD can be used as an approximate Bayesian posterior inference algorithm. Specifically, we show how to adjust the tuning parameters of constant SGD to best match the stationary distribution to a posterior, minimizing the Kullback-Leibler divergence between these two distributions. (2) We demonstrate that constant SGD gives rise to a new variational EM algorithm that optimizes hyperparameters in complex probabilistic models. (3) We also propose SGD with momentum for sampling and show how to adjust the damping coefficient accordingly. (4) We analyze MCMC algorithms. For Langevin Dynamics and Stochastic Gradient Fisher Scoring, we quantify the approximation errors due to finite learning rates. Finally (5), we use the stochastic process perspective to give a short proof of why Polyak averaging is optimal. Based on this idea, we propose a scalable approximate MCMC algorithm, the Averaged Stochastic Gradient Sampler.




Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Phase Transitions of Spectral Initialization for High-Dimensional Nonconvex Estimation


  [Personal message: I will be at ICLR next week, let's grab some coffee if you are there]
Yue just sent me the following:
Dear Igor,

I hope all is well.

We recently posted a paper on arXiv on analyzing the exact asymptotic performance of a popular spectral initialization method for various nonconvex signal estimation problems (such as phase retrieval). We think you and readers of your blog might be interested in this research.

The paper can be found here:

https://arxiv.org/abs/1702.06435

Best regards,
Yue
Thanks Yue, two phase transitions ! I like it: Phase Transitions of Spectral Initialization for High-Dimensional Nonconvex Estimation by Yue M. Lu, Gen Li
We study a spectral initialization method that serves a key role in recent work on estimating signals in nonconvex settings. Previous analysis of this method focuses on the phase retrieval problem and provides only performance bounds. In this paper, we consider arbitrary generalized linear sensing models and present a precise asymptotic characterization of the performance of the method in the high-dimensional limit. Our analysis also reveals a phase transition phenomenon that depends on the ratio between the number of samples and the signal dimension. When the ratio is below a minimum threshold, the estimates given by the spectral method are no better than random guesses drawn from a uniform distribution on the hypersphere, thus carrying no information; above a maximum threshold, the estimates become increasingly aligned with the target signal. The computational complexity of the method, as measured by the spectral gap, is also markedly different in the two phases. Worked examples and numerical results are provided to illustrate and verify the analytical predictions. In particular, simulations show that our asymptotic formulas provide accurate predictions for the actual performance of the spectral method even at moderate signal dimensions.



Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Printfriendly