Friday, February 28, 2014

Nuit Blanche in Review (February 2014)

This past month, we remembered the Past (Remembering Columbia STS-107The Graceful Rendez-Vous Pitch Maneuvers) and saw the Future (Predicting the Future: The Upcoming Stephanie Events). We learned that Compressive Sensing Skews Everything including Terry Tao's h-index. We wondered if Bad Learning was Learning ? Submitted From Direct Imaging to Machine Learning … and more, in 15 minutes to JIONC whose organizers decided to put on a poster (how do you convert 15 minutes on a poster?). I also provided some tips on Organizing Meetups and got Nuit Blanche featured on La Recherche of February. But this is nothing compared to the slow recognition that Machine Learning techniques like RKHS may be a way to more deeply connected nonlinear sening and compressive sensing

In the meantime, a large number of implementations were made available by their respective authors:
We had two Sunday Morning Insights

Specific Focused entries
But also 

CAI:

Meetups:

Sparse Estimation From Noisy Observations of an Overdetermined Linear System / On the Randomized Kaczmarz Algorithm

I wonder why I  never see the Donoho-Tanner phase transition in these overdetermined linear systems papers. Anyway today, we have some solvers in that area: 

Sparse Estimation From Noisy Observations of an Overdetermined Linear System by Liang Dai, Kristiaan Pelckmans
This note studies a method for the efficient estimation of a finite number of unknown parameters from linear equations, which are perturbed by Gaussian noise.
In case the unknown parameters have only few nonzero entries, the proposed estimator performs more efficiently than a traditional approach. The method consists of three steps:
(1) a classical Least Squares Estimate (LSE),
(2) the support is recovered through a Linear Programming (LP) optimization problem which can be computed using a soft-thresholding step,
(3) a de-biasing step using a LSE on the estimated support set.
The main contribution of this note is a formal derivation of an associated ORACLE property of the final estimate.
That is, when the number of samples is large enough, the estimate is shown to equal the LSE based on the support of the {\em true} parameters.

On the Randomized Kaczmarz Algorithm by Liang Dai, Mojtaba Soltanalian, Kristiaan Pelckmans
The Randomized Kaczmarz Algorithm is a randomized method which aims at solving a consistent system of over determined linear equations. This note discusses how to find an optimized randomization scheme for this algorithm, which is related to the question raised by \cite{c2}. Illustrative experiments are conducted to support the findings.




Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Thursday, February 27, 2014

MISO: Incremental Majorization-Minimization Optimization with Application to Large-Scale Machine Learning - implementation -




Majorization-minimization algorithms consist of successively minimizing a sequence of upper bounds of the objective function. These upper bounds are tight at the current estimate, and each iteration monotonically drives the objective function downhill. Such a simple principle is widely applicable and has been very popular in various scientific fields, especially in signal processing and statistics. In this paper, we propose an incremental majorization-minimization scheme for minimizing a large sum of continuous functions, a problem of utmost importance in machine learning. We present convergence guarantees for non-convex and convex optimization when the upper bounds approximate the objective up to a smooth error; we call such upper bounds "first-order surrogate functions". More precisely, we study asymptotic stationary point guarantees for non-convex problems, and for convex ones, we provide convergence rates for the expected objective function value. We apply our scheme to composite optimization and obtain a new incremental proximal gradient algorithm with linear convergence rate for strongly convex functions. In our experiments, we show that our method is competitive with the state of the art for solving large-scale machine learning problems such as logistic regression, and we demonstrate its usefulness for sparse estimation with non-convex penalties.

The implementation is in SPAMS: SPArse Modeling Software

Also of related interest the following two implementations:

Wednesday, February 26, 2014

Improved Recovery Guarantees for Phase Retrieval from Coded Diffraction Patterns / Sparse phase retrieval via group-sparse optimization

Continuing on the more theoretical streak:

This paper deals with sparse phase retrieval, i.e., the problem of estimating a vector from quadratic measurements under the assumption that few components are nonzero. In particular, we consider the problem of finding the sparsest vector consistent with the measurements and reformulate it as a group-sparse optimization problem with linear constraints. Then, we analyze the convex relaxation of the latter based on the minimization of a block l1-norm and show various exact recovery and stability results in the real and complex cases. Invariance to circular shifts and reflections are also discussed for real vectors measured via complex matrices.



Improved Recovery Guarantees for Phase Retrieval from Coded Diffraction Patterns by David Gross, Felix Krahmer, Richard Kueng
In this work we analyze the problem of phase retrieval from Fourier measurements with random diffraction patterns. To this end, we consider the recently introduced PhaseLift algorithm, which expresses the problem in the language of convex optimization. We provide recovery guarantees which require O(log^2 d) different diffraction patterns, thus improving on recent results by Candes et al. [arXiv:1310.3240], which require O(log^4 d) different patterns.








Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Small ball probabilities for linear images of high dimensional distributions / Circular law for random matrices with exchangeable entries



Today, we are more on the more theoretical sides of things:

We study concentration properties of random vectors of the form AX, where X = (X_1,..., X_n) has independent coordinates and A is a given matrix. We show that the distribution of AX is well spread in space whenever the distributions of X_i are well spread on the line. Specifically, assume that the probability that X_i falls in any given interval of length T is at most p. Then the probability that AX falls in any given ball of radius T \|A\|_{HS} is at most (Cp)^{0.9 r(A)}, where r(A) denotes the stable rank of A.


An exchangeable random matrix is a random matrix with distribution invariant under any permutation of the entries. For such random matrices, we show, as the dimension tends to infinity, that the empirical spectral distribution tends to the uniform law on the unit disc. This is an instance of the universality phenomenon known as the circular law, for a model of random matrices with dependent entries, rows, and columns. It is also a non-Hermitian counterpart of a result of Chatterjee on the semi-circular law for random Hermitian matrices with exchangeable entries. The proof relies in particular on a reduction to a simpler model given by a random shuffle of a rigid deterministic matrix, on Hermitization, and also on combinatorial concentration of measure and combinatorial Central Limit Theorem. A crucial step is a polynomial bound on the smallest singular value of exchangeable random matrices, which may be of independent interest.




Image Credit: NASA/JPL/Space Science Institute
Full-Res: N00220668.jpg

N00220668.jpg was taken on February 22, 2014 and received on Earth February 24, 2014. The camera was pointing toward TITAN at approximately 2,340,288 miles (3,766,329 kilometers) away, and the image was taken using the CL1 and UV3 filters.




Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Tuesday, February 25, 2014

Predicting the Future: The Upcoming Stephanie Events

You probably recall this entry on The Steamrollers, technologies that are improving much faster than Moore's law if not more. It turns out that at the Paris Machine Learning Meetup #5, Jean-Philippe Vert [1] provided some insight as to what happened in 2007-2008 on this curve:


Namely, the fact that the genomic pipelines went parallel. The curve has been going down ever since but one wonders if we will see another phase transition.

The answer is yes. 

Why ? from [3]
However, nearly all of these new techniques concomitantly decrease genome quality, primarily due to the inability of their relatively short read lengths to bridge certain genomic regions, e.g., those containing repeats. Fragmentation of predicted open reading frames (ORFs) is one possible consequence of this decreased quality.

it is one thing to go fast it is another to produce good results. Both of these issues of speed and accuracy due to short reads may well answered with nanopore technology and specifically the fact that entities like Quantum Biosystems Provides Raw Data Access to New Sequencing Technology and I am hearing through the grapevine that their data release program is successful. What are the consequences of the current capabilities ? Eric Schadt calls it the The Stephanie Event (see also [4]). How much faster are we going to have those events in the future if we have very accurate long read lengths and attendant algorithms [2, 5] ? I am betting more.

Avoiding pathologies in very deep networks - implementation -

Deeper kernels can now use tricks from deep learning with dropout regularization. From the attendant slides:





Choosing appropriate architectures and regularization strategies for deep networks is crucial to good predictive performance. To shed light on this problem, we analyze the analogous problem of constructing useful priors on compositions of functions. Specifically, we study the deep Gaussian process, a type of infinitely-wide,deep neural network. We show that in standard architectures, the representational capacity of the network tends to capture fewer degrees of freedom as the number of layers increases, retaining only a single degree of freedom in the limit. We propose an alternate network architecture which does not suffer from this pathology. We also examine deep covariance functions, obtained by composing infinitely many feature transforms. Lastly, we characterize the class of models obtained by performing dropout on Gaussian processes.
The attendant code is here.

Potentially interesting reading:

Monday, February 24, 2014

So Compressive Sensing Skews Everything

It's not just your worldview, consider Terry Tao's h-index:



"..For another example, consider T. Tao’s Google scholar profile. Since he has 30 053 citations, the rule of thumb predicts his h-index is 93.6. This is far from his actual h-index of 65. Now, his top five citations (joint with E. Candes on compressed sensing) are applied. Removing the papers on this topic leaves 13 942 citations. His new estimate is therefore 63.7 and his revised h-index is 61..."

The h-index was introduced by the physicist J.E. Hirsch in 2005 as measure of a researcher's productivity. We consider the "combinatorial Fermi problem" of estimating h given the citation count. Using the Euler-Gauss identity for integer partitions, we compute confidence intervals. An asymptotic theorem about Durfee squares, due to E.R. Canfield-S. Corteel-C.D. Savage from 1998, is reinterpreted as the rule of thumb h=0.54 x (citations)^{1/2}. We compare these intervals and the rule of thumb to empirical data (primarily using mathematicians).
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Around the blogs in 78 Summer hours

It's been a while since the last Around the blogs in 78 hours, here is what I could gather from the past month and a half:






Moritz


Carson


Vladimir

Deep Learning



Danny


Josh



Greg
Lets build something together, the Tin Can Radar Forum




Carson



While on Nuit Blanche, we had:


Image Credit: NASA/JPL-Caltech
This image was taken by Rear Hazcam: Left B (RHAZ_LEFT_B) onboard NASA's Mars rover Curiosity on Sol 552 (2014-02-24 07:57:39 UTC).
Full Resolution


Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Friday, February 21, 2014

Organizing Meetups: Some Tips

Meetups are a moment in time and space where people from very different worlds can get together and exchange on a specific topic. The crowd can be much more diverse than what one sees in a traditional academic meeting. 

In the course of organizing the Paris Machine Learning Meetup series, here are some of things Franck Bardol, Frederic Dembak and I have noted and which ought to be considered for whoever wants to organize similar venues.

Use the meetup.com platform. I don't know eventbrite, but I have organized different series of academic meetings in the pastmeetup.com takes a lot of the headaches away so that you can concentrate on getting the best speakers and the best discussions during the networking events. On top of being a good platform to organize meetups, there is an underlying recommender system that lets users know of new meetup groups in the "vicinity" of their current interest/meetups they attend. 

You also need a place that provides:
  • high speed internet and wifi.
  • a projector that connects to any laptops
  • a microphone that connects to louspeakers
  • people who support your technical community
DojoEvents has provided this in the past few meetups.

We generally use only one laptop for the projector so that no computer switch be allowed to break the flow of the presentations. All presentations are generally available on the archive section before the meetup. This serves two purposes, one of which is that there is a backup on the interweb.
 
We have taken the initiative to do streaming for each meetup for different reasons ( see one of them below ). This only requires a laptop. The microphone used by the speaker in order to be heard by the rest of the room, will be enough for the streaming provided the laptop is at the right location. We have also used the suitcase of Parisson. If you have only access to a laptop, you ought to use Google+ Hangout on Air (not just the regular hangout where several people can interact). When the streaming is done, you can have the video on YouTube, Vimeo eventually....

We have also maintained an archive of all the previous meetups so that potential speakers can gauge the level of technical details of their future presentation accordingly. That archive serves another purpose: Damping attendances issues.

Attendance:

Initially things will be OK, but soon enough you may run into the following problem: people will RSVP yes irrespective of their actual physical attendance.
Meetup.com allows one to set limits for attendance. If one look at how other Paris area meetups handle this, we have noticed that if one sets a limit on the number of attendees, people will RSVP yes automatically without an actual regard as to whether they will be in town that day. The attendance list fills up very rapidily (for some Parisian meetups, I have heard of events being full after four hours after having opened the attendance list). Because the attendance list fills up so rapidily, other people get put on a waiting list. The problem with this is that being on the attendance list becomes something artificially "precious". As a consequence, few people will  reset their RSVP to NO, even an hour before the meetup because nobody gives away something that "precious".

Our strategy so far has been as follows (we may change in the future) a combination of
  • streaming the event 
  • setting up an archive of previous meetups with the presentations
  • making available the presentations on the archive before the event
  • sending constant reminders to attendees to change their RSVP to NO, a week, a day, an hour before the meetup.
  • allowing people to RSVP yes even an hour before the start of the meetup
The first three items make being there less "precious" and makes it more likely people will RSVP NO much before the meetup starts.
In terms of organization of the meetup:
  • We generally have sodas, champagne and pizza before the presentations allowing for about 30 to 45 minutes of meet and greet / networking between attendees.
  • We generally try to have a word or two at the beginning of the meetup to give some context about what the speakers will talk about. We also mention local initiatives. 
  • We generally ask the presentation slides be in English (a good thing for the archives) but we leave the option of the spoken language open. It goes without saying that listening and understanding what is said in a foreign language (English) is not the same as really understanding what is said. Most of our speakers have given their presentations in French. We are absolutely open to both languages even for those speakers who's mother tongue is French.  
  • We started a series of remote presentations by having remote speakers give us a pitch or a presentation through Skype. Those are very convenient on two accounts: they require 20 minutes' time of somebody that is sometimes 8000 miles away, no travel required. A local crowd is exposed to some of the most innovative stuff on Earth, who would say no to that? 
    • In the case of a pitch, we had two speakers give us pitches for their crowdfunding projects. It required that we played locally the video on the projector of the crowdfunding campaign followed by a 5 to 10 minutes Q&A of the presenters directly on Skype. The Skype call was made directly from the laptop connected to the projector so that the face of the presenters would be big enough for the attendees to feel like they were talking to a real person. This happened in Meetup 7 and it went flawlessly. As an organizer, it is important you prepare the crowd to ask relevant questions. 
    • In the case of a remote presentation, we used an iPad for the video Skype call and put the microphone next the loudspeakers of the iPad (we could also have connected the local loudspeakers to the iPad). The full screen of the iPad was located at about the location where a local presenter would stand giving the impression to most that the remote presenter was not that remote. We had the presentation locally on the laptop connected to the projector. The speaker would ask the local organizers to go through each slide. A good thing was for those slides to be numbered. This happened at Meetup 8 and it went pretty much flawlessly. Again as an organizer, it is important you prepare the crowd to ask relevant questions. 
  • At the last meetup, we had five presentations. This was a bit much, three or four might be more optimal. 
  • We ask presenters for their presentations ahead of time and we also ask them to keep in mind the general topic of the meetup, for us it is Machine Learning. We also generally ask the speaker to provide ways for people to get in touch with them after the meetup: email, webpages, twitter....
  • Finally, we have also created a mirror group on LinkedIn in order to provide a means for advertizing jobs etc which is not the main purpose of platform like meetup.com. Currently the meetup group has more 597 members while the Paris Machine Learning Linkedin group has 294 members and about 10 job offers since the group started about five months ago.
So what's stopping you from organizing meetups ?




Image Credit: NASA/JPL/Space Science Institute
W00086955.jpg was taken on February 19, 2014 and received on Earth February 20, 2014. The camera was pointing toward SATURN at approximately 1,762,178 miles (2,835,951 kilometers) away, and the image was taken using the CB2 and CL2 filters. 


Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Interest Zone Matrix Approximation - implementation -



Interest Zone Matrix Approximation by Gil Shabat, Amir Averbuch

Abstract. We describe an algorithm for matrix approximation when only some of its entries aretaken into consideration. The approximation constraint can be any whose approximated solution isknown for the full matrix. For low rank approximations, this type of algorithms appears recently inthe literature under different names, where it usually uses the Expectation-Maximization algorithmthat maximizes the likelihood for the missing entries without any relation for mean square errorminimization. In this paper, we show that the algorithm can be extended to different cases otherthan low rank approximations under Frobenius norm such as minimizing the Frobenius norm undernuclear norm contraint, spectral norm constraint, orthogonality constraint and more. We also discussthe geometric interpretation of the proposed approximation process and its optimality for conexconstraints and show how the approximation algorithm can be used for matrix completion undera variety of spectral regularizations. Its applications to physics, electrical engineering and datainterpolation problems are also described.

Abstract—We describe several algorithms for matrix completion and matrix approximation when only some of its entries are known. The approximation constraint can be any whose approximated solution is known for the full matrix. For low rank approximations, similar algorithms appear recently in the literature under different names. In this work, we introduce new theorems for matrix approximation and show that these algorithms can be extended to handle different constraints such as nuclear norm, spectral norm, orthogonality constraints and more that are different than low rank approximations. As the algorithms can be viewed from an optimization point of view, we discuss their convergence to global solution for the convex case. We also discuss the optimal step size and show that it is fixed in each iteration. In addition, the derived matrix completion flow is robust and does not require any parameters. This matrix completion flow is applicable to different spectral minimizations and can be applied to physics, mathematics and electrical engineering problems such as data reconstruction of images and data coming from PDEs such as Helmholtz’s equation used for electromagnetic waves.

Thursday, February 20, 2014

REMODE: Probabilistic, Monocular Dense Reconstruction in Real Time - implementation -



Abstract— In this paper, we solve the problem of estimating dense and accurate depth maps from a single moving camera. A probabilistic depth measurement is carried out in real time on a per-pixel basis and the computed uncertainty is used to reject erroneous estimations and provide live feedback on the reconstruction progress. Our contribution is a novel approach to depth map computation that combines Bayesian estimation and recent development on convex optimization for image processing. We demonstrate that our method outperforms statof-the-art techniques in terms of accuracy, while exhibiting high efficiency in memory usage and computing power. We call our approach REMODE (REgularized MOnocular Depth Estimation). Our CUDA-based implementation runs at 30Hz on a laptop computer and is released as open-source software.
The accompanying video and source code are available at: http://rpg.ifi.uzh.ch/software


Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Wednesday, February 19, 2014

Randomized LU Decomposition - implementation -

Gil Shabat sent me the following recently:

Hi Igor,

I am following your posts on the internet about matrix factorizations, random matrices, etc.
In fact, one of my works (which is expected to publish soon) even appeared in your blog (Nuit Blanche) and this is how I was first exposed to it.

Anyway, I have recently submitted for publication an algorithm for Randomized LU factorization, which is quite accurate, but perhaps more interesting is that it can fully run in parallel, makes it suitable for GPUs.

I think it can also be added to your matrix factorization jungle page. What do you think?

Here's a link to the paper:
Gil
I responded to Gil that indeed it could be added to the Advanced Matrix Factorization page as long as some implementation was made available. Gil then kindly made it available on the interweb and it is now linked in the Advanced Matrix Factorization Jungle page. Thanks Gil. Here is the paper:

Randomized LU Decomposition by Gil Shabat, Yaniv Shmueli, Amir Averbuch
We present a fast randomized algorithm that computes a low rank LU decomposition. Our algorithm uses random projections type techniques to efficiently compute a low rank approximation of large matrices. The randomized LU algorithm can be parallelized and further accelerated by using sparse random matrices in its projection step. Several different error bounds are proven for the algorithm approximations. To prove these bounds, recent results from random matrix theory related to subgaussian matrices are used. As an application, we also show how the algorithm can be utilized to solve problems such as the rank-deficient least squares problem. Numerical examples, which illustrate the performance of the algorithm and compare it to other decomposition methods, are presented.
The implementation is here.

Printfriendly