Friday, November 09, 2012

A note from Jelani Nelson on OSNAP

Jelani Nelson responded in the comment section of the post I featured this morning (Low Rank Approximation and Regression in Input Sparsity Time - implementation ) and pointed out to my misunderstanding of a certain issue in a paper I featured yesterday ( OSNAP: Faster numerical linear algebra algorithms via sparser subspace embeddings ). Here is what  Jelani had to say:

Hi, 
My bad, indeed, David states so in the video, the log factor is gone. 
Sorry to add more confusion to the mix, but I don't think this is what Ken meant (in particular, the video you link to is from before what Ken is talking about). Their work was always, from the first version, able to get nnz(A) + poly(d/eps) time for regression, without the log n. What Huy and I addressed was: how big is the poly? Originally the CW12 poly was d^5*polylog(d), and they were able to make it d^3*polylog(d) (even replacing d with rank(A)) at a sacrifice: by turning nnz(A) into nnz(A)*log n. I think Ken's point is that they no longer have to make that sacrifice with their Oct. 31st version: their sharpened analysis can make it d^3*polylog(d) while still keeping the log n term away from the nnz(A). (See footnote 2 in our manuscript.)
The point of what we did is that (a) it's nnz(A) + d^3*log d, with only one log d multiplying the d^3 term (this difference used to be d^3*log d vs. d^5*polylog(d), and now is d^3*log d vs. d^3*polylog(d) given the Oct. 31 version that Ken referred to; I think this is what Ken's point was in his comment), and (b) if you're willing to make the sacrifice of multiplying nnz(A) by log n then you can make the additive term be d^{omega+gamma} for arbitrarily small gamma>0 (and again, d can be replaced with rank(A) in this bound), where omega < 2.373 is the exponent of matrix multiplication. There are some other differences also, e.g. our analyses don't need as strong hash functions to go through which has some slight advantages in streaming settings (see the paper).

Later he added in a second comment:

I guess I should have mentioned that even for (b) above, you actually don't even have to multiply nnz(A) by log n to get d^{omega+gamma}; rather, you have to multiply nnz(A) by a constant depending on 1/gamma. You do multiply it by log n if you want rank(A)^{omega+gamma} time though.
Thank you Jelani !

No comments:

Printfriendly