Comments for Minimizing Regret https://minimizingregret.wordpress.com Google Princeton AI and Hazan Lab @ Princeton University Wed, 17 Apr 2019 03:05:16 +0000 hourly 1 http://wordpress.com/ Comment on A mathematical definition of unsupervised learning? by Wilhelm Duncan https://minimizingregret.wordpress.com/2016/10/06/a-mathematical-definition-of-unsupervised-learning/comment-page-1/#comment-21 Tue, 27 Feb 2018 09:43:29 +0000 http://minimizingregret.wordpress.com/2016/10/06/a-mathematical-definition-of-unsupervised-learning/#comment-21 This comment has been removed by the author.

Like

]]>
Comment on The complexity zoo and reductions in optimization by Jessica Ellis https://minimizingregret.wordpress.com/2016/05/26/the-complexity-zoo-and-reductions-in-optimization/comment-page-1/#comment-28 Tue, 16 Jan 2018 11:05:43 +0000 http://minimizingregret.wordpress.com/2016/05/26/the-complexity-zoo-and-reductions-in-optimization/#comment-28 I think that this is really complicated to teach basic optimization: which variant/proof of gradient descent should one start with? This is especially acute for online translation services reviews courses that do not deal directly with optimization, which is described as a tool for learning or as a basic building block for other algorithms. One can find academic papers derive various optimization improvements many times for only one of the settings, leaving the other settings desirable.

Like

]]>
Comment on More than a decade of online convex optimization by Mary Walker https://minimizingregret.wordpress.com/2016/07/07/more-than-a-decade-of-online-convex-optimization/comment-page-1/#comment-23 Wed, 06 Sep 2017 11:05:17 +0000 http://minimizingregret.wordpress.com/2016/07/07/more-than-a-decade-of-online-convex-optimization/#comment-23 Your post is providing good information. I liked it and enjoyed reading it. Keep sharing such important posts. I am very much pleased with the contents you have mentioned. I wanted to thank you for this great article. visit website

Like

]]>
Comment on A mathematical definition of unsupervised learning? by Csaba Szepesvari https://minimizingregret.wordpress.com/2016/10/06/a-mathematical-definition-of-unsupervised-learning/comment-page-1/#comment-20 Sun, 30 Oct 2016 23:57:02 +0000 http://minimizingregret.wordpress.com/2016/10/06/a-mathematical-definition-of-unsupervised-learning/#comment-20 Cool:) Looking forward to it (and I see the followup will have a followup:))

Like

]]>
Comment on A mathematical definition of unsupervised learning? by ethan fetaya https://minimizingregret.wordpress.com/2016/10/06/a-mathematical-definition-of-unsupervised-learning/comment-page-1/#comment-19 Sun, 09 Oct 2016 06:55:30 +0000 http://minimizingregret.wordpress.com/2016/10/06/a-mathematical-definition-of-unsupervised-learning/#comment-19 Interesting post. This reminds me of the joke \”Classification of mathematical problems as linear and nonlinear is like classification of the Universe as bananas and non-bananas\”. Like nonlinear math, unsupervised learning is a very large class of loosly connected ideas so I am intrigued as to what can be said about it. Given that, even mathematically formulating sub-problems like clustering is hard. While some work has been done it only captures part of that we consider clustering.

Like

]]>
Comment on A mathematical definition of unsupervised learning? by Elad Hazan https://minimizingregret.wordpress.com/2016/10/06/a-mathematical-definition-of-unsupervised-learning/comment-page-1/#comment-18 Sat, 08 Oct 2016 05:07:30 +0000 http://minimizingregret.wordpress.com/2016/10/06/a-mathematical-definition-of-unsupervised-learning/#comment-18 Thanks Csaba! Your paper looks cool and in the direction of changing the statistical assumptions of ICA to some form of \”closeness\” to a signal in standard input-ICA form. This is certainly in the correct direction of removing statistical assumptions. What I'm arguing for is even more extreme: can we \”step out of the model\” (ICA in this case) completely, regardless of any special form of the input attain worst-case guarantees? The analogy would be to perform low-rank matrix completion (usually studied under uniform distribution over the inputs and incoherence assumptions) by low-trace (or low max-norm) relaxations. More to come soon…

Like

]]>
Comment on A mathematical definition of unsupervised learning? by Csaba Szepesvari https://minimizingregret.wordpress.com/2016/10/06/a-mathematical-definition-of-unsupervised-learning/comment-page-1/#comment-17 Fri, 07 Oct 2016 23:04:39 +0000 http://minimizingregret.wordpress.com/2016/10/06/a-mathematical-definition-of-unsupervised-learning/#comment-17 I am really looking forward to the post:)In the meanwhile, let me point out one work that I happen to know about as I was part of it (sorry for the plug, I could not help it).So what is happening in this work? Well, we study ICA *without* any generative assumptions. I think what we do can be done in many unsupervised settings, but we choose ICA as it is a setting that seems to be closely tied to generative and stochastic assumptions, so in some sense it looked especially challenging. So what do we do? We derive performance bounds for some algorithm (building on previous work, including Arora's) in terms of how close the data that the algorithm sees is to \”ideal data\”. I find this approach quite satisfactory as finally we don't need to make any generative/stochastic assumptions, and the bounds tell one exactly what we need: how performance (recovery of some mixing matrix in this case) will be impacted by deviations from the ideal situation. As a bonus, one can show that the bounds can also recover the usual bounds available in the generative setting. Details are here: Huang, R., György, A., and Szepesvári, Cs., Deterministic Independent Component Analysis, ICML, pp. 2521–2530, 2015. https://goo.gl/N1pnML

Like

]]>
Comment on Making second order methods practical for machine learning by David McAllester https://minimizingregret.wordpress.com/2016/03/02/making-second-order-methods-practical-for-machine-learning/comment-page-1/#comment-31 Tue, 08 Mar 2016 10:57:48 +0000 http://minimizingregret.wordpress.com/2016/03/02/making-second-order-methods-practical-for-machine-learning/#comment-31 This comment has been removed by the author.

Like

]]>
Comment on Making second order methods practical for machine learning by Elad Hazan https://minimizingregret.wordpress.com/2016/03/02/making-second-order-methods-practical-for-machine-learning/comment-page-1/#comment-30 Fri, 04 Mar 2016 03:37:28 +0000 http://minimizingregret.wordpress.com/2016/03/02/making-second-order-methods-practical-for-machine-learning/#comment-30 Thanks David! The loss for DNN is a composition of such functions, and indeed not rank one, although it still has special structure that can perhaps be exploited. The problem with applying Newton's method per-se to non-convex optimization is that it sends you in the wrong direction: you actually want to move in negative eigendirections of the Hessian. However, there is definitely potential for second order methods even there, e.g. the paper of Nesterov and Polyak on cubic regularization (also addressed in the recent paper of Ge et al. from COLT 2015). .

Like

]]>
Comment on Making second order methods practical for machine learning by David McAllester https://minimizingregret.wordpress.com/2016/03/02/making-second-order-methods-practical-for-machine-learning/comment-page-1/#comment-29 Fri, 04 Mar 2016 02:42:15 +0000 http://minimizingregret.wordpress.com/2016/03/02/making-second-order-methods-practical-for-machine-learning/#comment-29 This comment has been removed by the author.

Like

]]>