Upgrade all regressors to use TT#3319
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3319 +/- ##
==========================================
- Coverage 72.7% 72.69% -0.01%
==========================================
Files 807 807
Lines 145172 145172
Branches 16225 16225
==========================================
- Hits 105545 105536 -9
- Misses 35215 35221 +6
- Partials 4412 4415 +3
|
| // Label: 0.155, Prediction: 0.164 | ||
| // Label: 0.515, Prediction: 0.470 | ||
| // Label: 0.566, Prediction: 0.501 | ||
| // Label: 0.096, Prediction: 0.138"; |
There was a problem hiding this comment.
I think @rogancarr tried to make sure we have some distinction between real output and just comment, and he put three (3) spaces for actual output.
Would be nice you can keep that pattern.
EVERYWHERE #Closed
| // Create testing examples. Use different random seed to make it different from training data. | ||
| var testData = mlContext.Data.LoadFromEnumerable(GenerateRandomDataPoints(500, seed:123)); | ||
| // Create testing data. Use different random seed to make it different from training data. | ||
| var testData = mlContext.Data.LoadFromEnumerable(GenerateRandomDataPoints(500, seed: 123)); |
There was a problem hiding this comment.
500 [](start = 86, length = 3)
is this too much for testing? Would 100 be better? #Resolved
There was a problem hiding this comment.
I doubt it's great idea to have this lines in our samples. In reply to: 483336891 [](ancestors = 483336891) Refers to: docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/Regression/OnlineGradientDescent.cs:43 in 6721259. [](commit_id = 6721259, deletion_comment = False) |
|
|
||
| namespace Samples.Dynamic.Trainers.Regression | ||
| { | ||
| public static class OrdinaryLeastSquaresAdvanced |
There was a problem hiding this comment.
OrdinaryLeastSquaresAdvanced [](start = 24, length = 28)
is there a tt for this? #ByDesign
There was a problem hiding this comment.
| // MedianHomeValue CrimesPerCapita PercentResidental PercentNonRetail CharlesRiver NitricOxides RoomsPerDwelling PercentPre40s | ||
| // 24.00 0.00632 18.00 2.310 0 0.5380 6.5750 65.20 | ||
| // 21.60 0.02731 00.00 7.070 0 0.4690 6.4210 78.90 | ||
| // 34.70 0.02729 00.00 7.070 0 0.4690 7.1850 61.10 |
There was a problem hiding this comment.
There was a problem hiding this comment.
In your code no one printing actual content of the dataview/file, which is not align with how we write samples right now (right now we have code which prints something to console and //Expected output comments).
No one prints this lines, right?
So can we remove them.
And in general can we get rid of calling Download file and run in-memory sample instead?
You spend good chunk of time in attempts to convince me what everything should be IN-MEMORY NO EXCEPTION.
It's weird to see what you actually have references on SampleUtils.
Make sense?
In reply to: 275581463 [](ancestors = 275581463,275458742)
There was a problem hiding this comment.
This file is not an API sample at all. This is a record of a meaningful pipeline.
In reply to: 275582648 [](ancestors = 275582648,275581463,275458742)
| LabelColumnName = nameof(DataPoint.Label), | ||
| FeatureColumnName = nameof(DataPoint.Features), | ||
| L2Regularization = 0.1f, | ||
| CalculateStatistics = false |
There was a problem hiding this comment.
your one-line comments for the options in the other files were so nice :) #Pending
There was a problem hiding this comment.
Thanks. But I add them only if I don't need to dig into their code to understand their meanings.
In reply to: 275458769 [](ancestors = 275458769)
|
|
||
| // Create testing data. Use different random seed to make it different from training data. | ||
| var testData = mlContext.Data.LoadFromEnumerable(GenerateRandomDataPoints(500, seed:123)); | ||
| var testData = mlContext.Data.LoadFromEnumerable(GenerateRandomDataPoints(500, seed: 123)); |
There was a problem hiding this comment.
500 [](start = 86, length = 3)
100, maybe? #Resolved
There was a problem hiding this comment.
|
|
||
| // Evaluate the overall metrics | ||
| var metrics = mlContext.Regression.Evaluate(transformedTestData); | ||
| Microsoft.ML.SamplesUtils.ConsoleUtils.PrintMetrics(metrics); |
There was a problem hiding this comment.
SamplesUtils [](start = 25, length = 12)
please remove SamplesUtils and just print the metric here with regular console.writeline. we're deprecating SamplesUtils altogether. #Closed
| } | ||
| } No newline at end of file | ||
| } | ||
|
|
There was a problem hiding this comment.
It's caused by TT system. There is no such a line in my ttinclude and tt files.
In reply to: 275562731 [](ancestors = 275562731)
| /// <format type="text/markdown"> | ||
| /// <] | ||
| /// ]]> |
There was a problem hiding this comment.
please don't include this sample until we fix #2425 #ByDesign
There was a problem hiding this comment.
There are two major meanings to have a sample.
- Learn how to call it
- Explore from that sample.
A sample with bad prediction ability doesn't break any of them.
In reply to: 275564038 [](ancestors = 275564038)
| /// <format type="text/markdown"> | ||
| /// <] | ||
| /// ]]> |
There was a problem hiding this comment.
I agree with Shahab. Have documentation which states "this learner is broken here is tracker number" is worse than no documentation at all.
I would probably even make OGD internal until we figure out why it's so bad.
In reply to: 275592181 [](ancestors = 275592181,275564113)
| namespace Samples.Dynamic.Trainers.Regression | ||
| { | ||
| public static class StochasticDualCoordinateAscent | ||
| public static class Sdca |
There was a problem hiding this comment.
we kept going back and forth for sdca. The final version is using the acronym. please also change the filenames to Sdca (don't forget to update xmls referring this file) #Closed
There was a problem hiding this comment.
Code won't compile if we do In reply to: 483331666 [](ancestors = 483331666) Refers to: docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/Regression/RegressionSamplesTemplate.ttinclude:69 in 6721259. [](commit_id = 6721259, deletion_comment = False) |
Yes. It'd be much worse if we show numbers. Let me spend another 10 mins on tuning its parameters. [Update] I gave up. This is the worest linear trainer I have ever seen. In reply to: 483337067 [](ancestors = 483337067,483336891) Refers to: docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/Regression/OnlineGradientDescent.cs:43 in 6721259. [](commit_id = 6721259, deletion_comment = False) |
Ivanidzo4ka
left a comment
There was a problem hiding this comment.
I'm fine with everything except OGD.
That thing makes me worry. I doubt it's great idea to have documentation which claims it's currently under construction. I would rather prefer to hide whole trainer from users until we figure out problem/ in what conditions it works fine. We have plenty of options to for users anyway.
Part of #2522.