Add V1 Introspective Training Tests#2859

Merged

rogancarr merged 10 commits intodotnet:masterfrom

rogancarr:2817_introspective_training_scenarios

Mar 7, 2019

Contributor

rogancarr commented Mar 5, 2019

This PR adds tests to cover the Introspective Training scenarios we want fully supported in V1.

I can take an existing model file and inspect what transformers were included in the pipeline
I can inspect the coefficients (weights and bias) of a linear model without much work. Easy to find via auto-complete.
I can inspect the normalization coefficients of a normalizer in my pipeline without much work. Easy to find via auto-complete.
I can inspect the trees of a boosted decision tree model without much work. Easy to find via auto-complete.
I can inspect the topics after training an LDA transform. Easy to find via auto-complete.
I can inspect a categorical transform and see which feature values map to which key values. Easy to find via auto-complete.
P1: I can access the GAM feature histograms through APIs

Fixes: #2498

rogancarr requested review from artidoro and sfilipi

March 5, 2019 20:54

rogancarr force-pushed the 2817_introspective_training_scenarios branch from 6d3fdb6 to 55e7966 Compare

March 5, 2019 21:15

singlis reviewed

View reviewed changes

test/Microsoft.ML.Functional.Tests/Validation.cs Outdated

               using Microsoft.ML.Trainers.FastTree;
               using Microsoft.ML.Trainers;
               using Xunit;
+              using Microsoft.ML.Functional.Tests.Datasets;

Member

singlis Mar 5, 2019 •

edited by rogancarr

Loading

sort usings #Resolved

singlis reviewed

View reviewed changes

test/Microsoft.ML.Functional.Tests/Common.cs Outdated

+                      /// Verify that a numerical array has no NaNs or infinities.
+                      /// </summary>
+                      /// <param name="array">An array of doubles.</param>
+                      public static void AssertFiniteNumbers(double[] array, int ignoreElementAt = -1)

Member

singlis Mar 5, 2019 •

edited by rogancarr

Loading

AssertFiniteNumbers [](start = 27, length = 19)

Where is this function being used? #Resolved

Contributor

artidoro Mar 5, 2019

It's used here IntrospectGamShapeFunctions

In reply to: 262695483 [](ancestors = 262695483)

Contributor Author

rogancarr Mar 6, 2019

That's right. I put it in Common because I imagine that I'll use it again. Although ignoreElementAt is definitely a binning-only kind of thing.

In reply to: 262705673 [](ancestors = 262705673,262695483)

singlis approved these changes

View reviewed changes

Member

singlis left a comment

artidoro reviewed

View reviewed changes

test/Microsoft.ML.Functional.Tests/IntrospectiveTraining.cs Outdated

+                      }
+                      /// <summary>
+                      /// I can take an existing model file and inspect what transformers were included in the pipeline.

Contributor

artidoro Mar 5, 2019 •

edited by rogancarr

Loading

I can take an existing model file [](start = 12, length = 33)

You are not taking a model file. You are constructing the pipeline in the test. #Resolved

Contributor Author

rogancarr Mar 6, 2019

Good point. I am updating the summary. I changed this test to just look at pipelines, and not necessarily at serialization / deserialization. There will be model-file-specific tests that test serialization and deserialization, so I decided to not test that here.

In reply to: 262709151 [](ancestors = 262709151)

artidoro reviewed

View reviewed changes

test/Microsoft.ML.Functional.Tests/IntrospectiveTraining.cs

+                                      var column = currentSchema.GetColumnOrNull(expectedColumn);
+                                      Assert.Null(column);
+                                  }
+                              i++;

Contributor

artidoro Mar 5, 2019 •

edited by rogancarr

Loading

Seems a bit complex and overkill. We only have two transforms in the chain, so this will run for the first transform and will check that the outputschema does not contain Score. #Resolved

artidoro reviewed

View reviewed changes

test/Microsoft.ML.Functional.Tests/IntrospectiveTraining.cs Outdated

+                          // Transform the data.
+                          var transformedData = model.Transform(data);
+                          // Verify that the slotnames cane be used to backtrack by confirming that

Contributor

artidoro Mar 5, 2019 •

edited by rogancarr

Loading

can #Resolved

artidoro reviewed

View reviewed changes

test/Microsoft.ML.Functional.Tests/IntrospectiveTraining.cs

+                      }
+                      [Fact]
+                      public void InspectNestedPipeline()

Contributor

artidoro Mar 5, 2019 •

edited by rogancarr

Loading

InspectNestedPipeline [](start = 20, length = 21)

Missing summary. #Resolved

artidoro approved these changes

View reviewed changes

Contributor

artidoro left a comment

After you address the comments I think it's ready to go!

artidoro reviewed

View reviewed changes

test/Microsoft.ML.Functional.Tests/IntrospectiveTraining.cs Outdated

+                          var model = pipeline.Fit(data);
+                          // Extract the normalizer from the trained pipeline.
+                          // TODO #2854: Extract the normalizer parameters.

Contributor

artidoro Mar 5, 2019 •

edited by rogancarr

Loading

2854 [](start = 21, length = 4)

See issue, and sample on normalizers I think we can extract the parameters. #Resolved

wschin reviewed

View reviewed changes

test/Microsoft.ML.Functional.Tests/Datasets/Adult.cs Outdated

+                      public float HoursPerWeek { get; set; }
+                      /// <summary>
+                      /// The list of columns commonly used as numerical features.

Contributor

wschin Mar 6, 2019 •

edited by rogancarr

Loading

Suggested change

      
                    /// The list of columns commonly used as numerical features.
          
                    /// The list of columns commonly used as categorical features.
          
            ``` #Resolved

Rogan Carr added 7 commits

March 6, 2019 12:28


          work in progress

ee25218


          Adding introspective training scenario tests.

d3df179


          Adding an adult dataset.

c80f14f


          Fixing merge issues.

ec11f35


          Addressing PR comments.

13d9db7


          Addressing PR comments.

b29f4a3


          Address merge issues

4f7d8f5

rogancarr force-pushed the 2817_introspective_training_scenarios branch from 29371f4 to 4f7d8f5 Compare

March 6, 2019 20:50

codecov bot commented Mar 6, 2019 •

edited

Loading

Codecov Report

❗ No coverage uploaded for pull request base (master@0075757). Click here to learn what that means.
The diff coverage is 99.65%.

@@            Coverage Diff            @@
##             master    #2859   +/-   ##
=========================================
  Coverage          ?   71.72%           
=========================================
  Files             ?      812           
  Lines             ?   142678           
  Branches          ?    16124           
=========================================
  Hits              ?   102330           
  Misses            ?    35936           
  Partials          ?     4412

Flag	Coverage Δ
#Debug	`71.72% <99.65%> (?)`
#production	`67.9% <ø> (?)`
#test	`85.99% <99.65%> (?)`

Impacted Files	Coverage Δ
...osoft.ML.Functional.Tests/IntrospectiveTraining.cs	`100% <100%> (ø)`
...st/Microsoft.ML.Functional.Tests/Datasets/Adult.cs	`100% <100%> (ø)`
test/Microsoft.ML.TestFramework/Datasets.cs	`100% <100%> (ø)`
test/Microsoft.ML.Functional.Tests/Evaluation.cs	`100% <100%> (ø)`
test/Microsoft.ML.Functional.Tests/Validation.cs	`100% <100%> (ø)`
test/Microsoft.ML.Functional.Tests/Common.cs	`98.06% <94.44%> (ø)`

Rogan Carr added 3 commits

March 6, 2019 23:43


          Fixing cross-platform build errors.

25aded6


          Merge branch 'master' into 2817_introspective_training_scenarios

fa3f2bd


          Fix merge issues.

83747ba

rogancarr merged commit 10c4fc6 into dotnet:master

rogancarr deleted the 2817_introspective_training_scenarios branch

March 7, 2019 19:36

This was referenced Mar 9, 2019

Create functional tests for all V1 Introspective Training scenarios #2817

Closed

V1 Scenarios need to be covered by tests #2498

Open

OVA Multiclass Classification can be instantiated for variety of sub-trainer training tasks #2920

Closed

ghost locked as resolved and limited conversation to collaborators

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet