Added tests for text featurizer options (Part2).#3036
Conversation
|
|
||
| var prediction = engine.Predict(data[0]); | ||
| Assert.Equal("this is some text in english", string.Join(" ", prediction.OutputTokens)); | ||
| Assert.Equal(1.0f, prediction.Features[0]); |
There was a problem hiding this comment.
Assert.Equal(1.0f, prediction.Features[0]) [](start = 12, length = 42)
Doesn't Assert has option to compare to arrays or enumerables? #Resolved
|
|
||
| prediction = engine.Predict(data[1]); | ||
| Assert.Equal("xyz", string.Join(" ", prediction.OutputTokens)); | ||
| Assert.Equal(1.0f, prediction.Features[0]); |
There was a problem hiding this comment.
Assert.Equal(1.0f, prediction.Features[0]); [](start = 12, length = 43)
So i expect ngrams to be a,b,c,e,f,g,x,y,z, and end of string and empty string, not sure honestly, I guess. total 12 ngrams.
feature 0 and feature 8 is this end of string and empty string, right? #Resolved
There was a problem hiding this comment.
Index 0 is start marker and index 8 is end marker. Then there are total 10 characters including space.
In reply to: 267486364 [](ancestors = 267486364)
Codecov Report
@@ Coverage Diff @@
## master #3036 +/- ##
==========================================
+ Coverage 72.5% 72.5% +<.01%
==========================================
Files 804 804
Lines 144077 144150 +73
Branches 16179 16179
==========================================
+ Hits 104462 104519 +57
- Misses 35198 35220 +22
+ Partials 4417 4411 -6
|
| var engine = model.CreatePredictionEngine<TestClass, TestClass>(ML); | ||
|
|
||
| var prediction = engine.Predict(data[0]); | ||
| Assert.Equal("this is some text in english", string.Join(" ", prediction.OutputTokens)); |
There was a problem hiding this comment.
"this is some text in english" [](start = 25, length = 30)
nit, but for maintainability i'd create a var for this. #Resolved
|
|
||
| var prediction = engine.Predict(data[0]); | ||
| Assert.Equal("abc efg", string.Join(" ", prediction.OutputTokens)); | ||
| var expected = new float[] { 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 0.0f, 0.0f, 0.0f }; |
There was a problem hiding this comment.
1.0f [](start = 89, length = 4)
should this be 0? or is this the end marker? #Resolved
There was a problem hiding this comment.
|
|
||
| var options = new TextFeaturizingEstimator.Options() | ||
| { | ||
| CharFeatureExtractor = new WordBagEstimator.Options() { NgramLength = 1}, |
There was a problem hiding this comment.
CharFeatureExtractor = new WordBagEstimator.Options() [](start = 16, length = 54)
Is this correct? it doesn't read right to initialize a CharExtractor with the options of a WordBagEstimator... #Resolved
This PR finally fixes #2967. Test created in this PR are for the following parameters in options class
The intend here is to test that TextFeaturizer is instantiated for every parameter in the options class. Here, we are not testing the internal components of TextFeaturizer.