Conversation
| view = CompositeDataLoader.Create(env, loader, | ||
| new KeyValuePair<string, SubComponent<IDataTransform, SignatureDataTransform>>(null, component)); | ||
| } | ||
| // REVIEW: This verbose constructor should be replaced with zeahmed's enhancements once #405 is committed. |
There was a problem hiding this comment.
As mentioned, can be made more brief with #405. #Closed
| view = CompositeDataLoader.Create(env, loader, | ||
| new KeyValuePair<string, SubComponent<IDataTransform, SignatureDataTransform>>(null, component)); | ||
| } | ||
| // REVIEW: This verbose constructor should be replaced with zeahmed's enhancements once #405 is committed. |
There was a problem hiding this comment.
zeahmed [](start = 76, length = 7)
I though we have tendency to remove aliases from codebase... #Resolved
do we also need this line? it just if you remove labelInfo, groupInfo looks weird if it remaind untouched. #Resolved Refers to: src/Microsoft.ML.FastTree/FastTree.cs:1361 in 08c9001. [](commit_id = 08c9001, deletion_comment = False) |
Yes, I should definitely get rid of it -- it's not actually used for anything. In reply to: 401122953 [](ancestors = 401122953) Refers to: src/Microsoft.ML.FastTree/FastTree.cs:1361 in 08c9001. [](commit_id = 08c9001, deletion_comment = False) |
| view = CompositeDataLoader.Create(env, loader, | ||
| new KeyValuePair<string, SubComponent<IDataTransform, SignatureDataTransform>>(null, component)); | ||
| } | ||
| IDataView ApplyNormalizer(IHostEnvironment innerEnv, IDataView input) |
There was a problem hiding this comment.
If my PR goes in first, you will be able to use convenience constructor here...:) #Closed
There was a problem hiding this comment.
I know, I even had a comment about that. :) But Ivan made me remove it. #Closed
| /// This contains information about a column in an <see cref="IDataView"/>. It is essentially a convenience cache | ||
| /// containing the name, column index, and column type for the column. The intended usage is that users of <see cref="RoleMappedSchema"/> | ||
| /// will have a convenient method of getting the index and type without having to separately query it through the <see cref="ISchema"/>, | ||
| /// since practically the first thing a consumer of a <see cref="RoleMappedSchema"/> will want to do once they get a mappping is |
There was a problem hiding this comment.
I feel comment/summary is not complete here as sentence is ending in is? #Resolved
There was a problem hiding this comment.
Sure. You go first then I'll adjust. This work in some ways will be helped by what you're doing as well, as you have noted. #Closed
There was a problem hiding this comment.
| /// This method will not modify <paramref name="data"/> if the return from that is <c>null</c> or | ||
| /// <c>false</c>.</param> | ||
| /// <returns>True if the normalizer was applied and <paramref name="data"/> was modified</returns> | ||
| public static bool CreateIfNeeded(IHostEnvironment env, ref RoleMappedData data, ITrainer trainer) |
There was a problem hiding this comment.
Great! I will be using this method in #424 just before calling train method, correct? #Closed
There was a problem hiding this comment.
I think so. Also probably before the hypothetical cache-if-needed. #Closed
| { | ||
| // REVIEW: The role mapped data has the ability to have multiple columns fill the role of features, which is | ||
| // useful in some trainers that are nonetheless parameteric and can therefore benefit from normalization. | ||
| var featInfo = schema.Feature; |
There was a problem hiding this comment.
(nit) check schema for null? #Resolved
| } | ||
|
|
||
| /// <summary> | ||
| /// Returns whether a column has the <see cref="Kinds.IsNormalized"/> metadata set to true. |
There was a problem hiding this comment.
(minor) Do we instead want to document what IsNormalized means? "Returns whether a column <insert what IsNormalized means>". #Resolved
There was a problem hiding this comment.
Sure, I can add a bit more information. Though that information, I might prefer to make that part of the Kinds static class, since the documentation of those are the primary source of truth. This is just something we added as a convenience on top of that.
In reply to: 199596084 [](ancestors = 199596084)
eerhardt
left a comment
There was a problem hiding this comment.
LGTM. Just some minor questions.
d9e0b4d to
3f1e454
Compare
* API conveniences for the Normalize transform
In which I introduce some helpers for normalization, and generally try to clean up the code-base. Addresses #433 . Hopefully will be used in #424, though I've changed the code here to use some of it where it made sense.
The
Microsoft.ML.Datatransform had a "hidden" dependency onMicrosoft.ML.Transformproject via dependency injection, for its existing "helper" for normalization (the console-app centric version). This has been resolved and replaced with direct instantiation. It required moving the normalizer files, however.Introduction of helpers on
NormalizeTransformfor API-centric operations. (Not necessarily useful directly for console-application/GUI usage.)Some documentation changes on
RoleMappedSchemaandRoleMappedData, though more non-cosmetic changes I'd expected would come with Direct API: RoleMappedSchema/Data Cleanup, Improvement #445 .