To successfully run tests, you'll need to create an azuremltestsettings.json file in this folder.
This file contains credentials and lists various Azure resources to use when running the tests.
{
"workspace": {
"id": "11111111111111111111111111111111",
"token": "00000000000000000000000000000000",
"endpoint": "https://studio.azureml.net"
},
"storage": {
"accountName": "mystorageaccount",
"accountKey": "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==",
"container": "mydatasettestcontainer",
"mediumSizeBlob": "MediumSizeDataset_NH.csv",
"unicodeBomBlob": "DatasetWithUnicodeBOM.txt",
"blobs": [
"Dataset_NH.csv",
"Dataset_NH.tsv",
"Dataset_WH.csv",
"Dataset_WH.tsv",
"Dataset.txt"
]
},
"intermediateDataset": {
"experimentId": "11111111111111111111111111111111.f-id.22222222222222222222222222222222",
"nodeId": "33333333-3333-3333-3333-333333333333-333",
"portName": "Results dataset",
"dataTypeId": "GenericCSV"
},
"diagnostics": {
"writeBlobContents": "True",
"writeSerializedFrame": "True"
}
}
From the Azure portal, create a new ML workspace. Open the new workspace in Studio. From the URL, you'll find your workspace id.
In the settings page, you'll find 2 authorization tokens, you can use either one.
Set the id and token in the json:
"workspace": {
"id": "11111111111111111111111111111111",
"token": "00000000000000000000000000000000",
"endpoint": "https://studio.azureml.net"
},
The storage section is used for some tests that load dataset files from Azure blob storage.
You'll need to create an Azure storage account, create a container and upload dataset files to it.
The round-trip tests rely on a naming convention for the ones in the blobs array:
"blobs": [
"Dataset_NH.csv",
"Dataset_NH.tsv",
"Dataset_WH.csv",
"Dataset_WH.tsv",
"Dataset.txt"
]
NH means no header, WH means with header.
Create a new experiment. Add the following modules and connect them:
- Airport Codes Dataset
- Split
- Convert to CSV
Play the experiment and save.
You'll need the experiment id (appears in URL), the node id (can be found in the HTML DOM), the port name (displayed as a tooltip when you hover on the output port) and the data type id.
"intermediateDataset": {
"experimentId": "11111111111111111111111111111111.f-id.22222222222222222222222222222222",
"nodeId": "33333333-3333-3333-3333-333333333333-333",
"portName": "Results dataset",
"dataTypeId": "GenericCSV"
},
Some of the tests can write intermediate results to disk, which can help with debugging.
"diagnostics": {
"writeBlobContents": "True",
"writeSerializedFrame": "True"
}