Skip to content

Commit 9eebcd0

Browse files
authored
Merge pull request microsoft#23 from microsoft/develop
Documentation update
2 parents 4d4ccaf + e32366e commit 9eebcd0

1 file changed

Lines changed: 202 additions & 67 deletions

File tree

README.md

Lines changed: 202 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,79 @@
11

2-
32
# Nutter
43

4+
5+
- [Overview](#overview)
6+
- [Nutter Runner](#nutter-runner)
7+
* [Cluster Installation](#cluster-installation)
8+
* [Nutter Fixture](#nutter-fixture)
9+
* [Test Cases](#test-cases)
10+
* [before_all and after_all](#before-all-and-after-all)
11+
- [Nutter CLI](#nutter-cli)
12+
* [Getting Started with the Nutter CLI](#getting-started-with-the-nutter-cli)
13+
- [Examples](#examples)
14+
* [1. Listing Test Notebooks](#1-listing-test-notebooks)
15+
* [2. Executing Test Notebooks](#2-executing-test-notebooks)
16+
* [Run single test notebook](#run-single-test-notebook)
17+
* [Run multiple tests notebooks](#run-multiple-tests-notebooks)
18+
* [Parallel Execution](#parallel-execution)
19+
- [Nutter CLI Syntax and Flags](#nutter-cli-syntax-and-flags)
20+
* [Run Command](#run-command)
21+
* [List Command](#list-command)
22+
- [Integrating Nutter with Azure DevOps](#integrating-nutter-with-azure-devops)
23+
- [Contributing](#contributing)
24+
* [Contribution Tips](#contribution-tips)
25+
* [Contribution Guidelines](#contribution-guidelines)
526
## Overview
6-
The Nutter framework makes it easy to test Databricks notebooks. The framework enables a simple inner dev loop, but also easily integrates with Azure DevOps Build/Release pipelines, among others. When data or ML engineers want to test a notebook, they simply create a test notebook called *test_*<notebook_under_test>.
727

28+
The Nutter framework makes it easy to test Databricks notebooks. The framework enables a simple inner dev loop and easily integrates with Azure DevOps Build/Release pipelines, among others. When data or ML engineers want to test a notebook, they simply create a test notebook called *test_*<notebook_under_test>.
29+
30+
Nutter has 2 main components:
31+
32+
1. Nutter Runner - this is the server-side component that is installed as a library on the Databricks cluster
33+
2. Nutter CLI - this is the client CLI that can be installed both on a developers laptop and on a build agent
834

935
The tests can be run from within that notebook or executed from the Nutter CLI, useful for integrating into Build/Release pipelines.
1036

37+
## Nutter Runner
38+
39+
### Cluster Installation
40+
41+
The Nutter Runner can be installed as a cluster library, via PyPI.
42+
43+
For more information about installing libraries on a cluster, review [Install a library on a cluster](https://docs.microsoft.com/en-us/azure/databricks/libraries#--install-a-library-on-a-cluster).
44+
45+
### Nutter Fixture
46+
47+
The Nutter Runner is simply a base Python class, NutterFixture, that test fixtures implement. The runner runtime is a module you can use once you install Nutter on the Databricks cluster. The NutterFixture base class can then be imported in a test notebook and implemented by a test fixture:
48+
49+
``` Python
50+
from runtime.nutterfixture import NutterFixture, tag
51+
class MyTestFixture(NutterFixture):
52+
53+
```
54+
55+
To run the tests:
56+
57+
``` Python
58+
result = MyTestFixture().execute_tests()
59+
```
60+
61+
To view the results from within the test notebook:
62+
63+
``` Python
64+
print(result.to_string())
65+
```
66+
67+
To return the test results to the Nutter CLI:
68+
69+
``` Python
70+
result.exit(dbutils)
71+
```
72+
73+
__Note:__ The call to result.exit, behind the scenes calls dbutils.notebook.exit, passing the serialized TestResults back to the CLI. At the current time, print statements do not work when dbutils.notebook.exit is called in a notebook, even if they are written prior to the call. For this reason, it is required to *temporarily* comment out result.exit(dbutils) when running the tests locally.
74+
1175
The following defines a single test fixture named 'MyTestFixture' that has 1 TestCase named 'test_name':
76+
1277
``` Python
1378
from runtime.nutterfixture import NutterFixture, tag
1479
class MyTestFixture(NutterFixture):
@@ -28,6 +93,7 @@ result.exit(dbutils)
2893
```
2994

3095
To execute the test from within the test notebook, simply run the cell containing the above code. At the current time, in order to see the below test result, you will have to comment out the call to result.exit(dbutils). That call is required to send the results, if the test is run from the CLI, so do not forget to uncomment after locally testing.
96+
3197
``` Python
3298
Notebook: (local) - Lifecycle State: N/A, Result: N/A
3399
============================================================
@@ -39,45 +105,22 @@ test_name (19.43149897100011 seconds)
39105
============================================================
40106
```
41107

42-
## Components
43-
Nutter has 2 main components:
44-
1. Nutter Runner - this is the server-side component that is installed as a library on the Databricks cluster
45-
2. Nutter CLI - this is the client CLI that can be installed both on a developers laptop and on a build agent
108+
### Test Cases
46109

47-
## Nutter Runner
48-
The Nutter Runner is simply a base Python class, NutterFixture, that test fixtures implement. The runner is installed as a library on the Databricks cluster. The NutterFixture base class can then be imported in a test notebook and implemented by a test fixture:
49-
``` Python
50-
from runtime.nutterfixture import NutterFixture, tag
51-
class MyTestFixture(NutterFixture):
52-
53-
```
110+
A test fixture can contain 1 or mote test cases. Test cases are discovered when execute_tests() is called on the test fixture. Every test case is comprised of 2 required and 2 optional methods and are discovered by the following convention: prefix_testname, where valid prefixes are: before_, run_, assertion_, and after_. A test fixture that has run_fred and assertion_fred methods has 1 test case called 'fred'. The following are details about test case methods:
54111

55-
To run the tests:
56-
``` Python
57-
result = MyTestFixture().execute_tests()
58-
```
112+
* _before\_(testname)_ - (optional) - if provided, is run prior to the 'run_' method. This method can be used to setup any test pre-conditions
59113

60-
To view the results from within the test notebook:
61-
``` Python
62-
print(result.to_string())
63-
```
114+
* _run\_(testname)_ - (required) - run after 'before_' if before was provided, otherwise run first. This method typically runs the notebook under test
64115

65-
To return the test results to the Nutter CLI:
66-
``` Python
67-
result.exit(dbutils)
68-
```
116+
* _assertion\_(testname)_ (required) - run after 'run_'. This method typically contains the test assertions
69117

70-
__Note:__ The call to result.exit, behind the scenes calls dbutils.notebook.exit, passing the serialized TestResults back to the CLI. At the current time, print statements do not work when dbutils.notebook.exit is called in a notebook, even if they are written prior to the call. For this reason, it is required to *temporarily* comment out result.exit(dbutils) when running the tests locally.
118+
__Note:__ You can assert test scenarios using the standard ``` assert ``` statement or the assertion capabilities from a package of your choice.
71119

72-
### Test Cases
73-
A test fixture can contain 1 or mote test cases. Test cases are discovered when execute_tests() is called on the test fixture. Every test case is comprised of 2 required and 2 optional methods and are discovered by the following convention: prefix_testname, where valid prefixes are: before_, run_, assertion_, and after_. A test fixture that has run_fred and assertion_fred methods has 1 test case called 'fred'. The following are details about test case methods:
74-
75-
* before_(testname) - (optional) - if provided, is run prior to the 'run_' method. This method can be used to setup any test pre-conditions
76-
* run_(testname) - (required) - run after 'before_' if before was provided, otherwise run first. This method typically runs the notebook under test
77-
* assertion_(testname) (required) - run after 'run_'. This method typically contains the test assertions
78-
* after_(testname) (optional) - if provided, run after 'assertion_'. This method typically is used to clean up any test data used by the test
120+
* _after\_(testname)_ (optional) - if provided, run after 'assertion_'. This method typically is used to clean up any test data used by the test
79121

80122
A test fixture can have multiple test cases. The following example shows a fixture called MultiTestFixture with 2 test cases: 'test_case_1' and 'test_case_2' (assertion code omitted for brevity):
123+
81124
``` Python
82125
from runtime.nutterfixture import NutterFixture, tag
83126
class MultiTestFixture(NutterFixture):
@@ -95,11 +138,13 @@ class MultiTestFixture(NutterFixture):
95138

96139
result = MultiTestFixture().execute_tests()
97140
print(result.to_string())
98-
result.exit(dbutils)
141+
#result.exit(dbutils)
99142
```
100143

101144
### before_all and after_all
145+
102146
Test Fixtures also can have a before_all() method which is run prior to all tests and an after_all() which is run after all tests.
147+
103148
``` Python
104149
from runtime.nutterfixture import NutterFixture, tag
105150
class MultiTestFixture(NutterFixture):
@@ -116,41 +161,31 @@ class MultiTestFixture(NutterFixture):
116161
117162
```
118163

119-
### Installing the Nutter Runner on Azure Databricks
120-
Perform the following steps to install the Nutter wheel file on your Azure Databricks cluster:
121-
1. Open your Azure Databricks workspace
122-
2. Click on the 'Clusters' link (on the left)
123-
3. Click on the cluster you wish to install Nutter on
124-
4. Click 'Libraries' (at the top)
125-
5. Click 'Install New'
126-
6. Drag the Nutter whl file
127-
128164
## Nutter CLI
129165

130-
###
131-
### Getting Started
132-
Install the Nutter CLI from the source.
166+
The Nutter CLI is a command line interface that allows you to execute and list tests via a Command Prompt.
167+
168+
### Getting Started with the Nutter CLI
169+
170+
Install the Nutter CLI
133171

134172
``` bash
135-
pip install setuptools
136-
git clone https://github.com/microsoft/nutter
137-
cd nutter
138-
python setup.py bdist_wheel
139-
cd dist
140-
pip install nutter-<LATEST_VERSION>-py3-none-any.whl
173+
pip install nutter
141174
```
142175

143176
__Note:__ It's recommended to install the Nutter CLI in a virtual environment.
144177

145178
Set the environment variables.
146179

147-
Linux
180+
Linux
181+
148182
``` bash
149183
export DATABRICKS_HOST=<HOST>
150184
export DATABRICKS_TOKEN=<TOKEN>
151185
```
152186

153187
Windows PowerShell
188+
154189
``` cmd
155190
$env DATABRICKS_HOST="HOST"
156191
$env DATABRICKS_TOKEN="TOKEN"
@@ -183,11 +218,13 @@ nutter list /dataload --recursive
183218
The ```run``` command schedules the execution of test notebooks and waits for their result.
184219

185220
### Run single test notebook
221+
186222
The following command executes the test notebook ```/dataload/test_sourceLoad``` in the cluster ```0123-12334-tonedabc```.
187223

188224
```bash
189225
nutter run dataload/test_sourceLoad --cluster_id 0123-12334-tonedabc
190226
```
227+
191228
__Note:__ In Azure Databricks you can get the cluster ID by selecting a cluster name from the Clusters tab and clicking on the JSON view.
192229

193230
### Run multiple tests notebooks
@@ -225,9 +262,9 @@ __Note:__ Running tests notebooks in parallel introduces the risk of data race c
225262

226263
## Nutter CLI Syntax and Flags
227264

228-
*Run Command*
265+
### Run Command
229266

230-
```
267+
``` bash
231268
SYNOPSIS
232269
nutter run TEST_PATTERN CLUSTER_ID <flags>
233270

@@ -236,20 +273,20 @@ POSITIONAL ARGUMENTS
236273
CLUSTER_ID
237274
```
238275

239-
```
276+
``` bash
240277
FLAGS
241278
--timeout Execution timeout. Default 120s
242279
--junit_report Create a JUnit XML report from the test results.
243280
--tags_report Create a CSV report from the test results that includes the test cases tags.
244281
--max_parallel_tests Sets the level of parallelism for test notebook execution.
245282
--recursive Executes all tests in the hierarchical folder structure.
246-
```
283+
```
247284

248285
__Note:__ You can also use flags syntax for POSITIONAL ARGUMENTS
249286

250-
*List Command*
287+
### List Command
251288

252-
```
289+
``` bash
253290
NAME
254291
nutter list
255292

@@ -260,7 +297,7 @@ POSITIONAL ARGUMENTS
260297
PATH
261298
```
262299

263-
```
300+
``` bash
264301
FLAGS
265302
--recursive Lists all tests in the hierarchical folder structure.
266303
```
@@ -271,19 +308,117 @@ __Note:__ You can also use flags syntax for POSITIONAL ARGUMENTS
271308

272309
You can run the Nutter CLI within an Azure DevOps pipeline. The Nutter CLI will exit with non-zero code when a test case fails or the execution of the test notebook is not successful.
273310

274-
For full integration of the test results with Azure DevOps you can set the flag ```--junit_report```. When this flag is set, the Nutter CLI outputs the results of the tests cases as a JUnit XML compliant file.
311+
The following Azure DevOps pipeline installs nutter, recursively executes all tests in the workspace folder ```/Shared/ ``` and publishes the test results.
312+
313+
__Note:__ The pipeline expects the Databricks cluster, host and API token as pipeline varibles.
314+
315+
316+
317+
```yaml
318+
# Starter Nutter pipeline
319+
320+
trigger:
321+
- develop
322+
323+
pool:
324+
vmImage: 'ubuntu-latest'
325+
326+
steps:
327+
- task: UsePythonVersion@0
328+
inputs:
329+
versionSpec: '3.5'
330+
331+
- script: |
332+
pip install nutter
333+
displayName: 'Install Nutter'
334+
335+
- script: |
336+
nutter run /Shared/ $CLUSTER --recursive --junit_report
337+
displayName: 'Execute Nutter'
338+
env:
339+
CLUSTER: $(clusterID)
340+
DATABRICKS_HOST: $(databricks_host)
341+
DATABRICKS_TOKEN: $(databricks_token)
342+
343+
- task: PublishTestResults@2
344+
inputs:
345+
testResultsFormat: 'JUnit'
346+
testResultsFiles: '**/test-*.xml'
347+
testRunTitle: 'Publish Nutter results'
348+
```
349+
350+
In some scenarios, the notebooks under tests must be executed in a pre-configured test workspace, other than the development one, that contains the necessary pre-requisites such as test data, tables or mounted points. In such scenarios, you can use the pipeline to deploy the notebooks to the test workspace before executing the tests with Nutter.
351+
352+
The following sample pipeline uses the Databricks CLI to publish the notebooks from triggering branch to the test workspace.
353+
354+
355+
```yaml
356+
# Starter Nutter pipeline
357+
358+
trigger:
359+
- develop
360+
361+
pool:
362+
vmImage: 'ubuntu-latest'
363+
364+
steps:
365+
- task: UsePythonVersion@0
366+
inputs:
367+
versionSpec: '3.5'
368+
369+
- task: configuredatabricks@0
370+
displayName: 'Configure Databricks CLI'
371+
inputs:
372+
url: $(databricks_host)
373+
token: $(databricks_token)
374+
375+
- task: deploynotebooks@0
376+
displayName: 'Publish notebooks to test workspace'
377+
inputs:
378+
notebooksFolderPath: '$(System.DefaultWorkingDirectory)/notebooks/nutter'
379+
workspaceFolder: '/Shared/nutter'
380+
381+
- script: |
382+
pip install nutter
383+
displayName: 'Install Nutter'
384+
385+
- script: |
386+
nutter run /Shared/ $CLUSTER --recursive --junit_report
387+
displayName: 'Execute Nutter'
388+
env:
389+
CLUSTER: $(clusterID)
390+
DATABRICKS_HOST: $(databricks_host)
391+
DATABRICKS_TOKEN: $(databricks_token)
392+
393+
- task: PublishTestResults@2
394+
inputs:
395+
testResultsFormat: 'JUnit'
396+
testResultsFiles: '**/test-*.xml'
397+
testRunTitle: 'Publish Nutter results'
398+
```
399+
400+
## Contributing
401+
402+
### Contribution Tips
403+
404+
- There's a known issue with VS Code and the lastest version of pytest.
405+
- Please make sure that you install pytest 5.0.1
406+
- If you installed pytest using VS Code, then you are likely using the incorrect version. Run the following command to fix it:
275407
276-
# Contributing
277-
## Using VS Code
278-
- There's a known issue with VS Code and the lastest version of pytest.
279-
- Please make sure that you install pytest 5.0.1
280-
- If you installed pytest using VS Code, then you are likely using the incorrect version. Run the following command to fix it:
281408
``` Python
282409
pip install --force-reinstall pytest==5.0.1
283410
```
284411

285-
## Creating the wheel file and manually test wheel locally
412+
Creating the wheel file and manually test wheel locally
413+
286414
1. Change directory to the root that contains setup.py
287415
2. Update the version in the setup.py
288416
3. Run the following command: python3 setup.py sdist bdist_wheel
289417
4. (optional) Install the wheel locally by running: python3 -m pip install <path-to-whl-file>
418+
419+
### Contribution Guidelines
420+
421+
If you would like to become an active contributor to this project please follow the instructions provided in [Microsoft Azure Projects Contribution Guidelines](http://azure.github.io/guidelines/).
422+
423+
-----
424+
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or contact [[email protected]](mailto:[email protected]) with any additional questions or comments.

0 commit comments

Comments
 (0)