LabelingUX

Introduction

This open-source project aims to provide a sample template for sample data labeling. Users could implement other features based on this project, e.g., uploading data, or modifying the UI.

You are welcome to bring up any encountered issues. Microsoft Azure Form Recognizer team would update the source code periodically.

Get Started

Install node packages

Install server's node packages

Run the below command under root folder to install node packages for Server

npm i

Install client's node packages

Run the command below to install node packages for Client

cd Client
npm i

Start the application

Switch back to root folder and start the server at port 4000, the client application at port 3000

cd ..
npm run dev

You should see "Compiled successfully!" in your CLI, and the application should automatically open in your default browser with URL : http://localhost:3000/label

Add labeling data

All the labeling data will be read from and write into Server/data.

SampleLabelingUX
│   README.md
│
└───Server
│   │
│   └───data
│       │   fields.json
│       │   example.pdf
│       │   example.ocr.json
│       │   example.labels.json
│       │   ...
│
└───Client

Labeling data in `data` folder contains:

supported types of document file
*.ocr.json
*.labels.json
fields.json

Notice that you would need to provide the documents and their corresponding .ocr.json files to start labeling (If you don't have .ocr.json file for the document you would like to label, check out the instruction at the end of this README for how to generate the .ocr.json file for your document)

You could also provide document's .label.json file and overall fields,json file if you have ones.

After labeling your documents in this Sample Labeling tool, the labeling result, i.e. .labels.json files and fields.json will be stored in this folder as well.

More details about the labeling data

Supported types of document for labeling:
- PDF
- JPG
- JPEG
- PNG
- TIFF
- TIF
.ocr.json: MUST be provided along with the document to start labeling.
.labels.json: will be auto-generated after labeling with a field assigned to it. If you have provided one, it will read and write into the one you provided.
fields.json: will be auto-generated after a new field is created. If you have provided one, it will read and write into the one you provided.

Note: How to create `.ocr.json` file for your documents

User can create the ocr.json file via

Upload document to Form Recognizer Studio Layout to analyze the document, the ocr.json file will be created according to the filename respectively in user blob storage container.
Use Form Recognizer SDKs to analyze document with Layout API, save the result as JSON file with the naming convention
<document name>.<document extension>.ocr.json
For example, test.pdf document should have a corresponding test.pdf.ocr.json file for OCR results

Name		Name	Last commit message	Last commit date
parent directory ..
Client		Client
Server		Server
.env		.env
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Introduction

Get Started

Install node packages

Install server's node packages

Install client's node packages

Start the application

Add labeling data

Labeling data in `data` folder contains:

More details about the labeling data

Note: How to create `.ocr.json` file for your documents

FilesExpand file tree

LabelingUX

Directory actions

More options

Directory actions

More options

Latest commit

History

LabelingUX

Folders and files

parent directory

README.md

Introduction

Get Started

Install node packages

Install server's node packages

Install client's node packages

Start the application

Add labeling data

Labeling data in data folder contains:

More details about the labeling data

Note: How to create .ocr.json file for your documents

Labeling data in `data` folder contains:

Note: How to create `.ocr.json` file for your documents