Skip to content

itsmefurzy/core-intelligent-document-identifier

Repository files navigation

Nest Logo

Intelligent Document Identifier

A NestJS application that provides an API for analyzing document images using AI with LangChain and the Gemma-3-12b-it model via LMStudio.

Description

This project provides an API endpoint for analyzing document images. It uses LangChain to integrate with the Gemma-3-12b-it model running on LMStudio to perform AI-powered image analysis. The API returns a detailed quality assessment of the uploaded document image, including scores and explanations for image sharpness, blurriness, text visibility, and overall text quality.

Prerequisites

  • Node.js (v16 or higher)
  • npm or yarn
  • LMStudio installed locally with Gemma-3-12b-it model

Project setup

Install dependencies

$ npm install

Set up LMStudio

  1. Download and install LMStudio
  2. Download the Gemma-3-12b-it model in LMStudio
  3. Start the local server in LMStudio with the Gemma-3-12b-it model
  4. Ensure the server is running on http://localhost:1234

Running the application

# development
$ npm run start

# watch mode
$ npm run start:dev

# production mode
$ npm run start:prod

The API will be available at http://localhost:3000.

API Endpoints

Analyze Image

Analyzes a document image and returns quality assessment.

  • URL: /document-identifier/analyze

  • Method: POST

  • Content-Type: application/json

  • Request Body:

    {
      "image": "base64_encoded_image_data",
      "contentType": "image/jpeg"
    }
  • Response:

    {
      "confident": "0.95",
      "conclusion": "The image appears to be a scanned ID card and the text is generally readable...",
      "reason": {
        "image_sharpness": {
          "score": "0.85",
          "reason": "The image has decent sharpness, though some areas show slight blurring."
        },
        "blurred": {
          "score": "0.75",
          "reason": "There's a noticeable blur around the edges and signature, which could impact OCR accuracy in those regions."
        },
        "obscured": {
          "score": "1",
          "reason": "The text is not obscured by any major objects or reflections."
        },
        "text_quality": {
          "score": "0.9",
          "reason": "The text quality is good overall, but some minor artifacts and shadowing are present due to the scanning process."
        }
      }
    }

LMStudio Configuration

The application is configured to connect to LMStudio running locally on port 1234. If your LMStudio instance is running on a different port or host, update the configuration in src/document-identifier/langchain.service.ts:

this.model = new ChatOpenAI({
  modelName: 'gemma-3-12b-it',
  temperature: 0.2,
  maxTokens: 1000,
  openAIApiKey: 'dummy-key',
  configuration: {
    baseURL: 'http://localhost:1234/v1', // Update this URL if needed
  },
});

Run tests

# unit tests
$ npm run test

# e2e tests
$ npm run test:e2e

# test coverage
$ npm run test:cov

License

This project is MIT licensed.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors