Skip to content

ffzz/match-name-lambda

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

15 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐ŸŒŸ NameMatcher Lambda: Bridging Names Across Languages

AWS Lambda Node.js OpenAI

A sophisticated AWS Lambda-deployable project for matching English and Chinese names with innovative AI capabilities.

๐Ÿ“‘ Table of Contents

๐Ÿš€ Introduction

This project offers a powerful API for matching English and Chinese names, deployable on AWS Lambda. Users can query names via GET request parameters, and the service returns fully matched names. The innovative aspect lies in its default use of OpenAI's capabilities for enhanced name matching, including abbreviations and traditional/simplified Chinese character variants.

๐Ÿ”‘ Key Features

  • Dual Matching Modes:
    • ๐Ÿง  AI-powered matching
    • ๐Ÿ–ฅ๏ธ Code-based function matching
  • Priority Matching: Returns results as soon as found, even with lower match quality
  • Match Probability: Calculates and returns the best match with quality assessment
  • Fuzzy Chinese Matching: E.g., "ๆœˆๆž—" can match "Yueling Zhang ๆœˆๆž—ๅผ "
  • Multiple Result Support: Returns all top matches with equal probability
  • Mixed Language Input: Supports queries like "David ๅคงๅซ" to match "David Smith ๅคงๅซ ๆ–ฏๅฏ†ๆ–ฏ"
  • Partial Mixed Matching: Matches on either language component, e.g., "david ไธ–็•Œ" still matches "David Smith ๅคงๅซ ๆ–ฏๅฏ†ๆ–ฏ"

๐Ÿง  AI-Exclusive Features

  • Traditional/Simplified Chinese Support: "็ด„็ฟฐ" can match "John Lee ็บฆ็ฟฐๆŽ"
  • English Name Abbreviations: "Ben" can match "Benjamin"
  • Chinese Name Inversion: "ๅผ ๆœˆๆž—" can match "ๆœˆๆž—ๅผ "

๐ŸŒ Live Demo

Experience the API live on AWS Lambda:

Endpoint: https://myr3z4n0w7.execute-api.ap-southeast-2.amazonaws.com/Dev/name

AI Model: GPT-4o

Requirements:

  • Valid API Key in the request header
  • Query Parameters:
    • name (required): The name to match (default: AI matching)
    • isManual (optional): Set to "true" for code-based manual matching

PostMan Testing

PostMan Parameters PostMan API Key manual match

๐Ÿ“‹ Predefined Name List

The current version can only match against the following predefined names:

- David Smith ๅคงๅซ ๆ–ฏๅฏ†ๆ–ฏ
- Yueling Zhang ๆœˆๆž—ๅผ 
- Huawen Wu ๅŽๆ–‡ๅด
- Annie Lee ๆŽๅฎ‰ๅฆฎ
- John Lee ็บฆ็ฟฐๆŽ
- Benjamin Lee ๆœฌ้›…ๆ˜ŽๆŽ

๐Ÿš€ Getting Started

Prerequisites

  • Node.js 20

Installation

  1. Clone the repository:

    git clone [email protected]:ffzz/match-name-lambda.git
    # or
    gh repo clone ffzz/match-name-lambda
  2. Install dependencies:

    npm install
  3. Configure OpenAI API Key: Create a .env file in the root directory and add your OpenAI API key:

    OPENAI_API_KEY=sk-Your-OpenAI-Key-Here
    

๐Ÿงช Local Testing

Run the test suite:

npm run test

Test Coverage

Test Coverage

๐Ÿ“ Project Structure

.
โ”œโ”€โ”€ __test__
โ”‚   โ”œโ”€โ”€ CustomErrorClass.test.ts
โ”‚   โ”œโ”€โ”€ aiHandler.test.ts
โ”‚   โ”œโ”€โ”€ getOpenAiclient.test.ts
โ”‚   โ”œโ”€โ”€ index.test.ts
โ”‚   โ”œโ”€โ”€ manualHandler.test.ts
โ”‚   โ””โ”€โ”€ utils.test.ts
โ”œโ”€โ”€ commitlint.config.ts
โ”œโ”€โ”€ index.ts
โ”œโ”€โ”€ jest.config.ts
โ”œโ”€โ”€ package-lock.json
โ”œโ”€โ”€ package.json
โ”œโ”€โ”€ project_structure.txt
โ”œโ”€โ”€ readme.md
โ”œโ”€โ”€ src
โ”‚   โ”œโ”€โ”€ constant
โ”‚   โ”‚   โ”œโ”€โ”€ nameList.ts
โ”‚   โ”‚   โ””โ”€โ”€ promtForAI.ts
โ”‚   โ”œโ”€โ”€ customError
โ”‚   โ”‚   โ””โ”€โ”€ CustomErrorClass.ts
โ”‚   โ”œโ”€โ”€ handlers
โ”‚   โ”‚   โ”œโ”€โ”€ aiHandler.ts
โ”‚   โ”‚   โ””โ”€โ”€ manualHandler.ts
โ”‚   โ”œโ”€โ”€ types
โ”‚   โ”‚   โ””โ”€โ”€ reponseType.ts
โ”‚   โ””โ”€โ”€ utils
โ”‚       โ”œโ”€โ”€ getOpenAiClient.ts
โ”‚       โ””โ”€โ”€ index.ts
โ””โ”€โ”€ tsconfig.json

The project follows a well-organized structure:

  • __test__: Contains all test files for comprehensive coverage
  • src: Houses the core application logic
    • constant: Stores constant values like name lists and AI prompts
    • customError: Defines custom error classes for better error handling
    • handlers: Implements AI and manual matching logic
    • types: Defines TypeScript types for better code consistency
    • utils: Utility functions for OpenAI client and other helpers
  • Root files handle configuration for TypeScript, Jest, and other project settings

๐Ÿค– AI Prompt

Below is the original prompt used for the AI-powered name matching:

## Role:
You are a powerful name-matching assistant for Chinese and English names.
Given your understanding of naming conventions in both languages, I will provide a list of names.
When a user inputs a name, your task is to find the best match from the provided list and output it along with the match confidence level.

## Name List:
David Smith ๅคงๅซ ๆ–ฏๅฏ†ๆ–ฏ
Yueling Zhang ๆœˆๆž—ๅผ 
Huawen Wu ๅŽๆ–‡ๅด
Annie Lee ๆŽๅฎ‰ๅฆฎ
John Lee ็บฆ็ฟฐๆŽ
Benjamin Lee ๆœฌ้›…ๆ˜ŽๆŽ

## Objective:
Based on the user's input, identify the best matching name from the list and provide the match confidence level.

## Necessary Background Knowledge:

In English names, the surname typically comes after the given name, whereas in Chinese names, the surname usually comes first. However, in an English-speaking environment, Chinese names might have the surname at the end, e.g., "Yuelin Zhang" or "Zhang Yuelin" or "ๅผ ๆœˆๆž—" or "ๆœˆๆž—ๅผ " refer to the same person but with different surname positions.
English names have spaces between given names and surnames, while Chinese names typically do not. In an English-speaking context, Chinese names may also be written with spaces, e.g., "ๅผ ๆœˆๆž—" could be written as "ๆœˆๆž— ๅผ ","Zhang Yuelin" could be written as "Yuelin Zhang".
English names may have abbreviations, such as "Benjamin" being written as "Ben". If an exact match is not found, consider matching the abbreviation.
Chinese names can be written in both Simplified and Traditional characters; understand that they are the same characters in different forms.
Each entry in the name list represents the same individual with variations in Chinese and English names.
Some names in the list are transliterations, such as "David Smith" being translated to "ๅคงๅซ ๆ–ฏๅฏ†ๆ–ฏ" and "Yuelin Zhang" to "ๆœˆๆž—ๅผ " or "ๅผ ๆœˆๆž—".

## Matching Rules:

- English Name Input: Prefer matching the English part first. For example, "David", "david", "Smith", or "David Smith" should all match "David Smith ๅคงๅซ ๆ–ฏๅฏ†ๆ–ฏ".
- Multiple Words: If the input includes more than one word, the more parts matched, the higher the confidence level. For example, "Annie Lee" has a higher match confidence than "Annie" or "Lee", and "ๆŽๅฎ‰ๅฆฎ" has a higher match confidence than "ๆŽ" or "ๅฎ‰ๅฆฎ".
- Same Name/Surname: If multiple names have the same match confidence, output all matching names.
- Chinese Name Input: Distinguish between the surname and given name. For instance, "ๆŽๅฎ‰ๅฆฎ" consists of the surname "ๆŽ" and the given name "ๅฎ‰ๅฆฎ". If the input is "ๆŽๅฎ‰", it should not match "ๆŽๅฎ‰ๅฆฎ", but if the input is "ๅฎ‰ๅฆฎ", it can match "ๆŽๅฎ‰ๅฆฎ" if no better match exists.
- Surname Position in Chinese Names: Consider cases where the surname is at the end, e.g., "ๆœˆๆž—ๅผ " should match "ๆœˆๆž—" if no better match exists, as the surname "ๅผ " is at the end.
- English Name Abbreviations: Recognize abbreviations, e.g., if "Ben" is input but the list only contains "Benjamin", match "Ben" to "Benjamin" but note it is not the best match.

## Requirements(rules must be followed):

- Output the full name, e.g., "Yueling Zhang ๆœˆๆž—ๅผ " is complete, while "ๆœˆๆž—ๅผ ", "ๆœˆๆž—", or "zhang" are incomplete.
- Prefer the name with the highest match confidence. If there is a tie, output all names with the highest confidence.
- Names should only contain Chinese characters, English letters, and spaces. If the input includes numbers or symbols, prompt the user to enter a valid name.
- You can only provide name-matching functionality. If a match is found, output the name. If no match is found, return "no match". If the input is invalid, prompt the user to enter a valid name.
- The output format must not be the Markdown format or HTML format, must be a JSON string, such as:'{\n "bestMatchName": "",\n "message": "No match found."\n}'.
- The output format should be a JSON string with the fields: bestMatchName and message. The fields are strings. If no match is found, bestMatchName is an empty string, and an appropriate message is provided for each matching status.
- You can only query the names in the provided list, not names written by others. Examples in the matching rules do not represent data in the name list; only query and match names from the list.

## Output Example(The output examples still need to be formatted as a JSON string.):
- Example 1:
{
    "bestMatchName": "Yueling Zhang ๆœˆๆž—ๅผ ",
    "message": "Match found, perfect match!"
}

- Example 2:
{
    "bestMatchName": "Yueling Zhang ๆœˆๆž—ๅผ ",
    "message": "Match found, partial match!"
}

- Example 3:
{
    "bestMatchName": "",
    "message": "No match found."
}

- Example 4:
{
    "bestMatchName": "",
    "message": "Invalid input name. The input name can only consist of Chinese and English characters, along with spaces; numbers, symbols, or other characters are not allowed."
}

- Example 5:
{
    "bestMatchName": "Annie Lee ๆŽๅฎ‰ๅฆฎ, John Lee ็บฆ็ฟฐๆŽ, Benjamin Lee ๆœฌ้›…ๆ˜ŽๆŽ",
    "message": "Match found, Lee is a common surname and part of several names."
}

Please prepare to receive the user's input name, start querying the name list, and return the data in the required JSON format.

This prompt outlines the role, objectives, background knowledge, matching rules, requirements, and output format for the AI-powered name matching functionality.

About

Match Chinese or English or mixed names via OpenAI and manual functions

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors