Number-to-Words Sequence-to-Sequence Model

Authors: Varshith Gude
Framework: PyTorch
Date: 2025

Abstract

This project presents a sequence-to-sequence (Seq2Seq) model capable of converting numeric strings into their corresponding word representations. The system efficiently maps sequences of digits (0–9) into sequences of English words (e.g., 123 → one two three) using a GRU-based encoder-decoder architecture. This approach demonstrates the feasibility of applying neural sequence models to structured symbolic translation tasks.

Motivation

Automatic conversion of numeric data into natural language is essential in applications such as:

Voice assistants reading numbers aloud.
Financial document processing.
Educational tools for learning numerical literacy.

The task highlights sequence modeling challenges with variable-length inputs and outputs, providing a controlled environment to study encoder-decoder architectures with greedy decoding.

Methodology

Model Architecture

Encoder: GRU-based network that encodes the input numeric sequence into a hidden representation.
Decoder: GRU-based network that generates the word sequence conditioned on the encoder hidden states.
Seq2Seq Framework: Connects encoder and decoder for end-to-end sequence translation.

Data Generation

Digit-to-word mapping:

digit_to_word = {
'0': 'zero', '1': 'one', '2': 'two', '3': 'three', '4': 'four', '5': 'five', '6': 'six', '7': 'seven', '8': 'eight', '9': 'nine'
}

--

Files

config.py – Project configuration and hyperparameters.
dataset.py – Dataset generation, save/load utilities, and PyTorch Dataset class.
generate_dataset.py – Script to generate training data.
train.py / main.py – Training script and/or interactive number converter.
models.py – Encoder, decoder, and Seq2Seq model definitions.
train_utils.py – Training helper functions.
vocab.py – Input/output vocabularies and token mappings.
inference.py – Greedy decoding function for inference.

Training

Loss function: Cross-entropy between predicted and true word sequences.
Optimization: Adam optimizer, learning rate 0.001.
Batch size: 128
Embedding size: 64
Hidden size: 128
Teacher forcing ratio: 0.5
Epochs: 20

Decoding

Greedy decoding: Generates exactly as many words as the input digits, avoiding early stopping and <eos> token issues.

Installation

# Test The Model
python test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Number-to-Words Sequence-to-Sequence Model

Abstract

Motivation

Methodology

Model Architecture

Data Generation

Files

Training

Decoding

Installation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
__pycache__		__pycache__
README.md		README.md
config.py		config.py
dataset.py		dataset.py
digits_dataset.json		digits_dataset.json
generate_dataset.py		generate_dataset.py
inference.py		inference.py
main.py		main.py
models.py		models.py
seq2seq_number2words.pth		seq2seq_number2words.pth
train.py		train.py
train_utils.py		train_utils.py
vocab.py		vocab.py

Folders and files

Latest commit

History

Repository files navigation

Number-to-Words Sequence-to-Sequence Model

Abstract

Motivation

Methodology

Model Architecture

Data Generation

Files

Training

Decoding

Installation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages