Skip to content

perf: Parallelize DynamoDB batch reads in sync online_read#6024

Merged
ntkathole merged 6 commits intofeast-dev:masterfrom
abhijeet-dhumal:perf/parallel-dynamodb-batch-reads
Mar 4, 2026
Merged

perf: Parallelize DynamoDB batch reads in sync online_read#6024
ntkathole merged 6 commits intofeast-dev:masterfrom
abhijeet-dhumal:perf/parallel-dynamodb-batch-reads

Conversation

@abhijeet-dhumal
Copy link
Contributor

@abhijeet-dhumal abhijeet-dhumal commented Feb 25, 2026

Summary

Execute DynamoDB BatchGetItem requests in parallel using ThreadPoolExecutor instead of sequentially. This significantly reduces latency when reading features for many entities that span multiple batches.

Changes

  • Pre-split entity IDs into batches upfront
  • Use ThreadPoolExecutor to execute batch requests concurrently
  • Skip parallelization for single batch (no overhead)
  • Merge results in original order after parallel fetch

Expected Behavior

For multiple batches, DynamoDB BatchGetItem requests should execute in parallel, reducing total latency from N × network_latency to approximately 1 × network_latency.

Current Behavior

The sync online_read method executes batch requests sequentially in a while loop:

while True:
    batch = list(itertools.islice(entity_ids_iter, batch_size))
    if len(batch) == 0:
        break
    response = dynamodb_resource.batch_get_item(RequestItems=...)  # Sequential!
    result.extend(batch_result)

For 500 entities with batch_size=100, this makes 5 sequential network calls.

Steps to Reproduce

  1. Configure DynamoDB online store with batch_size=100
  2. Call get_online_features for 500 entities
  3. Profile network latency - observe 5 sequential calls

Specifications

  • Version: 0.47.0+
  • Platform: All
  • Subsystem: sdk/python/feast/infra/online_stores/dynamodb.py

Performance Impact

For 500 entities with batch_size=100 (5 batches):

  • Before: 5 sequential network calls = 50-150ms
  • After: 5 parallel network calls = 10-30ms
  • Estimated savings: 40-120ms for large entity sets

Possible Solution

Already implemented in this PR using ThreadPoolExecutor:

with ThreadPoolExecutor(max_workers=min(len(batches), batch_size)) as executor:
    responses = list(executor.map(fetch_batch, batches))

Related

  • RHOAIENG-46061 (60ms p99 SLA target for online feature serving)
  • Note: The async path (online_read_async) already uses asyncio.gather() for parallel execution

Open with Devin

@abhijeet-dhumal abhijeet-dhumal requested a review from a team as a code owner February 25, 2026 14:47
@abhijeet-dhumal abhijeet-dhumal changed the title perf: parallelize DynamoDB batch reads in sync online_read perf: Parallelize DynamoDB batch reads in sync online_read Feb 25, 2026
@abhijeet-dhumal abhijeet-dhumal force-pushed the perf/parallel-dynamodb-batch-reads branch from 2a3e9b8 to f75c446 Compare February 25, 2026 14:49
devin-ai-integration[bot]

This comment was marked as resolved.

@abhijeet-dhumal abhijeet-dhumal force-pushed the perf/parallel-dynamodb-batch-reads branch from 703606c to 30bb162 Compare March 2, 2026 06:21
@ntkathole ntkathole force-pushed the perf/parallel-dynamodb-batch-reads branch from 30bb162 to f4eebb4 Compare March 2, 2026 06:47
@abhijeet-dhumal abhijeet-dhumal force-pushed the perf/parallel-dynamodb-batch-reads branch from dbaf799 to 344305c Compare March 2, 2026 13:45
@ntkathole ntkathole force-pushed the perf/parallel-dynamodb-batch-reads branch from 344305c to 6e04b50 Compare March 2, 2026 14:18
devin-ai-integration[bot]

This comment was marked as resolved.

@ntkathole ntkathole force-pushed the perf/parallel-dynamodb-batch-reads branch from 6e04b50 to 91a5bdb Compare March 4, 2026 04:13
@abhijeet-dhumal abhijeet-dhumal force-pushed the perf/parallel-dynamodb-batch-reads branch from 91a5bdb to 8ead314 Compare March 4, 2026 06:16
@ntkathole ntkathole merged commit 9699944 into feast-dev:master Mar 4, 2026
34 of 36 checks passed
franciscojavierarceo pushed a commit that referenced this pull request Mar 10, 2026
# [0.61.0](v0.60.0...v0.61.0) (2026-03-10)

### Bug Fixes

* Add grpcio dependency group to transformation server Dockerfile ([2c2150a](2c2150a))
* Add https readiness check for rest-registry tests ([ea85e63](ea85e63))
* Add website build check for PRs and fix blog frontmatter YAML error ([#6079](#6079)) ([30a3a43](30a3a43))
* Added MLflow metric charts across feature selection ([#6080](#6080)) ([a403361](a403361))
* Check duplicate names for feature view across types ([#5999](#5999)) ([95b9af8](95b9af8))
* Fix integration tests ([#6046](#6046)) ([02d5548](02d5548))
* Fix non-specific label selector on metrics service ([a1a160d](a1a160d))
* Fixed IntegrityError on SqlRegistry ([#6047](#6047)) ([325e148](325e148))
* Fixed pre-commit check ([114b7db](114b7db))
* Fixed uv cache permission error for docker build on mac ([ad807be](ad807be))
* Fixes a `PydanticDeprecatedSince20` warning for trino_offline_store ([#5991](#5991)) ([abfd18a](abfd18a))
* Integration test failures ([#6040](#6040)) ([9165870](9165870))
* Ray offline store tests are duplicated across 3 workflows ([54f705a](54f705a))
* Reenable tests ([#6036](#6036)) ([82ee7f8](82ee7f8))
* Use commitlint pre-commit hook instead of a separate action ([35a81e7](35a81e7))

### Features

* Add complex type support (Map, JSON, Struct) with schema validation ([#5974](#5974)) ([1200dbf](1200dbf))
* Add materialization, feature freshness, request latency, and push metrics to feature server ([2c6be18](2c6be18))
* Add non-entity retrieval support for ClickHouse offline store ([4d08ddc](4d08ddc)), closes [#5835](#5835)
* Add OnlineStore for MongoDB ([#6025](#6025)) ([bf4e3fa](bf4e3fa)), closes [golang/go#74462](golang/go#74462)
* Added CodeQL SAST scanning and detect-secrets pre-commit hook ([547b516](547b516))
* Adding optional name to Aggregation (feast-dev[#5994](#5994)) ([#6083](#6083)) ([56469f7](56469f7))
* Feature Server High-Availability on Kubernetes ([#6028](#6028)) ([9c07b4c](9c07b4c)), closes [Hi#Availability](https://github.com/Hi/issues/Availability) [Hi#Availability](https://github.com/Hi/issues/Availability)
* **go:** Implement metrics and tracing for http and grpc servers ([#5925](#5925)) ([2b4ec9a](2b4ec9a))
* Horizontal scaling support to the Feast operator ([#6000](#6000)) ([3ec13e6](3ec13e6))
* Making feature view source optional (feast-dev[#6074](#6074)) ([#6075](#6075)) ([76917b7](76917b7))
* Support arm docker build ([#6061](#6061)) ([1e1f5d9](1e1f5d9))
* Use orjson for faster JSON serialization in feature server ([6f5203a](6f5203a))

### Performance Improvements

* Optimize protobuf parsing in Redis online store ([#6023](#6023)) ([59dfdb8](59dfdb8))
* Optimize timestamp conversion in _convert_rows_to_protobuf ([33a2e95](33a2e95))
* Parallelize DynamoDB batch reads in sync online_read ([#6024](#6024)) ([9699944](9699944))
* Remove redundant entity key serialization in online_read ([d87283f](d87283f))
ntkathole pushed a commit to red-hat-data-services/feast that referenced this pull request Mar 16, 2026
…#6024)

* perf: Parallelize DynamoDB batch reads in sync online_read

Signed-off-by: abhijeet-dhumal <[email protected]>

* test: add unit tests for DynamoDB parallel batch reads

Signed-off-by: abhijeet-dhumal <[email protected]>

* fix: address thread-safety and max_workers issues in parallel DynamoDB reads

Signed-off-by: abhijeet-dhumal <[email protected]>

* fix: Improve DynamoDB parallel reads: shared client, configurable workers

Signed-off-by: abhijeet-dhumal <[email protected]>

* docs: add max_read_workers config documentation for DynamoDB online store

Signed-off-by: abhijeet-dhumal <[email protected]>

* docs: Fix default max worker count in docs

Signed-off-by: abhijeet-dhumal <[email protected]>

---------

Signed-off-by: abhijeet-dhumal <[email protected]>
ntkathole pushed a commit to red-hat-data-services/feast that referenced this pull request Mar 16, 2026
# [0.61.0](feast-dev/feast@v0.60.0...v0.61.0) (2026-03-10)

### Bug Fixes

* Add grpcio dependency group to transformation server Dockerfile ([2c2150a](feast-dev@2c2150a))
* Add https readiness check for rest-registry tests ([ea85e63](feast-dev@ea85e63))
* Add website build check for PRs and fix blog frontmatter YAML error ([feast-dev#6079](feast-dev#6079)) ([30a3a43](feast-dev@30a3a43))
* Added MLflow metric charts across feature selection ([feast-dev#6080](feast-dev#6080)) ([a403361](feast-dev@a403361))
* Check duplicate names for feature view across types ([feast-dev#5999](feast-dev#5999)) ([95b9af8](feast-dev@95b9af8))
* Fix integration tests ([feast-dev#6046](feast-dev#6046)) ([02d5548](feast-dev@02d5548))
* Fix non-specific label selector on metrics service ([a1a160d](feast-dev@a1a160d))
* Fixed IntegrityError on SqlRegistry ([feast-dev#6047](feast-dev#6047)) ([325e148](feast-dev@325e148))
* Fixed pre-commit check ([114b7db](feast-dev@114b7db))
* Fixed uv cache permission error for docker build on mac ([ad807be](feast-dev@ad807be))
* Fixes a `PydanticDeprecatedSince20` warning for trino_offline_store ([feast-dev#5991](feast-dev#5991)) ([abfd18a](feast-dev@abfd18a))
* Integration test failures ([feast-dev#6040](feast-dev#6040)) ([9165870](feast-dev@9165870))
* Ray offline store tests are duplicated across 3 workflows ([54f705a](feast-dev@54f705a))
* Reenable tests ([feast-dev#6036](feast-dev#6036)) ([82ee7f8](feast-dev@82ee7f8))
* Use commitlint pre-commit hook instead of a separate action ([35a81e7](feast-dev@35a81e7))

### Features

* Add complex type support (Map, JSON, Struct) with schema validation ([feast-dev#5974](feast-dev#5974)) ([1200dbf](feast-dev@1200dbf))
* Add materialization, feature freshness, request latency, and push metrics to feature server ([2c6be18](feast-dev@2c6be18))
* Add non-entity retrieval support for ClickHouse offline store ([4d08ddc](feast-dev@4d08ddc)), closes [feast-dev#5835](feast-dev#5835)
* Add OnlineStore for MongoDB ([feast-dev#6025](feast-dev#6025)) ([bf4e3fa](feast-dev@bf4e3fa)), closes [golang/go#74462](golang/go#74462)
* Added CodeQL SAST scanning and detect-secrets pre-commit hook ([547b516](feast-dev@547b516))
* Adding optional name to Aggregation (feast-dev[feast-dev#5994](feast-dev#5994)) ([feast-dev#6083](feast-dev#6083)) ([56469f7](feast-dev@56469f7))
* Feature Server High-Availability on Kubernetes ([feast-dev#6028](feast-dev#6028)) ([9c07b4c](feast-dev@9c07b4c)), closes [Hi#Availability](https://github.com/Hi/issues/Availability) [Hi#Availability](https://github.com/Hi/issues/Availability)
* **go:** Implement metrics and tracing for http and grpc servers ([feast-dev#5925](feast-dev#5925)) ([2b4ec9a](feast-dev@2b4ec9a))
* Horizontal scaling support to the Feast operator ([feast-dev#6000](feast-dev#6000)) ([3ec13e6](feast-dev@3ec13e6))
* Making feature view source optional (feast-dev[feast-dev#6074](feast-dev#6074)) ([feast-dev#6075](feast-dev#6075)) ([76917b7](feast-dev@76917b7))
* Support arm docker build ([feast-dev#6061](feast-dev#6061)) ([1e1f5d9](feast-dev@1e1f5d9))
* Use orjson for faster JSON serialization in feature server ([6f5203a](feast-dev@6f5203a))

### Performance Improvements

* Optimize protobuf parsing in Redis online store ([feast-dev#6023](feast-dev#6023)) ([59dfdb8](feast-dev@59dfdb8))
* Optimize timestamp conversion in _convert_rows_to_protobuf ([33a2e95](feast-dev@33a2e95))
* Parallelize DynamoDB batch reads in sync online_read ([feast-dev#6024](feast-dev#6024)) ([9699944](feast-dev@9699944))
* Remove redundant entity key serialization in online_read ([d87283f](feast-dev@d87283f))
ntkathole pushed a commit to red-hat-data-services/feast that referenced this pull request Mar 16, 2026
…#6024)

* perf: Parallelize DynamoDB batch reads in sync online_read

Signed-off-by: abhijeet-dhumal <[email protected]>

* test: add unit tests for DynamoDB parallel batch reads

Signed-off-by: abhijeet-dhumal <[email protected]>

* fix: address thread-safety and max_workers issues in parallel DynamoDB reads

Signed-off-by: abhijeet-dhumal <[email protected]>

* fix: Improve DynamoDB parallel reads: shared client, configurable workers

Signed-off-by: abhijeet-dhumal <[email protected]>

* docs: add max_read_workers config documentation for DynamoDB online store

Signed-off-by: abhijeet-dhumal <[email protected]>

* docs: Fix default max worker count in docs

Signed-off-by: abhijeet-dhumal <[email protected]>

---------

Signed-off-by: abhijeet-dhumal <[email protected]>
ntkathole pushed a commit to red-hat-data-services/feast that referenced this pull request Mar 16, 2026
# [0.61.0](feast-dev/feast@v0.60.0...v0.61.0) (2026-03-10)

### Bug Fixes

* Add grpcio dependency group to transformation server Dockerfile ([2c2150a](feast-dev@2c2150a))
* Add https readiness check for rest-registry tests ([ea85e63](feast-dev@ea85e63))
* Add website build check for PRs and fix blog frontmatter YAML error ([feast-dev#6079](feast-dev#6079)) ([30a3a43](feast-dev@30a3a43))
* Added MLflow metric charts across feature selection ([feast-dev#6080](feast-dev#6080)) ([a403361](feast-dev@a403361))
* Check duplicate names for feature view across types ([feast-dev#5999](feast-dev#5999)) ([95b9af8](feast-dev@95b9af8))
* Fix integration tests ([feast-dev#6046](feast-dev#6046)) ([02d5548](feast-dev@02d5548))
* Fix non-specific label selector on metrics service ([a1a160d](feast-dev@a1a160d))
* Fixed IntegrityError on SqlRegistry ([feast-dev#6047](feast-dev#6047)) ([325e148](feast-dev@325e148))
* Fixed pre-commit check ([114b7db](feast-dev@114b7db))
* Fixed uv cache permission error for docker build on mac ([ad807be](feast-dev@ad807be))
* Fixes a `PydanticDeprecatedSince20` warning for trino_offline_store ([feast-dev#5991](feast-dev#5991)) ([abfd18a](feast-dev@abfd18a))
* Integration test failures ([feast-dev#6040](feast-dev#6040)) ([9165870](feast-dev@9165870))
* Ray offline store tests are duplicated across 3 workflows ([54f705a](feast-dev@54f705a))
* Reenable tests ([feast-dev#6036](feast-dev#6036)) ([82ee7f8](feast-dev@82ee7f8))
* Use commitlint pre-commit hook instead of a separate action ([35a81e7](feast-dev@35a81e7))

### Features

* Add complex type support (Map, JSON, Struct) with schema validation ([feast-dev#5974](feast-dev#5974)) ([1200dbf](feast-dev@1200dbf))
* Add materialization, feature freshness, request latency, and push metrics to feature server ([2c6be18](feast-dev@2c6be18))
* Add non-entity retrieval support for ClickHouse offline store ([4d08ddc](feast-dev@4d08ddc)), closes [feast-dev#5835](feast-dev#5835)
* Add OnlineStore for MongoDB ([feast-dev#6025](feast-dev#6025)) ([bf4e3fa](feast-dev@bf4e3fa)), closes [golang/go#74462](golang/go#74462)
* Added CodeQL SAST scanning and detect-secrets pre-commit hook ([547b516](feast-dev@547b516))
* Adding optional name to Aggregation (feast-dev[feast-dev#5994](feast-dev#5994)) ([feast-dev#6083](feast-dev#6083)) ([56469f7](feast-dev@56469f7))
* Feature Server High-Availability on Kubernetes ([feast-dev#6028](feast-dev#6028)) ([9c07b4c](feast-dev@9c07b4c)), closes [Hi#Availability](https://github.com/Hi/issues/Availability) [Hi#Availability](https://github.com/Hi/issues/Availability)
* **go:** Implement metrics and tracing for http and grpc servers ([feast-dev#5925](feast-dev#5925)) ([2b4ec9a](feast-dev@2b4ec9a))
* Horizontal scaling support to the Feast operator ([feast-dev#6000](feast-dev#6000)) ([3ec13e6](feast-dev@3ec13e6))
* Making feature view source optional (feast-dev[feast-dev#6074](feast-dev#6074)) ([feast-dev#6075](feast-dev#6075)) ([76917b7](feast-dev@76917b7))
* Support arm docker build ([feast-dev#6061](feast-dev#6061)) ([1e1f5d9](feast-dev@1e1f5d9))
* Use orjson for faster JSON serialization in feature server ([6f5203a](feast-dev@6f5203a))

### Performance Improvements

* Optimize protobuf parsing in Redis online store ([feast-dev#6023](feast-dev#6023)) ([59dfdb8](feast-dev@59dfdb8))
* Optimize timestamp conversion in _convert_rows_to_protobuf ([33a2e95](feast-dev@33a2e95))
* Parallelize DynamoDB batch reads in sync online_read ([feast-dev#6024](feast-dev#6024)) ([9699944](feast-dev@9699944))
* Remove redundant entity key serialization in online_read ([d87283f](feast-dev@d87283f))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants