Skip to content

Releases: moreh-dev/mif

v0.4.0

09 Apr 14:43
v0.4.0
d17fc91

Choose a tag to compare

Dependency Version Changes

MIF Helm Charts

Component v0.3.0 v0.4.0
moai-inference-framework v0.4.0
moai-inference-preset v0.5.0

Core Components

Component v0.3.0 v0.4.0
Odin v0.6.0 v0.8.0
Odin CRD v0.6.0 v0.8.0
Heimdall v0.7.1
heimdall-proxy v0.7.0
LWS 0.7.0 0.8.0
moreh-vLLM preset 0.15.0-260226-rc2
Istio 1.29.1

Infrastructure Dependencies

Bundled as sub-charts in moai-inference-framework:

Component v0.3.0 v0.4.0
kube-prometheus-stack 80.7.0 80.7.0
KEDA 2.18.0 2.18.0
kubernetes-replicator 2.12.2 2.12.2
Loki 6.30.0
Vector 0.39.0
MinIO 5.4.0
Node Feature Discovery 0.18.3

Highlights

Observability Stack

  • Integrated Loki + MinIO + Vector for centralized log collection (#64, #67, #69)
  • Added Heimdall TTFT, NTPOT, ITL metric panels to Grafana dashboard (#72)
  • Added AMD GPU usage monitor dashboard (#71)

Hardware Support

  • Added Node Feature Discovery integration with automatic accelerator detection (#62, #77)
  • Added NodeFeatureRule for NVIDIA accelerators (H100 NVL, L40S, and more) (#97, #103)
  • Guarded NodeFeatureRule with nfd.enabled condition (#92)

Preset Expansion

  • Added presets for DeepSeek-R1 max throughput (#46)
  • Added InferenceServiceTemplates for Qwen and other models (#45, #49)
  • Added quickstart presets for MI250 / MI300x (#94, #100)
  • Added vLLM v0.15.1 and v0.17.0 E2E presets for H100 / H200 (#98, #101)
  • Added model and framework fields to preset templates (#84)
  • Added PYTHONHASHSEED env var to runtime-base templates (#86)
  • Standardized preset naming convention (#75)

Documentation (Website)

  • Full Docusaurus documentation site setup (#54)
  • Added guides: Getting Started, Features, Operations, PD Disaggregation, Prefix Cache-Aware Routing (#59, #61, #74, #91, #93)
  • Added Heimdall plugin reference and multi-model serving guide (#57, #88)
  • Added Metrics monitoring and log collection docs (#66, #69)
  • Added DeepSeek R1 max throughput blog post (#85)

E2E Testing

  • Enhanced E2E testing framework with functional tests (#81)
  • Added Heimdall tag support and inference service configurations (#56)
  • Refactored E2E workflows for performance and quality benchmarks (#51, #53)

Agent Skills

  • Introduced agent skills documentation and guides (#68)
  • Added bump-dependency skill (#80)

What's Changed

Full Changelog: v0.3.0...v0.4.0

v0.3.0

01 Feb 17:35
v0.3.0
71623c2

Choose a tag to compare

What's Changed

  • MAF-19187: feat(config): add tenstorrent and nvidia devices to nfd rule by @hhk7734 in #33
  • MAF-19136: feat(deploy): add NVIDIA GPU to dashboard by @hhk7734 in #34
  • NO-ISSUE: feat(deploy): bump odin to v0.6.0 by @hhk7734 in #44

Full Changelog: v0.2.0...v0.3.0

v0.2.0

20 Jan 16:04
v0.2.0
11a45b4

Choose a tag to compare

What's Changed

  • feat(deploy): add DeepSeek R1 MI300 data-parallel inference preset by @bongwoobak in #12
  • MAF-19096: feat(config): add nvida a100 to accelerator rule by @hhk7734 in #13
  • MAF-19066: feat(test): add end-to-end testing framework and workflows by @seongsukwon-moreh in #14
  • MAF-19066: feat(e2e): enhance end-to-end testing framework and workflows by @seongsukwon-moreh in #15
  • MAF-19066: feat(e2e): add pd-disaggregation and run inference-perf afterwards by @seongsukwon-moreh in #16
  • MAF-19066: refactor(e2e): update CI workflow by @seongsukwon-moreh in #17
  • MAF-19066: refactor(e2e): enhance template rendering and structure for inference services by @seongsukwon-moreh in #18
  • NO-ISSUE: chore(workflow): Update pr-title-checker.yaml by @ibpark-moreh in #21
  • MAF-19066: fix(e2e): fix github action fail by @seongsukwon-moreh in #19
  • MAF-19139: feat(preset): add vllm pd dp bases by @hhk7734 in #22
  • MAF-19066: feat(e2e): add S3 configuration for inference-perf by @seongsukwon-moreh in #23
  • chore: refactor comment guidelines and add test section by @hhk7734 in #24
  • NO-ISSUE: style(e2e): indent go template syntax in yaml files by @hhk7734 in #25
  • NO-ISSUE: test(e2e): assume fully controlled cluster by @hhk7734 in #26
  • MAF-19141: feat(deploy): conditionally enable ecr-token-refresher and move init to pre-install by @hhk7734 in #27
  • NO-ISSUE: test(e2e): streamline gateway setup by @hhk7734 in #28
  • MAF-19057: feat(helm): add grafana dashboard by @seongsukwon-moreh in #20
  • NO-ISSUE: feat(deploy): bump odin and odin-crd to v0.5.1 by @hhk7734 in #29

New Contributors

  • @bongwoobak made their first contribution in #12
  • @seongsukwon-moreh made their first contribution in #14
  • @ibpark-moreh made their first contribution in #21

Full Changelog: v0.1.0...v0.2.0

v0.1.0

03 Jan 14:28
v0.1.0
0b78e71

Choose a tag to compare

What's Changed

  • feat(deploy): bump odin and odin-crd to v0.4.0 by @hhk7734 in #7
  • chore: add CODEOWNERS by @hhk7734 in #8
  • feat(config): add moai-accelerator rule by @hhk7734 in #9
  • MAF-19069: feat(deploy): add moai-inference-preset chart by @hhk7734 in #10
  • MAF-19069: feat(deploy): bump odin and odin-crd to v0.5.0 by @hhk7734 in #11

Full Changelog: v0.0.1...v0.1.0

v0.0.1

30 Dec 02:06
v0.0.1
3d2122c

Choose a tag to compare

What's Changed

  • feat(deploy): add moai-inference-framework helm chart by @hhk7734 in #1
  • feat(deploy): add odin to mif by @hhk7734 in #2
  • feat(deploy): add kube-prometheus-stack to mif by @hhk7734 in #3
  • feat(deploy): add keda to mif by @hhk7734 in #4
  • chore(workflow): add prod cd action by @hhk7734 in #5
  • chore(workflow): fix release action by @hhk7734 in #6

New Contributors

Full Changelog: https://github.com/moreh-dev/mif/commits/v0.0.1