Releases · microsoft/SynapseML

@ranadeepsingh

What's Changed

feat: Add token usage to OpenAIPrompt and OpenAIEmbedding by @ranadeepsingh in #2444
chore: Remove redundant text message for file input in OpenAIPrompt by @levscaut in #2448
fix: Make translation test less flaky by @BrendanWalsh in #2449
fix: Fix geospatial test error caused by invalid auth by @BrendanWalsh in #2450
fix: Remove test for deprecated bing search service by @BrendanWalsh in #2454
feat: Add OpenAI API Type to OpenAIDefaults by @levscaut in #2453
fix: Null cases for openai prompt and embeddings by @ranadeepsingh in #2457
chore: Replace placeholder with file name in prompt template by @levscaut in #2456
feat: OpenAI responses API: Add store and previous_response_id params by @ranadeepsingh in #2460
feat: OpenAIPrompt Branchout ReponseId and Usage cols by @ranadeepsingh in #2461
chore: remove redundant sas token from public test file by @BrendanWalsh in #2462
chore: null value passthrough for path columns in OpenAIPrompt by @levscaut in #2463
chore: Add argument to limit file size, put file error to errorCol by @levscaut in #2464
fix: remove random monikers by @ranadeepsingh in #2465
chore: Bump version to v1.1.1 by @smamindl in #2466
chore: Bump version to v1.1.2 by @smamindl in #2468
fix: Master release bug fixes by @BrendanWalsh in #2485
chore: Add granular CI job toggles and verification scripts by @BrendanWalsh in #2486
ci: Speed up builds with coverage skip and pre-built horovod by @BrendanWalsh in #2487
fix: Make TranslateSuite resilient to API translation changes by @BrendanWalsh in #2491
fix: Remove overly tight polling timeout from AbstractiveSummarizationSuite by @BrendanWalsh in #2493
ci: Skip WebsiteAutoDeployment on scheduled builds by @BrendanWalsh in #2492
chore: Remove deprecated Bing Search API v7 components by @BrendanWalsh in #2494
test: Port Fabric E2E tests from SynapseML-Internal by @BrendanWalsh in #2495
feat: Add attributes parameter to logToCertifiedEvents by @BrendanWalsh in #2496
test: add 145 unit tests for 22 untested core source files by @BrendanWalsh in #2497
fix: use BUILD_SOURCEBRANCH tag for version when available by @BrendanWalsh in #2499
ci: switch dead link checker from wget to lychee by @BrendanWalsh in #2501
ci: fix security issues in reopen-issue-on-comment workflow by @BrendanWalsh in #2502
ci: add dependency review and PR validation workflows by @BrendanWalsh in #2503
ci: remove unused SYNAPSE_ENVIRONMENT pipeline parameter by @BrendanWalsh in #2500
ci: add copilot-instructions.md for AI coding agents by @BrendanWalsh in #2504
ci: migrate website deployment to GitHub Actions by @BrendanWalsh in #2505
fix: OpenAI Prompt output for reasoning models by @ranadeepsingh in #2510
chore: Updated reaonsing_effort tests by @ranadeepsingh in #2511
fix: Prevent Python RCE via __import__ on metadata classnames by @ranadeepsingh in #2514
fix: Mitigate unsafe Java deserialization (CWE-502) by @ranadeepsingh in #2513
fix: Fix CVE-2023-44487 in mmlspark/release Docker image by @BrendanWalsh in #2520
chore(deps): bump brace-expansion from 1.1.11 to 1.1.13 in /website by @dependabot[bot] in #2530
chore(deps): bump lodash from 4.17.21 to 4.17.23 in /website by @dependabot[bot] in #2488
chore(deps): bump svgo from 2.8.0 to 2.8.2 in /website by @dependabot[bot] in #2509
chore(deps): bump yaml from 1.10.2 to 1.10.3 in /website by @dependabot[bot] in #2521
chore(deps): bump node-forge from 1.3.1 to 1.4.0 in /website by @dependabot[bot] in #2523
chore(deps): bump picomatch from 2.3.1 to 2.3.2 in /website by @dependabot[bot] in #2522
fix(ci): make dead links workflow robust against silent curl failures by @BrendanWalsh in #2534
ci: fix CodeQL Python analysis failure — upgrade to v4 and set explicit source-root by @BrendanWalsh in #2535
perf: split speech tests into 2 parallel jobs by @BrendanWalsh in #2528
perf: split Databricks E2E into 3 parallel CPU partitions, tune timeouts by @BrendanWalsh in #2527
docs: update LightGBM links by @jameslamb in #2512
fix: migrate Docker push from deprecated MSI to WIF service connection by @BrendanWalsh in #2536
feat: add maxCompletionTokens param for reasoning model compatibility by @BrendanWalsh in #2531
fix: improve 429 retry handling with exponential backoff by @BrendanWalsh in #2524
perf: optimize serialization tests for HTTP-backed services by @BrendanWalsh in #2525
chore: Add version bump automation script by @smamindl in #2519
test: fix flaky key-order assertions in structured output tests by @BrendanWalsh in #2537
fix: unblock master build — 3 test/style fixes from #2531 by @BrendanWalsh in #2538
fix: stop retrying 429s caused by Fabric capacity limits by @BrendanWalsh in #2539

New Contributors

@smamindl made their first contribution in #2466
@jameslamb made their first contribution in #2512

Full Changelog: v1.1.0...v1.1.3

We are excited to announce the release of SynapseML v1.1 marking a host of powerful new features introduced since the initial v1.0 release. SynapseML is an open-source library that aims to streamline the development of massively scalable machine learning pipelines. It unifies several existing ML Frameworks and new Microsoft algorithms in a single, scalable API that is usable across Python, R, Scala, and Java. SynapseML is usable from any Apache Spark platform with first class enterprise support on Microsoft Fabric.

Highlights


Microsoft Fabric	AI Functions	OneLake
Build and operationalize distributed ML with SynapseML in Fabric	Apply Pandas and Spark LLM transformations with one line of code	Automatically derive AI insights for unstructured data in OneLake
Build Your First Model	Explore AI Functions	Learn More


Hugging Face	Azure AI Foundry
Use open source models hosted on Hugging Face	Run Azure AI Foundry models in your notebook
Try an Example	View Notebook

More Hightlights

Spark 3.5 Support – In this version we transitioned to Spark 3.5 as our main Spark platform.

OpenAI Ecosystem – Comprehensive improvements including global parameter defaults, GPT-4 enablement, custom endpoints/headers, GPU-accelerated embeddings with KNN, and fine-grained control over model parameters (top_p, seed, responseFormat, temperature).

ML Innovation – HuggingFaceCausalLM transformer for distributed language model evaluation, custom embedder support, and synthetic difference-in-differences causal inference module.

Platform features – Spark Native OneLake support; MSI for Azure Storage; OpenAITranslate transformer.

AI Functions in Data Wrangler on Fabric – AI Functions built into Data Wrangler in Fabric allow you to apply LLM-powered operations to your dataframe without writing a single line of code.

New Features

Documentation 📚

AI Functions ⚡

Added support for AI Functions in Pandas (#1579613, #1585011, #1596611, #1509195, #1501185, #1494610, #1494951)
Added support for AI Functions in PySpark (#1460790, #1572928, #1599735, #1439858, #1463533)
Added support for async AI Functions execution (#1529058, #1523727)
JSON response support & improved language validation. (#1551823, #1566189)
Seed param for reproducible chat/completions (API 2024-10-21). (#1551883)
Fuzzy case-insensitive matching for Classify. (#1515064)
Add AI Functions Operations to Data Wrangler (#1590130, #1638257, #1718967, #1725101, #1730446)

Azure OpenAI 🌸

Enhanced Model Parameters – Added top_p, seed, responseFormat, temperature, and subscription key support (#2410, #2329, #2324)
GPT-4 Enablement – Full GPT-4 support in OpenAIPrompt (#2248)
Custom Endpoints & Headers – Support for custom URL endpoints and HTTP headers (#2232)
GPU-Accelerated Embeddings – OpenAI embeddings with GPU-based KNN pipeline (#2157)
Embedding Dimensions Control – Configurable dimensions parameter for OpenAIEmbedding (#2215)
Global Parameter Defaults – Centralized OpenAI parameter management with Python wrapper support (#2318, #2327)
Updated OpenAI API version to 2024 (#2190)
Updated OpenAIDefaults implementation (#2415)
OpenAIPrompt bug fixes and improvements (#2334)
Added responseFormat parameter to Chat Completion (#2329)
Optimized getOptionalParams in HasOpenAITextParams (#2315)

OneLake 🌊

Add Spark Native OneLake support (#1190687)

Machine Learning 🕸️

HuggingFaceCausalLM – Transformer for evaluating language models on Spark clusters (#2301)
Custom Embedder – Extensible custom embedding transformer support (#2236)
Synthetic DiD – Synthetic difference-in-differences module for causal inference (#2095)

Azure AI Foundry 🔨

AIFoundryChatCompletion – New transformer for Azure AI Foundry chat models (#2398)
AI Foundry + OpenAI Prompt – Unified interface for OpenAI and Foundry deployments (#2404)

General ✨

Add Spark 3.5 Support – Added full Spark 3.5 compatibility with new build variants (#2052)
Python 3.11 Baseline – Upgraded to Python 3.11 as minimum version (#2193)
Fabric Billing Integration – Enhanced Fabric Cognitive Service token for billing support (#2291)
Fabric WSPL FQDN Selection – Configurable Fabric workspace FQDN endpoints (#2376)
Added Bool input support for ONNX models ([#2130...

Changes:

123ead2 chore: bump to v1.0.14 (#2418)
1ceb3b1 chore: update OpenAIDefaults (#2415)
26220da feat: add aifoundry to openai prompt (#2404)
aac2ed6 fix: fix error handling in networking layer (#2412)
7c2f22a feat: Add top_p and seed params to OpenAIDefaults (#2410)
a9133aa noop commit to Stop library releases in build
a394061 chore: Bump library to version to v1.0.13 (#2405)
85a6687 chore: fix propagation of fabric telemetry (#2403)
298c7ed chore: Add devcontainer configuration for vscode and Copilot (#2400)
873884d feat: Add AIFoundaryChatCompletion (#2398)

See More

83ebb5a Update README.md

This list of changes was auto generated.

Changes:

a394061 chore: Bump library to version to v1.0.13 (#2405)
85a6687 chore: fix propagation of fabric telemetry (#2403)
298c7ed chore: Add devcontainer configuration for vscode and Copilot (#2400)
873884d feat: Add AIFoundaryChatCompletion (#2398)
83ebb5a Update README.md

This list of changes was auto generated.

Changes:

9c91148 chore: bump to v1.0.12 (#2397)
e71fed8 chore: fix vw benchmarks (#2396)
2b23d7c chore: split deepLearning tests (#2387)
1639b14 fix: testing telemetry-properties header (#2375)
9e0fde4 chore: Fixing logic for polling in LRO tasks (#2389)
b672aa4 chore: Updating codecov pipeline step (#2391)
84d7d65 feat: Support Choosing Fabric WSPL FQDN (#2376)
ffa0383 docs: default dataTransferMode is streaming, not bulk (#2377)
04d9dc4 fix: auto-convert DateType/TimestampType to ISO8601 in Azure Search (#2381)
5bbbbc7 chore: fix failing openai tests (#2385)

See More

9af855e fix: support scoring profiles in Azure Search index parsing (#2383)
326988d chore: update sbt version to allow for amd local builds (#2384)
3865e71 fix: fix model checking logic (#2379)
1eec70d fix: fix bug where token cannot be acquired on system context (#2378)
141039b fix: add hf causal LM python tests, fix build (#2374)
6c95bf0 fix: add case for Python only envs (#2368)
c1cef65 chore: limit adb concurrency (#2370)

This list of changes was auto generated.

Changes:

58b945f chore: fix yml (#2373)
2ec8db4 chore:fix release (#2372)
33b5a39 chore: bump to v1.0.11 (#2371)
e12b0a1 fix: Adding trailing / to URLs set (#2364)
53b76a8 feat: Add HuggingFaceCausalLM Transformer for evaluating language models on cluster (#2301)
25b49cd chore: free up space in build (#2365)
26baa09 fix: Cannot Load LightGBM Model When Placed in a Spark Pipeline with Custom Transformers (#2357)
b7971eb chore: Making unit tests green again (#2360)
57d15ec chore: Removing unnecessary logs (#2359)
6c9c9ce Add correct workspace ID for spark jobs (#2358)

See More

a4dec08 fix: update openai compeletion doc, fix failed OpenAI test (#2348) [ #2351, #2353, #2354 ]
7878fec chore: streamline cache fix (#2354)
8787c82 chore: free up space on build machines prior to caching (#2353)
b2f4080 Make sure library isnt released twice
0240895 chore: bump synapseml to v1.0.10 (#2351)
1b5df70 feat: Adding capability use Cognitive Service Language Service asynchronously for Summarization (#2342)
bab6aed chore: fix github release yml (#2339)

This list of changes was auto generated.

Releases: microsoft/SynapseML

v1.1.3

What's Changed

New Contributors

Contributors

Uh oh!

SynapseML v1.1.0

Highlights

More Hightlights

New Features

Documentation 📚

AI Functions ⚡

Azure OpenAI 🌸

OneLake 🌊

Machine Learning 🕸️

Azure AI Foundry 🔨

General ✨

Contributors

Uh oh!

SynapseML v1.0.14

Changes:

Uh oh!

SynapseML v1.0.14 Spark 3.5

Uh oh!

SynapseML v1.0.13

Changes:

Uh oh!

SynapseML v1.0.13-spark3.5

Uh oh!

v1.0.12-spark3.5

Uh oh!

SynapseML v1.0.12

Changes:

Uh oh!

SynapseML v1.0.11

Changes:

Uh oh!

SynapseML v1.0.11-spark3.5

Uh oh!