Releases: microsoft/SynapseML
v1.1.3
What's Changed
- feat: Add token usage to OpenAIPrompt and OpenAIEmbedding by @ranadeepsingh in #2444
- chore: Remove redundant text message for file input in OpenAIPrompt by @levscaut in #2448
- fix: Make translation test less flaky by @BrendanWalsh in #2449
- fix: Fix geospatial test error caused by invalid auth by @BrendanWalsh in #2450
- fix: Remove test for deprecated bing search service by @BrendanWalsh in #2454
- feat: Add OpenAI API Type to OpenAIDefaults by @levscaut in #2453
- fix: Null cases for openai prompt and embeddings by @ranadeepsingh in #2457
- chore: Replace placeholder with file name in prompt template by @levscaut in #2456
- feat: OpenAI responses API: Add store and previous_response_id params by @ranadeepsingh in #2460
- feat: OpenAIPrompt Branchout ReponseId and Usage cols by @ranadeepsingh in #2461
- chore: remove redundant sas token from public test file by @BrendanWalsh in #2462
- chore: null value passthrough for path columns in OpenAIPrompt by @levscaut in #2463
- chore: Add argument to limit file size, put file error to errorCol by @levscaut in #2464
- fix: remove random monikers by @ranadeepsingh in #2465
- chore: Bump version to v1.1.1 by @smamindl in #2466
- chore: Bump version to v1.1.2 by @smamindl in #2468
- fix: Master release bug fixes by @BrendanWalsh in #2485
- chore: Add granular CI job toggles and verification scripts by @BrendanWalsh in #2486
- ci: Speed up builds with coverage skip and pre-built horovod by @BrendanWalsh in #2487
- fix: Make TranslateSuite resilient to API translation changes by @BrendanWalsh in #2491
- fix: Remove overly tight polling timeout from AbstractiveSummarizationSuite by @BrendanWalsh in #2493
- ci: Skip WebsiteAutoDeployment on scheduled builds by @BrendanWalsh in #2492
- chore: Remove deprecated Bing Search API v7 components by @BrendanWalsh in #2494
- test: Port Fabric E2E tests from SynapseML-Internal by @BrendanWalsh in #2495
- feat: Add attributes parameter to logToCertifiedEvents by @BrendanWalsh in #2496
- test: add 145 unit tests for 22 untested core source files by @BrendanWalsh in #2497
- fix: use BUILD_SOURCEBRANCH tag for version when available by @BrendanWalsh in #2499
- ci: switch dead link checker from wget to lychee by @BrendanWalsh in #2501
- ci: fix security issues in reopen-issue-on-comment workflow by @BrendanWalsh in #2502
- ci: add dependency review and PR validation workflows by @BrendanWalsh in #2503
- ci: remove unused SYNAPSE_ENVIRONMENT pipeline parameter by @BrendanWalsh in #2500
- ci: add copilot-instructions.md for AI coding agents by @BrendanWalsh in #2504
- ci: migrate website deployment to GitHub Actions by @BrendanWalsh in #2505
- fix: OpenAI Prompt output for reasoning models by @ranadeepsingh in #2510
- chore: Updated reaonsing_effort tests by @ranadeepsingh in #2511
- fix: Prevent Python RCE via
__import__on metadata classnames by @ranadeepsingh in #2514 - fix: Mitigate unsafe Java deserialization (CWE-502) by @ranadeepsingh in #2513
- fix: Fix CVE-2023-44487 in mmlspark/release Docker image by @BrendanWalsh in #2520
- chore(deps): bump brace-expansion from 1.1.11 to 1.1.13 in /website by @dependabot[bot] in #2530
- chore(deps): bump lodash from 4.17.21 to 4.17.23 in /website by @dependabot[bot] in #2488
- chore(deps): bump svgo from 2.8.0 to 2.8.2 in /website by @dependabot[bot] in #2509
- chore(deps): bump yaml from 1.10.2 to 1.10.3 in /website by @dependabot[bot] in #2521
- chore(deps): bump node-forge from 1.3.1 to 1.4.0 in /website by @dependabot[bot] in #2523
- chore(deps): bump picomatch from 2.3.1 to 2.3.2 in /website by @dependabot[bot] in #2522
- fix(ci): make dead links workflow robust against silent curl failures by @BrendanWalsh in #2534
- ci: fix CodeQL Python analysis failure — upgrade to v4 and set explicit source-root by @BrendanWalsh in #2535
- perf: split speech tests into 2 parallel jobs by @BrendanWalsh in #2528
- perf: split Databricks E2E into 3 parallel CPU partitions, tune timeouts by @BrendanWalsh in #2527
- docs: update LightGBM links by @jameslamb in #2512
- fix: migrate Docker push from deprecated MSI to WIF service connection by @BrendanWalsh in #2536
- feat: add maxCompletionTokens param for reasoning model compatibility by @BrendanWalsh in #2531
- fix: improve 429 retry handling with exponential backoff by @BrendanWalsh in #2524
- perf: optimize serialization tests for HTTP-backed services by @BrendanWalsh in #2525
- chore: Add version bump automation script by @smamindl in #2519
- test: fix flaky key-order assertions in structured output tests by @BrendanWalsh in #2537
- fix: unblock master build — 3 test/style fixes from #2531 by @BrendanWalsh in #2538
- fix: stop retrying 429s caused by Fabric capacity limits by @BrendanWalsh in #2539
New Contributors
- @smamindl made their first contribution in #2466
- @jameslamb made their first contribution in #2512
Full Changelog: v1.1.0...v1.1.3
SynapseML v1.1.0
We are excited to announce the release of SynapseML v1.1 marking a host of powerful new features introduced since the initial v1.0 release. SynapseML is an open-source library that aims to streamline the development of massively scalable machine learning pipelines. It unifies several existing ML Frameworks and new Microsoft algorithms in a single, scalable API that is usable across Python, R, Scala, and Java. SynapseML is usable from any Apache Spark platform with first class enterprise support on Microsoft Fabric.
Highlights
![]() |
![]() |
![]() |
|---|---|---|
| Microsoft Fabric | AI Functions | OneLake |
| Build and operationalize distributed ML with SynapseML in Fabric | Apply Pandas and Spark LLM transformations with one line of code | Automatically derive AI insights for unstructured data in OneLake |
| Build Your First Model | Explore AI Functions | Learn More |
![]() |
![]() |
|---|---|
| Hugging Face | Azure AI Foundry |
| Use open source models hosted on Hugging Face | Run Azure AI Foundry models in your notebook |
| Try an Example | View Notebook |
More Hightlights
Spark 3.5 Support – In this version we transitioned to Spark 3.5 as our main Spark platform.
OpenAI Ecosystem – Comprehensive improvements including global parameter defaults, GPT-4 enablement, custom endpoints/headers, GPU-accelerated embeddings with KNN, and fine-grained control over model parameters (top_p, seed, responseFormat, temperature).
ML Innovation – HuggingFaceCausalLM transformer for distributed language model evaluation, custom embedder support, and synthetic difference-in-differences causal inference module.
Platform features – Spark Native OneLake support; MSI for Azure Storage; OpenAITranslate transformer.
AI Functions in Data Wrangler on Fabric – AI Functions built into Data Wrangler in Fabric allow you to apply LLM-powered operations to your dataframe without writing a single line of code.
New Features
Documentation 📚
- AI Functions
- AI Powered Transforms in OneLake
- Azure OpenAI for Big Data in SynapseML
- AI Functions in Data Wrangler
- AI Foundry
- Hugging Face
AI Functions ⚡
- Added support for AI Functions in Pandas (#1579613, #1585011, #1596611, #1509195, #1501185, #1494610, #1494951)
- Added support for AI Functions in PySpark (#1460790, #1572928, #1599735, #1439858, #1463533)
- Added support for async AI Functions execution (#1529058, #1523727)
- JSON response support & improved language validation. (#1551823, #1566189)
- Seed param for reproducible chat/completions (API 2024-10-21). (#1551883)
- Fuzzy case-insensitive matching for Classify. (#1515064)
- Add AI Functions Operations to Data Wrangler (#1590130, #1638257, #1718967, #1725101, #1730446)
Azure OpenAI 🌸
- Enhanced Model Parameters – Added top_p, seed, responseFormat, temperature, and subscription key support (#2410, #2329, #2324)
- GPT-4 Enablement – Full GPT-4 support in OpenAIPrompt (#2248)
- Custom Endpoints & Headers – Support for custom URL endpoints and HTTP headers (#2232)
- GPU-Accelerated Embeddings – OpenAI embeddings with GPU-based KNN pipeline (#2157)
- Embedding Dimensions Control – Configurable dimensions parameter for OpenAIEmbedding (#2215)
- Global Parameter Defaults – Centralized OpenAI parameter management with Python wrapper support (#2318, #2327)
- Updated OpenAI API version to 2024 (#2190)
- Updated OpenAIDefaults implementation (#2415)
- OpenAIPrompt bug fixes and improvements (#2334)
- Added responseFormat parameter to Chat Completion (#2329)
- Optimized getOptionalParams in HasOpenAITextParams (#2315)
OneLake 🌊
- Add Spark Native OneLake support (#1190687)
Machine Learning 🕸️
- HuggingFaceCausalLM – Transformer for evaluating language models on Spark clusters (#2301)
- Custom Embedder – Extensible custom embedding transformer support (#2236)
- Synthetic DiD – Synthetic difference-in-differences module for causal inference (#2095)
Azure AI Foundry 🔨
- AIFoundryChatCompletion – New transformer for Azure AI Foundry chat models (#2398)
- AI Foundry + OpenAI Prompt – Unified interface for OpenAI and Foundry deployments (#2404)
General ✨
- Add Spark 3.5 Support – Added full Spark 3.5 compatibility with new build variants (#2052)
- Python 3.11 Baseline – Upgraded to Python 3.11 as minimum version (#2193)
- Fabric Billing Integration – Enhanced Fabric Cognitive Service token for billing support (#2291)
- Fabric WSPL FQDN Selection – Configurable Fabric workspace FQDN endpoints (#2376)
- Added Bool input support for ONNX models ([#2130...
SynapseML v1.0.14
Changes:
- 123ead2 chore: bump to v1.0.14 (#2418)
- 1ceb3b1 chore: update OpenAIDefaults (#2415)
- 26220da feat: add aifoundry to openai prompt (#2404)
- aac2ed6 fix: fix error handling in networking layer (#2412)
- 7c2f22a feat: Add top_p and seed params to OpenAIDefaults (#2410)
- a9133aa noop commit to Stop library releases in build
- a394061 chore: Bump library to version to v1.0.13 (#2405)
- 85a6687 chore: fix propagation of fabric telemetry (#2403)
- 298c7ed chore: Add devcontainer configuration for vscode and Copilot (#2400)
- 873884d feat: Add AIFoundaryChatCompletion (#2398)
SynapseML v1.0.14 Spark 3.5
v1.0.14-spark3.5 Enabling Synapse 3.5 tests
SynapseML v1.0.13
Changes:
- a394061 chore: Bump library to version to v1.0.13 (#2405)
- 85a6687 chore: fix propagation of fabric telemetry (#2403)
- 298c7ed chore: Add devcontainer configuration for vscode and Copilot (#2400)
- 873884d feat: Add AIFoundaryChatCompletion (#2398)
- 83ebb5a Update README.md
This list of changes was auto generated.
SynapseML v1.0.13-spark3.5
Enabling Synapse 3.5 tests
v1.0.12-spark3.5
Enabling Synapse 3.5 tests
SynapseML v1.0.12
Changes:
- 9c91148 chore: bump to v1.0.12 (#2397)
- e71fed8 chore: fix vw benchmarks (#2396)
- 2b23d7c chore: split deepLearning tests (#2387)
- 1639b14 fix: testing telemetry-properties header (#2375)
- 9e0fde4 chore: Fixing logic for polling in LRO tasks (#2389)
- b672aa4 chore: Updating codecov pipeline step (#2391)
- 84d7d65 feat: Support Choosing Fabric WSPL FQDN (#2376)
- ffa0383 docs: default dataTransferMode is streaming, not bulk (#2377)
- 04d9dc4 fix: auto-convert DateType/TimestampType to ISO8601 in Azure Search (#2381)
- 5bbbbc7 chore: fix failing openai tests (#2385)
See More
- 9af855e fix: support scoring profiles in Azure Search index parsing (#2383)
- 326988d chore: update sbt version to allow for amd local builds (#2384)
- 3865e71 fix: fix model checking logic (#2379)
- 1eec70d fix: fix bug where token cannot be acquired on system context (#2378)
- 141039b fix: add hf causal LM python tests, fix build (#2374)
- 6c95bf0 fix: add case for Python only envs (#2368)
- c1cef65 chore: limit adb concurrency (#2370)
This list of changes was auto generated.
SynapseML v1.0.11
Changes:
- 58b945f chore: fix yml (#2373)
- 2ec8db4 chore:fix release (#2372)
- 33b5a39 chore: bump to v1.0.11 (#2371)
- e12b0a1 fix: Adding trailing / to URLs set (#2364)
- 53b76a8 feat: Add HuggingFaceCausalLM Transformer for evaluating language models on cluster (#2301)
- 25b49cd chore: free up space in build (#2365)
- 26baa09 fix: Cannot Load LightGBM Model When Placed in a Spark Pipeline with Custom Transformers (#2357)
- b7971eb chore: Making unit tests green again (#2360)
- 57d15ec chore: Removing unnecessary logs (#2359)
- 6c9c9ce Add correct workspace ID for spark jobs (#2358)
See More
- a4dec08 fix: update openai compeletion doc, fix failed OpenAI test (#2348) [ #2351, #2353, #2354 ]
- 7878fec chore: streamline cache fix (#2354)
- 8787c82 chore: free up space on build machines prior to caching (#2353)
- b2f4080 Make sure library isnt released twice
- 0240895 chore: bump synapseml to v1.0.10 (#2351)
- 1b5df70 feat: Adding capability use Cognitive Service Language Service asynchronously for Summarization (#2342)
- bab6aed chore: fix github release yml (#2339)
This list of changes was auto generated.
SynapseML v1.0.11-spark3.5
chore: Adding Spark35 support





