<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Data Engineering Weekly]]></title><description><![CDATA[The Weekly Data Engineering Newsletter]]></description><link>https://www.dataengineeringweekly.com</link><image><url>https://substackcdn.com/image/fetch/$s_!AdQk!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6f51c1bf-abc9-4cd3-ad69-22cc8e3f1ef2_1080x1080.png</url><title>Data Engineering Weekly</title><link>https://www.dataengineeringweekly.com</link></image><generator>Substack</generator><lastBuildDate>Tue, 14 Apr 2026 10:07:10 GMT</lastBuildDate><atom:link href="https://www.dataengineeringweekly.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Ananth Packkildurai]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[dataengineeringweekly@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[dataengineeringweekly@substack.com]]></itunes:email><itunes:name><![CDATA[Ananth Packkildurai]]></itunes:name></itunes:owner><itunes:author><![CDATA[Ananth Packkildurai]]></itunes:author><googleplay:owner><![CDATA[dataengineeringweekly@substack.com]]></googleplay:owner><googleplay:email><![CDATA[dataengineeringweekly@substack.com]]></googleplay:email><googleplay:author><![CDATA[Ananth Packkildurai]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Data Engineering Weekly #265]]></title><description><![CDATA[The Weekly Data Engineering Newsletter]]></description><link>https://www.dataengineeringweekly.com/p/data-engineering-weekly-265</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/data-engineering-weekly-265</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Mon, 13 Apr 2026 03:24:30 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!AdQk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6f51c1bf-abc9-4cd3-ad69-22cc8e3f1ef2_1080x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1>This week: Multi-Tenancy for Modern Data Platforms</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2erQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fad9d9a-b67e-4e81-a00e-e22a0a9a7603_1920x1080.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2erQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fad9d9a-b67e-4e81-a00e-e22a0a9a7603_1920x1080.heic 424w, https://substackcdn.com/image/fetch/$s_!2erQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fad9d9a-b67e-4e81-a00e-e22a0a9a7603_1920x1080.heic 848w, https://substackcdn.com/image/fetch/$s_!2erQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fad9d9a-b67e-4e81-a00e-e22a0a9a7603_1920x1080.heic 1272w, https://substackcdn.com/image/fetch/$s_!2erQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fad9d9a-b67e-4e81-a00e-e22a0a9a7603_1920x1080.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2erQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fad9d9a-b67e-4e81-a00e-e22a0a9a7603_1920x1080.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0fad9d9a-b67e-4e81-a00e-e22a0a9a7603_1920x1080.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:20580,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/194026123?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fad9d9a-b67e-4e81-a00e-e22a0a9a7603_1920x1080.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2erQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fad9d9a-b67e-4e81-a00e-e22a0a9a7603_1920x1080.heic 424w, https://substackcdn.com/image/fetch/$s_!2erQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fad9d9a-b67e-4e81-a00e-e22a0a9a7603_1920x1080.heic 848w, https://substackcdn.com/image/fetch/$s_!2erQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fad9d9a-b67e-4e81-a00e-e22a0a9a7603_1920x1080.heic 1272w, https://substackcdn.com/image/fetch/$s_!2erQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0fad9d9a-b67e-4e81-a00e-e22a0a9a7603_1920x1080.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Join Brooklyn Data Co. and Dagster Labs for a live deep dive on multi-tenancy for modern data platforms. We&#8217;ll cover:</p><p><br>- Code location isolation and project structure patterns<br>- Managing dependencies across tenants (including AI models)<br>- Operational strategies that scale with your organization<br>- Lessons learned from real production implementations<br><br>Save your spot for practical guidance that you can apply immediately.</p><p><strong><a href="https://dagster.io/events/multi-tenancy-for-modern-data-platforms?utm_campaign=39250561-26-04-WBNR_DEEP_DIVE_BROOKYLN_DATA&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_brooklyn_data&amp;utm_content=04_12_26_data_engineering_weekly">Reserve your spot now</a></strong></p><div><hr></div><h1>dbt: Semantic Layer vs. Text-to-SQL: 2026 Benchmark Update</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vWVf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f305db0-e60f-4200-86d2-10e804d1ef7d_970x480.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vWVf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f305db0-e60f-4200-86d2-10e804d1ef7d_970x480.heic 424w, https://substackcdn.com/image/fetch/$s_!vWVf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f305db0-e60f-4200-86d2-10e804d1ef7d_970x480.heic 848w, https://substackcdn.com/image/fetch/$s_!vWVf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f305db0-e60f-4200-86d2-10e804d1ef7d_970x480.heic 1272w, https://substackcdn.com/image/fetch/$s_!vWVf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f305db0-e60f-4200-86d2-10e804d1ef7d_970x480.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vWVf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f305db0-e60f-4200-86d2-10e804d1ef7d_970x480.heic" width="970" height="480" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4f305db0-e60f-4200-86d2-10e804d1ef7d_970x480.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:480,&quot;width&quot;:970,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12492,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/194026123?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f305db0-e60f-4200-86d2-10e804d1ef7d_970x480.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vWVf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f305db0-e60f-4200-86d2-10e804d1ef7d_970x480.heic 424w, https://substackcdn.com/image/fetch/$s_!vWVf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f305db0-e60f-4200-86d2-10e804d1ef7d_970x480.heic 848w, https://substackcdn.com/image/fetch/$s_!vWVf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f305db0-e60f-4200-86d2-10e804d1ef7d_970x480.heic 1272w, https://substackcdn.com/image/fetch/$s_!vWVf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f305db0-e60f-4200-86d2-10e804d1ef7d_970x480.heic 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AI will write all the code generated in data engineering. It is the fundamental shift we all have to prepare for. dbt published its benchmark update, claiming GPT-5.3-Codex with Semantic Layer acheives 100.0% accuracy. </p><p><strong><a href="https://docs.getdbt.com/blog/semantic-layer-vs-text-to-sql-2026?version=1.10">https://docs.getdbt.com/blog/semantic-layer-vs-text-to-sql-2026?version=1.10</a></strong></p><div><hr></div><h1>Rill: Introducing Metrics SQL: A SQL-based semantic layer for humans and agents</h1><p>Staying on the metrics and the semantic layer, Rill introduces Metrics SQL to define logic once in YAML and expose it through standard SQL, automating aggregations, enforcing row-level security, and serving governed definitions to AI agents via an MCP server without exposing raw schemas. Deterministic metric resolution eliminates inconsistencies across consumers, while a semantic pushdown roadmap targets native MEASURE support in OLAP engines like ClickHouse and Snowflake.</p><p><strong><a href="https://www.rilldata.com/blog/introducing-metrics-sql-a-sql-based-semantic-layer-for-humans-and-agents">https://www.rilldata.com/blog/introducing-metrics-sql-a-sql-based-semantic-layer-for-humans-and-agents</a></strong></p><div><hr></div><h1>Meta: How Meta Used AI to Map Tribal Knowledge in Large-Scale Data Pipelines</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5FAg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76224763-0a16-4b46-81a4-469a86ea5cef_1580x1354.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5FAg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76224763-0a16-4b46-81a4-469a86ea5cef_1580x1354.heic 424w, https://substackcdn.com/image/fetch/$s_!5FAg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76224763-0a16-4b46-81a4-469a86ea5cef_1580x1354.heic 848w, https://substackcdn.com/image/fetch/$s_!5FAg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76224763-0a16-4b46-81a4-469a86ea5cef_1580x1354.heic 1272w, https://substackcdn.com/image/fetch/$s_!5FAg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76224763-0a16-4b46-81a4-469a86ea5cef_1580x1354.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5FAg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76224763-0a16-4b46-81a4-469a86ea5cef_1580x1354.heic" width="1456" height="1248" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/76224763-0a16-4b46-81a4-469a86ea5cef_1580x1354.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1248,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:23001,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/194026123?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76224763-0a16-4b46-81a4-469a86ea5cef_1580x1354.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5FAg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76224763-0a16-4b46-81a4-469a86ea5cef_1580x1354.heic 424w, https://substackcdn.com/image/fetch/$s_!5FAg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76224763-0a16-4b46-81a4-469a86ea5cef_1580x1354.heic 848w, https://substackcdn.com/image/fetch/$s_!5FAg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76224763-0a16-4b46-81a4-469a86ea5cef_1580x1354.heic 1272w, https://substackcdn.com/image/fetch/$s_!5FAg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76224763-0a16-4b46-81a4-469a86ea5cef_1580x1354.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AI coding assistants fail in proprietary codebases because tribal knowledge&#8212;implicit design decisions and legacy constraints&#8212;remains absent from training data and documentation. Meta Platforms deploys a swarm of 50 specialized agents to map a 4,100-file pipeline into concise context artifacts, using tiered explorer, analyst, critic, and fixer roles with automated decay detection. This system increases context coverage from 5% to 100%, captures over 50 non-obvious patterns, reduces tool-call volume by 40%, and cuts codebase research time from two days to 30 minutes.</p><p><strong><a href="https://engineering.fb.com/2026/04/06/developer-tools/how-meta-used-ai-to-map-tribal-knowledge-in-large-scale-data-pipelines/">https://engineering.fb.com/2026/04/06/developer-tools/how-meta-used-ai-to-map-tribal-knowledge-in-large-scale-data-pipelines/</a></strong></p><div><hr></div><h1>Sponsored: The AI Modernization Guide</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=04_12_26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xtWw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97b6788c-3818-4bff-b818-cf8000dba970_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!xtWw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97b6788c-3818-4bff-b818-cf8000dba970_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!xtWw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97b6788c-3818-4bff-b818-cf8000dba970_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!xtWw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97b6788c-3818-4bff-b818-cf8000dba970_2400x1260.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xtWw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97b6788c-3818-4bff-b818-cf8000dba970_2400x1260.heic" width="1456" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/97b6788c-3818-4bff-b818-cf8000dba970_2400x1260.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:14581,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=04_12_26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/194026123?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97b6788c-3818-4bff-b818-cf8000dba970_2400x1260.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xtWw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97b6788c-3818-4bff-b818-cf8000dba970_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!xtWw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97b6788c-3818-4bff-b818-cf8000dba970_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!xtWw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97b6788c-3818-4bff-b818-cf8000dba970_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!xtWw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97b6788c-3818-4bff-b818-cf8000dba970_2400x1260.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AI is reshaping how data teams operate. But legacy pipelines, brittle workflows, and fragmented tooling weren&#8217;t designed for this shift.<br><br>Learn how leading teams are future-proofing their infrastructure before AI demands overwhelm it.</p><p><strong><a href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=04_12_26_data_engineering_weekly">Download the free guide</a></strong></p><div><hr></div><h1>Netflix: Stop Answering the Same Question Twice: Interval-Aware Caching for Druid at Netflix Scale</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QyEH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0ec2667-cecb-4d9d-82fb-5f542ae3d2ce_1400x774.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QyEH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0ec2667-cecb-4d9d-82fb-5f542ae3d2ce_1400x774.heic 424w, https://substackcdn.com/image/fetch/$s_!QyEH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0ec2667-cecb-4d9d-82fb-5f542ae3d2ce_1400x774.heic 848w, https://substackcdn.com/image/fetch/$s_!QyEH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0ec2667-cecb-4d9d-82fb-5f542ae3d2ce_1400x774.heic 1272w, https://substackcdn.com/image/fetch/$s_!QyEH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0ec2667-cecb-4d9d-82fb-5f542ae3d2ce_1400x774.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QyEH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0ec2667-cecb-4d9d-82fb-5f542ae3d2ce_1400x774.heic" width="1400" height="774" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b0ec2667-cecb-4d9d-82fb-5f542ae3d2ce_1400x774.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:774,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17702,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/194026123?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0ec2667-cecb-4d9d-82fb-5f542ae3d2ce_1400x774.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QyEH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0ec2667-cecb-4d9d-82fb-5f542ae3d2ce_1400x774.heic 424w, https://substackcdn.com/image/fetch/$s_!QyEH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0ec2667-cecb-4d9d-82fb-5f542ae3d2ce_1400x774.heic 848w, https://substackcdn.com/image/fetch/$s_!QyEH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0ec2667-cecb-4d9d-82fb-5f542ae3d2ce_1400x774.heic 1272w, https://substackcdn.com/image/fetch/$s_!QyEH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb0ec2667-cecb-4d9d-82fb-5f542ae3d2ce_1400x774.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Real-time dashboards with rolling windows invalidate traditional caches because shifting intervals cause repeated misses on otherwise unchanged historical data. Netflix builds an interval-aware caching proxy that decomposes Druid queries into one-minute buckets, serves historical segments from Cassandra, and fetches only the uncached tail from Druid using exponential TTLs ranging from 5 seconds to 1 hour. The system achieves 82% partial cache hit rates, reduces Druid query volume by 33%, improves P90 latency by 66%, and shifts the bottleneck from compute-heavy Druid to low-cost Cassandra storage.</p><p><strong><a href="https://netflixtechblog.com/stop-answering-the-same-question-twice-interval-aware-caching-for-druid-at-netflix-scale-22fadc9b840e">https://netflixtechblog.com/stop-answering-the-same-question-twice-interval-aware-caching-for-druid-at-netflix-scale-22fadc9b840e</a></strong></p><div><hr></div><h1>Booking.com: Scaling Experimentation Quality at Booking.com</h1><p>Underpowered experiments, premature peeking, and inconsistent reporting degrade decision quality as experimentation scales without statistical rigor. Booking.com embeds experimental quality across design, execution, and decision-making through data science ambassadors, peer-review practices, and tooling such as a Quality Tab that enforces power calculations and pre-registered hypotheses in real time. These changes increase the share of high-quality experiments, with the largest gains in design, where proper power improves the reliability of results and decision confidence.</p><p><strong><a href="https://booking.ai/scaling-experimentation-quality-at-booking-com-726152ee4ef0">https://booking.ai/scaling-experimentation-quality-at-booking-com-726152ee4ef0</a></strong></p><div><hr></div><h1>Andros Fenollosa: From zero to a RAG system: successes and failures</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ncCX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87ae63e3-a099-4ae2-af0a-9add719a524a_1670x552.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ncCX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87ae63e3-a099-4ae2-af0a-9add719a524a_1670x552.heic 424w, https://substackcdn.com/image/fetch/$s_!ncCX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87ae63e3-a099-4ae2-af0a-9add719a524a_1670x552.heic 848w, https://substackcdn.com/image/fetch/$s_!ncCX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87ae63e3-a099-4ae2-af0a-9add719a524a_1670x552.heic 1272w, https://substackcdn.com/image/fetch/$s_!ncCX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87ae63e3-a099-4ae2-af0a-9add719a524a_1670x552.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ncCX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87ae63e3-a099-4ae2-af0a-9add719a524a_1670x552.heic" width="1456" height="481" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/87ae63e3-a099-4ae2-af0a-9add719a524a_1670x552.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:481,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12400,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/194026123?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87ae63e3-a099-4ae2-af0a-9add719a524a_1670x552.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ncCX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87ae63e3-a099-4ae2-af0a-9add719a524a_1670x552.heic 424w, https://substackcdn.com/image/fetch/$s_!ncCX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87ae63e3-a099-4ae2-af0a-9add719a524a_1670x552.heic 848w, https://substackcdn.com/image/fetch/$s_!ncCX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87ae63e3-a099-4ae2-af0a-9add719a524a_1670x552.heic 1272w, https://substackcdn.com/image/fetch/$s_!ncCX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87ae63e3-a099-4ae2-af0a-9add719a524a_1670x552.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Legacy engineering knowledge locked in unstructured simulation files and technical documents remains inaccessible when confidentiality requirements prohibit sending proprietary data to external LLM APIs. The author writes about building a local RAG system using Ollama, LlamaIndex, and ChromaDB&#8212;filtering out non-text files to reduce indexable load by 54% and serving source documents from Azure Blob Storage while staying within a 100GB disk constraint. The architecture delivers confidential retrieval over 1TB of legacy engineering data while establishing batch checkpointing and error-tolerant ingestion as the critical patterns for production RAG deployments at scale.</p><p><strong><a href="https://en.andros.dev/blog/aa31d744/from-zero-to-a-rag-system-successes-and-failures/">https://en.andros.dev/blog/aa31d744/from-zero-to-a-rag-system-successes-and-failures/</a></strong></p><div><hr></div><h1>All Things Distributed: S3 Files and the changing face of S3</h1><p>The introduction of S3 Files certainly generated a lot of interest on my reading list last week. I&#8217;m still studying the impact of S3 files on data pipeline engineering. One thing to note, S3 Files indeed breaks the read-on-write (Write in S3 Files, but read in S3) consistency model. I wonder if the data infrastructure really wants to go back to that world; nonetheless, this is an exciting blog to read to understand the thought process behind S3 Files. </p><p><strong><a href="https://www.allthingsdistributed.com/2026/04/s3-files-and-the-changing-face-of-s3.html">https://www.allthingsdistributed.com/2026/04/s3-files-and-the-changing-face-of-s3.html</a></strong></p><div><hr></div><h1>Apache Kafka: KIP-848: The Next Generation of the Consumer Rebalance Protocol</h1><p>Whether you like or dislike Apache Kafka, its KIPs are among the best learning materials for distributed systems. Consumer rebalancing is one of the hottest debated topics in the Kafka world. KIP-848 moves rebalance logic from consumer clients to the Group Coordinator&#8212;introducing a ConsumerGroupHeartbeat API, three-layered epochs for group, assignment, and member state, and server-side Range and Uniform assignors that drive incremental partition reconciliation without global synchronization barriers.</p><p><strong><a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-848%3A+The+Next+Generation+of+the+Consumer+Rebalance+Protocol">https://cwiki.apache.org/confluence/display/KAFKA/KIP-848%3A+The+Next+Generation+of+the+Consumer+Rebalance+Protocol</a></strong></p><div><hr></div><p><em>All rights reserved, Dewpeche Private Limited. I have provided links for informational purposes and do not suggest endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers&#8217; opinions.</em></p>]]></content:encoded></item><item><title><![CDATA[Data Engineering Weekly #264]]></title><description><![CDATA[The Weekly Data Engineering Newsletter]]></description><link>https://www.dataengineeringweekly.com/p/data-engineering-weekly-264</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/data-engineering-weekly-264</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Mon, 06 Apr 2026 01:55:25 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!AdQk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6f51c1bf-abc9-4cd3-ad69-22cc8e3f1ef2_1080x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/events/multi-tenancy-for-modern-data-platforms?utm_campaign=39250561-26-04-WBNR_DEEP_DIVE_BROOKYLN_DATA&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_brooklyn_data&amp;utm_content=04_05_26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DOKS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bf59c3-807b-49c0-9155-94987c87402c_2880x1620.heic 424w, https://substackcdn.com/image/fetch/$s_!DOKS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bf59c3-807b-49c0-9155-94987c87402c_2880x1620.heic 848w, https://substackcdn.com/image/fetch/$s_!DOKS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bf59c3-807b-49c0-9155-94987c87402c_2880x1620.heic 1272w, https://substackcdn.com/image/fetch/$s_!DOKS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bf59c3-807b-49c0-9155-94987c87402c_2880x1620.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DOKS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bf59c3-807b-49c0-9155-94987c87402c_2880x1620.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/69bf59c3-807b-49c0-9155-94987c87402c_2880x1620.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:72035,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/events/multi-tenancy-for-modern-data-platforms?utm_campaign=39250561-26-04-WBNR_DEEP_DIVE_BROOKYLN_DATA&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_brooklyn_data&amp;utm_content=04_05_26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/193304081?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bf59c3-807b-49c0-9155-94987c87402c_2880x1620.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DOKS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bf59c3-807b-49c0-9155-94987c87402c_2880x1620.heic 424w, https://substackcdn.com/image/fetch/$s_!DOKS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bf59c3-807b-49c0-9155-94987c87402c_2880x1620.heic 848w, https://substackcdn.com/image/fetch/$s_!DOKS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bf59c3-807b-49c0-9155-94987c87402c_2880x1620.heic 1272w, https://substackcdn.com/image/fetch/$s_!DOKS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69bf59c3-807b-49c0-9155-94987c87402c_2880x1620.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>How data teams are solving multi-tenancy</h1><p>As data teams grow and serve multiple teams, clients, or business units from a shared platform, maintaining isolation and velocity without sacrificing either becomes a defining architectural challenge.<br><br>In this Deep Dive, Dagster Labs and Brooklyn Data Co. will cover the patterns, trade-offs, and real-world implementations behind multi-tenant data platforms built on Dagster. Attendees will leave this session with practical guidance they can take back to their own teams.</p><p><strong><a href="https://dagster.io/events/multi-tenancy-for-modern-data-platforms?utm_campaign=39250561-26-04-WBNR_DEEP_DIVE_BROOKYLN_DATA&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_brooklyn_data&amp;utm_content=04_05_26_data_engineering_weekly">Reserve your spot now</a></strong></p><div><hr></div><h1>Editorial Note: Help Us Make Data Engineering Weekly Better</h1><p>We&#8217;re working to make Data Engineering Weekly more useful, more relevant, and more worth your time every Sunday. If you have 2 minutes, please share your feedback through this short survey. Your input will directly shape what we cover, how we write, and where we improve next.</p><p><strong><a href="https://forms.gle/cgeww7czFAVBiVmV7">https://forms.gle/cgeww7czFAVBiVmV7</a></strong></p><div><hr></div><h1>Meta: Inside Meta&#8217;s Home-Grown AI Analytics Agent</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!E2lG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ff2220-e21f-42f3-9ece-339ac8e89958_1136x1152.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!E2lG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ff2220-e21f-42f3-9ece-339ac8e89958_1136x1152.heic 424w, https://substackcdn.com/image/fetch/$s_!E2lG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ff2220-e21f-42f3-9ece-339ac8e89958_1136x1152.heic 848w, https://substackcdn.com/image/fetch/$s_!E2lG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ff2220-e21f-42f3-9ece-339ac8e89958_1136x1152.heic 1272w, https://substackcdn.com/image/fetch/$s_!E2lG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ff2220-e21f-42f3-9ece-339ac8e89958_1136x1152.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!E2lG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ff2220-e21f-42f3-9ece-339ac8e89958_1136x1152.heic" width="1136" height="1152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d2ff2220-e21f-42f3-9ece-339ac8e89958_1136x1152.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1152,&quot;width&quot;:1136,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12310,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/193304081?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ff2220-e21f-42f3-9ece-339ac8e89958_1136x1152.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!E2lG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ff2220-e21f-42f3-9ece-339ac8e89958_1136x1152.heic 424w, https://substackcdn.com/image/fetch/$s_!E2lG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ff2220-e21f-42f3-9ece-339ac8e89958_1136x1152.heic 848w, https://substackcdn.com/image/fetch/$s_!E2lG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ff2220-e21f-42f3-9ece-339ac8e89958_1136x1152.heic 1272w, https://substackcdn.com/image/fetch/$s_!E2lG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2ff2220-e21f-42f3-9ece-339ac8e89958_1136x1152.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Routine analytical queries dominate enterprise data science workloads, yet agents fail as warehouse scale grows without a bounded, structured context. Meta Platforms seeds per-user memory from historical query logs and organizes domain knowledge into cookbooks, recipes, and ingredients that encode validated analyst logic. This approach drives 77% weekly adoption within six months as community-authored recipes expand coverage across domains</p><p><strong><a href="https://medium.com/@AnalyticsAtMeta/inside-metas-home-grown-ai-analytics-agent-4ea6779acfb3">https://medium.com/@AnalyticsAtMeta/inside-metas-home-grown-ai-analytics-agent-4ea6779acfb3</a></strong></p><div><hr></div><h1>Michel Tricot: Beyond ETL - The Case for Context</h1><p>Agentic data infrastructure exposes a meaning gap that traditional ETL never addressed, as autonomous agents propagate poor context across queries at scale. The author validates the ECL framework through real-world failures and reframes the Context Store as a materialized view that pre-replicates SaaS data into versioned semantic structures for agent consumption. Existing data engineering primitives&#8212;incremental replication, schema normalization, and tenant isolation&#8212;support this model, shifting the data engineer&#8217;s role from data movement to context architecture.</p><p><strong><a href="https://agentblueprint.substack.com/p/beyond-etl-the-case-for-context">https://agentblueprint.substack.com/p/beyond-etl-the-case-for-context</a></strong></p><div><hr></div><h1>Chris Gambill: Medallion Architecture Isn&#8217;t As New As You Think</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!H16B!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dba8a06-0dfc-4949-b164-3b2b16abbcaa_1456x794.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!H16B!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dba8a06-0dfc-4949-b164-3b2b16abbcaa_1456x794.heic 424w, https://substackcdn.com/image/fetch/$s_!H16B!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dba8a06-0dfc-4949-b164-3b2b16abbcaa_1456x794.heic 848w, https://substackcdn.com/image/fetch/$s_!H16B!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dba8a06-0dfc-4949-b164-3b2b16abbcaa_1456x794.heic 1272w, https://substackcdn.com/image/fetch/$s_!H16B!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dba8a06-0dfc-4949-b164-3b2b16abbcaa_1456x794.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!H16B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dba8a06-0dfc-4949-b164-3b2b16abbcaa_1456x794.heic" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6dba8a06-0dfc-4949-b164-3b2b16abbcaa_1456x794.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:28242,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/193304081?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dba8a06-0dfc-4949-b164-3b2b16abbcaa_1456x794.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!H16B!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dba8a06-0dfc-4949-b164-3b2b16abbcaa_1456x794.heic 424w, https://substackcdn.com/image/fetch/$s_!H16B!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dba8a06-0dfc-4949-b164-3b2b16abbcaa_1456x794.heic 848w, https://substackcdn.com/image/fetch/$s_!H16B!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dba8a06-0dfc-4949-b164-3b2b16abbcaa_1456x794.heic 1272w, https://substackcdn.com/image/fetch/$s_!H16B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dba8a06-0dfc-4949-b164-3b2b16abbcaa_1456x794.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AI tools amplify data-quality failures at scale when pipelines lack clear boundaries among the raw capture, transformation, and business-consumption layers. The author reframes Medallion Architecture as a disciplined evolution of staging and reporting models: Bronze preserves raw audit trails, Silver enforces schema contracts, and Gold delivers business-ready KPIs. This separation provides a reliable context for AI systems and reduces the downstream cost of bad data beyond incremental storage overhead.</p><p><strong><a href="https://gambilldataengineering.substack.com/p/medallion-architecture-isnt-as-new">https://gambilldataengineering.substack.com/p/medallion-architecture-isnt-as-new</a></strong></p><div><hr></div><h1>Sponsored: The Data Platform Fundamentals Guide</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/how-to-build-data-platforms-ebook?utm_campaign=8626009-25-02-eBook_Data-Platform-Fundamentals&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=data_fundamentals_ebook&amp;utm_content=04_05_26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Bwfu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F116e691c-b991-4de6-8071-f6badff33555_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!Bwfu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F116e691c-b991-4de6-8071-f6badff33555_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!Bwfu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F116e691c-b991-4de6-8071-f6badff33555_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!Bwfu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F116e691c-b991-4de6-8071-f6badff33555_3840x2160.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Bwfu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F116e691c-b991-4de6-8071-f6badff33555_3840x2160.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/116e691c-b991-4de6-8071-f6badff33555_3840x2160.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:20296,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/how-to-build-data-platforms-ebook?utm_campaign=8626009-25-02-eBook_Data-Platform-Fundamentals&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=data_fundamentals_ebook&amp;utm_content=04_05_26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/193304081?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F116e691c-b991-4de6-8071-f6badff33555_3840x2160.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Bwfu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F116e691c-b991-4de6-8071-f6badff33555_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!Bwfu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F116e691c-b991-4de6-8071-f6badff33555_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!Bwfu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F116e691c-b991-4de6-8071-f6badff33555_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!Bwfu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F116e691c-b991-4de6-8071-f6badff33555_3840x2160.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Learn the fundamental concepts to build a data platform in your organization.<br><br>- Tips and tricks for data modeling and data ingestion patterns<br>- Explore the benefits of an observation layer across your data pipelines<br>- Learn the key strategies for ensuring data quality for your organization</p><p><strong><a href="https://dagster.io/how-to-build-data-platforms-ebook?utm_campaign=8626009-25-02-eBook_Data-Platform-Fundamentals&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=data_fundamentals_ebook&amp;utm_content=04_05_26_data_engineering_weekly">Download the free guide</a></strong></p><div><hr></div><h1>Zapier: Lessons from using the outbox pattern at scale</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Q0Q6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab0180f-962f-4f77-a42e-82f5e768fae0_2266x1602.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Q0Q6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab0180f-962f-4f77-a42e-82f5e768fae0_2266x1602.heic 424w, https://substackcdn.com/image/fetch/$s_!Q0Q6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab0180f-962f-4f77-a42e-82f5e768fae0_2266x1602.heic 848w, https://substackcdn.com/image/fetch/$s_!Q0Q6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab0180f-962f-4f77-a42e-82f5e768fae0_2266x1602.heic 1272w, https://substackcdn.com/image/fetch/$s_!Q0Q6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab0180f-962f-4f77-a42e-82f5e768fae0_2266x1602.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Q0Q6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab0180f-962f-4f77-a42e-82f5e768fae0_2266x1602.heic" width="1456" height="1029" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fab0180f-962f-4f77-a42e-82f5e768fae0_2266x1602.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1029,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17976,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/193304081?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab0180f-962f-4f77-a42e-82f5e768fae0_2266x1602.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Q0Q6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab0180f-962f-4f77-a42e-82f5e768fae0_2266x1602.heic 424w, https://substackcdn.com/image/fetch/$s_!Q0Q6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab0180f-962f-4f77-a42e-82f5e768fae0_2266x1602.heic 848w, https://substackcdn.com/image/fetch/$s_!Q0Q6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab0180f-962f-4f77-a42e-82f5e768fae0_2266x1602.heic 1272w, https://substackcdn.com/image/fetch/$s_!Q0Q6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffab0180f-962f-4f77-a42e-82f5e768fae0_2266x1602.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>High-throughput event pipelines require durable buffering between producers and brokers to prevent data loss during failures or maintenance windows. Zapier implements a transactional outbox in its Go-based Events API using sharded SQLite on EBS-backed Kubernetes StatefulSets, with WAL mode, 50 shards per pod, and per-shard mutexes sustaining 15,000 events per second during Kafka outages. Operational limits from static sharding and StatefulSet constraints push a shift toward a sidecar pattern that writes to S3 on failure and replays via SQS.</p><p><strong><a href="https://zapier.com/blog/lessons-from-using-outbox-pattern-at-scale/">https://zapier.com/blog/lessons-from-using-outbox-pattern-at-scale/</a></strong></p><div><hr></div><h1>Lyft: Predicting Rider Conversion in Sparse Data Environments with Bayesian Trees</h1><p>Sparse contextual data causes standard ML models to overfit and generate unstable predictions across long-tail combinations of location, time, and demand. Lyft models rider conversion using a Bayesian Tree that organizes context hierarchically and applies Gaussian priors with L2 regularization to balance sparse leaf signals against stable parent trends. This approach delivers localized accuracy in dense data and degrades to broader signals in sparse regions, while enforcing monotonicity constraints to ensure consistent, interpretable predictions.</p><p><strong><a href="https://eng.lyft.com/predicting-rider-conversion-in-sparse-data-environments-with-bayesian-trees-07227ff92789">https://eng.lyft.com/predicting-rider-conversion-in-sparse-data-environments-with-bayesian-trees-07227ff92789</a></strong></p><div><hr></div><h1>LinkedIn: Building LinkedIn&#8217;s CTV Ads: Scaling professional reach to the big screen</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!M-Xu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd419cb91-72e2-43d8-a4c7-ac3016c10661_683x425.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!M-Xu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd419cb91-72e2-43d8-a4c7-ac3016c10661_683x425.heic 424w, https://substackcdn.com/image/fetch/$s_!M-Xu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd419cb91-72e2-43d8-a4c7-ac3016c10661_683x425.heic 848w, https://substackcdn.com/image/fetch/$s_!M-Xu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd419cb91-72e2-43d8-a4c7-ac3016c10661_683x425.heic 1272w, https://substackcdn.com/image/fetch/$s_!M-Xu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd419cb91-72e2-43d8-a4c7-ac3016c10661_683x425.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!M-Xu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd419cb91-72e2-43d8-a4c7-ac3016c10661_683x425.heic" width="683" height="425" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d419cb91-72e2-43d8-a4c7-ac3016c10661_683x425.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:425,&quot;width&quot;:683,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7012,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/193304081?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd419cb91-72e2-43d8-a4c7-ac3016c10661_683x425.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!M-Xu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd419cb91-72e2-43d8-a4c7-ac3016c10661_683x425.heic 424w, https://substackcdn.com/image/fetch/$s_!M-Xu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd419cb91-72e2-43d8-a4c7-ac3016c10661_683x425.heic 848w, https://substackcdn.com/image/fetch/$s_!M-Xu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd419cb91-72e2-43d8-a4c7-ac3016c10661_683x425.heic 1272w, https://substackcdn.com/image/fetch/$s_!M-Xu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd419cb91-72e2-43d8-a4c7-ac3016c10661_683x425.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>B2B advertisers struggle to reach professional audiences on connected TV while maintaining the targeting precision and measurement fidelity of digital environments. LinkedIn extends its identity graph to CTV through private marketplace supply, cross-device household mapping, and transcoding pipelines that meet CBR encoding and native frame rate standards. The platform delivers 99% brand-safe inventory, achieves 11x cost efficiency over linear TV, and scales from manual deals to self-serve inventory via Campaign Manager.</p><p><strong><a href="https://www.linkedin.com/blog/engineering/marketing/building-linkedins-ctv-ads">https://www.linkedin.com/blog/engineering/marketing/building-linkedins-ctv-ads</a></strong></p><div><hr></div><h1>Netflix: Synchronizing the Senses: Powering Multimodal Intelligence for Video Search</h1><p>Video search across large productions requires unifying outputs from multiple ML models into a low-latency retrieval system that editors can query in real time. Netflix pipelines multimodal annotations through Cassandra for high-throughput ingestion, Kafka for temporal bucketing into one-second intervals, and Elasticsearch for hierarchical indexing that combines character, scene, and dialogue signals. The system enables semantic vector search via HNSW, supports match-phrase dialogue queries, and applies union&#8211;intersection logic to reconstruct scene boundaries across billions of data points.</p><p><strong><a href="https://netflixtechblog.com/powering-multimodal-intelligence-for-video-search-3e0020cf1202">https://netflixtechblog.com/powering-multimodal-intelligence-for-video-search-3e0020cf1202</a></strong></p><div><hr></div><h1>Salesforce: Inside Informatica&#8217;s Spark-Based Data Integration Platform: Running 250K Enterprise Pipelines Daily</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nnTZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a42a2f-149f-476f-94db-cb5f5f4fa81e_652x440.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nnTZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a42a2f-149f-476f-94db-cb5f5f4fa81e_652x440.heic 424w, https://substackcdn.com/image/fetch/$s_!nnTZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a42a2f-149f-476f-94db-cb5f5f4fa81e_652x440.heic 848w, https://substackcdn.com/image/fetch/$s_!nnTZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a42a2f-149f-476f-94db-cb5f5f4fa81e_652x440.heic 1272w, https://substackcdn.com/image/fetch/$s_!nnTZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a42a2f-149f-476f-94db-cb5f5f4fa81e_652x440.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nnTZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a42a2f-149f-476f-94db-cb5f5f4fa81e_652x440.heic" width="652" height="440" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/98a42a2f-149f-476f-94db-cb5f5f4fa81e_652x440.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:440,&quot;width&quot;:652,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12133,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/193304081?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a42a2f-149f-476f-94db-cb5f5f4fa81e_652x440.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nnTZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a42a2f-149f-476f-94db-cb5f5f4fa81e_652x440.heic 424w, https://substackcdn.com/image/fetch/$s_!nnTZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a42a2f-149f-476f-94db-cb5f5f4fa81e_652x440.heic 848w, https://substackcdn.com/image/fetch/$s_!nnTZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a42a2f-149f-476f-94db-cb5f5f4fa81e_652x440.heic 1272w, https://substackcdn.com/image/fetch/$s_!nnTZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98a42a2f-149f-476f-94db-cb5f5f4fa81e_652x440.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Enterprise data integration platforms struggle at the petabyte scale when single-node execution engines lack distributed compute and automated resource optimization. Informatica migrates CDI to Spark++ on Kubernetes, preserving graphical mappings while introducing row-level fault isolation, ephemeral, VPC-bound clusters, and automated FinOps tuners that optimize infrastructure and Spark parameters based on historical workloads. The distributed system supports 5,500 enterprise clients across 250,000 daily tasks, reduces infrastructure costs by 1.65x, and maintains 99.9% control plane availability.</p><p><strong><a href="https://engineering.salesforce.com/inside-informaticas-spark-based-data-integration-platform-running-250k-enterprise-pipelines-daily/">https://engineering.salesforce.com/inside-informaticas-spark-based-data-integration-platform-running-250k-enterprise-pipelines-daily/</a></strong></p><div><hr></div><h1>ZeroToOne: Taming S3 Shuffle at Scale</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!d3xC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb409518c-0636-4080-8238-160c2da22ac0_1020x510.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!d3xC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb409518c-0636-4080-8238-160c2da22ac0_1020x510.heic 424w, https://substackcdn.com/image/fetch/$s_!d3xC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb409518c-0636-4080-8238-160c2da22ac0_1020x510.heic 848w, https://substackcdn.com/image/fetch/$s_!d3xC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb409518c-0636-4080-8238-160c2da22ac0_1020x510.heic 1272w, https://substackcdn.com/image/fetch/$s_!d3xC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb409518c-0636-4080-8238-160c2da22ac0_1020x510.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!d3xC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb409518c-0636-4080-8238-160c2da22ac0_1020x510.heic" width="1020" height="510" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b409518c-0636-4080-8238-160c2da22ac0_1020x510.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:510,&quot;width&quot;:1020,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:9844,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/193304081?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb409518c-0636-4080-8238-160c2da22ac0_1020x510.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!d3xC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb409518c-0636-4080-8238-160c2da22ac0_1020x510.heic 424w, https://substackcdn.com/image/fetch/$s_!d3xC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb409518c-0636-4080-8238-160c2da22ac0_1020x510.heic 848w, https://substackcdn.com/image/fetch/$s_!d3xC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb409518c-0636-4080-8238-160c2da22ac0_1020x510.heic 1272w, https://substackcdn.com/image/fetch/$s_!d3xC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb409518c-0636-4080-8238-160c2da22ac0_1020x510.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>S3-based Spark shuffle suffers from quadratic scaling of GET requests, prefix throttling, and executor hangs, which drive high API costs and instability at production scale. ZeroToOne reduces shuffle costs by 95% by coalescing map tasks and expanding S3 prefixes from 10 to 500, then hardens the shuffle plugin with ConcurrentHashMap-based atomic locking and prefetch iterator timeouts to eliminate race conditions and deadlocks. These changes stabilize spot instance execution and reduce per-stage API costs from $72 to near zero in large backfill workloads.</p><p><strong><a href="https://blog.platform.zerotoone.ai/blog/taming-s3-shuffle-at-scale/">https://blog.platform.zerotoone.ai/blog/taming-s3-shuffle-at-scale/</a></strong></p><div><hr></div><h1>Radim Marek: Production query plans without production data</h1><p>Just the other day, I was in a design discussion about building a routing engine for SQL query execution based on the query plan, and how to back this up in the CI pipeline to catch expensive queries earlier. It is one of the critical problems that I wish all data warehouses and Lakehouses would provide out of the box. </p><p><strong><a href="https://boringsql.com/posts/portable-stats/">https://boringsql.com/posts/portable-stats/</a></strong></p><div><hr></div><p><em>All rights reserved, Dewpeche Private Limited. I have provided links for informational purposes and do not suggest endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers&#8217; opinions.</em></p>]]></content:encoded></item><item><title><![CDATA[The Missing Interface in Data Platform Engineering]]></title><description><![CDATA[How data leaders should design the boundary between platforms and dependent teams.]]></description><link>https://www.dataengineeringweekly.com/p/the-missing-interface-in-data-platform</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/the-missing-interface-in-data-platform</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Thu, 02 Apr 2026 03:36:07 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!f16i!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae8ea51-3898-4ca0-aa70-eb4580cce90e_1920x2161.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RZl6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae408719-1fee-45d6-8455-64c78acf6219_1536x1024.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RZl6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae408719-1fee-45d6-8455-64c78acf6219_1536x1024.heic 424w, https://substackcdn.com/image/fetch/$s_!RZl6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae408719-1fee-45d6-8455-64c78acf6219_1536x1024.heic 848w, https://substackcdn.com/image/fetch/$s_!RZl6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae408719-1fee-45d6-8455-64c78acf6219_1536x1024.heic 1272w, https://substackcdn.com/image/fetch/$s_!RZl6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae408719-1fee-45d6-8455-64c78acf6219_1536x1024.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RZl6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae408719-1fee-45d6-8455-64c78acf6219_1536x1024.heic" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ae408719-1fee-45d6-8455-64c78acf6219_1536x1024.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:13817,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/192920698?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae408719-1fee-45d6-8455-64c78acf6219_1536x1024.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RZl6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae408719-1fee-45d6-8455-64c78acf6219_1536x1024.heic 424w, https://substackcdn.com/image/fetch/$s_!RZl6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae408719-1fee-45d6-8455-64c78acf6219_1536x1024.heic 848w, https://substackcdn.com/image/fetch/$s_!RZl6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae408719-1fee-45d6-8455-64c78acf6219_1536x1024.heic 1272w, https://substackcdn.com/image/fetch/$s_!RZl6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae408719-1fee-45d6-8455-64c78acf6219_1536x1024.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A familiar pattern plays out inside many platform organizations. A data platform team ships what it sees as a milestone: a self-service stack with governed datasets, reusable pipelines, access automation, lineage, templates, and documentation. Leadership sees leverage. The platform team sees scale.</p><p>Then requests start arriving.</p><p>Can someone help model the first few datasets? Can someone validate the ownership setup? Can someone walk us through the abstractions? Can the platform team handle the initial rollout for this one use case?</p><p>Platform teams often call that resistance to self-service. The real problem is simpler: the interface is incomplete.</p><p>The tooling exists. The technical path exists. But the consumer team still cannot tell where responsibility begins, where support ends, what failure looks like, or what the team must operate independently.</p><p>The platform team sees a reusable capability. The consumer team sees a system that still depends on human interpretation.</p><p>Both teams are acting rationally. The platform team has built a technical interface. The consumer team is still looking for an operating interface.</p><p>That gap accounts for more platform friction than most platform strategies acknowledge.</p><p>Data platform engineering often gets framed as a systems problem: storage layers, compute engines, orchestration, metadata, governance, access control, and developer tooling. Those components matter. Once a platform becomes shared infrastructure, however, the harder problem shifts. The platform becomes a dependency surface across teams, applications, workflows, and operational responsibilities.</p><p>At that point, the key question shifts from &#8220;What did we build?&#8221; to &#8220;How should other teams depend on it?&#8221;</p><p>That question defines the real interface in data platform engineering.</p><h1><strong>Data Platform Engineering is Coordination Engineering</strong></h1><p>A mature data platform is not just a collection of capabilities. A mature data platform creates a shared system that other teams must trust, integrate with, and operate against. Every dependency on that platform carries assumptions: what stays stable, what can change, who responds when something breaks, how fast a team can expect support, what a consumer must understand, and what remains the platform team&#8217;s responsibility.</p><p>Teams carry those assumptions whether they write them down or not. When teams leave them implicit, engineers reconstruct them through tickets, Slack threads, tribal knowledge, escalations, and repeated misunderstandings. When teams make them explicit, the assumptions become part of the platform&#8217;s operating interface.</p><p>The operating interface has two parts.</p><p>One part defines the explicit rules that govern the relationship: schemas, APIs, freshness guarantees, ownership boundaries, compatibility expectations, escalation paths, and adoption responsibilities.</p><p>The other part defines the communication pattern through which teams use those rules: reactive ticketing, temporary embedding, joint execution, self-service federation, or community contribution.</p><p>Most platform failures begin when teams underdesign one or both parts.</p><p>The platform team thinks it has published a reusable capability. The consumer team experiences an ambiguous boundary. We could document the schema, but not the operational expectations. The self-service path may exist, but the adoption model does not. The API may be stable, but teams still negotiate failure semantics socially every time they matter.</p><p>Platform maturity, then, depends on more than better tooling. Platform maturity depends on how well teams design the dependency boundary between the platform and the groups that rely on it.</p><h1><strong>A contract is only one layer of the operating interface</strong></h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yJJ5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed1b094b-cbaa-40b9-a01f-f791b5ea8c52_1536x1024.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yJJ5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed1b094b-cbaa-40b9-a01f-f791b5ea8c52_1536x1024.heic 424w, https://substackcdn.com/image/fetch/$s_!yJJ5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed1b094b-cbaa-40b9-a01f-f791b5ea8c52_1536x1024.heic 848w, https://substackcdn.com/image/fetch/$s_!yJJ5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed1b094b-cbaa-40b9-a01f-f791b5ea8c52_1536x1024.heic 1272w, https://substackcdn.com/image/fetch/$s_!yJJ5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed1b094b-cbaa-40b9-a01f-f791b5ea8c52_1536x1024.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yJJ5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed1b094b-cbaa-40b9-a01f-f791b5ea8c52_1536x1024.heic" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ed1b094b-cbaa-40b9-a01f-f791b5ea8c52_1536x1024.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:18015,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/192920698?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed1b094b-cbaa-40b9-a01f-f791b5ea8c52_1536x1024.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yJJ5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed1b094b-cbaa-40b9-a01f-f791b5ea8c52_1536x1024.heic 424w, https://substackcdn.com/image/fetch/$s_!yJJ5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed1b094b-cbaa-40b9-a01f-f791b5ea8c52_1536x1024.heic 848w, https://substackcdn.com/image/fetch/$s_!yJJ5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed1b094b-cbaa-40b9-a01f-f791b5ea8c52_1536x1024.heic 1272w, https://substackcdn.com/image/fetch/$s_!yJJ5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed1b094b-cbaa-40b9-a01f-f791b5ea8c52_1536x1024.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Platform discussions often stall because engineers use the word <em>contract</em> too narrowly. In data work, many engineers hear &#8220;contract&#8221; and immediately think of a data contract: schema shape, field semantics, compatibility rules between producer and consumer, and perhaps a validation mechanism.</p><p>That category matters. That category does not cover the problem.</p><p>A stack of interface layers governs a data platform dependency, and each layer answers a different question.</p><h2><strong>1. Technical interface</strong></h2><p>The technical interface is the layer most teams already know how to discuss. It includes APIs, schemas, tables, events, payloads, SDKs, versioning rules, authentication mechanisms, and compatibility expectations. The technical interface defines the shape of interaction.</p><p>When people say a platform has a clear interface, they often mean only that layer.</p><p>Teams can still fail operationally even when the technical interface is clear.</p><h2><strong>2. Operational contract</strong></h2><p>The operational contract defines runtime expectations. How fresh should the data be? What latency matters for a given workflow? How should retries behave? What happens when a dependency degrades? Which failures does the platform absorb, and which failures propagate to consumers? Which SLOs, error budgets, or maintenance windows apply?</p><p>The operational contract separates descriptive interoperability from dependable interoperability.</p><p>Two teams may agree on a schema and still disagree completely on whether a six-hour delay is acceptable, whether we tolerate the stale reads, or whether a breaking change in behavior requires a coordinated rollout.</p><h2><strong>3. Ownership model</strong></h2><p>The ownership model defines authority and accountability. Who approves interface changes? Who owns backward compatibility? Who responds during incidents? Who decides when a consumer must migrate? Who can reject a new use case because it violates platform constraints?</p><p>Many recurring platform conflicts are ownership failures disguised as technical disputes.</p><p>A consumer team says, &#8220;The platform changed under us.&#8221; The platform team says, &#8220;You were never supposed to rely on that behavior.&#8221; In most cases, unclear ownership boundaries create the conflict long before the disagreement surfaces.</p><h2><strong>4. Adoption model</strong></h2><p>The adoption model defines what a consuming team must do to use the platform successfully. Is the platform truly self-service? Does first adoption require embedding? Must the consumer own pipeline logic, operational monitoring, data quality checks, and incident response? How much platform literacy must a team build before independence becomes realistic?</p><p>Most platform design documents ignore that layer even though it often determines whether adoption succeeds.</p><p>A workflow is not self-service because a platform engineer no longer types the commands. Self-service begins when a consumer team can understand, operate, and recover within the platform&#8217;s boundaries independently.</p><h2><strong>5. Communication pattern</strong></h2><p>Every platform also has a practical communication mode. Teams may collaborate through tickets, pairing, embedded work, shared planning, interfaces, or contribution models. Those patterns are not secondary to the platform. Those patterns determine how the platform behaves in practice.</p><p>When teams do not consciously design that layer, habits and local workarounds define it by default.</p><p>Together, those layers form the platform&#8217;s operating interface: the real boundary through which teams depend on one another.</p><h1><strong>Every Platform Already has a Communication Model</strong></h1><p>Platform teams often speak as though better tooling will eventually eliminate communication. Tooling never eliminates communication. Tooling only changes its shape.</p><p>A ticket queue is a communication system. An embedding is a communication system. An API with onboarding guides and escalation rules is a communication system. An internal RFC process also serves as a communication system.</p><p>For that reason, a platform maturity model is also a communication maturity model. The model describes not only what the platform team has built, but also how dependency information moves between the platform and its consumers.</p><p>Communication may remain human-mediated, partially codified, or increasingly interface-led. No single mode is always superior. The right mode depends on the capability&#8217;s maturity, the consumer&#8217;s readiness, and the complexity of the work.</p><h1><strong>Five ways platform teams and dependent teams actually work</strong></h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!f16i!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae8ea51-3898-4ca0-aa70-eb4580cce90e_1920x2161.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!f16i!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae8ea51-3898-4ca0-aa70-eb4580cce90e_1920x2161.heic 424w, https://substackcdn.com/image/fetch/$s_!f16i!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae8ea51-3898-4ca0-aa70-eb4580cce90e_1920x2161.heic 848w, https://substackcdn.com/image/fetch/$s_!f16i!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae8ea51-3898-4ca0-aa70-eb4580cce90e_1920x2161.heic 1272w, https://substackcdn.com/image/fetch/$s_!f16i!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae8ea51-3898-4ca0-aa70-eb4580cce90e_1920x2161.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!f16i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae8ea51-3898-4ca0-aa70-eb4580cce90e_1920x2161.heic" width="1456" height="1639" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dae8ea51-3898-4ca0-aa70-eb4580cce90e_1920x2161.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1639,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:27221,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/192920698?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae8ea51-3898-4ca0-aa70-eb4580cce90e_1920x2161.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!f16i!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae8ea51-3898-4ca0-aa70-eb4580cce90e_1920x2161.heic 424w, https://substackcdn.com/image/fetch/$s_!f16i!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae8ea51-3898-4ca0-aa70-eb4580cce90e_1920x2161.heic 848w, https://substackcdn.com/image/fetch/$s_!f16i!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae8ea51-3898-4ca0-aa70-eb4580cce90e_1920x2161.heic 1272w, https://substackcdn.com/image/fetch/$s_!f16i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdae8ea51-3898-4ca0-aa70-eb4580cce90e_1920x2161.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>No single collaboration pattern fits every internal platform. The right mode depends on the capability in question, the consuming team, and the type of dependency with them. Strong platform organizations usually operate across several modes at once.</p><p>Teams rarely fail because they sit at the &#8220;wrong&#8221; level. Teams fail because they assume every consumer can interact with the platform through the same interface, when reality clearly shows otherwise.</p><h2><strong>Level 1: Reactive &#8212; The service desk</strong></h2><p>At Level 1, the platform team operates primarily through request fulfillment. Teams file tickets. Platform engineers provision resources, troubleshoot access, define ingestion patterns, or implement parts of the first workflow manually&#8212;knowledge about how the platform works lives mostly in people&#8217;s heads.</p><p>Many people dismiss that mode as immaturity. That judgment misses the point. Level 1 is where many new platform capabilities should begin.</p><p>When a capability is still emerging, the platform team does not yet know what the stable interface ought to be. Manual repetition helps the team discover the pattern worth codifying. The first few onboarding efforts reveal which inputs remain stable, which edge cases arise frequently, which assumptions break under real-world workloads, and which abstractions are premature.</p><p>The real danger is not Level 1 itself. The real danger is staying there after repetition becomes obvious.</p><p>Level 1 fails when demand scales linearly. Every new consumer increases direct demand on the platform team. The team becomes a fulfillment bottleneck. Consumers experience the platform as a queue instead of a leverage point. Platform engineers spend more time context-switching than building reusable capabilities.</p><p>Teams should move beyond Level 1 when repetition becomes predictable. Once the same task repeats enough times to reveal a stable pattern, some part of the operating interface should move out of people&#8217;s heads and into a reusable form.</p><h2><strong>Level 2: Coordinated &#8212; The embedding</strong></h2><p>At Level 2, the platform team transfers capability through direct collaboration. A platform engineer temporarily works with a consumer team to bootstrap adoption, interpret abstractions, and help the team operate inside the intended boundary.</p><p>Level 2 is not just support. Level 2 is a deliberate adoption model.</p><p>Embedding works when the platform capability is ready enough to be reused but still requires high-context interpretation. Embedding also works when the platform team needs to learn from consumers before it can fully stabilize the interface. The interaction runs in both directions: the platform teaches the intended path, and the consumer exposes where the path is incomplete.</p><p>Level 2 fails when dependency persists. The embedded engineer becomes the permanent translator. The team learns to route every ambiguity to a familiar person rather than building independent platform fluency. Once the embedding ends, the team slides back into ticketing.</p><p>Teams should move beyond Level 2 when understanding becomes repeatable. Once the same questions keep recurring, lack of exposure is no longer the main problem. Interface clarity is the problem. The platform then needs better runbooks, better failure handling, clearer ownership boundaries, or a more legible self-service path.</p><h2><strong>Level 3: Partnership &#8212; The joint mission</strong></h2><p>At Level 3, platform and consumer teams align around a shared objective for a bounded period. Level 3 is not request fulfillment, nor is it simple enablement. Level 3 is a temporary joint execution model.</p><p>Level 3 works when the dependency boundary itself is part of the problem. Teams often need Level 3 when they launch a new real-time product feature that requires changes across ingestion, serving, governance, and application behavior; when they stand up an experimentation platform that affects both platform architecture and domain logic; or when they build a new cross-cutting data product whose responsibilities cannot yet be separated cleanly.</p><p>Level 3 creates speed under complexity. Instead of negotiating everything through a queue, the teams create a shared execution context.</p><p>Level 3 fails when the temporary mission becomes a permanent entanglement. What was supposed to be a time-boxed collaboration becomes a staffing model. The platform roadmap drifts toward one team&#8217;s local priorities. The consumer team stops building independent ownership.</p><p>Teams should move beyond Level 3 when reusable patterns emerge. Once joint work starts producing structures that other teams will need, the organization should ask which parts belong in a generalized operating interface rather than in a persistent bespoke relationship.</p><h2><strong>Level 4: Federation &#8212; The self-service operating interface</strong></h2><p>At Level 4, teams collaborate primarily through explicit interfaces rather than constant human mediation. The platform publishes technical interfaces, operational expectations, ownership rules, onboarding guidance, and support boundaries clearly enough that consuming teams can adopt capabilities independently.</p><p>Level 4 is where platform economics start to work.</p><p>The marginal cost of onboarding new teams drops because the interface does more of the teaching. The platform team shifts away from request fulfillment and toward maintaining compatibility, reliability, documentation, tooling, and interface evolution.</p><p>Many organizations fool themselves at Level 4.</p><p>A team can publish an API, a portal, or a template and claim self-service while leaving the actual operating interface incomplete. The consumer can create the resource, but does not know how to handle failure. The documentation describes the happy path, but not the migration path. The schema is versioned, but the escalation model is still social. The ownership boundary exists in theory, but not in behavior.</p><p>That condition is not federation. That condition is a ticket queue with better branding.</p><p>Level 4 fails when self-service arrives too early. The platform exposes an interface before it has done enough repeated work to understand which parts are stable and which parts still require human judgment. Consumers adopt the easy 60% and escalate the hard 40%, forcing the platform team to run Level 1 and Level 4 simultaneously.</p><p>A healthy Level 4 shows more than usage. A healthy Level 4 shows independent operation. A consuming team should be able to adopt the capability, reason about normal failure, understand the support model, and make routine changes without renegotiating the relationship each time.</p><h2><strong>Level 5: Ecosystem &#8212; The internal commons</strong></h2><p>At Level 5, teams not only consume the platform. Teams extend it. The platform becomes a stewarded commons with contribution pathways, governance, standards, RFCs, and shared maintenance expectations.</p><p>That operating model looks attractive in strategy decks because it suggests scale through internal open-source behavior. In practice, organizations struggle to sustain it.</p><p>Contribution requires more than technical maturity. Contribution requires governance maturity. Teams need clarity on how to adopt the contribution, how to review its quality, who maintains the artifact over time, how support obligations are assigned, and how the platform distinguishes production-grade extensions from abandoned experiments.</p><p>Level 5 fails when the commons turns into unmanaged sprawl. Shared repositories fill with unevenly maintained components. The boundary between the core platform and the contributed surface becomes blurry. Consumers cannot distinguish between governed and incidental capabilities.</p><p>For many organizations, Level 4 is the durable steady state. Level 5 becomes valuable only when culture, incentives, and governance can support shared stewardship.</p><h1><strong>The missing variable is contract literacy</strong></h1><p>Most discussions of platform maturity focus on the platform side. That view is incomplete.</p><p>A platform can be highly mature in its own design and still fail in practice because the consuming team is not ready to operate against that interface. A Level 4 platform paired with a Level 1 consumer often behaves like a Level 1 system.</p><p>Consumer readiness matters, but teams should define the term more precisely than &#8220;platform familiarity&#8221; alone.</p><p>Consumer readiness is really a form of contract literacy. Consumer readiness measures a team&#8217;s ability to understand the operating interface, interpret the support boundaries, reason about failure modes, absorb ownership, and use the self-service path without relying on informal rescue.</p><p>A mature platform with a new team often needs Level 2 interaction first. The capability may be stable, but the team lacks the context to operate within it.</p><p>An early platform with a strong consumer base may benefit from a Level 3 partnership. The capability is yet to enter production, but the team is strong enough to co-develop the future interface.</p><p>A mature platform with a mature consumer can operate effectively through Level 4 and, in some cases, Level 5.</p><p>An early platform with a new consumer should not pretend to be anything other than Level 1 for a while.</p><p>The diagnostic question is not &#8220;How mature is the platform?&#8221; The better question is, &#8220;How mature is the dependency relationship, given both sides of the interface?&#8221;</p><h1><strong>Why platforms fail even when the interface exists</strong></h1><p>Many platform incidents seem surprising only when the platform team mistakes the technical interface for the whole interface.</p><p>A schema can remain stable even as the operational contract breaks down. A consumer receives the expected fields but cannot tolerate the freshness lag introduced by the new implementation.</p><p>An API can be correct while the ownership model remains unclear. Both teams assume the other team is responsible for migration sequencing, and the rollout fails in the gap.</p><p>A portal can be self-service while the adoption model remains incomplete. The consumer can provision the resource, but does not know which observability, alerting, backfill policy, or quality checks now belong to the team.</p><p>Documentation can be extensive while the communication pattern remains reactive. The written material explains the happy path, but the only reliable way to get edge-case answers is still to message a platform engineer directly.</p><p>Each example points to the same problem. The platform appears mature on paper and unstable in practice because one layer of the operating interface is missing.</p><p>That pattern also explains why many arguments about &#8220;data contracts&#8221; feel unsatisfying. A schema contract may remove one class of ambiguity while leaving the dependency relationship fundamentally underdesigned. Platforms do not scale on descriptive clarity alone. Platforms scale when the operating interface becomes explicit enough for teams to coordinate predictably.</p><h1><strong>Why does this matter more in an agentic enterprise</strong></h1><p>As organizations move toward AI-mediated operations, autonomous workflows, and increasingly automated decision loops, the cost of implicit interfaces rises sharply.</p><p>Human teams can absorb ambiguity through judgment, relationships, escalation habits, and informal context. Human teams can often compensate for missing rules because they know whom to ask, which unwritten assumptions are in place, and when a local exception is acceptable.</p><p>Automated systems do not compensate in the same way. Automated systems require explicit state, explicit boundaries, and explicit expectations. A platform boundary that depends on tribal knowledge, undocumented ownership, or socially negotiated failure handling already strains human teams. That same boundary becomes structurally limiting when an organization wants automation to operate consistently across it.</p><p>No company needs a grand operating model of the enterprise to benefit from that insight. Organizations do need a simpler discipline. Teams need to make the dependency interface between systems and teams legible enough to operate without constant interpretation.</p><h2><strong>Maturity is reversible</strong></h2><p>We view the maturity models as a ladder. Real organizations behave more like shifting systems.</p><p>A platform that reached Level 4 can slide back toward Level 1 when documentation rots, examples stop working, and teams stop trusting the interface. An embedding that once worked can decay when the engineers who learned the system leave, and the knowledge never becomes fully externalized. A clear ownership model can collapse after a reorganization. A rearchitecture can reset operational assumptions so thoroughly that teams must return to manual coordination until the new boundary stabilizes.</p><p>Those events are not unusual. Those events are the normal dynamics of organizational life.</p><p>A maturity model is useful not because it promises to eliminate regression. A maturity model is useful because it helps teams name regression quickly and respond deliberately.</p><p>If a self-service platform has quietly reverted to a ticket queue, the problem is not just support volume. Some part of the operating interface has decayed: the technical surface, the operational contract, the ownership model, the adoption model, or the communication pattern.</p><p>Once teams name that decay, they can turn frustration back into design work.</p><h1><strong>The real platform interface is organizational.</strong></h1><p>The hardest part of a data platform is rarely the infrastructure itself. The hardest part is designing the boundary between the platform and the teams that depend on it.</p><p>That boundary cannot be reduced to schemas or APIs alone. The boundary includes operational expectations, support models, ownership rules, adoption assumptions, and the communication pattern through which those expectations become real.</p><p>When teams leave those layers implicit, they do not avoid design work. They push the work into tickets, escalations, workarounds, and repeated misunderstandings. When teams make those layers explicit, the platform becomes easier to trust, adopt, and scale.</p><p>Data platform engineering is not just infrastructure engineering. Data platform engineering is interface design at the level of teams, systems, and responsibilities.</p><p>A maturity model should not rank organizations morally or insist that every platform must reach some final stage. A maturity model should give teams a vocabulary for the dependency relationships they actually have, the interfaces they are really exposing, and the ones they need to design next.</p><p>The missing interface in data platform engineering is not another layer of tooling. The missing interface is the operating interface that defines how dependent teams rely on one another.</p><p>Until teams make that interface explicit, most platform scale remains performative.</p><p>Once teams make that interface explicit, platform scale becomes real.</p><div><hr></div><p><em>All rights reserved, Dewpeche Private Limited. I have provided links for informational purposes and do not suggest endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers&#8217; opinions.</em></p>]]></content:encoded></item><item><title><![CDATA[Data Engineering Weekly #263]]></title><description><![CDATA[The Weekly Data Engineering Newsletter]]></description><link>https://www.dataengineeringweekly.com/p/data-engineering-weekly-263</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/data-engineering-weekly-263</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Mon, 30 Mar 2026 02:18:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!AdQk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6f51c1bf-abc9-4cd3-ad69-22cc8e3f1ef2_1080x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/events/multi-tenancy-for-modern-data-platforms?utm_campaign=39250561-26-04-WBNR_DEEP_DIVE_BROOKYLN_DATA&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_brooklyn_data&amp;utm_content=03_29_26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aZgx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac92e0d0-af5f-40dc-87dc-c959cb6157c3_2880x1620.heic 424w, https://substackcdn.com/image/fetch/$s_!aZgx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac92e0d0-af5f-40dc-87dc-c959cb6157c3_2880x1620.heic 848w, https://substackcdn.com/image/fetch/$s_!aZgx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac92e0d0-af5f-40dc-87dc-c959cb6157c3_2880x1620.heic 1272w, https://substackcdn.com/image/fetch/$s_!aZgx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac92e0d0-af5f-40dc-87dc-c959cb6157c3_2880x1620.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aZgx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac92e0d0-af5f-40dc-87dc-c959cb6157c3_2880x1620.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ac92e0d0-af5f-40dc-87dc-c959cb6157c3_2880x1620.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:72035,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/events/multi-tenancy-for-modern-data-platforms?utm_campaign=39250561-26-04-WBNR_DEEP_DIVE_BROOKYLN_DATA&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_brooklyn_data&amp;utm_content=03_29_26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/192563052?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac92e0d0-af5f-40dc-87dc-c959cb6157c3_2880x1620.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aZgx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac92e0d0-af5f-40dc-87dc-c959cb6157c3_2880x1620.heic 424w, https://substackcdn.com/image/fetch/$s_!aZgx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac92e0d0-af5f-40dc-87dc-c959cb6157c3_2880x1620.heic 848w, https://substackcdn.com/image/fetch/$s_!aZgx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac92e0d0-af5f-40dc-87dc-c959cb6157c3_2880x1620.heic 1272w, https://substackcdn.com/image/fetch/$s_!aZgx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac92e0d0-af5f-40dc-87dc-c959cb6157c3_2880x1620.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>How data teams are solving multi-tenancy</h1><p>As data teams grow and serve multiple teams, clients, or business units from a shared platform, maintaining isolation and velocity without sacrificing either becomes a defining architectural challenge.<br><br>In this Deep Dive, Dagster Labs and Brooklyn Data Co. will cover the patterns, trade-offs, and real-world implementations behind multi-tenant data platforms built on Dagster. Attendees will leave this session with practical guidance they can take back to their own teams.</p><p><strong><a href="https://dagster.io/events/multi-tenancy-for-modern-data-platforms?utm_campaign=39250561-26-04-WBNR_DEEP_DIVE_BROOKYLN_DATA&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_brooklyn_data&amp;utm_content=03_29_26_data_engineering_weekly">Reserve your spot now</a>.</strong></p><div><hr></div><h1>Aurimas Grici&#363;nas: State of Context Engineering in 2026</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ewfo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb3a1207-28b8-4518-adab-e7648764dd68_1456x826.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ewfo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb3a1207-28b8-4518-adab-e7648764dd68_1456x826.heic 424w, https://substackcdn.com/image/fetch/$s_!ewfo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb3a1207-28b8-4518-adab-e7648764dd68_1456x826.heic 848w, https://substackcdn.com/image/fetch/$s_!ewfo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb3a1207-28b8-4518-adab-e7648764dd68_1456x826.heic 1272w, https://substackcdn.com/image/fetch/$s_!ewfo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb3a1207-28b8-4518-adab-e7648764dd68_1456x826.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ewfo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb3a1207-28b8-4518-adab-e7648764dd68_1456x826.heic" width="1456" height="826" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fb3a1207-28b8-4518-adab-e7648764dd68_1456x826.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:826,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:16223,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/192563052?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb3a1207-28b8-4518-adab-e7648764dd68_1456x826.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ewfo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb3a1207-28b8-4518-adab-e7648764dd68_1456x826.heic 424w, https://substackcdn.com/image/fetch/$s_!ewfo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb3a1207-28b8-4518-adab-e7648764dd68_1456x826.heic 848w, https://substackcdn.com/image/fetch/$s_!ewfo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb3a1207-28b8-4518-adab-e7648764dd68_1456x826.heic 1272w, https://substackcdn.com/image/fetch/$s_!ewfo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb3a1207-28b8-4518-adab-e7648764dd68_1456x826.heic 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>LLM reasoning degrades with oversized context, forcing developers to manage attention through structured context engineering rather than scaling model size. The author outlines five patterns&#8212;<strong>progressive disclosure, compression, routing, agentic RAG, and tool management</strong>&#8212;that control how context is selected and applied. Layered orchestration across discovery, activation, and execution enables complex agent behavior within fixed context limits while preserving reasoning quality.</p><p><strong><a href="https://www.newsletter.swirlai.com/p/state-of-context-engineering-in-2026">https://www.newsletter.swirlai.com/p/state-of-context-engineering-in-2026</a></strong></p><div><hr></div><h1>Joe Reis: AI Is Here, But The Hard Parts Haven&#8217;t Changed</h1><p>AI is accelerating coding velocity, but it&#8217;s also exposing structural weaknesses that data teams have ignored for years&#8212;legacy systems, misaligned leadership, and poor business context modeling. Data from Joe Reis&#8217;s March 2026 survey reinforces the gap: teams are shipping code faster, yet many still lack clarity on production value, while data modeling and semantic layers are emerging as the next critical frontier. Data engineering now faces a reset moment&#8212;improving end-to-end delivery efficiency matters more than optimizing isolated pipelines, a direction I&#8217;ve been exploring in &#8220;<strong><a href="https://www.dataengineeringweekly.com/p/data-engineering-after-ai">Data Engineering After AI</a></strong>&#8221; and &#8220;<strong><a href="https://www.dataengineeringweekly.com/p/etl-is-dead">ETL is Dead</a>.</strong>&#8221;</p><p><strong><a href="https://joereis.substack.com/p/ai-is-here-but-the-hard-parts-havent">https://joereis.substack.com/p/ai-is-here-but-the-hard-parts-havent</a></strong></p><div><hr></div><h1>Hamel Husain: The Revenge of the Data Scientist</h1><p>LLM API accessibility enables rapid AI feature development but obscures reliability requirements grounded in evaluation and experimental design. The author argues modern AI development reuses core data science practices: analyzing production traces, validating LLM-as-judge with precision and recall, grounding test sets in real data, and using domain experts to define criteria. Teams that avoid synthetic benchmarks and over-automation focus on inspecting data to identify failure modes, reinforcing the role of data scientists as reliability gatekeepers.</p><p><strong><a href="https://hamel.dev/blog/posts/revenge/">https://hamel.dev/blog/posts/revenge/</a></strong></p><div><hr></div><h1>Sponsored: The Data Platform Fundamentals Guide</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/how-to-build-data-platforms-ebook?utm_campaign=8626009-25-02-eBook_Data-Platform-Fundamentals&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=data_fundamentals_ebook&amp;utm_content=03_29_26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0Nyg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91567ab5-d5ff-469b-b90e-b288422e2c4a_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!0Nyg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91567ab5-d5ff-469b-b90e-b288422e2c4a_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!0Nyg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91567ab5-d5ff-469b-b90e-b288422e2c4a_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!0Nyg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91567ab5-d5ff-469b-b90e-b288422e2c4a_3840x2160.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0Nyg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91567ab5-d5ff-469b-b90e-b288422e2c4a_3840x2160.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/91567ab5-d5ff-469b-b90e-b288422e2c4a_3840x2160.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22370,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/how-to-build-data-platforms-ebook?utm_campaign=8626009-25-02-eBook_Data-Platform-Fundamentals&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=data_fundamentals_ebook&amp;utm_content=03_29_26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/192563052?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91567ab5-d5ff-469b-b90e-b288422e2c4a_3840x2160.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0Nyg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91567ab5-d5ff-469b-b90e-b288422e2c4a_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!0Nyg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91567ab5-d5ff-469b-b90e-b288422e2c4a_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!0Nyg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91567ab5-d5ff-469b-b90e-b288422e2c4a_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!0Nyg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91567ab5-d5ff-469b-b90e-b288422e2c4a_3840x2160.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We wrote an eBook on Data Platform Fundamentals to help you be like the happy data teams, operating under a single platform. <br><br>In this book, you&#8217;ll learn:<br>- How composable architectures allow teams to ship faster<br>- Why data quality matters and how you can catch issues before they reach users<br>- What observability means, and how it will help you solve problems more quickly</p><p><strong><a href="https://dagster.io/how-to-build-data-platforms-ebook?utm_campaign=8626009-25-02-eBook_Data-Platform-Fundamentals&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=data_fundamentals_ebook&amp;utm_content=03_29_26_data_engineering_weekly">Download your free copy now</a>.</strong></p><div><hr></div><h1>Figma: Redefining impact as a data scientist</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1EWS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb670c688-fcae-4d85-b859-05f71a62feb6_2160x1215.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1EWS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb670c688-fcae-4d85-b859-05f71a62feb6_2160x1215.heic 424w, https://substackcdn.com/image/fetch/$s_!1EWS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb670c688-fcae-4d85-b859-05f71a62feb6_2160x1215.heic 848w, https://substackcdn.com/image/fetch/$s_!1EWS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb670c688-fcae-4d85-b859-05f71a62feb6_2160x1215.heic 1272w, https://substackcdn.com/image/fetch/$s_!1EWS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb670c688-fcae-4d85-b859-05f71a62feb6_2160x1215.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1EWS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb670c688-fcae-4d85-b859-05f71a62feb6_2160x1215.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b670c688-fcae-4d85-b859-05f71a62feb6_2160x1215.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:23211,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/192563052?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb670c688-fcae-4d85-b859-05f71a62feb6_2160x1215.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1EWS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb670c688-fcae-4d85-b859-05f71a62feb6_2160x1215.heic 424w, https://substackcdn.com/image/fetch/$s_!1EWS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb670c688-fcae-4d85-b859-05f71a62feb6_2160x1215.heic 848w, https://substackcdn.com/image/fetch/$s_!1EWS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb670c688-fcae-4d85-b859-05f71a62feb6_2160x1215.heic 1272w, https://substackcdn.com/image/fetch/$s_!1EWS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb670c688-fcae-4d85-b859-05f71a62feb6_2160x1215.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Data science impact in mission-critical systems like billing depends on domain expertise and observability rather than experimentation, shifting focus from models to correctness and clarity. The author describes Figma&#8217;s full-stack approach, where data scientists build consistency checks, create applications that explain system behavior, and define correctness criteria. Embedding these practices into operational systems scales their impact through tools rather than reports.</p><p><strong><a href="https://www.figma.com/blog/redefining-impact-as-a-data-scientist/">https://www.figma.com/blog/redefining-impact-as-a-data-scientist/</a></strong></p><div><hr></div><h1>BlaBlaCar: Beyond the dashboard: how BlaBlaCar PMs use AI to self-serve data</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UGBw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc7f9a29-a7ed-4b76-8406-95f90bbb0ebb_1400x764.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UGBw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc7f9a29-a7ed-4b76-8406-95f90bbb0ebb_1400x764.heic 424w, https://substackcdn.com/image/fetch/$s_!UGBw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc7f9a29-a7ed-4b76-8406-95f90bbb0ebb_1400x764.heic 848w, https://substackcdn.com/image/fetch/$s_!UGBw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc7f9a29-a7ed-4b76-8406-95f90bbb0ebb_1400x764.heic 1272w, https://substackcdn.com/image/fetch/$s_!UGBw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc7f9a29-a7ed-4b76-8406-95f90bbb0ebb_1400x764.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UGBw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc7f9a29-a7ed-4b76-8406-95f90bbb0ebb_1400x764.heic" width="1400" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bc7f9a29-a7ed-4b76-8406-95f90bbb0ebb_1400x764.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22161,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/192563052?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc7f9a29-a7ed-4b76-8406-95f90bbb0ebb_1400x764.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UGBw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc7f9a29-a7ed-4b76-8406-95f90bbb0ebb_1400x764.heic 424w, https://substackcdn.com/image/fetch/$s_!UGBw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc7f9a29-a7ed-4b76-8406-95f90bbb0ebb_1400x764.heic 848w, https://substackcdn.com/image/fetch/$s_!UGBw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc7f9a29-a7ed-4b76-8406-95f90bbb0ebb_1400x764.heic 1272w, https://substackcdn.com/image/fetch/$s_!UGBw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc7f9a29-a7ed-4b76-8406-95f90bbb0ebb_1400x764.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Data analyst bottlenecks in fast-moving organizations require enabling non-technical users to self-serve without compromising data integrity or introducing hallucinations. BlaBlaCar evolves its approach from generic LLM usage to structured JSON schema documentation and few-shot learning on expert query histories, teaching the system to map natural language to business rules. A three-zone autonomy framework&#8212;safe, risky, and dead zones&#8212;combined with SQL literacy training for PMs reduces error rates from 32% to 15% and shifts analysts from reactive ticket handling to strategic work.</p><p><strong><a href="https://medium.com/blablacar/beyond-the-dashboard-how-blablacar-pms-use-ai-to-self-serve-data-95ccd33ab1f9">https://medium.com/blablacar/beyond-the-dashboard-how-blablacar-pms-use-ai-to-self-serve-data-95ccd33ab1f9</a></strong></p><div><hr></div><h1>Expedia: Operating Trino at Scale With Trino Gateway</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xi8G!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7f51d7-f538-486a-a47f-fa1b65e51922_1038x583.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xi8G!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7f51d7-f538-486a-a47f-fa1b65e51922_1038x583.heic 424w, https://substackcdn.com/image/fetch/$s_!xi8G!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7f51d7-f538-486a-a47f-fa1b65e51922_1038x583.heic 848w, https://substackcdn.com/image/fetch/$s_!xi8G!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7f51d7-f538-486a-a47f-fa1b65e51922_1038x583.heic 1272w, https://substackcdn.com/image/fetch/$s_!xi8G!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7f51d7-f538-486a-a47f-fa1b65e51922_1038x583.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xi8G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7f51d7-f538-486a-a47f-fa1b65e51922_1038x583.heic" width="1038" height="583" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/af7f51d7-f538-486a-a47f-fa1b65e51922_1038x583.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:583,&quot;width&quot;:1038,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12041,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/192563052?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7f51d7-f538-486a-a47f-fa1b65e51922_1038x583.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xi8G!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7f51d7-f538-486a-a47f-fa1b65e51922_1038x583.heic 424w, https://substackcdn.com/image/fetch/$s_!xi8G!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7f51d7-f538-486a-a47f-fa1b65e51922_1038x583.heic 848w, https://substackcdn.com/image/fetch/$s_!xi8G!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7f51d7-f538-486a-a47f-fa1b65e51922_1038x583.heic 1272w, https://substackcdn.com/image/fetch/$s_!xi8G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf7f51d7-f538-486a-a47f-fa1b65e51922_1038x583.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Managing Trino at scale requires isolating workloads to prevent resource contention across analytical, ETL, and BI queries. Expedia writes about operating Trino Gateway&#8212;a fork of Lyft's Presto Gateway&#8212;as a single-endpoint proxy that routes queries to dedicated clusters using configurable rules. This design eliminates noisy-neighbor failures, supports zero-downtime deployments, and provides real-time visibility into cluster health.</p><p><strong><a href="https://medium.com/expedia-group-tech/operating-trino-at-scale-with-trino-gateway-41824af788de">https://medium.com/expedia-group-tech/operating-trino-at-scale-with-trino-gateway-41824af788de</a></strong></p><div><hr></div><h1>LangChain: How we build evals for Deep Agents</h1><p>Building reliable AI agents requires evals that target specific production behaviors rather than optimizing for aggregate benchmark scores. LangChain's Deep Agents harness defines behavior-first evals sourced from production errors, BFCL, and hand-written unit tests &#8212; then scores agents on correctness and Ideal Trajectory ratios for step and tool-call efficiency. Teams run tagged eval subsets via pytest in GitHub Actions and trace every run in LangSmith to isolate failure modes and control evaluation cost.</p><p><strong><a href="https://blog.langchain.com/how-we-build-evals-for-deep-agents/">https://blog.langchain.com/how-we-build-evals-for-deep-agents/</a></strong></p><div><hr></div><h1>LinkedIn: The LinkedIn Generative AI Application Tech Stack: Personalization with Cognitive Memory Agent</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7gFb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2b6aa4-229c-4de3-af46-8398b3079b18_1024x571.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7gFb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2b6aa4-229c-4de3-af46-8398b3079b18_1024x571.heic 424w, https://substackcdn.com/image/fetch/$s_!7gFb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2b6aa4-229c-4de3-af46-8398b3079b18_1024x571.heic 848w, https://substackcdn.com/image/fetch/$s_!7gFb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2b6aa4-229c-4de3-af46-8398b3079b18_1024x571.heic 1272w, https://substackcdn.com/image/fetch/$s_!7gFb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2b6aa4-229c-4de3-af46-8398b3079b18_1024x571.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7gFb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2b6aa4-229c-4de3-af46-8398b3079b18_1024x571.heic" width="1024" height="571" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bf2b6aa4-229c-4de3-af46-8398b3079b18_1024x571.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:571,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:8816,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/192563052?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2b6aa4-229c-4de3-af46-8398b3079b18_1024x571.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7gFb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2b6aa4-229c-4de3-af46-8398b3079b18_1024x571.heic 424w, https://substackcdn.com/image/fetch/$s_!7gFb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2b6aa4-229c-4de3-af46-8398b3079b18_1024x571.heic 848w, https://substackcdn.com/image/fetch/$s_!7gFb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2b6aa4-229c-4de3-af46-8398b3079b18_1024x571.heic 1272w, https://substackcdn.com/image/fetch/$s_!7gFb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf2b6aa4-229c-4de3-af46-8398b3079b18_1024x571.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AI agents lose personalization across sessions because they lack structured memory that separates conversational, episodic, semantic, and procedural signals. LinkedIn&#8217;s Cognitive Memory Agent ingests activity traces through streaming and batch pipelines, then uses an LLM-based orchestrator to retrieve and reason across all four memory layers. This architecture enables the Hiring Assistant to auto-populate role requirements and generate recruiter-specific insights from historical hiring activity.</p><p><strong><a href="https://www.linkedin.com/blog/engineering/ai/the-linkedin-generative-ai-application-tech-stack-personalization-with-cognitive-memory-agent">https://www.linkedin.com/blog/engineering/ai/the-linkedin-generative-ai-application-tech-stack-personalization-with-cognitive-memory-agent</a></strong></p><div><hr></div><h1>Ilia Gusev: Change Data Capture: Stop Copying 50M Rows to Move 5K Changes</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!M3Z_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82f01620-35ea-4a36-a7c2-30d0ad5f6e3c_1456x813.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!M3Z_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82f01620-35ea-4a36-a7c2-30d0ad5f6e3c_1456x813.heic 424w, https://substackcdn.com/image/fetch/$s_!M3Z_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82f01620-35ea-4a36-a7c2-30d0ad5f6e3c_1456x813.heic 848w, https://substackcdn.com/image/fetch/$s_!M3Z_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82f01620-35ea-4a36-a7c2-30d0ad5f6e3c_1456x813.heic 1272w, https://substackcdn.com/image/fetch/$s_!M3Z_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82f01620-35ea-4a36-a7c2-30d0ad5f6e3c_1456x813.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!M3Z_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82f01620-35ea-4a36-a7c2-30d0ad5f6e3c_1456x813.heic" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/82f01620-35ea-4a36-a7c2-30d0ad5f6e3c_1456x813.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24582,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/192563052?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82f01620-35ea-4a36-a7c2-30d0ad5f6e3c_1456x813.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!M3Z_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82f01620-35ea-4a36-a7c2-30d0ad5f6e3c_1456x813.heic 424w, https://substackcdn.com/image/fetch/$s_!M3Z_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82f01620-35ea-4a36-a7c2-30d0ad5f6e3c_1456x813.heic 848w, https://substackcdn.com/image/fetch/$s_!M3Z_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82f01620-35ea-4a36-a7c2-30d0ad5f6e3c_1456x813.heic 1272w, https://substackcdn.com/image/fetch/$s_!M3Z_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82f01620-35ea-4a36-a7c2-30d0ad5f6e3c_1456x813.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Nightly full-table copies introduce fragility and place increasing load on source databases as data volumes scale. The article contrasts timestamp, trigger, and log-based CDC approaches, recommending Debezium with Postgres WAL or MySQL binlog as the production standard for near-real-time replication without impacting OLTP performance. Log-based CDC captures hard deletes, handles DDL changes, and decouples replication throughput from transactional write load.</p><p><strong><a href="https://podostack.com/p/change-data-capture-cdc-intro">https://podostack.com/p/change-data-capture-cdc-intro</a></strong></p><div><hr></div><h1>Micheal Lanham: The Markdown File That Beat a $50M Vector Database</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bpVg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb38c0966-c4ad-48b6-a0d6-b58eda873aed_1376x768.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bpVg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb38c0966-c4ad-48b6-a0d6-b58eda873aed_1376x768.heic 424w, https://substackcdn.com/image/fetch/$s_!bpVg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb38c0966-c4ad-48b6-a0d6-b58eda873aed_1376x768.heic 848w, https://substackcdn.com/image/fetch/$s_!bpVg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb38c0966-c4ad-48b6-a0d6-b58eda873aed_1376x768.heic 1272w, https://substackcdn.com/image/fetch/$s_!bpVg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb38c0966-c4ad-48b6-a0d6-b58eda873aed_1376x768.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bpVg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb38c0966-c4ad-48b6-a0d6-b58eda873aed_1376x768.heic" width="1376" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b38c0966-c4ad-48b6-a0d6-b58eda873aed_1376x768.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24703,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/192563052?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb38c0966-c4ad-48b6-a0d6-b58eda873aed_1376x768.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bpVg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb38c0966-c4ad-48b6-a0d6-b58eda873aed_1376x768.heic 424w, https://substackcdn.com/image/fetch/$s_!bpVg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb38c0966-c4ad-48b6-a0d6-b58eda873aed_1376x768.heic 848w, https://substackcdn.com/image/fetch/$s_!bpVg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb38c0966-c4ad-48b6-a0d6-b58eda873aed_1376x768.heic 1272w, https://substackcdn.com/image/fetch/$s_!bpVg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb38c0966-c4ad-48b6-a0d6-b58eda873aed_1376x768.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Agentic workflows expose the cost and operational overhead of managed vector databases when used for single-threaded memory and state management. The author shows how Manus, OpenClaw, and Claude Code converge on Markdown files as the primary memory layer, leveraging KV-cache efficiency, filesystem hierarchy for scoped retrieval, and sqlite-vec for lightweight semantic search. This file-first architecture reduces token costs by nearly 10x and defers vector database adoption to scenarios that require multi-user concurrency.</p><p><strong><a href="https://medium.com/@Micheal-Lanham/the-markdown-file-that-beat-a-50m-vector-database-38e1f5113cbe">https://medium.com/@Micheal-Lanham/the-markdown-file-that-beat-a-50m-vector-database-38e1f5113cbe</a></strong></p><div><hr></div><p><em>All rights reserved, Dewpeche Private Limited. I have provided links for informational purposes and do not suggest endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers&#8217; opinions.</em></p>]]></content:encoded></item><item><title><![CDATA[Data Engineering Weekly #262]]></title><description><![CDATA[The Weekly Data Engineering Newsletter]]></description><link>https://www.dataengineeringweekly.com/p/data-engineering-weekly-262</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/data-engineering-weekly-262</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Mon, 23 Mar 2026 01:31:45 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!AdQk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6f51c1bf-abc9-4cd3-ad69-22cc8e3f1ef2_1080x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/events/deep-dive-building-a-cross-workspace-control-plane-for-databricks?utm_campaign=39250579-26-03-WBNR_Deep_DIVE_DATABRICKS&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_databricks&amp;utm_content=03_22_26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5Wb-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84708f0c-572e-443b-b9c6-432c2fc67219_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!5Wb-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84708f0c-572e-443b-b9c6-432c2fc67219_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!5Wb-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84708f0c-572e-443b-b9c6-432c2fc67219_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!5Wb-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84708f0c-572e-443b-b9c6-432c2fc67219_3840x2160.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5Wb-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84708f0c-572e-443b-b9c6-432c2fc67219_3840x2160.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/84708f0c-572e-443b-b9c6-432c2fc67219_3840x2160.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:23726,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/events/deep-dive-building-a-cross-workspace-control-plane-for-databricks?utm_campaign=39250579-26-03-WBNR_Deep_DIVE_DATABRICKS&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_databricks&amp;utm_content=03_22_26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/191816580?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84708f0c-572e-443b-b9c6-432c2fc67219_3840x2160.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5Wb-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84708f0c-572e-443b-b9c6-432c2fc67219_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!5Wb-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84708f0c-572e-443b-b9c6-432c2fc67219_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!5Wb-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84708f0c-572e-443b-b9c6-432c2fc67219_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!5Wb-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84708f0c-572e-443b-b9c6-432c2fc67219_3840x2160.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>This week: Orchestrating Databricks across multiple workspaces</h1><p>In this hands-on deep dive, you'll learn how to build a cross-workspace control plane for Databricks using Dagster &#8212; connecting multiple workspaces, dbt, and Fivetran into a single observable asset graph with zero code changes to get started.</p><p><strong><a href="https://dagster.io/events/deep-dive-building-a-cross-workspace-control-plane-for-databricks?utm_campaign=39250579-26-03-WBNR_Deep_DIVE_DATABRICKS&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_databricks&amp;utm_content=03_22_26_data_engineering_weekly">Register now</a></strong></p><div><hr></div><h1>Pinterest: Building an MCP Ecosystem at Pinterest</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EeQV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17e84bf7-acff-4f77-a138-5bb677b30f09_1400x824.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EeQV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17e84bf7-acff-4f77-a138-5bb677b30f09_1400x824.heic 424w, https://substackcdn.com/image/fetch/$s_!EeQV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17e84bf7-acff-4f77-a138-5bb677b30f09_1400x824.heic 848w, https://substackcdn.com/image/fetch/$s_!EeQV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17e84bf7-acff-4f77-a138-5bb677b30f09_1400x824.heic 1272w, https://substackcdn.com/image/fetch/$s_!EeQV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17e84bf7-acff-4f77-a138-5bb677b30f09_1400x824.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EeQV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17e84bf7-acff-4f77-a138-5bb677b30f09_1400x824.heic" width="1400" height="824" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/17e84bf7-acff-4f77-a138-5bb677b30f09_1400x824.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:824,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:11677,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/191816580?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17e84bf7-acff-4f77-a138-5bb677b30f09_1400x824.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EeQV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17e84bf7-acff-4f77-a138-5bb677b30f09_1400x824.heic 424w, https://substackcdn.com/image/fetch/$s_!EeQV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17e84bf7-acff-4f77-a138-5bb677b30f09_1400x824.heic 848w, https://substackcdn.com/image/fetch/$s_!EeQV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17e84bf7-acff-4f77-a138-5bb677b30f09_1400x824.heic 1272w, https://substackcdn.com/image/fetch/$s_!EeQV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17e84bf7-acff-4f77-a138-5bb677b30f09_1400x824.heic 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Agent tooling at scale requires decentralizing context across domain-specific servers while maintaining security, discoverability, and governance across production systems. Pinterest&#8217;s MCP ecosystem deploys specialized servers (Presto, Spark, Airflow) behind a central registry, routes requests via JWT end-user tokens and SPIFFE mesh identities, and enforces human approval for sensitive actions such as data overwrites. The system handles 66,000+ monthly invocations while saving engineers 7,000 hours monthly, validating decentralized tooling as the production pattern for agentic workflows.</p><p><strong><a href="https://medium.com/pinterest-engineering/building-an-mcp-ecosystem-at-pinterest-d881eb4c16f1">https://medium.com/pinterest-engineering/building-an-mcp-ecosystem-at-pinterest-d881eb4c16f1</a></strong></p><div><hr></div><h1>Julien Simon: Still Missing Critical Pieces</h1><p>Tool standardization protocols face re-fragmentation when architectural constraints&#8212;token overhead, stateless scaling, weak auth&#8212;force enterprises toward custom implementations for production workloads. The author argues that MCP won adoption but lacks enterprise readiness: Cloudflare's native MCP costs 244,000 tokens versus 1,000 in "Code Mode"; sticky routing defeats load balancers; and missing governance leaves security to individual teams. Companies like Perplexity and Cloudflare are abandoning MCP's tool-calling layer in favor of direct APIs and code generation, signaling that production-scale enterprises require deterministic execution patterns that MCP cannot provide.</p><p><strong><a href="https://julsimon.medium.com/still-missing-critical-pieces-7a78077235e5">https://julsimon.medium.com/still-missing-critical-pieces-7a78077235e5</a></strong></p><div><hr></div><h1>Databricks: Breaking the Microbatch Barrier: The Architecture of Apache Spark Real-Time Mode</h1><p>Real-time analytics infrastructure traditionally required separate engines for throughput (Spark) and sub-100ms latency (Flink), leading to duplicated tooling and operational complexity. Apache Spark 4.1's Real-Time Mode eliminates this trade-off by using longer epochs with boundary checkpointing, concurrent map-reduce stages, and non-blocking operators that emit results continuously rather than buffering them. The unified engine handles both massive ETL and low-latency fraud detection while preserving Spark's lineage-based fault tolerance, consolidating the data stack for single-engine architectures.</p><p><strong><a href="https://www.databricks.com/blog/breaking-microbatch-barrier-architecture-apache-spark-real-time-mode">https://www.databricks.com/blog/breaking-microbatch-barrier-architecture-apache-spark-real-time-mode</a></strong></p><div><hr></div><h1>Sponsored: The AI Modernization Guide</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=03_22_26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tvJc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615b85af-451f-47b0-a8d2-6c215f0195d2_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!tvJc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615b85af-451f-47b0-a8d2-6c215f0195d2_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!tvJc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615b85af-451f-47b0-a8d2-6c215f0195d2_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!tvJc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615b85af-451f-47b0-a8d2-6c215f0195d2_2400x1260.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tvJc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615b85af-451f-47b0-a8d2-6c215f0195d2_2400x1260.heic" width="1456" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/615b85af-451f-47b0-a8d2-6c215f0195d2_2400x1260.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:25914,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=03_22_26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/191816580?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615b85af-451f-47b0-a8d2-6c215f0195d2_2400x1260.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tvJc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615b85af-451f-47b0-a8d2-6c215f0195d2_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!tvJc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615b85af-451f-47b0-a8d2-6c215f0195d2_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!tvJc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615b85af-451f-47b0-a8d2-6c215f0195d2_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!tvJc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F615b85af-451f-47b0-a8d2-6c215f0195d2_2400x1260.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AI is reshaping how data teams operate. But legacy pipelines, brittle workflows, and fragmented tooling weren&#8217;t designed for this shift.<br><br>Learn how leading teams are future-proofing their infrastructure before AI demands overwhelm it.</p><p><strong><a href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=03_22_26_data_engineering_weekly">Download the free guide</a></strong></p><div><hr></div><h1>Etsy: Making Ads Count: Using MMoE and Auxiliary Tasks to Better Connect Buyers &amp; Sellers</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pGk0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e02470a-a6c5-459b-908c-0d17a415dbab_346x658.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pGk0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e02470a-a6c5-459b-908c-0d17a415dbab_346x658.heic 424w, https://substackcdn.com/image/fetch/$s_!pGk0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e02470a-a6c5-459b-908c-0d17a415dbab_346x658.heic 848w, https://substackcdn.com/image/fetch/$s_!pGk0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e02470a-a6c5-459b-908c-0d17a415dbab_346x658.heic 1272w, https://substackcdn.com/image/fetch/$s_!pGk0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e02470a-a6c5-459b-908c-0d17a415dbab_346x658.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pGk0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e02470a-a6c5-459b-908c-0d17a415dbab_346x658.heic" width="346" height="658" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3e02470a-a6c5-459b-908c-0d17a415dbab_346x658.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:658,&quot;width&quot;:346,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5716,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/191816580?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e02470a-a6c5-459b-908c-0d17a415dbab_346x658.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pGk0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e02470a-a6c5-459b-908c-0d17a415dbab_346x658.heic 424w, https://substackcdn.com/image/fetch/$s_!pGk0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e02470a-a6c5-459b-908c-0d17a415dbab_346x658.heic 848w, https://substackcdn.com/image/fetch/$s_!pGk0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e02470a-a6c5-459b-908c-0d17a415dbab_346x658.heic 1272w, https://substackcdn.com/image/fetch/$s_!pGk0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e02470a-a6c5-459b-908c-0d17a415dbab_346x658.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Multi-objective ranking in marketplaces faces metric conflicts&#8212;optimizing for clicks often degrades conversions&#8212;thereby requiring task-specific expert routing while managing data sparsity across event hierarchies. Etsy's MMoE architecture routes CTR and purchase prediction tasks through specialized experts with gated selection, then bridges sparse purchase signals using auxiliary add-to-cart tasks that correlate strongly with intent. The system achieved a 3.5% lift in purchase AUC and a 1% lift in click AUC while reducing inference cost through model pruning, enabling more accurate auto-bidding for sellers.</p><p><strong><a href="https://www.etsy.com/codeascraft/making-ads-count-using-mmoe-and-auxiliary-tasks-to-better-connect-buyers--sellers">https://www.etsy.com/codeascraft/making-ads-count-using-mmoe-and-auxiliary-tasks-to-better-connect-buyers--sellers</a></strong></p><div><hr></div><h1>Meta: Ranking Engineer Agent (REA): The Autonomous AI Agent Accelerating Meta&#8217;s Ads Ranking Innovation</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jBFq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F338f85d1-de4a-4734-9722-fd66e1033277_1463x720.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jBFq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F338f85d1-de4a-4734-9722-fd66e1033277_1463x720.heic 424w, https://substackcdn.com/image/fetch/$s_!jBFq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F338f85d1-de4a-4734-9722-fd66e1033277_1463x720.heic 848w, https://substackcdn.com/image/fetch/$s_!jBFq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F338f85d1-de4a-4734-9722-fd66e1033277_1463x720.heic 1272w, https://substackcdn.com/image/fetch/$s_!jBFq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F338f85d1-de4a-4734-9722-fd66e1033277_1463x720.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jBFq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F338f85d1-de4a-4734-9722-fd66e1033277_1463x720.heic" width="1456" height="717" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/338f85d1-de4a-4734-9722-fd66e1033277_1463x720.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:717,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:21461,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/191816580?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F338f85d1-de4a-4734-9722-fd66e1033277_1463x720.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jBFq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F338f85d1-de4a-4734-9722-fd66e1033277_1463x720.heic 424w, https://substackcdn.com/image/fetch/$s_!jBFq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F338f85d1-de4a-4734-9722-fd66e1033277_1463x720.heic 848w, https://substackcdn.com/image/fetch/$s_!jBFq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F338f85d1-de4a-4734-9722-fd66e1033277_1463x720.heic 1272w, https://substackcdn.com/image/fetch/$s_!jBFq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F338f85d1-de4a-4734-9722-fd66e1033277_1463x720.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>ML model iteration at production scale requires balancing hypothesis generation, resource constraints, and infrastructure resilience across multi-day experiment cycles. Meta&#8217;s Ranking Engineer Agent combines a Dual-Source Hypothesis Engine (historical experiments + novel ML research proposals) with autonomous debugging and cost-aware planning to execute long-horizon ranking workflows without human supervision. The system doubled average model accuracy across six models while enabling three engineers to maintain eight production models&#8212;a 5x productivity gain over traditional team structures.</p><p><strong><a href="https://engineering.fb.com/2026/03/17/developer-tools/ranking-engineer-agent-rea-autonomous-ai-system-accelerating-meta-ads-ranking-innovation/">https://engineering.fb.com/2026/03/17/developer-tools/ranking-engineer-agent-rea-autonomous-ai-system-accelerating-meta-ads-ranking-innovation/</a></strong></p><div><hr></div><h1>Rahul Garg: Context Anchoring</h1><p>AI-assisted development degrades over long sessions as models lose reasoning context ("why") despite retaining technical choices ("what"), trapping developers in single conversations to avoid context loss. The author proposes Context Anchoring&#8212;externalizing decision rationale, rejected alternatives, and constraints into lightweight Feature Documents outside the chat interface. Teams adopting external anchoring achieve warm starts in seconds, reduce token costs by 98%, enable multi-developer alignment, and validate logic through forced documentation, eliminating session anxiety as a design anti-pattern.</p><p><strong><a href="https://martinfowler.com/articles/reduce-friction-ai/context-anchoring.html">https://martinfowler.com/articles/reduce-friction-ai/context-anchoring.html</a></strong></p><div><hr></div><h1>Dropbox: How we optimized Dash&#8217;s relevance judge with DSPy</h1><p>Relevance scoring at scale faces a model-cost-quality trade-off: premium models like o3 are accurate but expensive, while cheaper open-weight models degrade without model-specific prompt tuning. Dropbox's DSPy-based optimization automates prompt refinement against human judgments using NMSE metrics and GEPA feedback loops, reducing manual tuning from weeks to 1&#8211;2 days. The system cut relevance error by 45%, eliminated JSON formatting failures by 97%, and enabled 10&#8211;100x data scaling by shifting to cheaper models while maintaining quality through systematic prompt compilation.</p><p><strong><a href="https://dropbox.tech/machine-learning/optimizing-dropbox-dash-relevance-judge-with-dspy">https://dropbox.tech/machine-learning/optimizing-dropbox-dash-relevance-judge-with-dspy</a></strong></p><div><hr></div><h1>Zalando: Search Quality Assurance with AI as a Judge</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mPcT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3c2467-a469-46ab-8b86-d517403946cc_1500x973.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mPcT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3c2467-a469-46ab-8b86-d517403946cc_1500x973.heic 424w, https://substackcdn.com/image/fetch/$s_!mPcT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3c2467-a469-46ab-8b86-d517403946cc_1500x973.heic 848w, https://substackcdn.com/image/fetch/$s_!mPcT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3c2467-a469-46ab-8b86-d517403946cc_1500x973.heic 1272w, https://substackcdn.com/image/fetch/$s_!mPcT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3c2467-a469-46ab-8b86-d517403946cc_1500x973.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mPcT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3c2467-a469-46ab-8b86-d517403946cc_1500x973.heic" width="1456" height="944" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cf3c2467-a469-46ab-8b86-d517403946cc_1500x973.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:944,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22409,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/191816580?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3c2467-a469-46ab-8b86-d517403946cc_1500x973.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mPcT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3c2467-a469-46ab-8b86-d517403946cc_1500x973.heic 424w, https://substackcdn.com/image/fetch/$s_!mPcT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3c2467-a469-46ab-8b86-d517403946cc_1500x973.heic 848w, https://substackcdn.com/image/fetch/$s_!mPcT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3c2467-a469-46ab-8b86-d517403946cc_1500x973.heic 1272w, https://substackcdn.com/image/fetch/$s_!mPcT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3c2467-a469-46ab-8b86-d517403946cc_1500x973.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Search quality assurance in a new domain often lacks historical user data, forcing teams to rely on manual testing and reactive fixes post-launch. Zalando's framework automates evaluation by generating NER-clustered queries translated into target languages, then routes the results through GPT-4o as a multimodal judge that assesses product metadata and images against a 0&#8211;4 relevance scale. The system evaluates 1,500 search segments (37,500 results) in 3&#8211;5 hours for $250, enabling proactive root-cause identification across languages and ensuring Day 1 quality without local market expertise.</p><p><strong><a href="https://engineering.zalando.com/posts/2026/03/search-quality-assurance-with-llm-judge.html">https://engineering.zalando.com/posts/2026/03/search-quality-assurance-with-llm-judge.html</a></strong></p><div><hr></div><h1>Andrey Novitskiy: Volga - A Rust Rewrite of a Real-Time AI/ML Data Engine (DataFusion, Arrow, SlateDB) with a Chronon + OpenMLDB&#8211;Style Architecture</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6UyA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb0b8f8-9c48-4fa9-b8f0-6cd21d900048_1456x1069.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6UyA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb0b8f8-9c48-4fa9-b8f0-6cd21d900048_1456x1069.heic 424w, https://substackcdn.com/image/fetch/$s_!6UyA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb0b8f8-9c48-4fa9-b8f0-6cd21d900048_1456x1069.heic 848w, https://substackcdn.com/image/fetch/$s_!6UyA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb0b8f8-9c48-4fa9-b8f0-6cd21d900048_1456x1069.heic 1272w, https://substackcdn.com/image/fetch/$s_!6UyA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb0b8f8-9c48-4fa9-b8f0-6cd21d900048_1456x1069.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6UyA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb0b8f8-9c48-4fa9-b8f0-6cd21d900048_1456x1069.heic" width="1456" height="1069" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/feb0b8f8-9c48-4fa9-b8f0-6cd21d900048_1456x1069.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1069,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:26732,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/191816580?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb0b8f8-9c48-4fa9-b8f0-6cd21d900048_1456x1069.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6UyA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb0b8f8-9c48-4fa9-b8f0-6cd21d900048_1456x1069.heic 424w, https://substackcdn.com/image/fetch/$s_!6UyA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb0b8f8-9c48-4fa9-b8f0-6cd21d900048_1456x1069.heic 848w, https://substackcdn.com/image/fetch/$s_!6UyA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb0b8f8-9c48-4fa9-b8f0-6cd21d900048_1456x1069.heic 1272w, https://substackcdn.com/image/fetch/$s_!6UyA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeb0b8f8-9c48-4fa9-b8f0-6cd21d900048_1456x1069.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Real-time ML feature computation requires unified streaming-batch execution with low-latency serving, forcing platforms to choose between specialized engines (Flink for latency, Spark for batch, Redis for serving). Volga's Rust implementation pairs DataFusion SQL execution with SlateDB (an embedded LSM on object storage) and Request Mode, embedding serving logic directly into operator state to eliminate external cache round trips. The system handles month-year windows via tiling, includes native ML aggregations (top-k, categorical sums), and achieves compute-storage separation by consolidating the feature pipeline, batch training, and real-time serving into a single Rust binary.</p><p><strong><a href="https://volgaai.substack.com/p/volga-a-rust-rewrite-of-a-real-time">https://volgaai.substack.com/p/volga-a-rust-rewrite-of-a-real-time</a></strong></p><div><hr></div><h1>Max Halford: Lower your warehouse costs via DuckDB transpilation</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!k0yk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F992e1193-d2a8-4957-8af6-7c820a7acb1e_2151x932.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!k0yk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F992e1193-d2a8-4957-8af6-7c820a7acb1e_2151x932.heic 424w, https://substackcdn.com/image/fetch/$s_!k0yk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F992e1193-d2a8-4957-8af6-7c820a7acb1e_2151x932.heic 848w, https://substackcdn.com/image/fetch/$s_!k0yk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F992e1193-d2a8-4957-8af6-7c820a7acb1e_2151x932.heic 1272w, https://substackcdn.com/image/fetch/$s_!k0yk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F992e1193-d2a8-4957-8af6-7c820a7acb1e_2151x932.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!k0yk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F992e1193-d2a8-4957-8af6-7c820a7acb1e_2151x932.heic" width="1456" height="631" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/992e1193-d2a8-4957-8af6-7c820a7acb1e_2151x932.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:631,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:68792,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/191816580?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F992e1193-d2a8-4957-8af6-7c820a7acb1e_2151x932.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!k0yk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F992e1193-d2a8-4957-8af6-7c820a7acb1e_2151x932.heic 424w, https://substackcdn.com/image/fetch/$s_!k0yk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F992e1193-d2a8-4957-8af6-7c820a7acb1e_2151x932.heic 848w, https://substackcdn.com/image/fetch/$s_!k0yk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F992e1193-d2a8-4957-8af6-7c820a7acb1e_2151x932.heic 1272w, https://substackcdn.com/image/fetch/$s_!k0yk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F992e1193-d2a8-4957-8af6-7c820a7acb1e_2151x932.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Cloud warehouse compute costs escalate during development and testing cycles, incentivizing hybrid approaches that separate cheap storage from expensive query execution. Max Halford&#8217;s &#8220;Quack Mode&#8221; transpiles warehouse SQL (BigQuery &#8594; DuckDB) using SQLGlot, pulls only upstream dependencies into local DuckDB instances, and executes transformations at near-zero cost, optionally pushing results back. The pattern dramatically reduces development compute spend while maintaining warehouse portability, though pulling tables &gt;100GB remains a bottleneck until zero-copy solutions like Iceberg are available.</p><p><strong><a href="https://maxhalford.github.io/blog/warehouse-cost-reduction-quack-mode/">https://maxhalford.github.io/blog/warehouse-cost-reduction-quack-mode/</a></strong></p><div><hr></div><p><em>All rights reserved, Dewpeche Private Limited. I have provided links for informational purposes and do not suggest endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers&#8217; opinions.</em></p>]]></content:encoded></item><item><title><![CDATA[Data Engineering Weekly #261]]></title><description><![CDATA[The Weekly Data Engineering Newsletter]]></description><link>https://www.dataengineeringweekly.com/p/data-engineering-weekly-261</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/data-engineering-weekly-261</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Mon, 16 Mar 2026 00:49:14 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!AdQk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6f51c1bf-abc9-4cd3-ad69-22cc8e3f1ef2_1080x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/events/deep-dive-building-a-cross-workspace-control-plane-for-databricks?utm_campaign=39250579-26-03-WBNR_Deep_DIVE_DATABRICKS&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_databricks&amp;utm_content=03_15_26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Lnqf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb4b54f-52d4-4bdd-a7ee-b8cfed5fee04_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!Lnqf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb4b54f-52d4-4bdd-a7ee-b8cfed5fee04_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!Lnqf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb4b54f-52d4-4bdd-a7ee-b8cfed5fee04_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!Lnqf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb4b54f-52d4-4bdd-a7ee-b8cfed5fee04_3840x2160.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Lnqf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb4b54f-52d4-4bdd-a7ee-b8cfed5fee04_3840x2160.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bfb4b54f-52d4-4bdd-a7ee-b8cfed5fee04_3840x2160.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24982,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/events/deep-dive-building-a-cross-workspace-control-plane-for-databricks?utm_campaign=39250579-26-03-WBNR_Deep_DIVE_DATABRICKS&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_databricks&amp;utm_content=03_15_26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/191078036?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb4b54f-52d4-4bdd-a7ee-b8cfed5fee04_3840x2160.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Lnqf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb4b54f-52d4-4bdd-a7ee-b8cfed5fee04_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!Lnqf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb4b54f-52d4-4bdd-a7ee-b8cfed5fee04_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!Lnqf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb4b54f-52d4-4bdd-a7ee-b8cfed5fee04_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!Lnqf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfb4b54f-52d4-4bdd-a7ee-b8cfed5fee04_3840x2160.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>How to Orchestrate Databricks Across Multiple Workspaces</h1><p>As Databricks deployments scale, a familiar pattern emerges: multiple workspaces, multiple teams, and no reliable way to manage the dependencies between them.<br><br>In this hands-on deep dive, we'll show you how to build a cross-workspace control plane using Dagster on top of your existing Databricks environment. Demo-heavy and practitioner-focused, you'll leave with working patterns you can apply to your own platform the same day.</p><p><strong><a href="https://dagster.io/events/deep-dive-building-a-cross-workspace-control-plane-for-databricks?utm_campaign=39250579-26-03-WBNR_Deep_DIVE_DATABRICKS&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_databricks&amp;utm_content=03_15_26_data_engineering_weekly">Register now</a></strong></p><div><hr></div><h1>Editor&#8217;s Note: Introducing Data Engineering After AI Podcast Series</h1><p>Lately, I&#8217;ve been thinking a lot about the intersection of data architecture and AI. To dig deeper into this, I&#8217;m launching a <strong>new podcast series</strong> called <strong><a href="https://www.dataengineeringweekly.com/p/data-engineering-after-ai">Data Engineering After AI.</a></strong></p><p>I&#8217;m looking for guests who are in the trenches. If you have strong opinions on where the industry is heading, or if you are actively building solutions in this space (either in-house or as a product), let&#8217;s talk.</p><p><strong>Please note:</strong> my goal is to foster an authentic discussion about how AI is reshaping data engineering from the ground up. This isn&#8217;t a space for promotional product pitches, and I want to keep the conversation strictly focused on the technology, the challenges, and the architectural shifts.</p><p>If you are passionate about the future of our field and want to share your insights, DM me on <strong><a href="https://www.linkedin.com/in/ananthdurai/">LinkedIn</a></strong>.</p><div><hr></div><h1>Joseph M. Hellerstein: AI and the Mixed-Consistency Future</h1><p>In my recent article, <strong><a href="https://www.dataengineeringweekly.com/p/etl-is-dead">ETL is dead</a></strong>, I projected that the data modeling techniques that got us here may not be sufficient for the AI era. The consistency model is one of the biggest gaps in the emerging file-based system design around the AI Agent. We have seen this shift from the Hadoop file system to the Lakehouse model. The author suggests that we may be entering the Mixed-Consistency future. </p><p><strong><a href="https://jhellerstein.github.io/blog/ai-mixed-consistency/">https://jhellerstein.github.io/blog/ai-mixed-consistency/</a></strong></p><div><hr></div><h1>Milan Mosny: Ontology, Taxonomy, Data Model, Context Graph &amp; Friends</h1><p>Context Engineering is the hot topic in the industry. I found the author did an excellent recap on ontology, taxonomy, data model &amp; context graph. As the famous saying goes, it is all data engineering. </p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YbOb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78af57ca-d7bc-4843-a7b4-26f1009d92ae_590x202.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YbOb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78af57ca-d7bc-4843-a7b4-26f1009d92ae_590x202.heic 424w, https://substackcdn.com/image/fetch/$s_!YbOb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78af57ca-d7bc-4843-a7b4-26f1009d92ae_590x202.heic 848w, https://substackcdn.com/image/fetch/$s_!YbOb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78af57ca-d7bc-4843-a7b4-26f1009d92ae_590x202.heic 1272w, https://substackcdn.com/image/fetch/$s_!YbOb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78af57ca-d7bc-4843-a7b4-26f1009d92ae_590x202.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YbOb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78af57ca-d7bc-4843-a7b4-26f1009d92ae_590x202.heic" width="590" height="202" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/78af57ca-d7bc-4843-a7b4-26f1009d92ae_590x202.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:202,&quot;width&quot;:590,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5085,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/191078036?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78af57ca-d7bc-4843-a7b4-26f1009d92ae_590x202.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YbOb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78af57ca-d7bc-4843-a7b4-26f1009d92ae_590x202.heic 424w, https://substackcdn.com/image/fetch/$s_!YbOb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78af57ca-d7bc-4843-a7b4-26f1009d92ae_590x202.heic 848w, https://substackcdn.com/image/fetch/$s_!YbOb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78af57ca-d7bc-4843-a7b4-26f1009d92ae_590x202.heic 1272w, https://substackcdn.com/image/fetch/$s_!YbOb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78af57ca-d7bc-4843-a7b4-26f1009d92ae_590x202.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong><a href="https://medium.com/response42/ontology-taxonomy-data-model-context-graph-friends-56a605e14355">https://medium.com/response42/ontology-taxonomy-data-model-context-graph-friends-56a605e14355</a></strong></p><div><hr></div><h1>Jason Cui &amp; Jennifer Li: Your Data Agents Need Context</h1><p>Contextual grounding&#8212;standardized terminology, data lineage, operational semantics&#8212;determines whether natural language agents answer analytics questions reliably. The authors propose a &#8220;Context Layer&#8221; combining LLM-powered metadata construction with human refinement to map business knowledge onto warehouse schemas. Organizations adopting context-aware agent architectures unlock self-serve analytics without brittleness, enabling agents to reason consistently across disparate schemas.</p><p><strong><a href="https://www.a16z.news/p/your-data-agents-need-context">https://www.a16z.news/p/your-data-agents-need-context</a></strong></p><div><hr></div><h1>Sponsored: The AI Modernization Guide</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=03_15_26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IM5z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a3722ad-fd1c-4213-8232-bb4fff037001_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!IM5z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a3722ad-fd1c-4213-8232-bb4fff037001_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!IM5z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a3722ad-fd1c-4213-8232-bb4fff037001_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!IM5z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a3722ad-fd1c-4213-8232-bb4fff037001_2400x1260.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IM5z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a3722ad-fd1c-4213-8232-bb4fff037001_2400x1260.heic" width="1456" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2a3722ad-fd1c-4213-8232-bb4fff037001_2400x1260.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:25914,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=03_15_26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/191078036?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a3722ad-fd1c-4213-8232-bb4fff037001_2400x1260.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IM5z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a3722ad-fd1c-4213-8232-bb4fff037001_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!IM5z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a3722ad-fd1c-4213-8232-bb4fff037001_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!IM5z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a3722ad-fd1c-4213-8232-bb4fff037001_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!IM5z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a3722ad-fd1c-4213-8232-bb4fff037001_2400x1260.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AI is reshaping how data teams operate. But legacy pipelines, brittle workflows, and fragmented tooling weren&#8217;t designed for this shift.<br><br>Learn how leading teams are future-proofing their infrastructure before AI demands overwhelm it.</p><p><strong><a href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=03_15_26_data_engineering_weekly">Download the free guide</a></strong></p><div><hr></div><h1>Robin Moffatt: Claude Code isn&#8217;t going to replace data engineers (yet)</h1><p>We see some degree of success with the Claude Code in software engineering. Is it ready for the prime data engineering? The author noted the gap in trust &amp; accuracy, silent data loss, non-determinism, technical flaws, and maintenance. There is a data engineering gap in building an efficient sandbox environment to bridge it, which is a must for brownfield projects. </p><p><strong><a href="https://rmoff.net/2026/03/11/claude-code-isnt-going-to-replace-data-engineers-yet/">https://rmoff.net/2026/03/11/claude-code-isnt-going-to-replace-data-engineers-yet/</a></strong></p><div><hr></div><h1>Snap: Agent Format: A Declarative Standard for AI Agents</h1><p>Speed and Correctness in execution always have their own trade-off. Snap writes about how different teams adopted different AI frameworks to move fast and focus on standard interface design to make everything work together. I believe as long as the pendulum swings between speed and efficiency, the software engineering is safe. We will always build the next best abstraction.</p><p><strong><a href="https://eng.snap.com/agent-format">https://eng.snap.com/agent-format  </a></strong></p><div><hr></div><h1>LinkedIn: Engineering the next generation of LinkedIn&#8217;s Feed</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!V6EO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ec2daf-48a5-4f61-bc52-3180ecd05256_459x510.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!V6EO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ec2daf-48a5-4f61-bc52-3180ecd05256_459x510.heic 424w, https://substackcdn.com/image/fetch/$s_!V6EO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ec2daf-48a5-4f61-bc52-3180ecd05256_459x510.heic 848w, https://substackcdn.com/image/fetch/$s_!V6EO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ec2daf-48a5-4f61-bc52-3180ecd05256_459x510.heic 1272w, https://substackcdn.com/image/fetch/$s_!V6EO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ec2daf-48a5-4f61-bc52-3180ecd05256_459x510.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!V6EO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ec2daf-48a5-4f61-bc52-3180ecd05256_459x510.heic" width="459" height="510" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/05ec2daf-48a5-4f61-bc52-3180ecd05256_459x510.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:510,&quot;width&quot;:459,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7915,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/191078036?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ec2daf-48a5-4f61-bc52-3180ecd05256_459x510.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!V6EO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ec2daf-48a5-4f61-bc52-3180ecd05256_459x510.heic 424w, https://substackcdn.com/image/fetch/$s_!V6EO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ec2daf-48a5-4f61-bc52-3180ecd05256_459x510.heic 848w, https://substackcdn.com/image/fetch/$s_!V6EO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ec2daf-48a5-4f61-bc52-3180ecd05256_459x510.heic 1272w, https://substackcdn.com/image/fetch/$s_!V6EO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ec2daf-48a5-4f61-bc52-3180ecd05256_459x510.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Feed personalization at a massive scale requires unifying disparate retrieval signals into semantic representations while maintaining sub-second latency across billions of users. LinkedIn's architecture consolidates keyword matching, collaborative filtering, and engagement signals into a dual-encoder LLM retrieval paired with a Generative Recommender transformer that sequences 1,000+ historical interactions to capture professional trajectories. Custom infrastructure&#8212;Flash Attention variants, GPU-optimized data loaders, decoupled nearline pipelines&#8212;enables semantic ranking at sub-second latency for 1.3 billion members while reducing training memory by 37%.</p><p><strong><a href="https://www.linkedin.com/blog/engineering/feed/engineering-the-next-generation-of-linkedins-feed">https://www.linkedin.com/blog/engineering/feed/engineering-the-next-generation-of-linkedins-feed</a></strong></p><div><hr></div><h1>Spotify: Inside the Archive: The Tech Behind Your 2025 Wrapped Highlights</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WeU7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82b0f593-056b-4088-a8de-ee34c04bb6ef_1920x1080.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WeU7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82b0f593-056b-4088-a8de-ee34c04bb6ef_1920x1080.heic 424w, https://substackcdn.com/image/fetch/$s_!WeU7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82b0f593-056b-4088-a8de-ee34c04bb6ef_1920x1080.heic 848w, https://substackcdn.com/image/fetch/$s_!WeU7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82b0f593-056b-4088-a8de-ee34c04bb6ef_1920x1080.heic 1272w, https://substackcdn.com/image/fetch/$s_!WeU7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82b0f593-056b-4088-a8de-ee34c04bb6ef_1920x1080.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WeU7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82b0f593-056b-4088-a8de-ee34c04bb6ef_1920x1080.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/82b0f593-056b-4088-a8de-ee34c04bb6ef_1920x1080.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17157,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/191078036?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82b0f593-056b-4088-a8de-ee34c04bb6ef_1920x1080.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WeU7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82b0f593-056b-4088-a8de-ee34c04bb6ef_1920x1080.heic 424w, https://substackcdn.com/image/fetch/$s_!WeU7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82b0f593-056b-4088-a8de-ee34c04bb6ef_1920x1080.heic 848w, https://substackcdn.com/image/fetch/$s_!WeU7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82b0f593-056b-4088-a8de-ee34c04bb6ef_1920x1080.heic 1272w, https://substackcdn.com/image/fetch/$s_!WeU7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82b0f593-056b-4088-a8de-ee34c04bb6ef_1920x1080.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Generating personalized narratives at a billion-scale requires balancing creative consistency, latency constraints, and data fidelity without requiring human review. Spotify's Wrapped Archive distills frontier LLM outputs into smaller production models via DPO, grounds narratives in heuristic-ranked "remarkable days" from distributed pipelines, and uses layered prompts to enforce tone while preventing hallucinations. Column-oriented storage with per-day qualifiers, pre-scaled compute, and automated Judge-model sampling of 165,000 reports enables 1.4 billion unique narratives at launch latency while catching systemic failures such as timezone bugs.</p><p><strong><a href="https://engineering.atspotify.com/2026/3/inside-the-archive-2025-wrapped">https://engineering.atspotify.com/2026/3/inside-the-archive-2025-wrapped</a></strong></p><div><hr></div><h1>LinkedIn: Driving data enhancement &amp; recruitment success with LinkedIn&#8217;s unified integrations</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iXUn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F484e89d1-98e6-4217-8e01-22b4d4f8990f_1200x465.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iXUn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F484e89d1-98e6-4217-8e01-22b4d4f8990f_1200x465.heic 424w, https://substackcdn.com/image/fetch/$s_!iXUn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F484e89d1-98e6-4217-8e01-22b4d4f8990f_1200x465.heic 848w, https://substackcdn.com/image/fetch/$s_!iXUn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F484e89d1-98e6-4217-8e01-22b4d4f8990f_1200x465.heic 1272w, https://substackcdn.com/image/fetch/$s_!iXUn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F484e89d1-98e6-4217-8e01-22b4d4f8990f_1200x465.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iXUn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F484e89d1-98e6-4217-8e01-22b4d4f8990f_1200x465.heic" width="1200" height="465" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/484e89d1-98e6-4217-8e01-22b4d4f8990f_1200x465.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:465,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:9979,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/191078036?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F484e89d1-98e6-4217-8e01-22b4d4f8990f_1200x465.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iXUn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F484e89d1-98e6-4217-8e01-22b4d4f8990f_1200x465.heic 424w, https://substackcdn.com/image/fetch/$s_!iXUn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F484e89d1-98e6-4217-8e01-22b4d4f8990f_1200x465.heic 848w, https://substackcdn.com/image/fetch/$s_!iXUn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F484e89d1-98e6-4217-8e01-22b4d4f8990f_1200x465.heic 1272w, https://substackcdn.com/image/fetch/$s_!iXUn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F484e89d1-98e6-4217-8e01-22b4d4f8990f_1200x465.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Recruitment data fragmentation&#8212;disparate ATS schemas, semantic conflicts, and partner integration overhead&#8212;blocks AI agents from reliably reasoning across hiring pipelines. LinkedIn's unified platform standardizes partner data into canonical schemas via hybrid push/pull models (BuildIn for speed, BuildOut with Temporal orchestration for reliability), assigns stable Integration IDs to decouple identity, and reconciles multi-source conflicts into single-truth serving layers. The system cut onboarding from 12 months to 4, expanded job field coverage 1.8x, and dropped resume gaps below 10%, enabling agents to reason and act consistently across enterprise hiring systems.</p><p><strong><a href="https://www.linkedin.com/blog/engineering/talent/driving-data-enhancement-and-recruitment-success-with-linkedins-unified-integrations">https://www.linkedin.com/blog/engineering/talent/driving-data-enhancement-and-recruitment-success-with-linkedins-unified-integrations</a></strong></p><div><hr></div><h1>Uber: Transforming Ads Personalization with Sequential Modeling and Hetero-MMoE at Uber</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lZPM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf070bf-2430-4223-b746-637eb1edc9e7_1536x746.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lZPM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf070bf-2430-4223-b746-637eb1edc9e7_1536x746.heic 424w, https://substackcdn.com/image/fetch/$s_!lZPM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf070bf-2430-4223-b746-637eb1edc9e7_1536x746.heic 848w, https://substackcdn.com/image/fetch/$s_!lZPM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf070bf-2430-4223-b746-637eb1edc9e7_1536x746.heic 1272w, https://substackcdn.com/image/fetch/$s_!lZPM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf070bf-2430-4223-b746-637eb1edc9e7_1536x746.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lZPM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf070bf-2430-4223-b746-637eb1edc9e7_1536x746.heic" width="1456" height="707" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fdf070bf-2430-4223-b746-637eb1edc9e7_1536x746.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:707,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12616,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/191078036?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf070bf-2430-4223-b746-637eb1edc9e7_1536x746.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lZPM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf070bf-2430-4223-b746-637eb1edc9e7_1536x746.heic 424w, https://substackcdn.com/image/fetch/$s_!lZPM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf070bf-2430-4223-b746-637eb1edc9e7_1536x746.heic 848w, https://substackcdn.com/image/fetch/$s_!lZPM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf070bf-2430-4223-b746-637eb1edc9e7_1536x746.heic 1272w, https://substackcdn.com/image/fetch/$s_!lZPM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdf070bf-2430-4223-b746-637eb1edc9e7_1536x746.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Ads ranking at scale requires capturing sequential user intent over long behavioral histories while simultaneously optimizing competing objectives such as clicks and conversions. Uber's system pairs target-aware transformers with Multi-Head Latent Attention (reducing sequence complexity from O(N&#178;) to O(N&#215;L)) to compress engagement histories, then routes the compressed signals through Hetero-MMoE&#8212;blending DCN and CIN experts to capture low- to high-order feature interactions across multimodal inputs. Online experiments yielded +0.93% AUC on predicted CTR and +0.66% AUC on predicted click-to-order, validating sequential modeling at the ranking scale.</p><p><strong><a href="https://www.uber.com/en-EG/blog/transforming-ads-personalization/">https://www.uber.com/en-EG/blog/transforming-ads-personalization/</a></strong></p><div><hr></div><h1>Databricks: LogSentinel: How Databricks uses Databricks for LLM-Powered PII Detection and Governance</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FI4x!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F294b3671-32e1-4898-813e-de0295ae45f2_1999x1250.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FI4x!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F294b3671-32e1-4898-813e-de0295ae45f2_1999x1250.heic 424w, https://substackcdn.com/image/fetch/$s_!FI4x!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F294b3671-32e1-4898-813e-de0295ae45f2_1999x1250.heic 848w, https://substackcdn.com/image/fetch/$s_!FI4x!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F294b3671-32e1-4898-813e-de0295ae45f2_1999x1250.heic 1272w, https://substackcdn.com/image/fetch/$s_!FI4x!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F294b3671-32e1-4898-813e-de0295ae45f2_1999x1250.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FI4x!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F294b3671-32e1-4898-813e-de0295ae45f2_1999x1250.heic" width="1456" height="910" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/294b3671-32e1-4898-813e-de0295ae45f2_1999x1250.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:910,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19202,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/191078036?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F294b3671-32e1-4898-813e-de0295ae45f2_1999x1250.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FI4x!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F294b3671-32e1-4898-813e-de0295ae45f2_1999x1250.heic 424w, https://substackcdn.com/image/fetch/$s_!FI4x!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F294b3671-32e1-4898-813e-de0295ae45f2_1999x1250.heic 848w, https://substackcdn.com/image/fetch/$s_!FI4x!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F294b3671-32e1-4898-813e-de0295ae45f2_1999x1250.heic 1272w, https://substackcdn.com/image/fetch/$s_!FI4x!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F294b3671-32e1-4898-813e-de0295ae45f2_1999x1250.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>PII discovery and compliance monitoring at a data warehouse scale requires automating label classification across schema evolution without manual audit cycles. Databricks&#8217; LogSentinel orchestrates multiple LLM &#8220;experts&#8221; in parallel&#8212;augmented with Vector Search context and AI-generated column comments&#8212;to classify data across 100+ granular, hierarchical, and residency labels, selecting predictions by confidence voting. The system achieves 92% precision and 95% recall while reducing manual review cycles from weeks to hours, enabling real-time governance as schemas drift.</p><p><strong><a href="https://www.databricks.com/blog/logsentinel-how-databricks-uses-databricks-llm-powered-pii-detection-and-governance">https://www.databricks.com/blog/logsentinel-how-databricks-uses-databricks-llm-powered-pii-detection-and-governance</a></strong></p><div><hr></div><p><em>All rights reserved, Dewpeche Private Limited. I have provided links for informational purposes and do not suggest endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers&#8217; opinions.</em></p>]]></content:encoded></item><item><title><![CDATA[ETL is Dead]]></title><description><![CDATA[Why the shift from human-operated to agent-operated data warehouses demands a new architecture]]></description><link>https://www.dataengineeringweekly.com/p/etl-is-dead</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/etl-is-dead</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Wed, 11 Mar 2026 14:42:11 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!BI6v!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5feea529-8813-4b35-b17c-a8580d63201a_2232x886.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>More ETL pipelines will run in 2027 than in any year in history. AI will generate more extraction jobs, more transformation logic, and more loading routines than any team of data engineers could write by hand. The volume of ETL will explode.</p><p>And ETL is still dead.</p><p>Not dead the way Latin is dead &#8212; no one speaks it. Dead, the way landlines are dead &#8212; they still work, millions exist, but nobody builds their communication strategy around one. ETL is dead as the defining work of data engineering. Dead as the thing we hire for, build careers around, and organize teams to do. The pipelines keep running. The professional identity built around them does not survive.</p><h1>The Warehouse Was Always a Metaphor. Now the Metaphor Is Breaking.</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BI6v!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5feea529-8813-4b35-b17c-a8580d63201a_2232x886.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BI6v!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5feea529-8813-4b35-b17c-a8580d63201a_2232x886.heic 424w, https://substackcdn.com/image/fetch/$s_!BI6v!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5feea529-8813-4b35-b17c-a8580d63201a_2232x886.heic 848w, https://substackcdn.com/image/fetch/$s_!BI6v!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5feea529-8813-4b35-b17c-a8580d63201a_2232x886.heic 1272w, https://substackcdn.com/image/fetch/$s_!BI6v!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5feea529-8813-4b35-b17c-a8580d63201a_2232x886.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BI6v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5feea529-8813-4b35-b17c-a8580d63201a_2232x886.heic" width="1456" height="578" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5feea529-8813-4b35-b17c-a8580d63201a_2232x886.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:578,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:26399,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/190620055?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5feea529-8813-4b35-b17c-a8580d63201a_2232x886.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BI6v!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5feea529-8813-4b35-b17c-a8580d63201a_2232x886.heic 424w, https://substackcdn.com/image/fetch/$s_!BI6v!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5feea529-8813-4b35-b17c-a8580d63201a_2232x886.heic 848w, https://substackcdn.com/image/fetch/$s_!BI6v!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5feea529-8813-4b35-b17c-a8580d63201a_2232x886.heic 1272w, https://substackcdn.com/image/fetch/$s_!BI6v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5feea529-8813-4b35-b17c-a8580d63201a_2232x886.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We literally called it a data <em>warehouse</em>. And that wasn&#8217;t just naming &#8212; we replicated the entire physical warehouse operating model into the digital world. Racks became tables. Inventory management became catalogs. Forklifts became ETL pipelines. Floor workers became data engineers. Shift supervisors became analytics leads.</p><p>Every technique we built &#8212; <strong><a href="https://en.wikipedia.org/wiki/Dimensional_modeling">star schemas</a></strong>, slowly changing dimensions, <strong><a href="https://www.databricks.com/glossary/medallion-architecture">medallion architectures</a></strong>, conformed dimensions &#8212; served the same purpose as aisle markers and shelf labels in a physical warehouse: help a <em>human</em> walk in, find what they need, and carry it out.</p><p>Data modeling organizes information so humans can discover it. Data catalogs provided wayfinding to help humans navigate them. The medallion architecture created a pick-pack-ship assembly line where humans inspected and validated data at each station. Naming conventions &#8212; fact_orders, dim_customers &#8212; acted as signage so humans could read the shelves at a glance.</p><p>Every design decision is optimized for human cognition. And then the operator changed.</p><h1>What Happened When Robots Entered the Physical Warehouse</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YGa4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ee87f2-b944-4306-b222-399c99f16632_2182x842.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YGa4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ee87f2-b944-4306-b222-399c99f16632_2182x842.heic 424w, https://substackcdn.com/image/fetch/$s_!YGa4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ee87f2-b944-4306-b222-399c99f16632_2182x842.heic 848w, https://substackcdn.com/image/fetch/$s_!YGa4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ee87f2-b944-4306-b222-399c99f16632_2182x842.heic 1272w, https://substackcdn.com/image/fetch/$s_!YGa4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ee87f2-b944-4306-b222-399c99f16632_2182x842.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YGa4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ee87f2-b944-4306-b222-399c99f16632_2182x842.heic" width="1456" height="562" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f4ee87f2-b944-4306-b222-399c99f16632_2182x842.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:562,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24175,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/190620055?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ee87f2-b944-4306-b222-399c99f16632_2182x842.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YGa4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ee87f2-b944-4306-b222-399c99f16632_2182x842.heic 424w, https://substackcdn.com/image/fetch/$s_!YGa4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ee87f2-b944-4306-b222-399c99f16632_2182x842.heic 848w, https://substackcdn.com/image/fetch/$s_!YGa4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ee87f2-b944-4306-b222-399c99f16632_2182x842.heic 1272w, https://substackcdn.com/image/fetch/$s_!YGa4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4ee87f2-b944-4306-b222-399c99f16632_2182x842.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>When Amazon deployed <strong><a href="https://en.wikipedia.org/wiki/Amazon_Robotics">Kiva robots</a></strong>, they didn&#8217;t replace human tasks one-for-one. They <strong><a href="https://spectrum.ieee.org/amazon-ai-robotics">redesigned the entire warehouse</a></strong> around a different operator.</p><p>Physical warehouses built for humans had wide aisles because humans need space to walk. They grouped items logically because humans need to remember where things are. They placed high-demand products at eye level because humans have ergonomic constraints. They posted signage everywhere because humans need wayfinding.</p><p>Robotic warehouses <strong><a href="https://www.aboutamazon.com/news/operations/amazon-robotics-robots-fulfillment-center">threw all of that out</a></strong>. Aisles shrank because robots don&#8217;t need shoulder width. Shelving went floor-to-ceiling because robots don&#8217;t have ergonomic limits. Logical grouping became unnecessary because robots navigate by coordinates, not memory. Signage disappeared because robots don&#8217;t read signs &#8212; they read instructions.</p><p>But the biggest gains weren&#8217;t physical. They were <em>cognitive</em>. Human warehouse workers carried an enormous cognitive load &#8212; remembering locations, making routing decisions, prioritizing picks, and mentally handling exceptions. Robots eliminated that cognitive burden entirely. The warehouse didn&#8217;t just move faster. It became a fundamentally different system that could handle complexity no human floor operation could manage.</p><h1>The Data Warehouse Is Still Designed for Human Forklift Operators</h1><p>Now look at our data warehouse through this lens.</p><p><strong><a href="https://www.kimballgroup.com/data-warehouse-business-intelligence-resources/books/data-warehouse-dw-toolkit/">Star schemas and dimensional modeling</a></strong> exist so a human analyst can visualize how tables relate. A human needs to see the star &#8212; the fact table at the center, dimensions radiating outward. An agent doesn&#8217;t need a star. It needs a validated semantic definition of what each entity means and how entities connect.</p><p>Data catalogs are digital signage. We built them because humans need to browse and discover what&#8217;s in the warehouse. An agent doesn&#8217;t browse a catalog the way a human walks an aisle. It queries for a validated meaning.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7Ube!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06c6557-89ed-45d5-b20c-e36a700aa888_2078x662.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7Ube!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06c6557-89ed-45d5-b20c-e36a700aa888_2078x662.heic 424w, https://substackcdn.com/image/fetch/$s_!7Ube!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06c6557-89ed-45d5-b20c-e36a700aa888_2078x662.heic 848w, https://substackcdn.com/image/fetch/$s_!7Ube!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06c6557-89ed-45d5-b20c-e36a700aa888_2078x662.heic 1272w, https://substackcdn.com/image/fetch/$s_!7Ube!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06c6557-89ed-45d5-b20c-e36a700aa888_2078x662.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7Ube!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06c6557-89ed-45d5-b20c-e36a700aa888_2078x662.heic" width="1456" height="464" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b06c6557-89ed-45d5-b20c-e36a700aa888_2078x662.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:464,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12449,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/190620055?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06c6557-89ed-45d5-b20c-e36a700aa888_2078x662.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7Ube!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06c6557-89ed-45d5-b20c-e36a700aa888_2078x662.heic 424w, https://substackcdn.com/image/fetch/$s_!7Ube!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06c6557-89ed-45d5-b20c-e36a700aa888_2078x662.heic 848w, https://substackcdn.com/image/fetch/$s_!7Ube!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06c6557-89ed-45d5-b20c-e36a700aa888_2078x662.heic 1272w, https://substackcdn.com/image/fetch/$s_!7Ube!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06c6557-89ed-45d5-b20c-e36a700aa888_2078x662.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The <strong><a href="https://learn.microsoft.com/en-us/azure/databricks/lakehouse/medallion">medallion architecture</a></strong> &#8212; Bronze to Silver to Gold &#8212; is an assembly line designed for human inspection at each station. Raw data lands, gets progressively cleaned, and arrives ready for consumption. Each station assumes a human will inspect, validate, and pass the data forward. And at each handoff, context erodes &#8212; the original meaning collapses a little more, like a game of telephone played silently in the pipeline.</p><p>We optimized every layer of the data warehouse for human cognitive constraints. And just like the physical warehouse, those very optimizations become limitations when the operator changes.</p><h1>Where the Analogy Holds &#8212; and Where It Breaks</h1><p>I want to be precise about this, because imprecise analogies are how our industry ends up with decade-long hype cycles built on half-truths.</p><p>The analogy holds powerfully for <em>navigation and discovery</em>. Physical warehouses organized shelves for human wayfinding. Data warehouses organize tables for human querying. Robots don&#8217;t need aisle signs. Agents don&#8217;t need star schemas to find data. That part maps cleanly.</p><p>But here&#8217;s where it breaks: physical goods don&#8217;t change meaning based on how you store them. A box of shoes is a box of shoes, whether it sits on shelf A3 or shelf Z9. Data is different. How you structure data shapes what questions you can ask of it. A normalized schema enables different analytical patterns than a denormalized one. A slowly changing dimension preserves the temporal context that a snapshot table destroys.</p><p>Structure still matters for agent-operated data. It just serves a different purpose. Instead of organizing for human navigation &#8212; &#8220;how do I find the data?&#8221; &#8212; you organize for agent operation &#8212; &#8220;what data and context does this agent need for this task?&#8221; Think about how AI tools work with a scoped working folder. You don&#8217;t reorganize your filesystem into an agent-friendly layout. You give the agent a well-scoped boundary, and it operates within it. The structure shifts from navigational to operational &#8212; from shelf labels to access boundaries.</p><h1>The Thinking Survives. The Format May Not</h1><p>I took the last class Ralph Kimball taught before his retirement. I remember the vivid conversation around HBase (which was popular at the time) and the notion of versioning to handle slowly changing dimensions. I&#8217;ve internalized dimensional modeling deeply enough to know which parts are permanent and which parts are artifacts of their era.</p><p>Kimbal didn&#8217;t start the training with the star schema and slowly changing dimensions. Kimball&#8217;s <strong><a href="https://www.kimballgroup.com/wp-content/uploads/2013/08/2013.09-Kimball-Dimensional-Modeling-Techniques11.pdf">dimensional modeling process</a></strong> starts with two steps: <em><strong>identify the business process and select the grain</strong></em>. These steps ask the most fundamental questions in data engineering &#8212; what does the business actually do, and at what level of detail does it matter? Only after answering those do you design the dimensions, the facts, and the star schema.</p><p>Steps one and two are context architecture. They always were. Identifying the business process means understanding the semantic reality of what the organization does. Selecting the grain means choosing the level of meaning that matters. That thinking is more relevant today than it was in <strong><a href="https://www.wiley.com/en-us/The+Data+Warehouse+Toolkit:+The+Definitive+Guide+to+Dimensional+Modeling,+3rd+Edition-p-9781118530801">1996</a></strong>.</p><p>Steps three and four &#8212; the star schema, the dimension tables, the fact tables &#8212; were a rendering choice. They were the best output format for the consumer of that era: a human analyst writing SQL against a relational database. The star schema serialized business understanding into a structure that humans could query using the available tools.</p><p><em><strong>The consumer has changed or is changing.</strong></em> The rendering should too. When the consumer is an AI agent, the same analytical thinking about business processes and grain produces a Context Store entry &#8212; a validated, versioned, queryable semantic definition &#8212; not a fact table. The thinking survives. The format may not.</p><p>Dismissing dimensional modeling entirely would be ignorant. Clinging to its output format when the consumer has fundamentally changed would be equally so.</p><h1>The Pendulum</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dIWJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F087d9551-a494-41bd-a648-57493f9bf87f_2242x1288.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dIWJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F087d9551-a494-41bd-a648-57493f9bf87f_2242x1288.heic 424w, https://substackcdn.com/image/fetch/$s_!dIWJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F087d9551-a494-41bd-a648-57493f9bf87f_2242x1288.heic 848w, https://substackcdn.com/image/fetch/$s_!dIWJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F087d9551-a494-41bd-a648-57493f9bf87f_2242x1288.heic 1272w, https://substackcdn.com/image/fetch/$s_!dIWJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F087d9551-a494-41bd-a648-57493f9bf87f_2242x1288.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dIWJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F087d9551-a494-41bd-a648-57493f9bf87f_2242x1288.heic" width="1456" height="836" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/087d9551-a494-41bd-a648-57493f9bf87f_2242x1288.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:836,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:31547,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/190620055?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F087d9551-a494-41bd-a648-57493f9bf87f_2242x1288.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dIWJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F087d9551-a494-41bd-a648-57493f9bf87f_2242x1288.heic 424w, https://substackcdn.com/image/fetch/$s_!dIWJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F087d9551-a494-41bd-a648-57493f9bf87f_2242x1288.heic 848w, https://substackcdn.com/image/fetch/$s_!dIWJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F087d9551-a494-41bd-a648-57493f9bf87f_2242x1288.heic 1272w, https://substackcdn.com/image/fetch/$s_!dIWJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F087d9551-a494-41bd-a648-57493f9bf87f_2242x1288.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Every era of data architecture has tried to solve the same tension: semantic precision versus operational flexibility.</p><p>The relational era chose precision. ERDs, primary keys, foreign keys, referential integrity, constraints &#8212; the schema <em>was</em> the semantic contract. <strong><a href="https://en.wikipedia.org/wiki/Bill_Inmon">Bill Inmon&#8217;s Corporate Information Factory</a></strong> formalized this into an enterprise architecture. It worked. It encoded business meaning directly into the physical structure. But it was rigid. I remember interviewing at a company in the pre-Hadoop era and asking what their current priority was. The interviewer told me they were working on implementing a schema change in a day rather than a month. That was the state of the art &#8212; a month to add a column, because the semantic contracts were so tightly welded to the physical structure that touching one meant touching everything.</p><p><strong><a href="https://www.databricks.com/discover/data-lakes/history">Hadoop&#8217;s</a></strong> answer was brute force. Sheer machine power, schema-on-read, commodity hardware &#8212; throw everything in and figure it out later. It broke the operational rigidity overnight. And it also broke every semantic contract the relational era had built. We traded meaning for speed and went too far. The data lake became a <strong><a href="https://cacm.acm.org/blogcacm/why-the-data-lake-is-really-a-data-swamp/">data swamp</a></strong> because nobody could remember what anything meant &#8212; the constraints that encoded that meaning were gone.</p><p>The lakehouse tried to find a middle ground. <strong><a href="https://iceberg.apache.org/">Iceberg</a>, <a href="https://delta.io/">Delta</a>, <a href="https://hudi.apache.org/">Hudi</a></strong> &#8212; the flexibility of the lake with some structure of the warehouse. Better. But the semantic layer remained an afterthought.</p><blockquote><p><em><strong>catalogs, documentation, and governance overlays that nobody maintained because nobody&#8217;s career depended on them being right.</strong></em> </p></blockquote><p>Even recent efforts like Snowflake&#8217;s <strong><a href="https://www.snowflake.com/en/blog/open-semantic-interchanges-specs-finalized/">Open Semantic Interchange</a></strong> initiative acknowledge the gap &#8212; the industry is only now trying to standardize how semantic meaning travels between tools.</p><p>Each swing of the pendulum traded one problem for another. Rigidity for meaninglessness. Meaninglessness for a partial structure. What none of them achieved was <em>decoupling</em> &#8212; semantic precision that doesn&#8217;t require physical rigidity. Context that travels alongside the data but isn&#8217;t welded to the table structure. Change the schema in seconds. The context updates through the Contextualize pipeline. The meaning stays current without the rigidity.</p><p>That decoupling is what ECL provides. It&#8217;s the first architecture that doesn&#8217;t force you to choose between knowing what your data means and being able to change it.</p><h1>The Graveyard of Good Intentions</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!A88K!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe5acb56-0d07-4548-8b58-cc641124d250_2178x722.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!A88K!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe5acb56-0d07-4548-8b58-cc641124d250_2178x722.heic 424w, https://substackcdn.com/image/fetch/$s_!A88K!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe5acb56-0d07-4548-8b58-cc641124d250_2178x722.heic 848w, https://substackcdn.com/image/fetch/$s_!A88K!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe5acb56-0d07-4548-8b58-cc641124d250_2178x722.heic 1272w, https://substackcdn.com/image/fetch/$s_!A88K!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe5acb56-0d07-4548-8b58-cc641124d250_2178x722.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!A88K!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe5acb56-0d07-4548-8b58-cc641124d250_2178x722.heic" width="1456" height="483" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fe5acb56-0d07-4548-8b58-cc641124d250_2178x722.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:483,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19421,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/190620055?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe5acb56-0d07-4548-8b58-cc641124d250_2178x722.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!A88K!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe5acb56-0d07-4548-8b58-cc641124d250_2178x722.heic 424w, https://substackcdn.com/image/fetch/$s_!A88K!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe5acb56-0d07-4548-8b58-cc641124d250_2178x722.heic 848w, https://substackcdn.com/image/fetch/$s_!A88K!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe5acb56-0d07-4548-8b58-cc641124d250_2178x722.heic 1272w, https://substackcdn.com/image/fetch/$s_!A88K!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe5acb56-0d07-4548-8b58-cc641124d250_2178x722.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I know what the skeptics are thinking, because I&#8217;ve thought it myself: we&#8217;ve heard this before.</p><p>Bill Inmon literally wrote the book on this in 2007 &#8212; <strong><a href="https://www.goodreads.com/en/book/show/1982171">Business Metadata: Capturing Enterprise Knowledge</a> </strong>&#8212; which covers semantics, ontologies, business rules, and the capture of tacit knowledge. He laid out a complete methodology for capturing it. The methodology was sound. The economics weren&#8217;t there yet.</p><p>Business glossaries in the 2000s promised to capture institutional knowledge. They became static documents that nobody updated. Semantic layers in the 2010s promised a unified layer of meaning. They became another piece of middleware to maintain. Data catalogs promised discoverability and governance, but soon <strong><a href="https://www.dataengineeringweekly.com/p/data-catalog-a-broken-promise">proved to be useless</a></strong>. Many became expensive shelfware. Enterprise knowledge graphs <strong><a href="https://www.cutter.com/article/knowledge-graph-implementation-costs-obstacles">promised connected meaning</a></strong>. Most never made it past the proof-of-concept stage.</p><p>Every generation of data practitioners has pointed at the same north star: capture business meaning as a first-class artifact. Every generation has underestimated the organizational gravity that pulls teams back to &#8220;just get the data there, and we&#8217;ll figure out what it means later.&#8221;</p><blockquote><p><em><strong>So what makes this time structurally different? One thing: the consumer changed from forgiving to unforgiving.</strong></em></p></blockquote><p>When the consumer was a human analyst, missing context was inconvenient. The analyst would Slack a colleague, read the dbt code, ask in standup, and check the wiki. Humans are remarkably good at filling semantic gaps through social channels. Bad metadata produced frustrated analysts, not system failures.</p><p>When the consumer is an AI agent, missing context produces systematic errors at scale. The agent doesn&#8217;t Slack anyone. It doesn&#8217;t read tribal knowledge. It sees a column called rev_adj, makes its best inference, and acts &#8212; confidently, consistently, and potentially wrong across every downstream decision. Bad context doesn&#8217;t produce frustration. It produces hallucination at an enterprise scale.</p><p>For the first time, the cost of missing context exceeds the cost of maintaining it. That economic inversion is what none of the previous attempts had. Business glossaries failed because humans bore the cost of maintaining them, while the benefit was diffuse. The Context Store succeeds or fails based on whether agents produce reliable results &#8212; and that feedback loop is immediate, measurable, and impossible to ignore.</p><p>The graveyard is real. But the economics changed.</p><h1>What Replaces It</h1><p>ETL asked: Did the data land? ECL asks: Can the data be trusted? I introduced the <strong><a href="https://www.dataengineeringweekly.com/p/data-engineering-after-ai">ECL framework</a></strong> in my earlier article on data engineering after AI.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jOKr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde2f9d7-0220-4897-943c-2673213e39c2_2270x1034.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jOKr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde2f9d7-0220-4897-943c-2673213e39c2_2270x1034.heic 424w, https://substackcdn.com/image/fetch/$s_!jOKr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde2f9d7-0220-4897-943c-2673213e39c2_2270x1034.heic 848w, https://substackcdn.com/image/fetch/$s_!jOKr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde2f9d7-0220-4897-943c-2673213e39c2_2270x1034.heic 1272w, https://substackcdn.com/image/fetch/$s_!jOKr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde2f9d7-0220-4897-943c-2673213e39c2_2270x1034.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jOKr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde2f9d7-0220-4897-943c-2673213e39c2_2270x1034.heic" width="1456" height="663" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fde2f9d7-0220-4897-943c-2673213e39c2_2270x1034.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:663,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:21454,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/190620055?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde2f9d7-0220-4897-943c-2673213e39c2_2270x1034.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jOKr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde2f9d7-0220-4897-943c-2673213e39c2_2270x1034.heic 424w, https://substackcdn.com/image/fetch/$s_!jOKr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde2f9d7-0220-4897-943c-2673213e39c2_2270x1034.heic 848w, https://substackcdn.com/image/fetch/$s_!jOKr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde2f9d7-0220-4897-943c-2673213e39c2_2270x1034.heic 1272w, https://substackcdn.com/image/fetch/$s_!jOKr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde2f9d7-0220-4897-943c-2673213e39c2_2270x1034.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Extract remains. Data still moves from source systems to analytical environments. That work still requires engineering judgment about reliability, latency, and failure modes. AI handles more of the mechanical construction. Humans make the architectural decisions.</p><p>Contextualize is the new center of gravity. A dedicated, agentic pipeline that builds and maintains a living store of semantic context. It isn&#8217;t documentation. It isn&#8217;t a catalog. It&#8217;s an engineering artifact with its own trigger model, validation layer, and storage &#8212; the Context Store.</p><p>The Context Store holds two types of objects. Context objects capture long-lived semantic definitions &#8212; what &#8220;revenue&#8221; means, who validated that definition, when, and at what confidence level. These compounds increase in value over time. Decision objects capture what agents produce when they act on context &#8212; which definitions they used, what they inferred, and what they recommended. These create the audit trail.</p><p>Link connects entities across the data landscape &#8212; and emerging standards like <strong><a href="https://www.anthropic.com/news/model-context-protocol">Model Context Protocol (MCP)</a></strong> are starting to standardize how agents access data without moving it. Not just table joins &#8212; semantic relationships between business entities across systems. A customer in CRM is linked to a user in your product, linked to a session in your support tool. Whether you implement that as a graph, a mapping table, or a markdown file matters less than whether the linkage is validated and the semantic relationship is explicit.</p><p>And because data is inherently social in nature, you don&#8217;t build this all at once. You start with one business flow. One critical table. Early bind where you control the data and can hold producers accountable for meaning. Late bind where data comes from outside your accountability boundary &#8212; third-party feeds, undocumented internal systems, legacy data where the person who knew what the fields meant left five years ago. Even one table, well contextualized, starts compounding as you connect it to the next one, and the next one.</p><h1>Long Live the Context Architect</h1><p>The physical warehouse workers who resisted robotics didn&#8217;t save their jobs. They delayed their own transition. Those who moved into robotics coordination, system design, and exception architecture found themselves more valued, more strategic, and more central to the operation than they were when driving forklifts.</p><p>Data engineers who built their identity around moving data from one bucket to another have felt that identity under pressure for a while now. That pressure isn&#8217;t going away. AI will write your Spark jobs. AI will generate your dbt models. <strong><a href="https://www.elitebrains.com/blog/aI-generated-code-statistics-2025">AI will build more pipelines</a></strong> in a year than your team could build in a decade.</p><p>But AI cannot decide what &#8220;revenue&#8221; means for your organization. It cannot negotiate data contracts between producing and consuming teams. It cannot design the appropriate level of context for an agent addressing a specific business problem. It cannot build the organizational agreements that make semantic definitions stick. That work requires institutional knowledge, cross-functional coordination, and architectural judgment. That work is context architecture.</p><p>The data engineer&#8217;s value migrates from pipeline reliability to semantic reliability. From &#8220;the job ran&#8221; to &#8220;the meaning is right.&#8221; From operating the warehouse floor to designing the system that makes robotic operation trustworthy.</p><p>The frontier is genuinely open. Nobody has this figured out yet. The practitioners who invest in the architecture of meaning &#8212; not just the mechanics of movement &#8212; will define this discipline for the next decade.</p><div class="pullquote"><p><strong>ETL is dead. Long live the Context Architect.</strong></p></div><p><em>All rights reserved, Dewpeche Private Limited. I have provided links for informational purposes and do not suggest endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers&#8217; opinions.</em></p>]]></content:encoded></item><item><title><![CDATA[Data Engineering Weekly #260]]></title><description><![CDATA[The Weekly Data Engineering Newsletter]]></description><link>https://www.dataengineeringweekly.com/p/data-engineering-weekly-260</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/data-engineering-weekly-260</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Mon, 09 Mar 2026 04:31:18 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!AdQk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6f51c1bf-abc9-4cd3-ad69-22cc8e3f1ef2_1080x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/events/deep-dive-building-a-cross-workspace-control-plane-for-databricks?utm_campaign=39250579-26-03-WBNR_Deep_DIVE_DATABRICKS&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_databricks&amp;utm_content=03_08_26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bayI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64ae4f4-c033-4325-b742-4775a4ba9a3c_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!bayI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64ae4f4-c033-4325-b742-4775a4ba9a3c_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!bayI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64ae4f4-c033-4325-b742-4775a4ba9a3c_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!bayI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64ae4f4-c033-4325-b742-4775a4ba9a3c_3840x2160.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bayI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64ae4f4-c033-4325-b742-4775a4ba9a3c_3840x2160.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d64ae4f4-c033-4325-b742-4775a4ba9a3c_3840x2160.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:30259,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/events/deep-dive-building-a-cross-workspace-control-plane-for-databricks?utm_campaign=39250579-26-03-WBNR_Deep_DIVE_DATABRICKS&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_databricks&amp;utm_content=03_08_26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/190348672?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64ae4f4-c033-4325-b742-4775a4ba9a3c_3840x2160.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bayI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64ae4f4-c033-4325-b742-4775a4ba9a3c_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!bayI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64ae4f4-c033-4325-b742-4775a4ba9a3c_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!bayI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64ae4f4-c033-4325-b742-4775a4ba9a3c_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!bayI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64ae4f4-c033-4325-b742-4775a4ba9a3c_3840x2160.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Best practices for orchestrating Databricks at scale</h1><p>As Databricks deployments scale, a familiar pattern emerges: multiple workspaces, multiple teams, and no reliable way to manage the dependencies between them.<br><br>In this hands-on deep dive, we'll show you how to build a cross-workspace control plane using Dagster on top of your existing Databricks environment. Demo-heavy and practitioner-focused, you'll leave with working patterns you can apply to your own platform the same day.</p><p><strong><a href="https://dagster.io/events/deep-dive-building-a-cross-workspace-control-plane-for-databricks?utm_campaign=39250579-26-03-WBNR_Deep_DIVE_DATABRICKS&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_databricks&amp;utm_content=03_08_26_data_engineering_weekly">Save your spot now</a></strong></p><div><hr></div><h1>underCurrent: <a href="https://current.confluent.io/data-engineers?utm_campaign=tm.devx_cd.underCurrent&amp;utm_source=newsletter&amp;utm_medium=dew">A one-day conference for data engineers and architects</a></h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YgTh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335fca03-45f7-4633-b9bd-be08f47f71c2_2400x1254.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YgTh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335fca03-45f7-4633-b9bd-be08f47f71c2_2400x1254.heic 424w, https://substackcdn.com/image/fetch/$s_!YgTh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335fca03-45f7-4633-b9bd-be08f47f71c2_2400x1254.heic 848w, https://substackcdn.com/image/fetch/$s_!YgTh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335fca03-45f7-4633-b9bd-be08f47f71c2_2400x1254.heic 1272w, https://substackcdn.com/image/fetch/$s_!YgTh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335fca03-45f7-4633-b9bd-be08f47f71c2_2400x1254.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YgTh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335fca03-45f7-4633-b9bd-be08f47f71c2_2400x1254.heic" width="1456" height="761" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/335fca03-45f7-4633-b9bd-be08f47f71c2_2400x1254.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:761,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:16850,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/190348672?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335fca03-45f7-4633-b9bd-be08f47f71c2_2400x1254.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YgTh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335fca03-45f7-4633-b9bd-be08f47f71c2_2400x1254.heic 424w, https://substackcdn.com/image/fetch/$s_!YgTh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335fca03-45f7-4633-b9bd-be08f47f71c2_2400x1254.heic 848w, https://substackcdn.com/image/fetch/$s_!YgTh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335fca03-45f7-4633-b9bd-be08f47f71c2_2400x1254.heic 1272w, https://substackcdn.com/image/fetch/$s_!YgTh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F335fca03-45f7-4633-b9bd-be08f47f71c2_2400x1254.heic 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Confluent is hosting a free one-day conference with a catch: there&#8217;s no catch. It&#8217;s a single-track event with no sponsors and no product pitches&#8212;just technical talks for data engineers and architects.<br><br>&#127897;&#65039; Speakers include <strong>Joe Reis, Holden Karau, and Max Beauchemin</strong><br>&#128683; No vendors. No sales pitches<br>&#10024; 100% free to attend <br>&#128197; <strong>March 26</strong> <br>&#128205; San Francisco<br>&#127903;&#65039; <strong>Limited to 100 seats</strong> &#8212; <strong><a href="https://current.confluent.io/data-engineers?utm_campaign=tm.devx_cd.underCurrent&amp;utm_source=newsletter&amp;utm_medium=dew">register for free here</a></strong></p><div><hr></div><h1>Vinoth Govindarajan: OpenClaw Architecture</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!P3Mc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75673d5b-29a1-476f-91ff-1b822acd8f08_1456x717.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!P3Mc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75673d5b-29a1-476f-91ff-1b822acd8f08_1456x717.heic 424w, https://substackcdn.com/image/fetch/$s_!P3Mc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75673d5b-29a1-476f-91ff-1b822acd8f08_1456x717.heic 848w, https://substackcdn.com/image/fetch/$s_!P3Mc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75673d5b-29a1-476f-91ff-1b822acd8f08_1456x717.heic 1272w, https://substackcdn.com/image/fetch/$s_!P3Mc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75673d5b-29a1-476f-91ff-1b822acd8f08_1456x717.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!P3Mc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75673d5b-29a1-476f-91ff-1b822acd8f08_1456x717.heic" width="1456" height="717" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/75673d5b-29a1-476f-91ff-1b822acd8f08_1456x717.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:717,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12218,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/190348672?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75673d5b-29a1-476f-91ff-1b822acd8f08_1456x717.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!P3Mc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75673d5b-29a1-476f-91ff-1b822acd8f08_1456x717.heic 424w, https://substackcdn.com/image/fetch/$s_!P3Mc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75673d5b-29a1-476f-91ff-1b822acd8f08_1456x717.heic 848w, https://substackcdn.com/image/fetch/$s_!P3Mc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75673d5b-29a1-476f-91ff-1b822acd8f08_1456x717.heic 1272w, https://substackcdn.com/image/fetch/$s_!P3Mc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75673d5b-29a1-476f-91ff-1b822acd8f08_1456x717.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Production AI agents fail at scale because uncontrolled state mutations corrupt execution and create unpredictable behavior. In &#8220;The Agent Stack,&#8221; Vinoth Govindarajan outlines OpenClaw&#8217;s architecture, in which isolated execution contexts and strict invariants prevent state leakage, while sessions enable async pause-resume semantics. The pattern standardizes how teams decouple short-term context from persistent state, ensuring agents reliably rehydrate their mental model and enforce authorization boundaries that gate tool access to user privilege levels.</p><p><strong><a href="https://theagentstack.substack.com/p/openclaw-architecture-part-1-control">Part 1</a>, <a href="https://theagentstack.substack.com/p/openclaw-architecture-part-2-concurrency">Part 2</a>, <a href="https://openclawunboxed.com/p/openclaw-architecture-part-3-memory">Part 3.1</a>, <a href="https://theagentstack.substack.com/p/openclaw-architecture-part-3-memory">Part 3.2</a>, <a href="https://theagentstack.substack.com/p/openclaw-architecture-part-4-security">Part 4</a></strong></p><div><hr></div><h1>Pinterest: Unified Context-Intent Embeddings for Scalable Text-to-SQL</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EtBR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9528d97e-84de-44dc-bbd5-17be18a55cf4_1400x655.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EtBR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9528d97e-84de-44dc-bbd5-17be18a55cf4_1400x655.heic 424w, https://substackcdn.com/image/fetch/$s_!EtBR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9528d97e-84de-44dc-bbd5-17be18a55cf4_1400x655.heic 848w, https://substackcdn.com/image/fetch/$s_!EtBR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9528d97e-84de-44dc-bbd5-17be18a55cf4_1400x655.heic 1272w, https://substackcdn.com/image/fetch/$s_!EtBR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9528d97e-84de-44dc-bbd5-17be18a55cf4_1400x655.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EtBR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9528d97e-84de-44dc-bbd5-17be18a55cf4_1400x655.heic" width="1400" height="655" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9528d97e-84de-44dc-bbd5-17be18a55cf4_1400x655.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:655,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24113,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/190348672?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9528d97e-84de-44dc-bbd5-17be18a55cf4_1400x655.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EtBR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9528d97e-84de-44dc-bbd5-17be18a55cf4_1400x655.heic 424w, https://substackcdn.com/image/fetch/$s_!EtBR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9528d97e-84de-44dc-bbd5-17be18a55cf4_1400x655.heic 848w, https://substackcdn.com/image/fetch/$s_!EtBR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9528d97e-84de-44dc-bbd5-17be18a55cf4_1400x655.heic 1272w, https://substackcdn.com/image/fetch/$s_!EtBR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9528d97e-84de-44dc-bbd5-17be18a55cf4_1400x655.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Navigating sprawling data warehouses forces analysts to choose between slow manual exploration and unreliable keyword-based search. Pinterest Engineering built a production Analytics Agent that embeds historical SQL queries as semantic intent signatures, injecting business glossary terms and extracting structural patterns (join keys, filters, usage signals) to retrieve contextually relevant tables at scale. The system reached 40% internal adoption within two months by standardizing discovery through an asset-first pattern, converting years of institutional SQL knowledge into a searchable, governance-aware library.</p><p><strong><a href="https://medium.com/pinterest-engineering/unified-context-intent-embeddings-for-scalable-text-to-sql-793635e60aac">https://medium.com/pinterest-engineering/unified-context-intent-embeddings-for-scalable-text-to-sql-793635e60aac</a></strong></p><div><hr></div><h1>Francesca Lazzeri: AI evals platforms: A comparative guide for production AI systems</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kftj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a2a07-932c-4b96-8027-479d5bbc9456_753x330.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kftj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a2a07-932c-4b96-8027-479d5bbc9456_753x330.heic 424w, https://substackcdn.com/image/fetch/$s_!kftj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a2a07-932c-4b96-8027-479d5bbc9456_753x330.heic 848w, https://substackcdn.com/image/fetch/$s_!kftj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a2a07-932c-4b96-8027-479d5bbc9456_753x330.heic 1272w, https://substackcdn.com/image/fetch/$s_!kftj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a2a07-932c-4b96-8027-479d5bbc9456_753x330.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kftj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a2a07-932c-4b96-8027-479d5bbc9456_753x330.heic" width="753" height="330" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a98a2a07-932c-4b96-8027-479d5bbc9456_753x330.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:330,&quot;width&quot;:753,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12090,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/190348672?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a2a07-932c-4b96-8027-479d5bbc9456_753x330.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kftj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a2a07-932c-4b96-8027-479d5bbc9456_753x330.heic 424w, https://substackcdn.com/image/fetch/$s_!kftj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a2a07-932c-4b96-8027-479d5bbc9456_753x330.heic 848w, https://substackcdn.com/image/fetch/$s_!kftj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a2a07-932c-4b96-8027-479d5bbc9456_753x330.heic 1272w, https://substackcdn.com/image/fetch/$s_!kftj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a2a07-932c-4b96-8027-479d5bbc9456_753x330.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Production AI systems fail silently in ways demos never expose, forcing teams to replace manual testing with automated evaluation as the enterprise LLM market scales toward $71.1 billion by 2034. A comparative analysis of six leading eval platforms reveals a consolidation around open standards (OpenTelemetry, OpenInference) and specialized architectures&#8212;Microsoft AI Foundry embeds red-teaming agents into Azure workflows, while Galileo replaces expensive LLM judges with smaller consensus models (Luna) to reduce eval latency. The shift standardizes safety as a structural property of development, enabling teams to catch jailbreaks and data leaks early while choosing platform fit based on stack priorities: simulation-first, research rigor, or ecosystem depth.</p><p><strong><a href="https://medium.com/data-science-at-microsoft/how-do-you-know-your-ai-actually-works-b1a380a07825">https://medium.com/data-science-at-microsoft/how-do-you-know-your-ai-actually-works-b1a380a07825</a></strong></p><div><hr></div><h1>Sponsored: The AI Modernization Guide</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=03_08_26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uymw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c0d34d0-9f2d-40a3-94e1-f596c56b03df_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!uymw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c0d34d0-9f2d-40a3-94e1-f596c56b03df_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!uymw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c0d34d0-9f2d-40a3-94e1-f596c56b03df_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!uymw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c0d34d0-9f2d-40a3-94e1-f596c56b03df_2400x1260.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uymw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c0d34d0-9f2d-40a3-94e1-f596c56b03df_2400x1260.heic" width="1456" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4c0d34d0-9f2d-40a3-94e1-f596c56b03df_2400x1260.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:18459,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=03_08_26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/190348672?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c0d34d0-9f2d-40a3-94e1-f596c56b03df_2400x1260.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uymw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c0d34d0-9f2d-40a3-94e1-f596c56b03df_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!uymw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c0d34d0-9f2d-40a3-94e1-f596c56b03df_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!uymw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c0d34d0-9f2d-40a3-94e1-f596c56b03df_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!uymw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c0d34d0-9f2d-40a3-94e1-f596c56b03df_2400x1260.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AI is reshaping how data teams operate. But legacy pipelines, brittle workflows, and fragmented tooling weren&#8217;t designed for this shift.<br><br>Learn how leading teams are future-proofing their infrastructure before AI demands overwhelm it.</p><p><strong><a href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=03_08_26_data_engineering_weekly">Download the free guide</a></strong></p><div><hr></div><h1>Netflix: MediaFM - The Multimodal AI Foundation for Media Understanding at Netflix</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!p3o3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6c8b569-8c07-4d7d-924e-1cb9ccc84975_1400x1172.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!p3o3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6c8b569-8c07-4d7d-924e-1cb9ccc84975_1400x1172.heic 424w, https://substackcdn.com/image/fetch/$s_!p3o3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6c8b569-8c07-4d7d-924e-1cb9ccc84975_1400x1172.heic 848w, https://substackcdn.com/image/fetch/$s_!p3o3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6c8b569-8c07-4d7d-924e-1cb9ccc84975_1400x1172.heic 1272w, https://substackcdn.com/image/fetch/$s_!p3o3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6c8b569-8c07-4d7d-924e-1cb9ccc84975_1400x1172.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!p3o3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6c8b569-8c07-4d7d-924e-1cb9ccc84975_1400x1172.heic" width="1400" height="1172" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a6c8b569-8c07-4d7d-924e-1cb9ccc84975_1400x1172.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1172,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:16754,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/190348672?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6c8b569-8c07-4d7d-924e-1cb9ccc84975_1400x1172.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!p3o3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6c8b569-8c07-4d7d-924e-1cb9ccc84975_1400x1172.heic 424w, https://substackcdn.com/image/fetch/$s_!p3o3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6c8b569-8c07-4d7d-924e-1cb9ccc84975_1400x1172.heic 848w, https://substackcdn.com/image/fetch/$s_!p3o3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6c8b569-8c07-4d7d-924e-1cb9ccc84975_1400x1172.heic 1272w, https://substackcdn.com/image/fetch/$s_!p3o3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6c8b569-8c07-4d7d-924e-1cb9ccc84975_1400x1172.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Understanding content at scale requires machine-readable representations that capture narrative structure, not just visual features&#8212;a challenge intensified as streaming catalogs exceed tens of thousands of titles. Netflix built MediaFM, a tri-modal transformer that fuses video frames, audio (wav2vec2), and subtitles into shot-level embeddings using Masked Shot Modeling, with a [GLOBAL] token injecting title-level context (synopsis, genre) to ground each segment. The model powers ad placement, clip ranking, content tagging, and cold-start recommendations by contextualizing shots within narrative sequence, outperforming external benchmarks and enabling machine-readable understanding across Netflix's entire catalog.</p><p><strong><a href="https://netflixtechblog.com/mediafm-the-multimodal-ai-foundation-for-media-understanding-at-netflix-e8c28df82e2d">https://netflixtechblog.com/mediafm-the-multimodal-ai-foundation-for-media-understanding-at-netflix-e8c28df82e2d</a></strong></p><div><hr></div><h1>Nabin Debnath: Building a Least-Privilege AI Agent Gateway for Infrastructure Automation with MCP, OPA, and Ephemeral Runners</h1><p>AI agents in infrastructure automation bypass traditional guardrails by making runtime decisions without human validation, risking silent resource destruction or credential exfiltration at scale. The author writes about the Agent Gateway to treat the agents as untrusted requesters, layering Model Context Protocol (MCP) for tool discovery, Open Policy Agent (OPA) for intent-based authorization, and ephemeral Kubernetes runners for isolated execution. The pattern enforces least privilege by mediating all API calls through policy code, validates plan integrity against immutable hashes, and surfaces decision reasoning via OpenTelemetry&#8212;standardizing agent governance with SLO targets (100ms policy decisions, 5s runner startup) that prevent silent bypasses.</p><p><strong><a href="https://www.infoq.com/articles/building-ai-agent-gateway-mcp/">https://www.infoq.com/articles/building-ai-agent-gateway-mcp/</a></strong></p><div><hr></div><h1>Dropbox: Using LLMs to amplify human labeling and improve Dash search relevance</h1><p>Enterprise search ranking requires massive labeled datasets, but traditional human annotation is prohibitively slow and cannot scale to sensitive content across billions of internal documents. Dropbox Dash uses LLMs as labeling force multipliers by calibrating a small human-labeled set to generate millions of relevance judgments offline, then training lightweight production models (XGBoost) on synthetic labels at scale. The pattern standardizes judgment consistency by pairing contextual research tools (for acronyms and ambiguous queries) with programmatic prompt optimization (DSPy), enabling continuous ranking improvements while keeping human oversight as the ground truth rather than replacing it.</p><p><strong><a href="https://dropbox.tech/machine-learning/llm-human-labeling-improving-search-relevance-dropbox-dash">https://dropbox.tech/machine-learning/llm-human-labeling-improving-search-relevance-dropbox-dash</a></strong></p><div><hr></div><h1>Zalando: Why We Ditched Flink Table API Joins: Cutting State by 75% with DataStream Unions</h1><p>Declarative SQL joins in Flink multiply state across operators, forcing teams to choose between snapshot overhead or operational instability&#8212;a scaling bottleneck for pipelines enriching millions of real-time product records. Zalando replaced chained Table API joins with a custom KeyedProcessFunction that unions all streams into a single keyed DataStream, storing each product&#8217;s enriched state once in RocksDB instead of redundantly across join operators. The shift cut state size by 75% (235GB to 56GB), reduced snapshot time by 77% (11 minutes to 2.5 minutes), and lowered AWS costs by 13%&#8212;demonstrating how imperative control over stream topology recovers efficiency when declarative abstractions misalign with physical execution.</p><p><strong><a href="https://engineering.zalando.com/posts/2026/03/why-we-ditched-flink-table-api-joins-cutting-state.html">https://engineering.zalando.com/posts/2026/03/why-we-ditched-flink-table-api-joins-cutting-state.html</a></strong></p><div><hr></div><h1>Aihua Xu &amp; Andrew Lamb: Variant Type in Apache Parquet for Semi-Structured Data</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4lQG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea671ece-9073-414d-adc3-952731dc5248_1024x633.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4lQG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea671ece-9073-414d-adc3-952731dc5248_1024x633.heic 424w, https://substackcdn.com/image/fetch/$s_!4lQG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea671ece-9073-414d-adc3-952731dc5248_1024x633.heic 848w, https://substackcdn.com/image/fetch/$s_!4lQG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea671ece-9073-414d-adc3-952731dc5248_1024x633.heic 1272w, https://substackcdn.com/image/fetch/$s_!4lQG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea671ece-9073-414d-adc3-952731dc5248_1024x633.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4lQG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea671ece-9073-414d-adc3-952731dc5248_1024x633.heic" width="1024" height="633" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ea671ece-9073-414d-adc3-952731dc5248_1024x633.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:633,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:53802,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/190348672?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea671ece-9073-414d-adc3-952731dc5248_1024x633.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4lQG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea671ece-9073-414d-adc3-952731dc5248_1024x633.heic 424w, https://substackcdn.com/image/fetch/$s_!4lQG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea671ece-9073-414d-adc3-952731dc5248_1024x633.heic 848w, https://substackcdn.com/image/fetch/$s_!4lQG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea671ece-9073-414d-adc3-952731dc5248_1024x633.heic 1272w, https://substackcdn.com/image/fetch/$s_!4lQG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fea671ece-9073-414d-adc3-952731dc5248_1024x633.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Semi-structured data in columnar formats forces a choice between slow JSON parsing or rigid schemas that block evolution, creating friction in pipelines handling heterogeneous records. Apache Parquet&#8217;s new Variant type uses binary-encoded metadata plus value fields, enabling direct nested field access without full-document parsing while preserving native types (timestamps, integers) that JSON loses. The type standardizes schema flexibility through &#8220;shredding&#8221;&#8212;extracting hot fields into strongly-typed columns for predicate pushdown and pruning&#8212;allowing heterogeneous records to coexist in one column, reducing migration overhead and accelerating adoption across DuckDB, Spark 4.0, and Snowflake.</p><p><strong><a href="https://parquet.apache.org/blog/2026/02/27/variant-type-in-apache-parquet-for-semi-structured-data/">https://parquet.apache.org/blog/2026/02/27/variant-type-in-apache-parquet-for-semi-structured-data/</a></strong></p><div><hr></div><h1>Pranav Mehta: Silent Data Loss in ClickHouse: 3 Reasons Your Distributed Queue Keeps Growing</h1><p>ClickHouse distributed inserts silently fail when coordination services downtime, execution timeouts, or concurrency limits block the async flush pipeline, leaving data trapped in on-disk queues while clients receive no error signals. The author identifies three failure modes: <em>Keeper/ZooKeeper downtime forcing ReplicatedMergeTree read-only, oversized insert blocks exceeding max_execution_time that cork sequential queue processing, and exhausted user concurrency slots starving background INSERT workers</em>. The pattern demands proactive monitoring of DistributedFilesToInsert (alert at 50+ files), debugging via system.distribution_queue.last_exception, and inode-aware filesystem choice (XFS over ext4) to prevent silent data loss and system crashes from queue explosion.</p><p><strong><a href="https://medium.com/@pranavmehta94/silent-data-loss-in-clickhouse-3-reasons-your-distributed-queue-keeps-growing-9bf6b8af88e5">https://medium.com/@pranavmehta94/silent-data-loss-in-clickhouse-3-reasons-your-distributed-queue-keeps-growing-9bf6b8af88e5</a></strong></p><div><hr></div><p><em>All rights reserved, Dewpeche Private Limited. I have provided links for informational purposes and do not suggest endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers&#8217; opinions.</em></p>]]></content:encoded></item><item><title><![CDATA[Data Engineering Weekly #259]]></title><description><![CDATA[The Weekly Data Engineering Newsletter]]></description><link>https://www.dataengineeringweekly.com/p/data-engineering-weekly-259</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/data-engineering-weekly-259</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Mon, 02 Mar 2026 03:57:15 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!AdQk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6f51c1bf-abc9-4cd3-ad69-22cc8e3f1ef2_1080x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=03_01_26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OlRC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06da4e96-6d18-41c1-95b5-bd22999807a9_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!OlRC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06da4e96-6d18-41c1-95b5-bd22999807a9_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!OlRC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06da4e96-6d18-41c1-95b5-bd22999807a9_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!OlRC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06da4e96-6d18-41c1-95b5-bd22999807a9_2400x1260.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OlRC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06da4e96-6d18-41c1-95b5-bd22999807a9_2400x1260.heic" width="1456" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/06da4e96-6d18-41c1-95b5-bd22999807a9_2400x1260.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24006,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=03_01_26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/189610818?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06da4e96-6d18-41c1-95b5-bd22999807a9_2400x1260.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OlRC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06da4e96-6d18-41c1-95b5-bd22999807a9_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!OlRC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06da4e96-6d18-41c1-95b5-bd22999807a9_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!OlRC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06da4e96-6d18-41c1-95b5-bd22999807a9_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!OlRC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06da4e96-6d18-41c1-95b5-bd22999807a9_2400x1260.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>AI is moving fast. Is your data platform ready?</h1><p>AI is reshaping how data teams operate. But legacy pipelines, brittle workflows, and fragmented tooling weren&#8217;t designed for this shift.<br><br>Learn how leading teams are future-proofing their infrastructure before AI demands overwhelm it.</p><p><strong><a href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=03_01_26_data_engineering_weekly">Download the AI Modernization Guide</a></strong></p><div><hr></div><h1>underCurrent: <a href="https://current.confluent.io/data-engineers?utm_campaign=tm.devx_cd.underCurrent&amp;utm_source=newsletter&amp;utm_medium=dew">A one-day conference for data engineers and architects</a></h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2Ad5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2f1aec-0512-4572-8eee-f14bf7b735ee_2400x1254.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2Ad5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2f1aec-0512-4572-8eee-f14bf7b735ee_2400x1254.heic 424w, https://substackcdn.com/image/fetch/$s_!2Ad5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2f1aec-0512-4572-8eee-f14bf7b735ee_2400x1254.heic 848w, https://substackcdn.com/image/fetch/$s_!2Ad5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2f1aec-0512-4572-8eee-f14bf7b735ee_2400x1254.heic 1272w, https://substackcdn.com/image/fetch/$s_!2Ad5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2f1aec-0512-4572-8eee-f14bf7b735ee_2400x1254.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2Ad5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2f1aec-0512-4572-8eee-f14bf7b735ee_2400x1254.heic" width="1456" height="761" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fe2f1aec-0512-4572-8eee-f14bf7b735ee_2400x1254.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:761,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:16850,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/189610818?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2f1aec-0512-4572-8eee-f14bf7b735ee_2400x1254.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2Ad5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2f1aec-0512-4572-8eee-f14bf7b735ee_2400x1254.heic 424w, https://substackcdn.com/image/fetch/$s_!2Ad5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2f1aec-0512-4572-8eee-f14bf7b735ee_2400x1254.heic 848w, https://substackcdn.com/image/fetch/$s_!2Ad5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2f1aec-0512-4572-8eee-f14bf7b735ee_2400x1254.heic 1272w, https://substackcdn.com/image/fetch/$s_!2Ad5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe2f1aec-0512-4572-8eee-f14bf7b735ee_2400x1254.heic 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Confluent is hosting a free one-day conference with a catch: there&#8217;s no catch. It&#8217;s a single-track event with no sponsors and no product pitches&#8212;just technical talks for data engineers and architects.<br><br>&#127897;&#65039; Speakers include <strong>Joe Reis</strong>, <strong>Holden Karau</strong>, and <strong>Max Beauchemin</strong><br>&#128683; No vendors. No sales pitches<br>&#10024; 100% free to attend <br>&#128205; San Francisco &#128197; March 26 <br>&#127903;&#65039; Limited to 100 seats &#8212; register for free <strong><a href="https://current.confluent.io/data-engineers?utm_campaign=tm.devx_cd.underCurrent&amp;utm_source=newsletter&amp;utm_medium=dew">here</a></strong></p><div><hr></div><h1>Netflix: DataJunction as Netflix&#8217;s answer to the missing piece of the modern data stack</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VaGI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0868d85-4037-430d-95f6-47be9cb1fba8_512x354.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VaGI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0868d85-4037-430d-95f6-47be9cb1fba8_512x354.webp 424w, https://substackcdn.com/image/fetch/$s_!VaGI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0868d85-4037-430d-95f6-47be9cb1fba8_512x354.webp 848w, https://substackcdn.com/image/fetch/$s_!VaGI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0868d85-4037-430d-95f6-47be9cb1fba8_512x354.webp 1272w, https://substackcdn.com/image/fetch/$s_!VaGI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0868d85-4037-430d-95f6-47be9cb1fba8_512x354.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VaGI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0868d85-4037-430d-95f6-47be9cb1fba8_512x354.webp" width="512" height="354" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f0868d85-4037-430d-95f6-47be9cb1fba8_512x354.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:354,&quot;width&quot;:512,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:9726,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/189610818?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0868d85-4037-430d-95f6-47be9cb1fba8_512x354.webp&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VaGI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0868d85-4037-430d-95f6-47be9cb1fba8_512x354.webp 424w, https://substackcdn.com/image/fetch/$s_!VaGI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0868d85-4037-430d-95f6-47be9cb1fba8_512x354.webp 848w, https://substackcdn.com/image/fetch/$s_!VaGI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0868d85-4037-430d-95f6-47be9cb1fba8_512x354.webp 1272w, https://substackcdn.com/image/fetch/$s_!VaGI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0868d85-4037-430d-95f6-47be9cb1fba8_512x354.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Metric inconsistency and definition sprawl across distributed teams create onboarding bottlenecks and fragment analytics workflows. Netflix built DataJunction, an open-source semantic layer that decouples metric definitions from compute through a graph-based metadata model and SQL generation engine. This standardizes metrics across the experimentation platform, reducing onboarding from weeks to hours, while enabling expansion across all business verticals and LLM integration for auditable metric lineage.</p><p><strong><a href="https://netflixtechblog.medium.com/datajunction-as-netflixs-answer-to-the-missing-piece-of-the-modern-data-stack-92af926b40a5">https://netflixtechblog.medium.com/datajunction-as-netflixs-answer-to-the-missing-piece-of-the-modern-data-stack-92af926b40a5</a></strong></p><div><hr></div><h1>Benoit Pimpaud: Specs Should Be Equations, Not Essays</h1><p>As AI automates code generation, the engineering bottleneck shifts from writing implementation to defining precise specifications. the author argues that natural language specifications create compounding ambiguity when parsed by LLMs and proposes layered specifications that combine text, diagrams, and mathematical notation as constraint definitions for AI iteration. Mathematical specs eliminate interpretation drift, enabling AI agents to generate correct programs by satisfying invariants rather than reconstructing intent from prose.</p><p><strong><a href="https://fromanengineersight.substack.com/p/specs-should-be-equations-not-essays">https://fromanengineersight.substack.com/p/specs-should-be-equations-not-essays</a></strong></p><div><hr></div><h1>Notion: Balancing cost and reliability for Spark on Kubernetes</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8jTD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F209221f3-17f2-4695-93d1-fafd90f239c4_616x316.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8jTD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F209221f3-17f2-4695-93d1-fafd90f239c4_616x316.heic 424w, https://substackcdn.com/image/fetch/$s_!8jTD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F209221f3-17f2-4695-93d1-fafd90f239c4_616x316.heic 848w, https://substackcdn.com/image/fetch/$s_!8jTD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F209221f3-17f2-4695-93d1-fafd90f239c4_616x316.heic 1272w, https://substackcdn.com/image/fetch/$s_!8jTD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F209221f3-17f2-4695-93d1-fafd90f239c4_616x316.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8jTD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F209221f3-17f2-4695-93d1-fafd90f239c4_616x316.heic" width="616" height="316" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/209221f3-17f2-4695-93d1-fafd90f239c4_616x316.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:316,&quot;width&quot;:616,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:9723,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/189610818?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F209221f3-17f2-4695-93d1-fafd90f239c4_616x316.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8jTD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F209221f3-17f2-4695-93d1-fafd90f239c4_616x316.heic 424w, https://substackcdn.com/image/fetch/$s_!8jTD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F209221f3-17f2-4695-93d1-fafd90f239c4_616x316.heic 848w, https://substackcdn.com/image/fetch/$s_!8jTD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F209221f3-17f2-4695-93d1-fafd90f239c4_616x316.heic 1272w, https://substackcdn.com/image/fetch/$s_!8jTD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F209221f3-17f2-4695-93d1-fafd90f239c4_616x316.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Spark clusters on Kubernetes face a fundamental tension between aggressive cost optimization through spot instances and job reliability during capacity interruptions. Notion reduced compute costs by 60&#8211;90% through EKS migration with Karpenter bin-packing, then open-sourced Spot Balancer&#8212;a Kubernetes webhook that enforces stable spot-to-on-demand ratios per job, preventing cascade failures during AWS termination windows. Spot Balancer abstracts infrastructure trade-offs into developer-friendly stability tiers, enabling teams to optimize costs without sacrificing job completion rates.</p><p><strong><a href="https://www.notion.com/blog/balancing-cost-and-reliability-for-spark-on-kubernetes">https://www.notion.com/blog/balancing-cost-and-reliability-for-spark-on-kubernetes</a></strong></p><div><hr></div><h1>Sponsored: Building a Cross-Workspace Control Plane for Databricks</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/events/deep-dive-building-a-cross-workspace-control-plane-for-databricks?utm_campaign=39250579-26-03-WBNR_Deep_DIVE_DATABRICKS&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_databricks&amp;utm_content=03_01_26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uy7d!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f39374-3d44-46e6-a142-145a1b94fd31_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!uy7d!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f39374-3d44-46e6-a142-145a1b94fd31_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!uy7d!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f39374-3d44-46e6-a142-145a1b94fd31_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!uy7d!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f39374-3d44-46e6-a142-145a1b94fd31_3840x2160.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uy7d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f39374-3d44-46e6-a142-145a1b94fd31_3840x2160.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/04f39374-3d44-46e6-a142-145a1b94fd31_3840x2160.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24982,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/events/deep-dive-building-a-cross-workspace-control-plane-for-databricks?utm_campaign=39250579-26-03-WBNR_Deep_DIVE_DATABRICKS&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_databricks&amp;utm_content=03_01_26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/189610818?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f39374-3d44-46e6-a142-145a1b94fd31_3840x2160.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uy7d!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f39374-3d44-46e6-a142-145a1b94fd31_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!uy7d!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f39374-3d44-46e6-a142-145a1b94fd31_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!uy7d!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f39374-3d44-46e6-a142-145a1b94fd31_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!uy7d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f39374-3d44-46e6-a142-145a1b94fd31_3840x2160.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As Databricks deployments scale, a familiar pattern emerges: multiple workspaces, multiple teams, and no reliable way to manage the dependencies between them.<br>In this hands-on deep dive, we'll show you how to build a cross-workspace control plane using Dagster on top of your existing Databricks environment. Demo-heavy and practitioner-focused, you'll leave with working patterns you can apply to your own platform the same day.</p><p><strong><a href="https://dagster.io/events/deep-dive-building-a-cross-workspace-control-plane-for-databricks?utm_campaign=39250579-26-03-WBNR_Deep_DIVE_DATABRICKS&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_databricks&amp;utm_content=03_01_26_data_engineering_weekly">Register now</a></strong></p><div><hr></div><h1>Apache Iceberg: Introducing the Apache Iceberg File Format API</h1><p>It is indeed an exciting development in Iceberg to support a plugable file format API spec. As we increasingly handle unstructured data, this will significantly enhance data management practices through unified governance and compliance. Interestingly, Apache Hudi&#8217;s <strong><a href="https://github.com/apache/hudi/issues/14127">RFC-100</a></strong> is, in fact, the feature request to support the Lance File Format. </p><p><strong><a href="https://iceberg.apache.org/blog/apache-iceberg-file-format-api/">https://iceberg.apache.org/blog/apache-iceberg-file-format-api/</a></strong></p><div><hr></div><h1>Delta Lake: The next evolution of Delta - Catalog-Managed Tables</h1><blockquote><p><em>We went through the full cycle, from exposing the files directly through Hadoop to Snowflake-style cloud data warehouses, to Iceberg-style direct file access, back to catalog-managed tables. </em></p></blockquote><p>Nonetheless, it will be interesting to watch DuckLake-style catalog-managed tables vs object-store-style managed tables. </p><p><strong><a href="https://delta.io/blog/2026-02-02-delta-catalog-managed-tables/">https://delta.io/blog/2026-02-02-delta-catalog-managed-tables/</a></strong></p><div><hr></div><h1>Microsoft Fabric: Under the hood: an introduction to the Native Execution Engine for Microsoft Fabric</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ah5O!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962b9eef-68ea-4abd-914f-bf0319e0ec6e_496x465.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ah5O!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962b9eef-68ea-4abd-914f-bf0319e0ec6e_496x465.heic 424w, https://substackcdn.com/image/fetch/$s_!ah5O!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962b9eef-68ea-4abd-914f-bf0319e0ec6e_496x465.heic 848w, https://substackcdn.com/image/fetch/$s_!ah5O!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962b9eef-68ea-4abd-914f-bf0319e0ec6e_496x465.heic 1272w, https://substackcdn.com/image/fetch/$s_!ah5O!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962b9eef-68ea-4abd-914f-bf0319e0ec6e_496x465.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ah5O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962b9eef-68ea-4abd-914f-bf0319e0ec6e_496x465.heic" width="496" height="465" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/962b9eef-68ea-4abd-914f-bf0319e0ec6e_496x465.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:465,&quot;width&quot;:496,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7782,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/189610818?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962b9eef-68ea-4abd-914f-bf0319e0ec6e_496x465.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ah5O!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962b9eef-68ea-4abd-914f-bf0319e0ec6e_496x465.heic 424w, https://substackcdn.com/image/fetch/$s_!ah5O!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962b9eef-68ea-4abd-914f-bf0319e0ec6e_496x465.heic 848w, https://substackcdn.com/image/fetch/$s_!ah5O!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962b9eef-68ea-4abd-914f-bf0319e0ec6e_496x465.heic 1272w, https://substackcdn.com/image/fetch/$s_!ah5O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F962b9eef-68ea-4abd-914f-bf0319e0ec6e_496x465.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The Apache Gluten project is continually making an impact on the Spark ecosystem, bringing unique optimization and efficiency. Microsoft Fabric writes an under-the-hood story of adopting Apache Gluten in its Fabric platform. </p><p><strong><a href="https://blog.fabric.microsoft.com/en-us/blog/under-the-hood-an-introduction-to-the-native-execution-engine-for-microsoft-fabric/">https://blog.fabric.microsoft.com/en-us/blog/under-the-hood-an-introduction-to-the-native-execution-engine-for-microsoft-fabric/</a></strong></p><div><hr></div><h1>Pinterest: Piqama - Pinterest Quota Management Ecosystem</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WV0P!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dbf8e98-63fe-4f9c-b006-22a2778b6ed6_1400x701.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WV0P!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dbf8e98-63fe-4f9c-b006-22a2778b6ed6_1400x701.heic 424w, https://substackcdn.com/image/fetch/$s_!WV0P!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dbf8e98-63fe-4f9c-b006-22a2778b6ed6_1400x701.heic 848w, https://substackcdn.com/image/fetch/$s_!WV0P!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dbf8e98-63fe-4f9c-b006-22a2778b6ed6_1400x701.heic 1272w, https://substackcdn.com/image/fetch/$s_!WV0P!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dbf8e98-63fe-4f9c-b006-22a2778b6ed6_1400x701.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WV0P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dbf8e98-63fe-4f9c-b006-22a2778b6ed6_1400x701.heic" width="1400" height="701" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7dbf8e98-63fe-4f9c-b006-22a2778b6ed6_1400x701.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:701,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17308,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/189610818?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dbf8e98-63fe-4f9c-b006-22a2778b6ed6_1400x701.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WV0P!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dbf8e98-63fe-4f9c-b006-22a2778b6ed6_1400x701.heic 424w, https://substackcdn.com/image/fetch/$s_!WV0P!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dbf8e98-63fe-4f9c-b006-22a2778b6ed6_1400x701.heic 848w, https://substackcdn.com/image/fetch/$s_!WV0P!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dbf8e98-63fe-4f9c-b006-22a2778b6ed6_1400x701.heic 1272w, https://substackcdn.com/image/fetch/$s_!WV0P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7dbf8e98-63fe-4f9c-b006-22a2778b6ed6_1400x701.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As companies scale, manual and static quota systems become bottlenecks, forcing engineers to choose between over-provisioning resources and managing brittle enforcement logic. Pinterest developed Piqama, a unified quota platform that dynamically right-sizes limits using historical data stored in Apache Iceberg, then applies custom enforcement strategies across batch schedulers and online services. Piqama centralizes resource governance across hardware and service metrics, enabling teams to optimize capacity allocation while linking consumption directly to financial costs.</p><p><strong><a href="https://medium.com/pinterest-engineering/piqama-pinterest-quota-management-ecosystem-dc7881433bf5">https://medium.com/pinterest-engineering/piqama-pinterest-quota-management-ecosystem-dc7881433bf5</a></strong></p><div><hr></div><h1>LinkedIn: Engineering LinkedIn&#8217;s job ingestion system at scale</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ee5n!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7676ea75-a146-4fd6-8136-bf59759a6822_1920x792.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ee5n!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7676ea75-a146-4fd6-8136-bf59759a6822_1920x792.heic 424w, https://substackcdn.com/image/fetch/$s_!Ee5n!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7676ea75-a146-4fd6-8136-bf59759a6822_1920x792.heic 848w, https://substackcdn.com/image/fetch/$s_!Ee5n!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7676ea75-a146-4fd6-8136-bf59759a6822_1920x792.heic 1272w, https://substackcdn.com/image/fetch/$s_!Ee5n!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7676ea75-a146-4fd6-8136-bf59759a6822_1920x792.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ee5n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7676ea75-a146-4fd6-8136-bf59759a6822_1920x792.heic" width="1456" height="601" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7676ea75-a146-4fd6-8136-bf59759a6822_1920x792.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:601,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:13870,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/189610818?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7676ea75-a146-4fd6-8136-bf59759a6822_1920x792.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ee5n!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7676ea75-a146-4fd6-8136-bf59759a6822_1920x792.heic 424w, https://substackcdn.com/image/fetch/$s_!Ee5n!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7676ea75-a146-4fd6-8136-bf59759a6822_1920x792.heic 848w, https://substackcdn.com/image/fetch/$s_!Ee5n!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7676ea75-a146-4fd6-8136-bf59759a6822_1920x792.heic 1272w, https://substackcdn.com/image/fetch/$s_!Ee5n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7676ea75-a146-4fd6-8136-bf59759a6822_1920x792.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Ingestion systems struggle to scale source onboarding&#8212;hard-coded extraction logic creates engineering bottlenecks that slow integration of new data partners. LinkedIn shifted extraction logic from code to configuration files called Sitemaps, enabling AI tools and browser plugins to onboard sources without engineering deployments. At the same time, a transactional state machine enforces precise failure boundaries across parallel mining tasks. The configuration-driven approach reduces onboarding time from weeks to hours, allowing LinkedIn to ingest 20TB daily across thousands of global sources. </p><p><strong><a href="https://www.linkedin.com/blog/engineering/infrastructure/engineering-linkedins-job-ingestion-system-at-scale">https://www.linkedin.com/blog/engineering/infrastructure/engineering-linkedins-job-ingestion-system-at-scale</a></strong></p><div><hr></div><h1>Shopify: The generative recommender behind Shopify&#8217;s commerce engine</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jh6b!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f4c783-688f-493d-bd0b-e232f8014d34_1965x1196.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jh6b!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f4c783-688f-493d-bd0b-e232f8014d34_1965x1196.heic 424w, https://substackcdn.com/image/fetch/$s_!jh6b!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f4c783-688f-493d-bd0b-e232f8014d34_1965x1196.heic 848w, https://substackcdn.com/image/fetch/$s_!jh6b!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f4c783-688f-493d-bd0b-e232f8014d34_1965x1196.heic 1272w, https://substackcdn.com/image/fetch/$s_!jh6b!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f4c783-688f-493d-bd0b-e232f8014d34_1965x1196.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jh6b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f4c783-688f-493d-bd0b-e232f8014d34_1965x1196.heic" width="1456" height="886" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f7f4c783-688f-493d-bd0b-e232f8014d34_1965x1196.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:886,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:13795,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/189610818?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f4c783-688f-493d-bd0b-e232f8014d34_1965x1196.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jh6b!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f4c783-688f-493d-bd0b-e232f8014d34_1965x1196.heic 424w, https://substackcdn.com/image/fetch/$s_!jh6b!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f4c783-688f-493d-bd0b-e232f8014d34_1965x1196.heic 848w, https://substackcdn.com/image/fetch/$s_!jh6b!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f4c783-688f-493d-bd0b-e232f8014d34_1965x1196.heic 1272w, https://substackcdn.com/image/fetch/$s_!jh6b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7f4c783-688f-493d-bd0b-e232f8014d34_1965x1196.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Recommendation systems traditionally treat purchases as isolated events, missing the temporal and causal structure that shapes buyer journeys across millions of products. Shopify transitioned to an autoregressive sequence model that treats commerce journeys as token sequences, implementing RoPE-inspired rotary encoding combined with relative attention bias to capture temporal gaps and seasonality across its catalog. The time-aware attention mechanism drove +0.94% order growth and +0.71% conversion lift while achieving 7.3x training speedup through optimized CUDA kernels, enabling Shopify to integrate richer context into a unified generative framework.</p><p><strong><a href="https://shopify.engineering/generative-recommendations">https://shopify.engineering/generative-recommendations</a></strong></p><div><hr></div><h1>Alibaba: PostgreSQL Blink-tree Implementation</h1><p>As we increasingly use AI to code, understanding database internals is more critical than ever. Alibaba Cloud engineers break down how PostgreSQL utilizes the <strong><a href="https://pages.cs.wisc.edu/~yxy/cs764-f22/slides/L15.pdf">Blink-tree </a></strong>architecture to achieve massive concurrency. By adding link pointers to sibling nodes and high keys to mark boundaries, PostgreSQL allows searches to proceed without lock-coupling. This enables the system to gracefully handle concurrent page splits&#8212;following links when data exceeds old boundaries&#8212;and significantly outperforms the more rigid <strong><a href="https://kernelmaker.github.io/MySQL-Lock-1">lock-subtree approach</a></strong> used in MySQL&#8217;s InnoDB.</p><p><strong><a href="https://www.alibabacloud.com/blog/postgresql-blink-tree-implementation_602913">https://www.alibabacloud.com/blog/postgresql-blink-tree-implementation_602913</a></strong></p><div><hr></div><p><em>All rights reserved, Dewpeche Private Limited. I have provided links for informational purposes and do not suggest endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers&#8217; opinions.</em></p>]]></content:encoded></item><item><title><![CDATA[Data Engineering After AI]]></title><description><![CDATA[Moving Data Was Never the Point. Meaning It Is.]]></description><link>https://www.dataengineeringweekly.com/p/data-engineering-after-ai</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/data-engineering-after-ai</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Tue, 24 Feb 2026 03:03:53 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/2da46ceb-78fd-4718-9ccb-7afb113096ec_1154x486.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A few days back, I ran a LinkedIn poll asking what stays core to software engineering as AI increasingly writes the code. 53% said architecture and trade-offs. 20% said quality and ownership, and 25% said product and problem discovery.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uwq8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e82120a-d802-4411-8a58-914f27f6ef24_948x610.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uwq8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e82120a-d802-4411-8a58-914f27f6ef24_948x610.png 424w, https://substackcdn.com/image/fetch/$s_!uwq8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e82120a-d802-4411-8a58-914f27f6ef24_948x610.png 848w, https://substackcdn.com/image/fetch/$s_!uwq8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e82120a-d802-4411-8a58-914f27f6ef24_948x610.png 1272w, https://substackcdn.com/image/fetch/$s_!uwq8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e82120a-d802-4411-8a58-914f27f6ef24_948x610.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uwq8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e82120a-d802-4411-8a58-914f27f6ef24_948x610.png" width="948" height="610" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3e82120a-d802-4411-8a58-914f27f6ef24_948x610.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:610,&quot;width&quot;:948,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:89112,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188977018?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e82120a-d802-4411-8a58-914f27f6ef24_948x610.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uwq8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e82120a-d802-4411-8a58-914f27f6ef24_948x610.png 424w, https://substackcdn.com/image/fetch/$s_!uwq8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e82120a-d802-4411-8a58-914f27f6ef24_948x610.png 848w, https://substackcdn.com/image/fetch/$s_!uwq8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e82120a-d802-4411-8a58-914f27f6ef24_948x610.png 1272w, https://substackcdn.com/image/fetch/$s_!uwq8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e82120a-d802-4411-8a58-914f27f6ef24_948x610.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The poll wasn&#8217;t specifically about data engineering, but the answer it yielded applies directly to us. When AI can generate a pipeline as fluently as a senior engineer, the question isn&#8217;t whether our toolbox is changing &#8212; it clearly is. The question is: what kind of thinking has always been too important to automate, and why we let it get buried under the more mechanical work in the first place.</p><p>My answer is that the irreducible work was never about moving data. It was always about meaning. And the framework we&#8217;ve been using &#8212; ETL &#8212; was never really designed to capture meaning.</p><div><hr></div><h1>The ETL Era and Why It&#8217;s Ending</h1><p>Extract, Transform, Load made sense as a job description for a specific historical moment. Source systems were siloed, formats were inconsistent, and somebody had to write the code that moved data from where it lived to where it could be used. The data engineer was that somebody.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KwLr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2bc94f-81e7-4ad2-94fe-b09ee17e9668_1228x346.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KwLr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2bc94f-81e7-4ad2-94fe-b09ee17e9668_1228x346.png 424w, https://substackcdn.com/image/fetch/$s_!KwLr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2bc94f-81e7-4ad2-94fe-b09ee17e9668_1228x346.png 848w, https://substackcdn.com/image/fetch/$s_!KwLr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2bc94f-81e7-4ad2-94fe-b09ee17e9668_1228x346.png 1272w, https://substackcdn.com/image/fetch/$s_!KwLr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2bc94f-81e7-4ad2-94fe-b09ee17e9668_1228x346.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KwLr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2bc94f-81e7-4ad2-94fe-b09ee17e9668_1228x346.png" width="1228" height="346" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fa2bc94f-81e7-4ad2-94fe-b09ee17e9668_1228x346.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:346,&quot;width&quot;:1228,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:497725,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188977018?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2bc94f-81e7-4ad2-94fe-b09ee17e9668_1228x346.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KwLr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2bc94f-81e7-4ad2-94fe-b09ee17e9668_1228x346.png 424w, https://substackcdn.com/image/fetch/$s_!KwLr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2bc94f-81e7-4ad2-94fe-b09ee17e9668_1228x346.png 848w, https://substackcdn.com/image/fetch/$s_!KwLr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2bc94f-81e7-4ad2-94fe-b09ee17e9668_1228x346.png 1272w, https://substackcdn.com/image/fetch/$s_!KwLr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa2bc94f-81e7-4ad2-94fe-b09ee17e9668_1228x346.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>But if we&#8217;re honest, the transformation step was always the most brittle part. Teams encoded business rules as SQL logic or Python functions, buried them in pipeline code, version-controlled them alongside infrastructure, but rarely treated them with the same rigor as application code. When the definition of &#8220;active user&#8221; changed &#8212; and it always changed &#8212; someone had to find every place that definition lived and update it, hoping they caught them all.</p><p>AI is now competent at generating this kind of code. Not perfect, but competent enough that the mechanical work of pipeline construction is no longer a meaningful differentiator. If your professional identity is built around being good at writing transformation logic, that identity is under pressure.</p><p>But this isn&#8217;t a story about loss. It&#8217;s a story about clarity. The mechanical work was always obscuring the more important work underneath it. AI forcing that reckoning is, in a strange way, a gift.</p><div><hr></div><h1>Introducing ECL &#8212; Extract, Contextualize, Link</h1><p>The framework emerging as a replacement isn&#8217;t a technical architecture so much as a reorientation of purpose. Instead of Extract, Transform, Load, think Extract, Contextualize, Link.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gXAy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F500e7461-2805-49c0-9833-d0b54e04421e_1280x528.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gXAy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F500e7461-2805-49c0-9833-d0b54e04421e_1280x528.png 424w, https://substackcdn.com/image/fetch/$s_!gXAy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F500e7461-2805-49c0-9833-d0b54e04421e_1280x528.png 848w, https://substackcdn.com/image/fetch/$s_!gXAy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F500e7461-2805-49c0-9833-d0b54e04421e_1280x528.png 1272w, https://substackcdn.com/image/fetch/$s_!gXAy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F500e7461-2805-49c0-9833-d0b54e04421e_1280x528.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gXAy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F500e7461-2805-49c0-9833-d0b54e04421e_1280x528.png" width="1280" height="528" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/500e7461-2805-49c0-9833-d0b54e04421e_1280x528.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:528,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:972606,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188977018?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F500e7461-2805-49c0-9833-d0b54e04421e_1280x528.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gXAy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F500e7461-2805-49c0-9833-d0b54e04421e_1280x528.png 424w, https://substackcdn.com/image/fetch/$s_!gXAy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F500e7461-2805-49c0-9833-d0b54e04421e_1280x528.png 848w, https://substackcdn.com/image/fetch/$s_!gXAy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F500e7461-2805-49c0-9833-d0b54e04421e_1280x528.png 1272w, https://substackcdn.com/image/fetch/$s_!gXAy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F500e7461-2805-49c0-9833-d0b54e04421e_1280x528.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Extract remains. Data still needs to move from source systems to analytical environments, and that work still requires engineering judgment &#8212; about reliability, latency, volume, and failure modes. AI will increasingly handle the mechanical parts, but the architectural decisions about what to extract, when, and how belong to people who understand both the source systems and the downstream consequences.</p><p>Contextualize is where the real shift happens. This is the work of giving data semantic meaning &#8212; understanding that &#8220;revenue&#8221; is calculated differently by Finance and Sales, that a timestamp in a clickstream event means something different than a timestamp in a billing record, that a null value in one system represents the absence of information while in another it represents an explicit user choice. AI can draft this work at scale &#8212; inferring field definitions, classifying entities, and mapping relationships across a data landscape that no human team could manually annotate in full. What AI cannot do is be accountable for itself. The judgment of whether an inference is correct, the organizational authority to declare a definition, the decision to formalize a discovered pattern into an enforced contract &#8212; that belongs to humans. Contextualize is where AI inference and human judgment meet, structured by a pipeline built specifically for that purpose.</p><p>Link is about entity relationships across the data landscape &#8212; connecting a customer record in your CRM to a user record in your product database, linking an event in your analytics system to a session in your support tool. As AI generates more of the code that consumes data, the ability to reason about how entities relate across systems becomes more valuable, not less. Linkage is what makes context portable &#8212; what allows the meaning built in one part of the landscape to be grounded in its relationships to the rest.</p><p>The rest of this article discusses how ECL works at the architectural level, not as three abstract concepts, but as three concrete pipelines &#8212; and why you need all of them.</p><div><hr></div><h1>Early Binding &#8212; Contracts as Executable Constraints</h1><p>The first technique is early binding: capturing semantic intent at the point of data production, before the data moves.</p><p>Data contracts are the practical implementation of this idea. At their core, contracts are agreements between data producers and their consumers &#8212; specifying schema, data quality expectations, ownership, and the semantic meaning of each field.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!g3D-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb651c-b309-4861-ba49-0e142c836729_1234x404.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!g3D-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb651c-b309-4861-ba49-0e142c836729_1234x404.png 424w, https://substackcdn.com/image/fetch/$s_!g3D-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb651c-b309-4861-ba49-0e142c836729_1234x404.png 848w, https://substackcdn.com/image/fetch/$s_!g3D-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb651c-b309-4861-ba49-0e142c836729_1234x404.png 1272w, https://substackcdn.com/image/fetch/$s_!g3D-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb651c-b309-4861-ba49-0e142c836729_1234x404.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!g3D-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb651c-b309-4861-ba49-0e142c836729_1234x404.png" width="1234" height="404" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ddbb651c-b309-4861-ba49-0e142c836729_1234x404.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:404,&quot;width&quot;:1234,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:752776,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188977018?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb651c-b309-4861-ba49-0e142c836729_1234x404.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!g3D-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb651c-b309-4861-ba49-0e142c836729_1234x404.png 424w, https://substackcdn.com/image/fetch/$s_!g3D-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb651c-b309-4861-ba49-0e142c836729_1234x404.png 848w, https://substackcdn.com/image/fetch/$s_!g3D-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb651c-b309-4861-ba49-0e142c836729_1234x404.png 1272w, https://substackcdn.com/image/fetch/$s_!g3D-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb651c-b309-4861-ba49-0e142c836729_1234x404.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Data Engineering Weekly identified this gap precisely in their piece <em><strong><a href="https://www.dataengineeringweekly.com/p/data-contracts-a-missed-opportunity">Data Contracts: A Missed Opportunity</a></strong></em>. While the data industry was debating what contracts were and drafting governance frameworks to describe them, software engineering had quietly converged on a different organizing principle: treating specifications as executable constraints with real failure semantics. The data industry treated contracts as documentation. Software engineers treated them as interfaces &#8212; things that could break, that had versioning implications, that enforced behavior rather than merely describing it.</p><p>A data contract that lives in a wiki and gets updated when someone remembers is the documentation. A data contract that is enforced at the point of production &#8212; that fails a pipeline when a schema changes without notice, that alerts a consumer when quality thresholds are violated, that an AI agent can reason about deterministically &#8212; that is architecture.</p><p>This matters more in an AI-heavy world, not less. When AI agents generate transformation code, bad contracts are amplified at scale. The agent will faithfully implement whatever logic it&#8217;s given; if the contract governing its inputs is ambiguous or unenforced, the errors it produces will be systematic rather than isolated. Early binding is the mechanism by which human intent gets formalized into something AI can actually work with.</p><p>But early binding alone has a fundamental limitation. And understanding that limitation is what makes the Contextualize pipeline necessary.</p><div><hr></div><h1>The Problem Early Binding Alone Can&#8217;t Solve</h1><p>Consider what happens to a well-contracted dataset as it moves through a modern Medallion architecture.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vwwM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ca03c22-9a4d-4a7d-a244-279af04bee7f_1237x321.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vwwM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ca03c22-9a4d-4a7d-a244-279af04bee7f_1237x321.png 424w, https://substackcdn.com/image/fetch/$s_!vwwM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ca03c22-9a4d-4a7d-a244-279af04bee7f_1237x321.png 848w, https://substackcdn.com/image/fetch/$s_!vwwM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ca03c22-9a4d-4a7d-a244-279af04bee7f_1237x321.png 1272w, https://substackcdn.com/image/fetch/$s_!vwwM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ca03c22-9a4d-4a7d-a244-279af04bee7f_1237x321.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vwwM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ca03c22-9a4d-4a7d-a244-279af04bee7f_1237x321.png" width="1237" height="321" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1ca03c22-9a4d-4a7d-a244-279af04bee7f_1237x321.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:321,&quot;width&quot;:1237,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:448249,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188977018?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ca03c22-9a4d-4a7d-a244-279af04bee7f_1237x321.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vwwM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ca03c22-9a4d-4a7d-a244-279af04bee7f_1237x321.png 424w, https://substackcdn.com/image/fetch/$s_!vwwM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ca03c22-9a4d-4a7d-a244-279af04bee7f_1237x321.png 848w, https://substackcdn.com/image/fetch/$s_!vwwM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ca03c22-9a4d-4a7d-a244-279af04bee7f_1237x321.png 1272w, https://substackcdn.com/image/fetch/$s_!vwwM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ca03c22-9a4d-4a7d-a244-279af04bee7f_1237x321.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>At the Bronze layer, data lands close to its source &#8212; raw, minimally transformed, the contract&#8217;s guarantees still largely intact. Silver applies conformance rules: deduplication, type casting, and light standardization. By the time data reaches Gold, the pipeline has made a series of editorial decisions on the data&#8217;s behalf. Aggregations collapse granular events into metrics. Engineers bake business logic into the shape of the table. The Gold layer is an artifact optimized for a specific set of questions &#8212; the ones that seemed important when the pipeline was built.</p><p>Early binding contracts help at the source, but they can&#8217;t prevent this erosion at every subsequent hop &#8212; especially when those contracts are treated as descriptive rather than executable. If there&#8217;s no enforcement mechanism preventing meaning from drifting across transformations, the telephone game plays out silently in your pipeline. By the time a consumer queries the Gold layer, they&#8217;re working with an artifact whose original intent may be several editorial decisions removed from the contract.</p><p>This is the problem that early binding alone cannot solve. Each transformation layer progressively collapses the context captured at the source. You need a complementary approach&#8212;one that preserves the ability to recover context when it&#8217;s actually needed.</p><div><hr></div><h1>Late Binding &#8212; The Agentic Contextualized Pipeline</h1><p>Traditional late binding deferred the <em>application</em> of business rules to query time. What it didn&#8217;t defer was the <em>definition</em> of those rules &#8212; domain experts still had to specify them upfront, just applied through a semantic layer rather than baked into a physical table. In complex domains, that knowledge engineering process was its own bottleneck.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!C5NB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e69cc6d-3401-44c7-b2d7-9be8a00d3380_1300x378.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!C5NB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e69cc6d-3401-44c7-b2d7-9be8a00d3380_1300x378.png 424w, https://substackcdn.com/image/fetch/$s_!C5NB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e69cc6d-3401-44c7-b2d7-9be8a00d3380_1300x378.png 848w, https://substackcdn.com/image/fetch/$s_!C5NB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e69cc6d-3401-44c7-b2d7-9be8a00d3380_1300x378.png 1272w, https://substackcdn.com/image/fetch/$s_!C5NB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e69cc6d-3401-44c7-b2d7-9be8a00d3380_1300x378.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!C5NB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e69cc6d-3401-44c7-b2d7-9be8a00d3380_1300x378.png" width="1300" height="378" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3e69cc6d-3401-44c7-b2d7-9be8a00d3380_1300x378.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:378,&quot;width&quot;:1300,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:733215,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188977018?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e69cc6d-3401-44c7-b2d7-9be8a00d3380_1300x378.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!C5NB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e69cc6d-3401-44c7-b2d7-9be8a00d3380_1300x378.png 424w, https://substackcdn.com/image/fetch/$s_!C5NB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e69cc6d-3401-44c7-b2d7-9be8a00d3380_1300x378.png 848w, https://substackcdn.com/image/fetch/$s_!C5NB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e69cc6d-3401-44c7-b2d7-9be8a00d3380_1300x378.png 1272w, https://substackcdn.com/image/fetch/$s_!C5NB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e69cc6d-3401-44c7-b2d7-9be8a00d3380_1300x378.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The more forward-looking approach is to defer definition itself &#8212; and hand that work to a dedicated pipeline.</p><p>The Contextualize pipeline is a separate, agentic pipeline that runs alongside your data infrastructure. Its job is singular: build and maintain a living, validated store of semantic context. It isn&#8217;t part of the Extract pipeline. It isn&#8217;t a query-time process. It&#8217;s a first-class engineering artifact with its own triggering model, validation layer, and storage.</p><p>The trigger is event-driven, not scheduled. Every new dataset that lands automatically kicks off the pipeline. For existing datasets, continuous profiling monitors for meaningful changes &#8212; a new column appears, a column is dropped, a data distribution shifts in ways that suggest something changed upstream. Any of these events re-triggers the pipeline for the affected entities. Semantic context isn&#8217;t a one-time annotation exercise. It tracks the data as it evolves.</p><p>The pipeline itself is agentic. An AI agent analyzes the incoming data &#8212; schema, sample values, statistical profiles, lineage &#8212; and infers semantic meaning. What does this field represent? What business entity does it belong to? What relationships exist between it and other data in the landscape? It produces structured, versioned context artifacts: inferences about meaning that didn&#8217;t require a domain expert to pre-specify every scenario.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2K_z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dad946a-2f01-4407-ac7b-1ec7acbbf60b_1129x464.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2K_z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dad946a-2f01-4407-ac7b-1ec7acbbf60b_1129x464.png 424w, https://substackcdn.com/image/fetch/$s_!2K_z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dad946a-2f01-4407-ac7b-1ec7acbbf60b_1129x464.png 848w, https://substackcdn.com/image/fetch/$s_!2K_z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dad946a-2f01-4407-ac7b-1ec7acbbf60b_1129x464.png 1272w, https://substackcdn.com/image/fetch/$s_!2K_z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dad946a-2f01-4407-ac7b-1ec7acbbf60b_1129x464.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2K_z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dad946a-2f01-4407-ac7b-1ec7acbbf60b_1129x464.png" width="1129" height="464" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1dad946a-2f01-4407-ac7b-1ec7acbbf60b_1129x464.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:464,&quot;width&quot;:1129,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:646894,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188977018?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dad946a-2f01-4407-ac7b-1ec7acbbf60b_1129x464.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2K_z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dad946a-2f01-4407-ac7b-1ec7acbbf60b_1129x464.png 424w, https://substackcdn.com/image/fetch/$s_!2K_z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dad946a-2f01-4407-ac7b-1ec7acbbf60b_1129x464.png 848w, https://substackcdn.com/image/fetch/$s_!2K_z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dad946a-2f01-4407-ac7b-1ec7acbbf60b_1129x464.png 1272w, https://substackcdn.com/image/fetch/$s_!2K_z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dad946a-2f01-4407-ac7b-1ec7acbbf60b_1129x464.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Those inferences don&#8217;t automatically commit. They route to a validation layer that works like a labeling workflow &#8212; because structurally, it is one. An LLM-as-Judge validates high-confidence inferences before any human review triggers. Medium-confidence ones surface to domain experts for labeling. The pipeline flags low-confidence or contested inferences for deeper investigation. The humans aren&#8217;t reviewing every artifact; they&#8217;re reviewing the uncertain ones. Every labeling automation technique that works in ML pipelines applies here.</p><p>Validated artifacts land in a Context Store &#8212; a dedicated, versioned, queryable store of semantic definitions, entity classifications, and relationship maps. This is the new infrastructure component that ECL requires. Downstream agents don&#8217;t query raw data and infer meaning on the fly. They query the Context Store first, ground their understanding in validated context, and then query the data. The context is stable, reusable, and auditable &#8212; the opposite of ephemeral query-time inference.</p><div><hr></div><h1>Early Binding vs Late Binding &#8212; When to Choose What</h1><p>The decision criterion isn&#8217;t about semantic maturity or how well-understood a domain is. It&#8217;s about where the data comes from relative to your accountability boundary.</p><p>When a dataset originates within a controlled environment &#8212; produced by a team or system within your organization&#8217;s sphere of accountability &#8212; early binding is the right tool. The producer and consumer share an organizational context. Contracts can be negotiated, enforced, and held to. The producing team can be made accountable for the schema they declare and the semantics they commit to. Prescribed context is possible because the relationship that makes it enforceable exists.</p><p>When a dataset originates outside that boundary &#8212; third-party feeds, partner data, public datasets, marketplace sources &#8212; that relationship doesn&#8217;t exist. You cannot hold an external provider to a data contract. The schema can change without notice. The semantics are inferred, not declared. This is where the Contextualize pipeline earns its place. Discovered context is the only kind available.</p><p>But the boundary is not purely organizational. Poorly governed internal data &#8212; produced by a team with no accountability to its consumers, with undocumented schemas and inconsistent definitions &#8212; is effectively uncontrolled even if it sits within the same organization. The real test is not position on an org chart. It is accountability. Early bind where accountability exists. Let the Contextualize pipeline discover where it doesn&#8217;t.</p><p>The feedback loop holds in both directions. Discovered context built up through repeated profiling, inference, and validation can graduate into a prescribed context over time. An external dataset that your organization ingests consistently enough to profile, validate, and republish as an internal data product crosses the boundary from uncontrolled to controlled at that point. The Contextualize pipeline is what makes that transition possible &#8212; and makes the resulting contract trustworthy rather than assumed.</p><p>A data environment that treats all data as early-bindable is brittle. It can only contract what it already understands, and it has no mechanism for the uncontrolled data that makes up a growing share of the analytical landscape. A data environment that treats all data as requiring discovery never formalizes what it learns into enforceable guarantees. The architecture that works reads the accountability boundary correctly and applies the right technique on both sides.</p><div><hr></div><h1>Context Propagation &#8212; The Relay, Not the Pipeline</h1><p>With three pipelines now in play, the question becomes: how does context actually travel through the architecture without getting lost?</p><p>The conventional mental model is wrong. Context doesn&#8217;t travel <em>through</em> the data pipeline&#8212;if it did, it would be lost at every transformation step, which is precisely the Medallion erosion problem. Context travels <em>alongside</em> the pipeline, as metadata, lineage records, and contract provenance. The transformations change the data; the metadata preserves the meaning.</p><p>The relay works like this. Early binding stamps prescribed context at the point of origin &#8212; schema, field-level semantics, producing team ownership, quality thresholds &#8212; as an executable contract living in metadata, not column values. Lineage tooling propagates this through Bronze, Silver, and Gold, maintaining a record of the transformations applied and the contract that governed the data at each stage. The Contextualize pipeline reads that lineage as part of its inference process &#8212; understanding not just what a field looks like today, but also the history of how it arrived and the commitments made about it at the source. Validated inferences land in the Context Store, which becomes the relay&#8217;s destination: a durable, queryable record of what the data means, grounded in both original contract and accumulated lineage.</p><p>The analogy that makes this concrete is git. A file can be modified heavily across dozens of commits &#8212; refactored, renamed, moved, rewritten &#8212; but the context of how it got there is never lost, because it lives in the commit history, not in the file itself. The Gold layer is the latest commit. The lineage graph is the git log. The Context Store is the understanding you build by reading that log systematically rather than hoping the current file tells the whole story.</p><p>This reframe &#8212; from pipeline to relay &#8212; changes what data engineers are actually responsible for building. The transformations are increasingly automatable. The metadata infrastructure, the lineage graph, the Contextualize pipeline that reads it, the Context Store that accumulates from it &#8212; that is the engineering surface that requires sustained human judgment.</p><div><hr></div><h1>The Context Store as the New Engineering Surface</h1><p>Which brings us to where the most consequential engineering work has migrated.</p><p>The Context Store is where business definitions live &#8212; not as documentation in a wiki, not as logic engineers have baked into a Gold table, but as versioned, validated artifacts that downstream systems can query and trust. This is where the validation workflow resolves the competing interpretations of &#8220;revenue&#8221; from Finance and Sales &#8212; not organizational politics, but a confidence-based process that determines which inference earns formalization. Where AI consumers find the grounded, stable context they need to act reliably rather than reverting to ad hoc inference.</p><p>This surface distinguishes queryable data from trustworthy data. A table can be perfectly partitioned, indexed, and replicated while being semantically wrong &#8212; built on a definition that drifted from its source contract three transformations ago and never caught because no Contextualize pipeline was watching. The Context Store is where that failure mode gets closed.</p><p>As AI generates more transformation code and AI agents consume more data at scale, the stakes of this surface rise. An agent operating on a stale or conflicting context artifact produces systematic errors rather than recoverable ones. The engineering work that governs trustworthiness &#8212; designing the trigger model for the Contextualize pipeline, structuring the labeling workflow, deciding what validation confidence threshold earns formalization, and versioning context artifacts as definitions evolve &#8212; requires human judgment at every step.</p><p>Practitioners are still working out the patterns for doing this at scale. The tooling is maturing. How organizations govern ownership of the Context Store, adjudicate conflicts between teams, and manage the graduation from discovered to prescribed context are genuinely open questions. This is where the frontier actually is.</p><div><hr></div><h1>The New Data Engineer &#8212; Context Architect</h1><p>Return to the poll. 53% said architecture and trade-offs are what remain irreducibly human. In the data engineering context, ECL is what that looks like in practice.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_X4x!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c872ba5-b9fe-4282-a931-32886719e1d5_1154x486.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_X4x!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c872ba5-b9fe-4282-a931-32886719e1d5_1154x486.png 424w, https://substackcdn.com/image/fetch/$s_!_X4x!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c872ba5-b9fe-4282-a931-32886719e1d5_1154x486.png 848w, https://substackcdn.com/image/fetch/$s_!_X4x!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c872ba5-b9fe-4282-a931-32886719e1d5_1154x486.png 1272w, https://substackcdn.com/image/fetch/$s_!_X4x!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c872ba5-b9fe-4282-a931-32886719e1d5_1154x486.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_X4x!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c872ba5-b9fe-4282-a931-32886719e1d5_1154x486.png" width="1154" height="486" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2c872ba5-b9fe-4282-a931-32886719e1d5_1154x486.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:486,&quot;width&quot;:1154,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:784728,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188977018?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c872ba5-b9fe-4282-a931-32886719e1d5_1154x486.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_X4x!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c872ba5-b9fe-4282-a931-32886719e1d5_1154x486.png 424w, https://substackcdn.com/image/fetch/$s_!_X4x!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c872ba5-b9fe-4282-a931-32886719e1d5_1154x486.png 848w, https://substackcdn.com/image/fetch/$s_!_X4x!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c872ba5-b9fe-4282-a931-32886719e1d5_1154x486.png 1272w, https://substackcdn.com/image/fetch/$s_!_X4x!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c872ba5-b9fe-4282-a931-32886719e1d5_1154x486.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The data engineer of the next decade owns the architecture of meaning. They design the contractual foundations at the source&#8212;executable, enforceable, versioned. They build the lineage infrastructure that carries context through every transformation layer without losing it. They design and govern the Contextualize pipeline and the Context Store &#8212; the infrastructure where inferences get built, validated, and formalized into the definitions that everything downstream depends on. They understand when to prescribe context upfront and when to let it be discovered, and they build the systems that make both possible.</p><p>But this is not only a technical role. Context erosion is as much an organizational failure as a technical one. Teams don&#8217;t share semantic definitions because no ownership model incentivizes them to do so. Nobody enforces contracts because producing teams have no accountability to the consumers they serve. In this new frame, the data engineer is the person who builds both the technical system and the organizational agreement that holds it together. They sit at the intersection of architecture and coordination &#8212; the two things the poll respondents correctly identified as irreducibly human.</p><p>The title &#8220;Data Engineer&#8221; may need an update. What we are actually describing is a Context Architect &#8212; someone whose primary material is not data movement but data meaning, not pipelines but provenance, not transformation logic but the semantic infrastructure that makes transformation logic trustworthy.</p><div><hr></div><h1>An Open Frontier</h1><p>I want to be honest about what ECL is and what it isn&#8217;t. It is a reorientation &#8212; a way of thinking about what the work actually is, now that AI is handling more of what the work used to look like. It is not a finished methodology. The tooling that links early binding contracts to the Contextualize pipeline and Context Store is still maturing. The organizational patterns for governing who owns the Context Store, how conflicts between teams get adjudicated, and how discovered context earns formalization don&#8217;t yet have established templates. Practitioners are working out the engineering patterns for building contextual pipelines that operate reliably at scale in production environments right now, figuring it out as they go.</p><p>That&#8217;s precisely what makes this moment worth paying close attention to. The frontier is genuinely open. The practitioners who invest in the architectural and organizational work of context &#8212; who treat contracts as executable infrastructure, who build lineage as a first-class engineering concern, who govern the Contextualize pipeline and Context Store as seriously as they once owned the ETL pipeline &#8212; will define the discipline for the decade ahead.</p><p>The 53% who said architecture and trade-offs are irreducibly human were right. We didn&#8217;t yet know which architecture, or which trade-offs.</p><p>Now we do.</p><div><hr></div><p><em>All rights reserved, Dewpeche Private Limited. I have provided links for informational purposes and do not suggest endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers&#8217; opinions.</em></p>]]></content:encoded></item><item><title><![CDATA[Data Engineering Weekly #258]]></title><description><![CDATA[The Weekly Data Engineering Newsletter]]></description><link>https://www.dataengineeringweekly.com/p/data-engineering-weekly-257-19d</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/data-engineering-weekly-257-19d</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Mon, 23 Feb 2026 01:00:43 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!AdQk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6f51c1bf-abc9-4cd3-ad69-22cc8e3f1ef2_1080x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=02_22_26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pcFF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ec3cc0-0f2e-4483-bb91-5847d16f9c12_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!pcFF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ec3cc0-0f2e-4483-bb91-5847d16f9c12_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!pcFF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ec3cc0-0f2e-4483-bb91-5847d16f9c12_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!pcFF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ec3cc0-0f2e-4483-bb91-5847d16f9c12_2400x1260.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pcFF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ec3cc0-0f2e-4483-bb91-5847d16f9c12_2400x1260.heic" width="1456" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/30ec3cc0-0f2e-4483-bb91-5847d16f9c12_2400x1260.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:28626,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=02_22_26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188850941?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ec3cc0-0f2e-4483-bb91-5847d16f9c12_2400x1260.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pcFF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ec3cc0-0f2e-4483-bb91-5847d16f9c12_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!pcFF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ec3cc0-0f2e-4483-bb91-5847d16f9c12_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!pcFF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ec3cc0-0f2e-4483-bb91-5847d16f9c12_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!pcFF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ec3cc0-0f2e-4483-bb91-5847d16f9c12_2400x1260.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>AI is moving fast. Is your data platform ready?</h1><p>AI is reshaping how data teams operate. But legacy pipelines, brittle workflows, and fragmented tooling weren&#8217;t designed for this shift.<br>Learn how leading teams are future-proofing their infrastructure before AI demands overwhelm it.</p><p><strong><a href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=02_22_26_data_engineering_weekly">Download the AI Modernization Guide</a></strong></p><div><hr></div><h1>Garry Tan: Half the AI Agent Market Is One Category. The Rest Is Wide Open</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Pr4c!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd197d5-0d12-4a80-ab7c-50c28896cc3e_1200x718.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Pr4c!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd197d5-0d12-4a80-ab7c-50c28896cc3e_1200x718.heic 424w, https://substackcdn.com/image/fetch/$s_!Pr4c!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd197d5-0d12-4a80-ab7c-50c28896cc3e_1200x718.heic 848w, https://substackcdn.com/image/fetch/$s_!Pr4c!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd197d5-0d12-4a80-ab7c-50c28896cc3e_1200x718.heic 1272w, https://substackcdn.com/image/fetch/$s_!Pr4c!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd197d5-0d12-4a80-ab7c-50c28896cc3e_1200x718.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Pr4c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd197d5-0d12-4a80-ab7c-50c28896cc3e_1200x718.heic" width="1200" height="718" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bdd197d5-0d12-4a80-ab7c-50c28896cc3e_1200x718.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:718,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:16453,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188850941?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd197d5-0d12-4a80-ab7c-50c28896cc3e_1200x718.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Pr4c!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd197d5-0d12-4a80-ab7c-50c28896cc3e_1200x718.heic 424w, https://substackcdn.com/image/fetch/$s_!Pr4c!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd197d5-0d12-4a80-ab7c-50c28896cc3e_1200x718.heic 848w, https://substackcdn.com/image/fetch/$s_!Pr4c!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd197d5-0d12-4a80-ab7c-50c28896cc3e_1200x718.heic 1272w, https://substackcdn.com/image/fetch/$s_!Pr4c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd197d5-0d12-4a80-ab7c-50c28896cc3e_1200x718.heic 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AI Agents thrive in RL environments with a verifiable target and quick feedback. Software manufacturing is a perfect model for such an environment, but the challenge persists in other categories. It will be an exciting decade for software engineering as we build new infrastructure that we never imagined.  </p><p><strong><a href="https://garryslist.org/posts/half-the-ai-agent-market-is-one-category-the-rest-is-wide-open">https://garryslist.org/posts/half-the-ai-agent-market-is-one-category-the-rest-is-wide-open</a></strong></p><div><hr></div><h1>LangChain: How to Use Memory in Agent Builder</h1><p>Agents fail to improve over time when they treat every conversation as stateless and discard learned preferences or workflows. The article explains how LangChain&#8217;s Agent Builder implements short-term and long-term memory as a filesystem of Markdown files, enabling persistent instructions and reusable skills. Explicit memory updates, modular skill loading, and direct file editing enable agents to reliably evolve behavior without increasing core prompt complexity.</p><p><strong><a href="https://blog.langchain.com/how-to-use-memory-in-agent-builder/">https://blog.langchain.com/how-to-use-memory-in-agent-builder/</a></strong></p><div><hr></div><h1>LinkedIn: Scaling LLM-Based ranking systems with SGLang at LinkedIn</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9eIl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc187884f-7b73-4d47-b2a2-810f3251a56e_1024x571.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9eIl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc187884f-7b73-4d47-b2a2-810f3251a56e_1024x571.heic 424w, https://substackcdn.com/image/fetch/$s_!9eIl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc187884f-7b73-4d47-b2a2-810f3251a56e_1024x571.heic 848w, https://substackcdn.com/image/fetch/$s_!9eIl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc187884f-7b73-4d47-b2a2-810f3251a56e_1024x571.heic 1272w, https://substackcdn.com/image/fetch/$s_!9eIl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc187884f-7b73-4d47-b2a2-810f3251a56e_1024x571.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9eIl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc187884f-7b73-4d47-b2a2-810f3251a56e_1024x571.heic" width="1024" height="571" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c187884f-7b73-4d47-b2a2-810f3251a56e_1024x571.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:571,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12896,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188850941?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc187884f-7b73-4d47-b2a2-810f3251a56e_1024x571.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9eIl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc187884f-7b73-4d47-b2a2-810f3251a56e_1024x571.heic 424w, https://substackcdn.com/image/fetch/$s_!9eIl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc187884f-7b73-4d47-b2a2-810f3251a56e_1024x571.heic 848w, https://substackcdn.com/image/fetch/$s_!9eIl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc187884f-7b73-4d47-b2a2-810f3251a56e_1024x571.heic 1272w, https://substackcdn.com/image/fetch/$s_!9eIl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc187884f-7b73-4d47-b2a2-810f3251a56e_1024x571.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>LLM-based ranking systems face strict latency and concurrency constraints because they score thousands of items per query without requiring text generation. The article explains how LinkedIn optimized SGLang for prefill-only ranking through batching improvements, scoring-only execution paths, prefix KV reuse, and Python runtime parallelization.</p><p><strong><a href="https://www.linkedin.com/blog/engineering/ai/scaling-llm-based-ranking-systems-with-sglang-at-linkedin">https://www.linkedin.com/blog/engineering/ai/scaling-llm-based-ranking-systems-with-sglang-at-linkedin</a></strong></p><div><hr></div><h1>Sponsored: The Scaling Data Teams Guide</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/how-to-scale-data-teams-ebook?utm_campaign=27879954-25-11-DMND_eBook_Scaling_Data_Teams&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=scaling_data_teams_ebook&amp;utm_content=02_22_26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Uaud!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3d41c2-93f6-40dc-ad7d-a310216a1922_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!Uaud!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3d41c2-93f6-40dc-ad7d-a310216a1922_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!Uaud!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3d41c2-93f6-40dc-ad7d-a310216a1922_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!Uaud!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3d41c2-93f6-40dc-ad7d-a310216a1922_3840x2160.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Uaud!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3d41c2-93f6-40dc-ad7d-a310216a1922_3840x2160.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cf3d41c2-93f6-40dc-ad7d-a310216a1922_3840x2160.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:25368,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/how-to-scale-data-teams-ebook?utm_campaign=27879954-25-11-DMND_eBook_Scaling_Data_Teams&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=scaling_data_teams_ebook&amp;utm_content=02_22_26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188850941?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3d41c2-93f6-40dc-ad7d-a310216a1922_3840x2160.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Uaud!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3d41c2-93f6-40dc-ad7d-a310216a1922_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!Uaud!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3d41c2-93f6-40dc-ad7d-a310216a1922_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!Uaud!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3d41c2-93f6-40dc-ad7d-a310216a1922_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!Uaud!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf3d41c2-93f6-40dc-ad7d-a310216a1922_3840x2160.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>More datasets. More pipelines. More AI demands. The old way of doing things doesn&#8217;t work at this scale.<br>This free eBook walks through how teams actually scale sustainably with roles, responsibilities, automation, and patterns that work.</p><p><strong><a href="https://dagster.io/how-to-scale-data-teams-ebook?utm_campaign=27879954-25-11-DMND_eBook_Scaling_Data_Teams&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=scaling_data_teams_ebook&amp;utm_content=02_22_26_data_engineering_weekly">Get the guide now</a></strong></p><div><hr></div><h1>Spotify: Our Multi-Agent Architecture for Smarter Advertising</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dXi7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9184364-e8f9-4d15-883a-882b12d212c4_1920x1080.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dXi7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9184364-e8f9-4d15-883a-882b12d212c4_1920x1080.heic 424w, https://substackcdn.com/image/fetch/$s_!dXi7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9184364-e8f9-4d15-883a-882b12d212c4_1920x1080.heic 848w, https://substackcdn.com/image/fetch/$s_!dXi7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9184364-e8f9-4d15-883a-882b12d212c4_1920x1080.heic 1272w, https://substackcdn.com/image/fetch/$s_!dXi7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9184364-e8f9-4d15-883a-882b12d212c4_1920x1080.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dXi7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9184364-e8f9-4d15-883a-882b12d212c4_1920x1080.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c9184364-e8f9-4d15-883a-882b12d212c4_1920x1080.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:14952,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188850941?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9184364-e8f9-4d15-883a-882b12d212c4_1920x1080.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dXi7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9184364-e8f9-4d15-883a-882b12d212c4_1920x1080.heic 424w, https://substackcdn.com/image/fetch/$s_!dXi7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9184364-e8f9-4d15-883a-882b12d212c4_1920x1080.heic 848w, https://substackcdn.com/image/fetch/$s_!dXi7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9184364-e8f9-4d15-883a-882b12d212c4_1920x1080.heic 1272w, https://substackcdn.com/image/fetch/$s_!dXi7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9184364-e8f9-4d15-883a-882b12d212c4_1920x1080.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Fragmented decision logic across buying channels prevented Spotify from translating high-level campaign goals into unified execution plans. The article explains how Spotify built Ads AI, a multi-agent orchestration layer with intent routing, specialized resolution agents, and data-grounded media planning using real-time tool integration. The architecture reduced campaign setup time from minutes to seconds, simplified user inputs, and grounded recommendations in historical performance data.</p><p><strong><a href="https://engineering.atspotify.com/2026/2/our-multi-agent-architecture-for-smarter-advertising">https://engineering.atspotify.com/2026/2/our-multi-agent-architecture-for-smarter-advertising</a></strong></p><div><hr></div><h1>Uber: Database Federation: Decentralized and ACL-Compliant Hive&#8482; Databases</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0mK-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F970d155b-ca3d-413e-a7ec-74766d2e798e_1528x836.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0mK-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F970d155b-ca3d-413e-a7ec-74766d2e798e_1528x836.heic 424w, https://substackcdn.com/image/fetch/$s_!0mK-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F970d155b-ca3d-413e-a7ec-74766d2e798e_1528x836.heic 848w, https://substackcdn.com/image/fetch/$s_!0mK-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F970d155b-ca3d-413e-a7ec-74766d2e798e_1528x836.heic 1272w, https://substackcdn.com/image/fetch/$s_!0mK-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F970d155b-ca3d-413e-a7ec-74766d2e798e_1528x836.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0mK-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F970d155b-ca3d-413e-a7ec-74766d2e798e_1528x836.heic" width="1456" height="797" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/970d155b-ca3d-413e-a7ec-74766d2e798e_1528x836.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:797,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22398,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188850941?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F970d155b-ca3d-413e-a7ec-74766d2e798e_1528x836.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0mK-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F970d155b-ca3d-413e-a7ec-74766d2e798e_1528x836.heic 424w, https://substackcdn.com/image/fetch/$s_!0mK-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F970d155b-ca3d-413e-a7ec-74766d2e798e_1528x836.heic 848w, https://substackcdn.com/image/fetch/$s_!0mK-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F970d155b-ca3d-413e-a7ec-74766d2e798e_1528x836.heic 1272w, https://substackcdn.com/image/fetch/$s_!0mK-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F970d155b-ca3d-413e-a7ec-74766d2e798e_1528x836.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Monolithic Hive warehouses create shared-fate outages, resource contention, and weak governance when thousands of datasets share a single database. The article explains how Uber implemented database federation by reorganizing datasets into domain-specific units, updating Hive Metastore pointers to avoid data duplication, and deploying both real-time and batch synchronizers to maintain consistency. The decentralized architecture improves ACL compliance, strengthens resource isolation, and reclaims storage while enabling zero-downtime migration at the petabyte scale.</p><p><strong><a href="https://www.uber.com/en-IN/blog/database-federation/">https://www.uber.com/en-IN/blog/database-federation/</a></strong></p><div><hr></div><h1>Anton Borisov: AutoMQ: Shared Storage Architecture Deep Dive</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!r-r5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b5b1810-6be0-427c-9836-97ade14737eb_1400x635.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!r-r5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b5b1810-6be0-427c-9836-97ade14737eb_1400x635.heic 424w, https://substackcdn.com/image/fetch/$s_!r-r5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b5b1810-6be0-427c-9836-97ade14737eb_1400x635.heic 848w, https://substackcdn.com/image/fetch/$s_!r-r5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b5b1810-6be0-427c-9836-97ade14737eb_1400x635.heic 1272w, https://substackcdn.com/image/fetch/$s_!r-r5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b5b1810-6be0-427c-9836-97ade14737eb_1400x635.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!r-r5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b5b1810-6be0-427c-9836-97ade14737eb_1400x635.heic" width="1400" height="635" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8b5b1810-6be0-427c-9836-97ade14737eb_1400x635.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:635,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12313,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188850941?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b5b1810-6be0-427c-9836-97ade14737eb_1400x635.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!r-r5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b5b1810-6be0-427c-9836-97ade14737eb_1400x635.heic 424w, https://substackcdn.com/image/fetch/$s_!r-r5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b5b1810-6be0-427c-9836-97ade14737eb_1400x635.heic 848w, https://substackcdn.com/image/fetch/$s_!r-r5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b5b1810-6be0-427c-9836-97ade14737eb_1400x635.heic 1272w, https://substackcdn.com/image/fetch/$s_!r-r5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b5b1810-6be0-427c-9836-97ade14737eb_1400x635.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Kafka&#8217;s shared-nothing architecture imposes high replication costs, slow failover, and tight coupling between storage and compute. The article explains how AutoMQ replaces local disk replication with S3-backed shared storage, using layered abstractions, WAL batching, metadata-driven ownership, and epoch fencing to enable stateless brokers and zero-copy failover. AutoMQ design eliminates the 3x replication tax and simplifies scaling to &#8220;add compute,&#8221; while accepting higher cold-read latency from object storage.</p><p><strong><a href="https://medium.com/fresha-data-engineering/automq-shared-storage-architecture-deep-dive-043c5226847e">https://medium.com/fresha-data-engineering/automq-shared-storage-architecture-deep-dive-043c5226847e</a></strong></p><div><hr></div><h1>Pinterest: Drastically Reducing Out-of-Memory Errors in Apache Spark at Pinterest</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!B_ZY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F608a2f5a-c8d4-42ca-8dd7-2ce2e6665897_1400x515.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!B_ZY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F608a2f5a-c8d4-42ca-8dd7-2ce2e6665897_1400x515.heic 424w, https://substackcdn.com/image/fetch/$s_!B_ZY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F608a2f5a-c8d4-42ca-8dd7-2ce2e6665897_1400x515.heic 848w, https://substackcdn.com/image/fetch/$s_!B_ZY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F608a2f5a-c8d4-42ca-8dd7-2ce2e6665897_1400x515.heic 1272w, https://substackcdn.com/image/fetch/$s_!B_ZY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F608a2f5a-c8d4-42ca-8dd7-2ce2e6665897_1400x515.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!B_ZY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F608a2f5a-c8d4-42ca-8dd7-2ce2e6665897_1400x515.heic" width="1400" height="515" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/608a2f5a-c8d4-42ca-8dd7-2ce2e6665897_1400x515.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:515,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24764,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188850941?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F608a2f5a-c8d4-42ca-8dd7-2ce2e6665897_1400x515.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!B_ZY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F608a2f5a-c8d4-42ca-8dd7-2ce2e6665897_1400x515.heic 424w, https://substackcdn.com/image/fetch/$s_!B_ZY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F608a2f5a-c8d4-42ca-8dd7-2ce2e6665897_1400x515.heic 848w, https://substackcdn.com/image/fetch/$s_!B_ZY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F608a2f5a-c8d4-42ca-8dd7-2ce2e6665897_1400x515.heic 1272w, https://substackcdn.com/image/fetch/$s_!B_ZY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F608a2f5a-c8d4-42ca-8dd7-2ce2e6665897_1400x515.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>OOM in Spark jobs is an infamous issue across data processing, creating operational overhead and inefficient cluster utilization. The article explains how Auto Memory Retries dynamically adjusts executor resources by retrying failed tasks with higher CPU allocation or larger executors through modified Spark resource profiles. The elastic strategy reduced OOM failures by 96%, lowered infrastructure costs by avoiding over-provisioning, and improved overall pipeline reliability.</p><p><strong><a href="https://medium.com/pinterest-engineering/drastically-reducing-out-of-memory-errors-in-apache-spark-at-pinterest-c55d7dac2257">https://medium.com/pinterest-engineering/drastically-reducing-out-of-memory-errors-in-apache-spark-at-pinterest-c55d7dac2257</a></strong></p><div><hr></div><h1>StarTree: Consistent, Scalable Compaction for Real-Time Upserts in Apache Pinot</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!R5Mt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5861d413-e0ca-42a2-9644-4025a0ceec7b_1301x870.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!R5Mt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5861d413-e0ca-42a2-9644-4025a0ceec7b_1301x870.heic 424w, https://substackcdn.com/image/fetch/$s_!R5Mt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5861d413-e0ca-42a2-9644-4025a0ceec7b_1301x870.heic 848w, https://substackcdn.com/image/fetch/$s_!R5Mt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5861d413-e0ca-42a2-9644-4025a0ceec7b_1301x870.heic 1272w, https://substackcdn.com/image/fetch/$s_!R5Mt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5861d413-e0ca-42a2-9644-4025a0ceec7b_1301x870.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!R5Mt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5861d413-e0ca-42a2-9644-4025a0ceec7b_1301x870.heic" width="1301" height="870" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5861d413-e0ca-42a2-9644-4025a0ceec7b_1301x870.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:870,&quot;width&quot;:1301,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:14034,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188850941?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5861d413-e0ca-42a2-9644-4025a0ceec7b_1301x870.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!R5Mt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5861d413-e0ca-42a2-9644-4025a0ceec7b_1301x870.heic 424w, https://substackcdn.com/image/fetch/$s_!R5Mt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5861d413-e0ca-42a2-9644-4025a0ceec7b_1301x870.heic 848w, https://substackcdn.com/image/fetch/$s_!R5Mt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5861d413-e0ca-42a2-9644-4025a0ceec7b_1301x870.heic 1272w, https://substackcdn.com/image/fetch/$s_!R5Mt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5861d413-e0ca-42a2-9644-4025a0ceec7b_1301x870.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Near-Real-Time upsert is my favorite subject to study, and I've worked with many OLAP engines. Apache Pinot always stands out for its flexible indexing and fast upsert capabilities. The article explains how StarTree&#8217;s SegmentRefreshTask compacts segments in the background by merging only valid records and ensuring atomic visibility with bitmap-based consistency controls. The approach reduces storage costs, supports sustained high ingestion rates, and maintains predictable query latency at a billion-key scale.</p><p><strong><a href="https://startree.ai/resources/upserts-compaction-in-apache-pinot-startree/">https://startree.ai/resources/upserts-compaction-in-apache-pinot-startree/</a></strong></p><div><hr></div><h1>Zepto: Debezium at Scale: An Open Source CDC Story from Zepto</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yR5T!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb625a238-f1ba-4b74-8300-0bf051db1d26_1050x285.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yR5T!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb625a238-f1ba-4b74-8300-0bf051db1d26_1050x285.heic 424w, https://substackcdn.com/image/fetch/$s_!yR5T!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb625a238-f1ba-4b74-8300-0bf051db1d26_1050x285.heic 848w, https://substackcdn.com/image/fetch/$s_!yR5T!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb625a238-f1ba-4b74-8300-0bf051db1d26_1050x285.heic 1272w, https://substackcdn.com/image/fetch/$s_!yR5T!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb625a238-f1ba-4b74-8300-0bf051db1d26_1050x285.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yR5T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb625a238-f1ba-4b74-8300-0bf051db1d26_1050x285.heic" width="1050" height="285" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b625a238-f1ba-4b74-8300-0bf051db1d26_1050x285.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:285,&quot;width&quot;:1050,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:11155,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188850941?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb625a238-f1ba-4b74-8300-0bf051db1d26_1050x285.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yR5T!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb625a238-f1ba-4b74-8300-0bf051db1d26_1050x285.heic 424w, https://substackcdn.com/image/fetch/$s_!yR5T!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb625a238-f1ba-4b74-8300-0bf051db1d26_1050x285.heic 848w, https://substackcdn.com/image/fetch/$s_!yR5T!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb625a238-f1ba-4b74-8300-0bf051db1d26_1050x285.heic 1272w, https://substackcdn.com/image/fetch/$s_!yR5T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb625a238-f1ba-4b74-8300-0bf051db1d26_1050x285.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>High-velocity CDC pipelines can overwhelm downstream databases due to redundant updates and MVCC-induced write amplification. The article explains how Zepto optimized Debezium by introducing an in-memory reduction buffer to collapse duplicate updates and a Postgres UNNEST-based batching strategy to reduce parsing overhead. These improvements stabilized CPU and I/O usage, eliminated replication lag during peak traffic, and ensured the database processes only final record states.</p><p><strong><a href="https://blog.zeptonow.com/debezium-at-scale-an-open-source-cdc-story-from-zepto-aa4b12e32bf7">https://blog.zeptonow.com/debezium-at-scale-an-open-source-cdc-story-from-zepto-aa4b12e32bf7</a></strong></p><div><hr></div><p><em>All rights reserved, Dewpeche Private Limited. I have provided links for informational purposes and do not suggest endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers&#8217; opinions.</em></p>]]></content:encoded></item><item><title><![CDATA[Data Engineering Weekly #257]]></title><description><![CDATA[The Weekly Data Engineering Newsletter]]></description><link>https://www.dataengineeringweekly.com/p/data-engineering-weekly-257</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/data-engineering-weekly-257</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Mon, 16 Feb 2026 01:45:25 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!AdQk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6f51c1bf-abc9-4cd3-ad69-22cc8e3f1ef2_1080x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/events/dagster-running-dagster-how-we-use-compass-for-ai-analytics?utm_campaign=34764303-26-02-WBNR_Deep%20Dive_Dagster_Running_Dagster_Compass&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_compass&amp;utm_content=02-15-26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!r0G-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7d6f09-30cc-4fac-bf6d-660c295596b2_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!r0G-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7d6f09-30cc-4fac-bf6d-660c295596b2_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!r0G-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7d6f09-30cc-4fac-bf6d-660c295596b2_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!r0G-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7d6f09-30cc-4fac-bf6d-660c295596b2_3840x2160.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!r0G-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7d6f09-30cc-4fac-bf6d-660c295596b2_3840x2160.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5b7d6f09-30cc-4fac-bf6d-660c295596b2_3840x2160.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:24171,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/events/dagster-running-dagster-how-we-use-compass-for-ai-analytics?utm_campaign=34764303-26-02-WBNR_Deep%20Dive_Dagster_Running_Dagster_Compass&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_compass&amp;utm_content=02-15-26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188089137?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7d6f09-30cc-4fac-bf6d-660c295596b2_3840x2160.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!r0G-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7d6f09-30cc-4fac-bf6d-660c295596b2_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!r0G-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7d6f09-30cc-4fac-bf6d-660c295596b2_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!r0G-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7d6f09-30cc-4fac-bf6d-660c295596b2_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!r0G-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b7d6f09-30cc-4fac-bf6d-660c295596b2_3840x2160.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Dagster Running Dagster</h1><p>In this upcoming session, find out how Dagster's data team has increased its capacity, along with best practices for data modeling that work well with AI assistants. We'll also demo a real case where our Compass Dagster+ integration identified the root cause of a Postgres-to-Snowflake pipeline that was failing 40-50% of the time.</p><p><strong><a href="https://dagster.io/events/dagster-running-dagster-how-we-use-compass-for-ai-analytics?utm_campaign=34764303-26-02-WBNR_Deep%20Dive_Dagster_Running_Dagster_Compass&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_compass&amp;utm_content=02-15-26_data_engineering_weekly">Register now</a></strong></p><div><hr></div><h1>Ben Lorica: Your agents need runbooks, not bigger context windows</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ym8Y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb951298-9b82-472b-a013-b293b10b62d4_1456x871.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ym8Y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb951298-9b82-472b-a013-b293b10b62d4_1456x871.heic 424w, https://substackcdn.com/image/fetch/$s_!Ym8Y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb951298-9b82-472b-a013-b293b10b62d4_1456x871.heic 848w, https://substackcdn.com/image/fetch/$s_!Ym8Y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb951298-9b82-472b-a013-b293b10b62d4_1456x871.heic 1272w, https://substackcdn.com/image/fetch/$s_!Ym8Y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb951298-9b82-472b-a013-b293b10b62d4_1456x871.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ym8Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb951298-9b82-472b-a013-b293b10b62d4_1456x871.heic" width="1456" height="871" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db951298-9b82-472b-a013-b293b10b62d4_1456x871.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:871,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:19513,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188089137?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb951298-9b82-472b-a013-b293b10b62d4_1456x871.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ym8Y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb951298-9b82-472b-a013-b293b10b62d4_1456x871.heic 424w, https://substackcdn.com/image/fetch/$s_!Ym8Y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb951298-9b82-472b-a013-b293b10b62d4_1456x871.heic 848w, https://substackcdn.com/image/fetch/$s_!Ym8Y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb951298-9b82-472b-a013-b293b10b62d4_1456x871.heic 1272w, https://substackcdn.com/image/fetch/$s_!Ym8Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb951298-9b82-472b-a013-b293b10b62d4_1456x871.heic 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AI agents struggle to scale in operational environments because they rely on large, transient context windows that reset after each task and incur repeated planning costs. The article discusses a Context File System (CFS) that separates reasoning from persistent procedural memory, enabling agents to mount task-specific runbooks, reuse indexed tools, and replay proven workflows.</p><p><strong><a href="https://gradientflow.substack.com/p/the-missing-layer-in-todays-agent">https://gradientflow.substack.com/p/the-missing-layer-in-todays-agent</a></strong></p><div><hr></div><h1>Netflix: High-Throughput Graph Abstraction at Netflix - Part I</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!R51d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1537e1d0-b08a-4d57-bf20-2a67ef9bec4e_1400x1016.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!R51d!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1537e1d0-b08a-4d57-bf20-2a67ef9bec4e_1400x1016.heic 424w, https://substackcdn.com/image/fetch/$s_!R51d!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1537e1d0-b08a-4d57-bf20-2a67ef9bec4e_1400x1016.heic 848w, https://substackcdn.com/image/fetch/$s_!R51d!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1537e1d0-b08a-4d57-bf20-2a67ef9bec4e_1400x1016.heic 1272w, https://substackcdn.com/image/fetch/$s_!R51d!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1537e1d0-b08a-4d57-bf20-2a67ef9bec4e_1400x1016.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!R51d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1537e1d0-b08a-4d57-bf20-2a67ef9bec4e_1400x1016.heic" width="1400" height="1016" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1537e1d0-b08a-4d57-bf20-2a67ef9bec4e_1400x1016.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1016,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:16009,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188089137?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1537e1d0-b08a-4d57-bf20-2a67ef9bec4e_1400x1016.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!R51d!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1537e1d0-b08a-4d57-bf20-2a67ef9bec4e_1400x1016.heic 424w, https://substackcdn.com/image/fetch/$s_!R51d!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1537e1d0-b08a-4d57-bf20-2a67ef9bec4e_1400x1016.heic 848w, https://substackcdn.com/image/fetch/$s_!R51d!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1537e1d0-b08a-4d57-bf20-2a67ef9bec4e_1400x1016.heic 1272w, https://substackcdn.com/image/fetch/$s_!R51d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1537e1d0-b08a-4d57-bf20-2a67ef9bec4e_1400x1016.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>High-throughput OLTP graph workloads demand low-latency traversal, strong typing, and global consistency at scale. The article explains how Netflix built a Graph Abstraction service on existing KV, TimeSeries, and caching infrastructure, using a property graph model with partitioned namespaces, optimized edge indexing, and write- and read-aside caching. The architecture processes millions of operations per second with single-digit millisecond latency while maintaining strict eventual consistency across regions.</p><p><strong><a href="https://netflixtechblog.medium.com/high-throughput-graph-abstraction-at-netflix-part-i-e88063e6f6d5">https://netflixtechblog.medium.com/high-throughput-graph-abstraction-at-netflix-part-i-e88063e6f6d5</a></strong></p><div><hr></div><h1>Reliable Data Engineering: <strong>Data Contracts in Practice - What 50 Production Implementations Actually Look Like</strong></h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!beaW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5affcb2b-ac74-4242-9f5d-654775e34c08_1400x933.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!beaW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5affcb2b-ac74-4242-9f5d-654775e34c08_1400x933.heic 424w, https://substackcdn.com/image/fetch/$s_!beaW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5affcb2b-ac74-4242-9f5d-654775e34c08_1400x933.heic 848w, https://substackcdn.com/image/fetch/$s_!beaW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5affcb2b-ac74-4242-9f5d-654775e34c08_1400x933.heic 1272w, https://substackcdn.com/image/fetch/$s_!beaW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5affcb2b-ac74-4242-9f5d-654775e34c08_1400x933.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!beaW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5affcb2b-ac74-4242-9f5d-654775e34c08_1400x933.heic" width="1400" height="933" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5affcb2b-ac74-4242-9f5d-654775e34c08_1400x933.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:933,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12712,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188089137?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5affcb2b-ac74-4242-9f5d-654775e34c08_1400x933.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!beaW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5affcb2b-ac74-4242-9f5d-654775e34c08_1400x933.heic 424w, https://substackcdn.com/image/fetch/$s_!beaW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5affcb2b-ac74-4242-9f5d-654775e34c08_1400x933.heic 848w, https://substackcdn.com/image/fetch/$s_!beaW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5affcb2b-ac74-4242-9f5d-654775e34c08_1400x933.heic 1272w, https://substackcdn.com/image/fetch/$s_!beaW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5affcb2b-ac74-4242-9f5d-654775e34c08_1400x933.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I recently wrote about <strong><a href="https://www.dataengineeringweekly.com/p/data-contracts-a-missed-opportunity">Data Contract - a missed opportunity</a>, </strong>the recent <strong><a href="https://x.com/tayloramurphy/status/2022530907526107465">Twitter conversation</a></strong>, and the recent <strong><a href="https://github.com/open-semantic-interchange/OSI">OSI spec from Snowflake</a> </strong>reflecting the traces of Data Contract. The author did a solid job summarizing various data contract patterns and their implementations. </p><p><strong><a href="https://medium.com/@reliabledataengineering/data-contracts-in-practice-what-50-production-implementations-actually-look-like-f1c953336bf2">https://medium.com/@reliabledataengineering/data-contracts-in-practice-what-50-production-implementations-actually-look-like-f1c953336bf2</a></strong></p><div><hr></div><h1>Sponsored: The AI Modernization Guide</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=02_15_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z5zN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54d12322-1343-4527-ba92-abc82cb3c91c_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!z5zN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54d12322-1343-4527-ba92-abc82cb3c91c_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!z5zN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54d12322-1343-4527-ba92-abc82cb3c91c_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!z5zN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54d12322-1343-4527-ba92-abc82cb3c91c_2400x1260.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z5zN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54d12322-1343-4527-ba92-abc82cb3c91c_2400x1260.heic" width="1456" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/54d12322-1343-4527-ba92-abc82cb3c91c_2400x1260.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:15511,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=02_15_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188089137?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54d12322-1343-4527-ba92-abc82cb3c91c_2400x1260.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!z5zN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54d12322-1343-4527-ba92-abc82cb3c91c_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!z5zN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54d12322-1343-4527-ba92-abc82cb3c91c_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!z5zN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54d12322-1343-4527-ba92-abc82cb3c91c_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!z5zN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54d12322-1343-4527-ba92-abc82cb3c91c_2400x1260.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Learn how to build a data platform that enables AI-driven development, reduces pipeline failures, and cuts complexity.<br><br>- Transform from Big Complexity to AI-ready architecture<br>- Real metrics from organizations achieving 50% cost reductions<br>- Introduction to Components: YAML-first pipelines that AI can build</p><p><strong><a href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=02_15_data_engineering_weekly">Get the guide now</a></strong></p><div><hr></div><h1>Netflix: Scaling LLM Post-Training at Netflix</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!d20Y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13c3e891-d2db-4a1c-ba89-7ea1f632db14_1162x415.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!d20Y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13c3e891-d2db-4a1c-ba89-7ea1f632db14_1162x415.heic 424w, https://substackcdn.com/image/fetch/$s_!d20Y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13c3e891-d2db-4a1c-ba89-7ea1f632db14_1162x415.heic 848w, https://substackcdn.com/image/fetch/$s_!d20Y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13c3e891-d2db-4a1c-ba89-7ea1f632db14_1162x415.heic 1272w, https://substackcdn.com/image/fetch/$s_!d20Y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13c3e891-d2db-4a1c-ba89-7ea1f632db14_1162x415.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!d20Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13c3e891-d2db-4a1c-ba89-7ea1f632db14_1162x415.heic" width="1162" height="415" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/13c3e891-d2db-4a1c-ba89-7ea1f632db14_1162x415.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:415,&quot;width&quot;:1162,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:13201,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188089137?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13c3e891-d2db-4a1c-ba89-7ea1f632db14_1162x415.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!d20Y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13c3e891-d2db-4a1c-ba89-7ea1f632db14_1162x415.heic 424w, https://substackcdn.com/image/fetch/$s_!d20Y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13c3e891-d2db-4a1c-ba89-7ea1f632db14_1162x415.heic 848w, https://substackcdn.com/image/fetch/$s_!d20Y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13c3e891-d2db-4a1c-ba89-7ea1f632db14_1162x415.heic 1272w, https://substackcdn.com/image/fetch/$s_!d20Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13c3e891-d2db-4a1c-ba89-7ea1f632db14_1162x415.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Production LLM deployment requires a scalable post-training infrastructure that handles complex data pipelines, distributed GPUs, and evolving fine-tuning strategies. The article explains how Netflix built a unified post-training framework on its ML platform to support efficient SFT and RL workflows, modular data and model abstractions, and tight integration with open-source ecosystems. Custom optimizations, such as sequence packing and hybrid RL orchestration, increase token throughput and enable researchers to focus on modeling rather than infrastructure.</p><p><strong><a href="https://netflixtechblog.com/scaling-llm-post-training-at-netflix-0046f8790194">https://netflixtechblog.com/scaling-llm-post-training-at-netflix-0046f8790194</a></strong></p><div><hr></div><h1>Abhishek Goswami: From Prompts to Production: A Playbook for Agentic Development</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rdac!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdead1cb-0e23-4630-bdb4-df7128aa3098_2209x1017.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rdac!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdead1cb-0e23-4630-bdb4-df7128aa3098_2209x1017.heic 424w, https://substackcdn.com/image/fetch/$s_!rdac!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdead1cb-0e23-4630-bdb4-df7128aa3098_2209x1017.heic 848w, https://substackcdn.com/image/fetch/$s_!rdac!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdead1cb-0e23-4630-bdb4-df7128aa3098_2209x1017.heic 1272w, https://substackcdn.com/image/fetch/$s_!rdac!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdead1cb-0e23-4630-bdb4-df7128aa3098_2209x1017.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rdac!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdead1cb-0e23-4630-bdb4-df7128aa3098_2209x1017.heic" width="1456" height="670" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fdead1cb-0e23-4630-bdb4-df7128aa3098_2209x1017.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:670,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:15094,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188089137?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdead1cb-0e23-4630-bdb4-df7128aa3098_2209x1017.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rdac!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdead1cb-0e23-4630-bdb4-df7128aa3098_2209x1017.heic 424w, https://substackcdn.com/image/fetch/$s_!rdac!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdead1cb-0e23-4630-bdb4-df7128aa3098_2209x1017.heic 848w, https://substackcdn.com/image/fetch/$s_!rdac!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdead1cb-0e23-4630-bdb4-df7128aa3098_2209x1017.heic 1272w, https://substackcdn.com/image/fetch/$s_!rdac!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffdead1cb-0e23-4630-bdb4-df7128aa3098_2209x1017.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Enterprise AI agents fail in production when teams rely on prompt experimentation without structured lifecycle and behavioral controls. The article introduces an Agentic SDLC that separates deterministic and agentic components, applies reusable orchestration patterns, and enforces versioned prompts, tool manifests, and MCP-based integrations. Behavioral testing with golden trajectories, layered validation, and human-in-the-loop oversight enables reliable, scalable deployment of agents beyond prototypes.</p><p><strong><a href="https://www.infoq.com/articles/prompts-to-production-playbook-for-agentic-development/">https://www.infoq.com/articles/prompts-to-production-playbook-for-agentic-development/</a></strong></p><div><hr></div><h1>Zepto: How We Built High-Precision, Low-Latency Semantic Search in Production</h1><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Fh2j!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc8d2897-cc06-4efb-80fc-f64941efa245_1050x239.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Fh2j!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc8d2897-cc06-4efb-80fc-f64941efa245_1050x239.heic 424w, https://substackcdn.com/image/fetch/$s_!Fh2j!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc8d2897-cc06-4efb-80fc-f64941efa245_1050x239.heic 848w, https://substackcdn.com/image/fetch/$s_!Fh2j!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc8d2897-cc06-4efb-80fc-f64941efa245_1050x239.heic 1272w, https://substackcdn.com/image/fetch/$s_!Fh2j!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc8d2897-cc06-4efb-80fc-f64941efa245_1050x239.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Fh2j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc8d2897-cc06-4efb-80fc-f64941efa245_1050x239.heic" width="1050" height="239" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc8d2897-cc06-4efb-80fc-f64941efa245_1050x239.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:239,&quot;width&quot;:1050,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7656,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188089137?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc8d2897-cc06-4efb-80fc-f64941efa245_1050x239.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Fh2j!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc8d2897-cc06-4efb-80fc-f64941efa245_1050x239.heic 424w, https://substackcdn.com/image/fetch/$s_!Fh2j!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc8d2897-cc06-4efb-80fc-f64941efa245_1050x239.heic 848w, https://substackcdn.com/image/fetch/$s_!Fh2j!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc8d2897-cc06-4efb-80fc-f64941efa245_1050x239.heic 1272w, https://substackcdn.com/image/fetch/$s_!Fh2j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc8d2897-cc06-4efb-80fc-f64941efa245_1050x239.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Keyword-based search fails on short, misspelled, and tail queries that lack lexical overlap with product catalogs. The article explains how Zepto built a dual-encoder semantic retrieval system trained with weak supervision, synthetic data, and InfoNCE loss to learn intent-aware embeddings under strict latency constraints. The approach delivered a 35% uplift on impacted queries, improved downstream ranking quality, and enabled semantic retrieval for both search and ads use cases.</p><p><strong><a href="https://blog.zeptonow.com/how-we-built-high-precision-low-latency-semantic-search-in-production-75a6c61dee25">https://blog.zeptonow.com/how-we-built-high-precision-low-latency-semantic-search-in-production-75a6c61dee25</a></strong></p><div><hr></div><h1>Atlas9: The challenges of soft delete</h1><p>Deleting data is one of the hardest problems; it is easy to write, but hard to delete. The standard approach is soft deletion, and it has several complications. The article covers the nuances of soft deletes, architectural patterns, and best practices for handling delete system design at scale. </p><p><strong><a href="https://atlas9.dev/blog/soft-delete.html">https://atlas9.dev/blog/soft-delete.html</a></strong></p><div><hr></div><h1>Apache Parquet: Native Geospatial Types in Apache Parquet</h1><p>It is exciting to see Apache Parquet evolve with the addition of more types and indexing support. The flexibility of a data format that allows you to select data types and the indexing patterns associated with them improves data management efficiency. I wish Apache Parquet/Lakehouse formats such as Hudi, Iceberg, and Delta Lake offered the flexibility of <strong><a href="https://docs.pinot.apache.org/basics/indexing">Apache Pinot&#8217;s types and indexing patterns. </a></strong></p><p><strong><a href="https://parquet.apache.org/blog/2026/02/13/native-geospatial-types-in-apache-parquet/">https://parquet.apache.org/blog/2026/02/13/native-geospatial-types-in-apache-parquet/</a></strong></p><div><hr></div><h1>Dalto Curvelano: Introduction to PostgreSQL Indexes</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HJNU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff489ecc0-cbd5-4bfb-9213-05538bb8e4a6_2408x1234.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HJNU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff489ecc0-cbd5-4bfb-9213-05538bb8e4a6_2408x1234.heic 424w, https://substackcdn.com/image/fetch/$s_!HJNU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff489ecc0-cbd5-4bfb-9213-05538bb8e4a6_2408x1234.heic 848w, https://substackcdn.com/image/fetch/$s_!HJNU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff489ecc0-cbd5-4bfb-9213-05538bb8e4a6_2408x1234.heic 1272w, https://substackcdn.com/image/fetch/$s_!HJNU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff489ecc0-cbd5-4bfb-9213-05538bb8e4a6_2408x1234.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HJNU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff489ecc0-cbd5-4bfb-9213-05538bb8e4a6_2408x1234.heic" width="1456" height="746" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f489ecc0-cbd5-4bfb-9213-05538bb8e4a6_2408x1234.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:746,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:21043,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/188089137?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff489ecc0-cbd5-4bfb-9213-05538bb8e4a6_2408x1234.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HJNU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff489ecc0-cbd5-4bfb-9213-05538bb8e4a6_2408x1234.heic 424w, https://substackcdn.com/image/fetch/$s_!HJNU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff489ecc0-cbd5-4bfb-9213-05538bb8e4a6_2408x1234.heic 848w, https://substackcdn.com/image/fetch/$s_!HJNU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff489ecc0-cbd5-4bfb-9213-05538bb8e4a6_2408x1234.heic 1272w, https://substackcdn.com/image/fetch/$s_!HJNU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff489ecc0-cbd5-4bfb-9213-05538bb8e4a6_2408x1234.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Speaking of types and indexing, it always comes back to basics and a solid foundation. The article is an excellent overview of PostgreSQL indexing support and its use. </p><p><strong><a href="https://dlt.github.io/blog/posts/introduction-to-postgresql-indexes/">https://dlt.github.io/blog/posts/introduction-to-postgresql-indexes/</a></strong></p><div><hr></div><p><em>All rights reserved, Dewpeche Private Limited. I have provided links for informational purposes and do not suggest endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers&#8217; opinions.</em></p>]]></content:encoded></item><item><title><![CDATA[Data Engineering Weekly #256]]></title><description><![CDATA[The Weekly Data Engineering Newsletter]]></description><link>https://www.dataengineeringweekly.com/p/data-engineering-weekly-256</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/data-engineering-weekly-256</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Mon, 09 Feb 2026 04:54:05 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!AdQk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6f51c1bf-abc9-4cd3-ad69-22cc8e3f1ef2_1080x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/what-assets-do-best?utm_campaign=37123386-26-02-DMND_Dagster_Childrens_Book&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=childrens_book&amp;utm_content=02-08-26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Lc0r!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e92cf7-314c-4a76-9c5e-f60bdf37821b_1200x630.heic 424w, https://substackcdn.com/image/fetch/$s_!Lc0r!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e92cf7-314c-4a76-9c5e-f60bdf37821b_1200x630.heic 848w, https://substackcdn.com/image/fetch/$s_!Lc0r!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e92cf7-314c-4a76-9c5e-f60bdf37821b_1200x630.heic 1272w, https://substackcdn.com/image/fetch/$s_!Lc0r!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e92cf7-314c-4a76-9c5e-f60bdf37821b_1200x630.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Lc0r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e92cf7-314c-4a76-9c5e-f60bdf37821b_1200x630.heic" width="1200" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c0e92cf7-314c-4a76-9c5e-f60bdf37821b_1200x630.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17350,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/what-assets-do-best?utm_campaign=37123386-26-02-DMND_Dagster_Childrens_Book&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=childrens_book&amp;utm_content=02-08-26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/187353312?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e92cf7-314c-4a76-9c5e-f60bdf37821b_1200x630.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Lc0r!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e92cf7-314c-4a76-9c5e-f60bdf37821b_1200x630.heic 424w, https://substackcdn.com/image/fetch/$s_!Lc0r!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e92cf7-314c-4a76-9c5e-f60bdf37821b_1200x630.heic 848w, https://substackcdn.com/image/fetch/$s_!Lc0r!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e92cf7-314c-4a76-9c5e-f60bdf37821b_1200x630.heic 1272w, https://substackcdn.com/image/fetch/$s_!Lc0r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e92cf7-314c-4a76-9c5e-f60bdf37821b_1200x630.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>What assets do best: The Dagster Children's Book</h1><p>We&#8217;re excited to share something a little unexpected from the Dagster team: What assets do best, a children&#8217;s book about data assets! Perfect for kids and data-loving grown-ups alike, you&#8217;ll learn how assets work together, adapt to change, and give teams a complete view of their data.<br><br>Watch the narrated story, find out where you can snag a free book IRL, and print &amp; play puzzles!</p><p><strong><a href="https://dagster.io/what-assets-do-best?utm_campaign=37123386-26-02-DMND_Dagster_Childrens_Book&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=childrens_book&amp;utm_content=02-08-26_data_engineering_weekly">Check out the book &amp; other activities</a></strong></p><div><hr></div><h1>Alexander Shereshevsky: Graph RAG in 2026 - A Practitioner&#8217;s Guide to What Actually Works</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XWy8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5c0303d-8bc9-4ffb-8f0d-4c0c6ab623b1_1400x764.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XWy8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5c0303d-8bc9-4ffb-8f0d-4c0c6ab623b1_1400x764.heic 424w, https://substackcdn.com/image/fetch/$s_!XWy8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5c0303d-8bc9-4ffb-8f0d-4c0c6ab623b1_1400x764.heic 848w, https://substackcdn.com/image/fetch/$s_!XWy8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5c0303d-8bc9-4ffb-8f0d-4c0c6ab623b1_1400x764.heic 1272w, https://substackcdn.com/image/fetch/$s_!XWy8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5c0303d-8bc9-4ffb-8f0d-4c0c6ab623b1_1400x764.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XWy8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5c0303d-8bc9-4ffb-8f0d-4c0c6ab623b1_1400x764.heic" width="1400" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b5c0303d-8bc9-4ffb-8f0d-4c0c6ab623b1_1400x764.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:32473,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/187353312?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5c0303d-8bc9-4ffb-8f0d-4c0c6ab623b1_1400x764.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XWy8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5c0303d-8bc9-4ffb-8f0d-4c0c6ab623b1_1400x764.heic 424w, https://substackcdn.com/image/fetch/$s_!XWy8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5c0303d-8bc9-4ffb-8f0d-4c0c6ab623b1_1400x764.heic 848w, https://substackcdn.com/image/fetch/$s_!XWy8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5c0303d-8bc9-4ffb-8f0d-4c0c6ab623b1_1400x764.heic 1272w, https://substackcdn.com/image/fetch/$s_!XWy8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5c0303d-8bc9-4ffb-8f0d-4c0c6ab623b1_1400x764.heic 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Graph RAG adoption stalled after early hype because high indexing costs and unclear performance trade-offs limited production use. The article explains when Graph RAG outperforms vector search, how teams reduce costs with selective graph construction, and why hybrid vector&#8211;graph architectures deliver the best results.</p><p><strong><a href="https://medium.com/@shereshevsky/graph-rag-in-2026-a-practitioners-guide-to-what-actually-works-dca4962e7517">https://medium.com/@shereshevsky/graph-rag-in-2026-a-practitioners-guide-to-what-actually-works-dca4962e7517</a></strong></p><div><hr></div><h1>Mark Rittman: So, Just How Relevant is Multi-Touch Attribution for Marketers in 2026?</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-U-T!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59159696-3c90-4f51-8429-4c3aaccfcf19_2950x1404.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-U-T!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59159696-3c90-4f51-8429-4c3aaccfcf19_2950x1404.heic 424w, https://substackcdn.com/image/fetch/$s_!-U-T!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59159696-3c90-4f51-8429-4c3aaccfcf19_2950x1404.heic 848w, https://substackcdn.com/image/fetch/$s_!-U-T!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59159696-3c90-4f51-8429-4c3aaccfcf19_2950x1404.heic 1272w, https://substackcdn.com/image/fetch/$s_!-U-T!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59159696-3c90-4f51-8429-4c3aaccfcf19_2950x1404.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-U-T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59159696-3c90-4f51-8429-4c3aaccfcf19_2950x1404.heic" width="1456" height="693" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/59159696-3c90-4f51-8429-4c3aaccfcf19_2950x1404.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:693,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:27323,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/187353312?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59159696-3c90-4f51-8429-4c3aaccfcf19_2950x1404.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-U-T!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59159696-3c90-4f51-8429-4c3aaccfcf19_2950x1404.heic 424w, https://substackcdn.com/image/fetch/$s_!-U-T!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59159696-3c90-4f51-8429-4c3aaccfcf19_2950x1404.heic 848w, https://substackcdn.com/image/fetch/$s_!-U-T!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59159696-3c90-4f51-8429-4c3aaccfcf19_2950x1404.heic 1272w, https://substackcdn.com/image/fetch/$s_!-U-T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59159696-3c90-4f51-8429-4c3aaccfcf19_2950x1404.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Marketing attribution struggles in 2026 as privacy controls, regulations, and cookie loss remove large portions of user-level data. The article explains how teams adapt by combining deterministic identity for logged-in users, server-side tracking, and triangulation across MTA, MMM, and incrementality testing. Prioritizing authentication, tracking micro-conversions, and owning raw event data enables more reliable attribution in a privacy-first environment.</p><p><strong><a href="https://blog.rittmananalytics.com/how-relevant-is-multi-touch-attribution-for-marke-275a71a36d5e">https://blog.rittmananalytics.com/how-relevant-is-multi-touch-attribution-for-marke-275a71a36d5e</a></strong></p><div><hr></div><h1>Pinterest: Next Generation DB Ingestion at Pinterest</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0PtB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29536bca-c84e-4130-89a6-8c7abd79fe59_1400x621.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0PtB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29536bca-c84e-4130-89a6-8c7abd79fe59_1400x621.heic 424w, https://substackcdn.com/image/fetch/$s_!0PtB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29536bca-c84e-4130-89a6-8c7abd79fe59_1400x621.heic 848w, https://substackcdn.com/image/fetch/$s_!0PtB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29536bca-c84e-4130-89a6-8c7abd79fe59_1400x621.heic 1272w, https://substackcdn.com/image/fetch/$s_!0PtB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29536bca-c84e-4130-89a6-8c7abd79fe59_1400x621.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0PtB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29536bca-c84e-4130-89a6-8c7abd79fe59_1400x621.heic" width="1400" height="621" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/29536bca-c84e-4130-89a6-8c7abd79fe59_1400x621.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:621,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7333,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/187353312?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29536bca-c84e-4130-89a6-8c7abd79fe59_1400x621.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0PtB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29536bca-c84e-4130-89a6-8c7abd79fe59_1400x621.heic 424w, https://substackcdn.com/image/fetch/$s_!0PtB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29536bca-c84e-4130-89a6-8c7abd79fe59_1400x621.heic 848w, https://substackcdn.com/image/fetch/$s_!0PtB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29536bca-c84e-4130-89a6-8c7abd79fe59_1400x621.heic 1272w, https://substackcdn.com/image/fetch/$s_!0PtB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F29536bca-c84e-4130-89a6-8c7abd79fe59_1400x621.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Legacy batch-based ingestion pipelines created high latency, operational complexity, and compliance gaps across Pinterest&#8217;s data ecosystem. The article explains how Pinterest built a unified CDC-based ingestion framework using Kafka, Flink, Spark, and Iceberg to stream database changes and efficiently upsert them into analytical tables, reducing data latency from days to minutes while lowering compute costs and improving reliability at the petabyte scale.</p><p><strong><a href="https://medium.com/pinterest-engineering/next-generation-db-ingestion-at-pinterest-66844b7153b7">https://medium.com/pinterest-engineering/next-generation-db-ingestion-at-pinterest-66844b7153b7</a></strong></p><div><hr></div><h1>Sponsored: Dagster Running Dagster</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/events/dagster-running-dagster-how-we-use-compass-for-ai-analytics?utm_campaign=34764303-26-02-WBNR_Deep%20Dive_Dagster_Running_Dagster_Compass&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_compass&amp;utm_content=02-08-26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!avAE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8361337-d96b-4210-8bf7-0b913055da7a_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!avAE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8361337-d96b-4210-8bf7-0b913055da7a_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!avAE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8361337-d96b-4210-8bf7-0b913055da7a_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!avAE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8361337-d96b-4210-8bf7-0b913055da7a_3840x2160.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!avAE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8361337-d96b-4210-8bf7-0b913055da7a_3840x2160.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f8361337-d96b-4210-8bf7-0b913055da7a_3840x2160.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:29444,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/events/dagster-running-dagster-how-we-use-compass-for-ai-analytics?utm_campaign=34764303-26-02-WBNR_Deep%20Dive_Dagster_Running_Dagster_Compass&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_compass&amp;utm_content=02-08-26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/187353312?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8361337-d96b-4210-8bf7-0b913055da7a_3840x2160.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!avAE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8361337-d96b-4210-8bf7-0b913055da7a_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!avAE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8361337-d96b-4210-8bf7-0b913055da7a_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!avAE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8361337-d96b-4210-8bf7-0b913055da7a_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!avAE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8361337-d96b-4210-8bf7-0b913055da7a_3840x2160.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In this upcoming session, find out how Dagster's data team has increased its capacity, along with best practices for data modeling that work well with AI assistants. We'll also demo a real case where our Compass Dagster+ integration identified the root cause of a Postgres-to-Snowflake pipeline that was failing 40-50% of the time.</p><p><strong><a href="https://dagster.io/events/dagster-running-dagster-how-we-use-compass-for-ai-analytics?utm_campaign=34764303-26-02-WBNR_Deep%20Dive_Dagster_Running_Dagster_Compass&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_compass&amp;utm_content=02-08-26_data_engineering_weekly">Register now</a></strong></p><div><hr></div><h1>Netflix: The Data Canary: How Netflix Validates Catalog Metadata</h1><p>Although this article is not specifically about the data warehouse, it demonstrates how data corruption can occur without a code change and still disrupt the system. The article describes how Netflix built a data canary system that validates new catalog metadata using side-by-side clusters, chaos-based testing, and customer-centric behavioral metrics. By detecting corruption within minutes and blocking the release of unsafe data, Netflix applies code-level deployment rigor to high-velocity data pipelines.</p><p><strong><a href="https://netflixtechblog.medium.com/the-data-canary-how-netflix-validates-catalog-metadata-18b699d58e36">https://netflixtechblog.medium.com/the-data-canary-how-netflix-validates-catalog-metadata-18b699d58e36</a></strong></p><div><hr></div><h1>Uber: Introducing uFowarder - The Consumer Proxy for Kafka Async Queuing</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ul1d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6498c7c6-0fba-49cc-ab8a-ed96714d077d_1536x862.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ul1d!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6498c7c6-0fba-49cc-ab8a-ed96714d077d_1536x862.heic 424w, https://substackcdn.com/image/fetch/$s_!Ul1d!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6498c7c6-0fba-49cc-ab8a-ed96714d077d_1536x862.heic 848w, https://substackcdn.com/image/fetch/$s_!Ul1d!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6498c7c6-0fba-49cc-ab8a-ed96714d077d_1536x862.heic 1272w, https://substackcdn.com/image/fetch/$s_!Ul1d!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6498c7c6-0fba-49cc-ab8a-ed96714d077d_1536x862.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ul1d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6498c7c6-0fba-49cc-ab8a-ed96714d077d_1536x862.heic" width="1456" height="817" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6498c7c6-0fba-49cc-ab8a-ed96714d077d_1536x862.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:817,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:14301,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/187353312?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6498c7c6-0fba-49cc-ab8a-ed96714d077d_1536x862.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ul1d!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6498c7c6-0fba-49cc-ab8a-ed96714d077d_1536x862.heic 424w, https://substackcdn.com/image/fetch/$s_!Ul1d!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6498c7c6-0fba-49cc-ab8a-ed96714d077d_1536x862.heic 848w, https://substackcdn.com/image/fetch/$s_!Ul1d!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6498c7c6-0fba-49cc-ab8a-ed96714d077d_1536x862.heic 1272w, https://substackcdn.com/image/fetch/$s_!Ul1d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6498c7c6-0fba-49cc-ab8a-ed96714d077d_1536x862.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Kafka consumer services struggle to scale reliably when direct protocol access introduces head-of-the-line blocking, inefficiency, and operational complexity. The article explains how Uber built uForwarder, a gRPC-based Kafka consumer proxy that resolves head-of-line blocking, improves hardware utilization, isolates traffic, and supports delayed processing. By abstracting Kafka internals behind a push-based interface, uForwarder increases reliability and efficiency across thousands of consumer workloads.</p><p><strong><a href="https://www.uber.com/en-IN/blog/introducing-ufowarder/">https://www.uber.com/en-IN/blog/introducing-ufowarder/</a></strong></p><div><hr></div><h1>Pierce Lamb: Agentic Search over Graphs of Long Documents (or LAD-RAG++)</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PUZG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8807a27-20c2-44fc-ac4b-6f7bc5046a16_1400x764.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PUZG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8807a27-20c2-44fc-ac4b-6f7bc5046a16_1400x764.heic 424w, https://substackcdn.com/image/fetch/$s_!PUZG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8807a27-20c2-44fc-ac4b-6f7bc5046a16_1400x764.heic 848w, https://substackcdn.com/image/fetch/$s_!PUZG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8807a27-20c2-44fc-ac4b-6f7bc5046a16_1400x764.heic 1272w, https://substackcdn.com/image/fetch/$s_!PUZG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8807a27-20c2-44fc-ac4b-6f7bc5046a16_1400x764.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PUZG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8807a27-20c2-44fc-ac4b-6f7bc5046a16_1400x764.heic" width="1400" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b8807a27-20c2-44fc-ac4b-6f7bc5046a16_1400x764.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:39712,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/187353312?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8807a27-20c2-44fc-ac4b-6f7bc5046a16_1400x764.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PUZG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8807a27-20c2-44fc-ac4b-6f7bc5046a16_1400x764.heic 424w, https://substackcdn.com/image/fetch/$s_!PUZG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8807a27-20c2-44fc-ac4b-6f7bc5046a16_1400x764.heic 848w, https://substackcdn.com/image/fetch/$s_!PUZG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8807a27-20c2-44fc-ac4b-6f7bc5046a16_1400x764.heic 1272w, https://substackcdn.com/image/fetch/$s_!PUZG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8807a27-20c2-44fc-ac4b-6f7bc5046a16_1400x764.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Possibly one of the best reads for this week for me. Vanilla RAG struggles with long, structured documents because static chunking loses layout, relationships, and cross-page context. The author reviews LAD-RAG++, which constructs layout-aware document graphs and employs agentic retrieval to explore structural and semantic connections dynamically. Engineering improvements in memory control, graph pruning, and cost-efficient processing make graph-based RAG practical for high-recall question answering over dense professional documents. </p><p><strong><a href="https://pierce-lamb.medium.com/agentic-search-over-graphs-of-long-documents-or-lad-rag-1264030158e8">https://pierce-lamb.medium.com/agentic-search-over-graphs-of-long-documents-or-lad-rag-1264030158e8</a></strong></p><div><hr></div><h1>Halodoc: Halodoc&#8217;s Layered Data Validation Strategy for Building Trust in the Lakehouse</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kwYE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde70d11-a62d-4983-838e-d318b0da4db3_1677x993.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kwYE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde70d11-a62d-4983-838e-d318b0da4db3_1677x993.heic 424w, https://substackcdn.com/image/fetch/$s_!kwYE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde70d11-a62d-4983-838e-d318b0da4db3_1677x993.heic 848w, https://substackcdn.com/image/fetch/$s_!kwYE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde70d11-a62d-4983-838e-d318b0da4db3_1677x993.heic 1272w, https://substackcdn.com/image/fetch/$s_!kwYE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde70d11-a62d-4983-838e-d318b0da4db3_1677x993.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kwYE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde70d11-a62d-4983-838e-d318b0da4db3_1677x993.heic" width="1456" height="862" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fde70d11-a62d-4983-838e-d318b0da4db3_1677x993.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:862,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:18964,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/187353312?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde70d11-a62d-4983-838e-d318b0da4db3_1677x993.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kwYE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde70d11-a62d-4983-838e-d318b0da4db3_1677x993.heic 424w, https://substackcdn.com/image/fetch/$s_!kwYE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde70d11-a62d-4983-838e-d318b0da4db3_1677x993.heic 848w, https://substackcdn.com/image/fetch/$s_!kwYE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde70d11-a62d-4983-838e-d318b0da4db3_1677x993.heic 1272w, https://substackcdn.com/image/fetch/$s_!kwYE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffde70d11-a62d-4983-838e-d318b0da4db3_1677x993.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Data quality issues in healthcare analytics demand stronger guarantees than generic validation frameworks can provide. The article explains how Halodoc built a custom, configuration-driven validation pipeline with four-layered checks that combine time-bound reconciliation, AI-generated structural tests, and business-rule enforcement across the Lakehouse. By integrating LLMs into validation and surfacing failures in real time, the system reduces incidents, increases trust in analytics, and supports reliable clinical decision-making.</p><p><strong><a href="https://blogs.halodoc.io/halodocs-layered-data-validation-strategy/amp/">https://blogs.halodoc.io/halodocs-layered-data-validation-strategy/amp/</a></strong></p><div><hr></div><h1>Booking.com: Beyond Prompt Engineering: How We Used Supervised Fine-Tuning for Travel Recommendations</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!beVr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f334ba-7c0c-4380-8443-7d8d030fe499_1400x830.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!beVr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f334ba-7c0c-4380-8443-7d8d030fe499_1400x830.heic 424w, https://substackcdn.com/image/fetch/$s_!beVr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f334ba-7c0c-4380-8443-7d8d030fe499_1400x830.heic 848w, https://substackcdn.com/image/fetch/$s_!beVr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f334ba-7c0c-4380-8443-7d8d030fe499_1400x830.heic 1272w, https://substackcdn.com/image/fetch/$s_!beVr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f334ba-7c0c-4380-8443-7d8d030fe499_1400x830.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!beVr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f334ba-7c0c-4380-8443-7d8d030fe499_1400x830.heic" width="1400" height="830" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e6f334ba-7c0c-4380-8443-7d8d030fe499_1400x830.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:830,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:31481,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/187353312?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f334ba-7c0c-4380-8443-7d8d030fe499_1400x830.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!beVr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f334ba-7c0c-4380-8443-7d8d030fe499_1400x830.heic 424w, https://substackcdn.com/image/fetch/$s_!beVr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f334ba-7c0c-4380-8443-7d8d030fe499_1400x830.heic 848w, https://substackcdn.com/image/fetch/$s_!beVr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f334ba-7c0c-4380-8443-7d8d030fe499_1400x830.heic 1272w, https://substackcdn.com/image/fetch/$s_!beVr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6f334ba-7c0c-4380-8443-7d8d030fe499_1400x830.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Prompt-based LLMs struggle to deliver fast, privacy-safe, and personalized travel recommendations at production scale. The article explains how Booking.com used supervised fine-tuning on an open-weight model with parameter-efficient techniques, contextual inputs, and carefully designed labels to combine conversational understanding with behavioral signals.</p><p><strong><a href="https://booking.ai/beyond-prompt-engineering-how-we-used-supervised-fine-tuning-for-travel-recommendations-91e8f4711e4b">https://booking.ai/beyond-prompt-engineering-how-we-used-supervised-fine-tuning-for-travel-recommendations-91e8f4711e4b</a></strong></p><div><hr></div><h1>Pranav Mehta: Clickhouse Internals: A Deep Dive into ClickHouse Distributed Connection Pooling</h1><p>ClickHouse operators may misinterpret connection-retry warnings as leaks when distributed queries encounter transient network errors. The article explains how ClickHouse reuses pooled TCP connections for distributed tables and why idle timeouts in spiky workloads produce harmless &#8220;Broken pipe&#8221; warnings. </p><p><strong><a href="https://medium.com/@pranavmehta94/clickhouse-internals-a-deep-dive-into-clickhouse-distributed-connection-pooling-d9e956b5eb57">https://medium.com/@pranavmehta94/clickhouse-internals-a-deep-dive-into-clickhouse-distributed-connection-pooling-d9e956b5eb57</a></strong></p><div><hr></div><p><em>All rights reserved, Dewpeche Private Limited. I have provided links for informational purposes and do not suggest endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers&#8217; opinions.</em></p>]]></content:encoded></item><item><title><![CDATA[Data Engineering Weekly #255]]></title><description><![CDATA[The Weekly Data Engineering Newsletter]]></description><link>https://www.dataengineeringweekly.com/p/data-engineering-weekly-255</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/data-engineering-weekly-255</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Mon, 02 Feb 2026 02:22:26 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!AdQk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6f51c1bf-abc9-4cd3-ad69-22cc8e3f1ef2_1080x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/events/dagster-running-dagster-how-we-use-compass-for-ai-analytics?utm_campaign=34764303-26-02-WBNR_Deep%20Dive_Dagster_Running_Dagster_Compass&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_compass&amp;utm_content=02-01-26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!azkE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c6cd40-4b1c-480e-83bf-193e1fd0b1a9_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!azkE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c6cd40-4b1c-480e-83bf-193e1fd0b1a9_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!azkE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c6cd40-4b1c-480e-83bf-193e1fd0b1a9_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!azkE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c6cd40-4b1c-480e-83bf-193e1fd0b1a9_3840x2160.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!azkE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c6cd40-4b1c-480e-83bf-193e1fd0b1a9_3840x2160.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/87c6cd40-4b1c-480e-83bf-193e1fd0b1a9_3840x2160.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22021,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/events/dagster-running-dagster-how-we-use-compass-for-ai-analytics?utm_campaign=34764303-26-02-WBNR_Deep%20Dive_Dagster_Running_Dagster_Compass&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_compass&amp;utm_content=02-01-26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/186564828?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c6cd40-4b1c-480e-83bf-193e1fd0b1a9_3840x2160.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!azkE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c6cd40-4b1c-480e-83bf-193e1fd0b1a9_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!azkE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c6cd40-4b1c-480e-83bf-193e1fd0b1a9_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!azkE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c6cd40-4b1c-480e-83bf-193e1fd0b1a9_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!azkE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c6cd40-4b1c-480e-83bf-193e1fd0b1a9_3840x2160.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Dagster Running Dagster dives into AI analytics.</h1><p>In this upcoming session, Analytics Lead Anil walks through how Compass has increased the Dagster data team's capacity, shares best practices for data modeling that work well with AI assistants (hint: nested columns and wide tables are your friends), and demos a real case where our Compass Dagster+ integration identified the root cause of a Postgres-to-Snowflake pipeline that was failing 40-50% of the time.</p><p><strong><a href="https://dagster.io/events/dagster-running-dagster-how-we-use-compass-for-ai-analytics?utm_campaign=34764303-26-02-WBNR_Deep%20Dive_Dagster_Running_Dagster_Compass&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_compass&amp;utm_content=02-01-26_data_engineering_weekly">Save your spot now</a>.</strong></p><div><hr></div><p>OpenAI: Unrolling the Codex agent loop</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pE2N!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81c303-3c96-476c-8c27-93c60e1f3c64_1198x716.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pE2N!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81c303-3c96-476c-8c27-93c60e1f3c64_1198x716.heic 424w, https://substackcdn.com/image/fetch/$s_!pE2N!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81c303-3c96-476c-8c27-93c60e1f3c64_1198x716.heic 848w, https://substackcdn.com/image/fetch/$s_!pE2N!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81c303-3c96-476c-8c27-93c60e1f3c64_1198x716.heic 1272w, https://substackcdn.com/image/fetch/$s_!pE2N!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81c303-3c96-476c-8c27-93c60e1f3c64_1198x716.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pE2N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81c303-3c96-476c-8c27-93c60e1f3c64_1198x716.heic" width="1198" height="716" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d81c303-3c96-476c-8c27-93c60e1f3c64_1198x716.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:716,&quot;width&quot;:1198,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:11866,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/186564828?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81c303-3c96-476c-8c27-93c60e1f3c64_1198x716.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pE2N!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81c303-3c96-476c-8c27-93c60e1f3c64_1198x716.heic 424w, https://substackcdn.com/image/fetch/$s_!pE2N!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81c303-3c96-476c-8c27-93c60e1f3c64_1198x716.heic 848w, https://substackcdn.com/image/fetch/$s_!pE2N!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81c303-3c96-476c-8c27-93c60e1f3c64_1198x716.heic 1272w, https://substackcdn.com/image/fetch/$s_!pE2N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d81c303-3c96-476c-8c27-93c60e1f3c64_1198x716.heic 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The explanations of AI agents often obscure how local tools, model inference, and user interaction are orchestrated in practice. The article breaks down the Codex CLI agent loop, detailing how prompts, tool calls, iterative inference, context compaction, and prompt caching work together to execute software tasks efficiently. By combining stateless operation, automatic context management, and flexible tool integration via MCP, Codex achieves secure, performant local agent execution without server-side session retention.</p><p><strong><a href="https://openai.com/index/unrolling-the-codex-agent-loop/">https://openai.com/index/unrolling-the-codex-agent-loop/</a></strong></p><div><hr></div><h1>OpenAI: Inside OpenAI&#8217;s in-house data agent</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bzTg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0caca09a-1888-48f5-ab5a-983a45c4368f_1830x1124.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bzTg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0caca09a-1888-48f5-ab5a-983a45c4368f_1830x1124.heic 424w, https://substackcdn.com/image/fetch/$s_!bzTg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0caca09a-1888-48f5-ab5a-983a45c4368f_1830x1124.heic 848w, https://substackcdn.com/image/fetch/$s_!bzTg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0caca09a-1888-48f5-ab5a-983a45c4368f_1830x1124.heic 1272w, https://substackcdn.com/image/fetch/$s_!bzTg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0caca09a-1888-48f5-ab5a-983a45c4368f_1830x1124.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bzTg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0caca09a-1888-48f5-ab5a-983a45c4368f_1830x1124.heic" width="1456" height="894" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0caca09a-1888-48f5-ab5a-983a45c4368f_1830x1124.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:894,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:23903,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/186564828?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0caca09a-1888-48f5-ab5a-983a45c4368f_1830x1124.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bzTg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0caca09a-1888-48f5-ab5a-983a45c4368f_1830x1124.heic 424w, https://substackcdn.com/image/fetch/$s_!bzTg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0caca09a-1888-48f5-ab5a-983a45c4368f_1830x1124.heic 848w, https://substackcdn.com/image/fetch/$s_!bzTg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0caca09a-1888-48f5-ab5a-983a45c4368f_1830x1124.heic 1272w, https://substackcdn.com/image/fetch/$s_!bzTg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0caca09a-1888-48f5-ab5a-983a45c4368f_1830x1124.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>OpenAI writes about its internal data agent, which uses a closed-loop, self-correcting process and multiple context layers to translate natural language into reliable queries across hundreds of petabytes of data. By grounding meaning in code, minimizing tool complexity, and enforcing pass-through permissions with continuous evaluation, the system delivers fast, secure, and reliable data access for employees at scale.</p><p><strong><a href="https://openai.com/index/inside-our-in-house-data-agent/">https://openai.com/index/inside-our-in-house-data-agent/</a></strong></p><div><hr></div><h1>Preset: The Semantic Layer Is Back. Here&#8217;s What We&#8217;re Doing About It.</h1><p>The article reads like a pitch for the present, but what I liked most is the clear analogy for what a semantic layer is and why it has failed, even though legacy tools like Business Objects do support it. Overall, I&#8217;m excited about the agents as an interface for insights and the renewed interest in the semantic layer. </p><p><strong><a href="https://preset.io/blog/semantic-layer-is-back/">https://preset.io/blog/semantic-layer-is-back/</a></strong></p><div><hr></div><h1>Sponsored: How to build a data platform that's ready for AI</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=02_01_26_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2i_W!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F682b92fe-6c83-4992-b0e2-9dad75a8be4f_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!2i_W!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F682b92fe-6c83-4992-b0e2-9dad75a8be4f_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!2i_W!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F682b92fe-6c83-4992-b0e2-9dad75a8be4f_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!2i_W!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F682b92fe-6c83-4992-b0e2-9dad75a8be4f_2400x1260.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2i_W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F682b92fe-6c83-4992-b0e2-9dad75a8be4f_2400x1260.heic" width="1456" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/682b92fe-6c83-4992-b0e2-9dad75a8be4f_2400x1260.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:15511,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=02_01_26_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/186564828?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F682b92fe-6c83-4992-b0e2-9dad75a8be4f_2400x1260.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2i_W!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F682b92fe-6c83-4992-b0e2-9dad75a8be4f_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!2i_W!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F682b92fe-6c83-4992-b0e2-9dad75a8be4f_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!2i_W!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F682b92fe-6c83-4992-b0e2-9dad75a8be4f_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!2i_W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F682b92fe-6c83-4992-b0e2-9dad75a8be4f_2400x1260.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Traditional data platforms are becoming the biggest bottleneck when companies experiment with AI. Learn how to build a unified control plane that enables AI-driven development, reduces pipeline failures, and cuts complexity.<br><br>- Transform from Big Complexity to AI-ready architecture<br>- Real metrics from organizations achieving 50% cost reductions<br>- Introduction to Components: YAML-first pipelines that AI can build</p><p><strong><a href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=02_01_26_data_engineering_weekly">Get the free guide now</a></strong></p><div><hr></div><h1>LangChain: Context Management for Deep Agents</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!95-w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a27a000-f63e-42fd-8e78-a254b7a9c4e8_1783x1047.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!95-w!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a27a000-f63e-42fd-8e78-a254b7a9c4e8_1783x1047.heic 424w, https://substackcdn.com/image/fetch/$s_!95-w!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a27a000-f63e-42fd-8e78-a254b7a9c4e8_1783x1047.heic 848w, https://substackcdn.com/image/fetch/$s_!95-w!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a27a000-f63e-42fd-8e78-a254b7a9c4e8_1783x1047.heic 1272w, https://substackcdn.com/image/fetch/$s_!95-w!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a27a000-f63e-42fd-8e78-a254b7a9c4e8_1783x1047.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!95-w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a27a000-f63e-42fd-8e78-a254b7a9c4e8_1783x1047.heic" width="1456" height="855" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3a27a000-f63e-42fd-8e78-a254b7a9c4e8_1783x1047.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:855,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17628,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/186564828?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a27a000-f63e-42fd-8e78-a254b7a9c4e8_1783x1047.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!95-w!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a27a000-f63e-42fd-8e78-a254b7a9c4e8_1783x1047.heic 424w, https://substackcdn.com/image/fetch/$s_!95-w!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a27a000-f63e-42fd-8e78-a254b7a9c4e8_1783x1047.heic 848w, https://substackcdn.com/image/fetch/$s_!95-w!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a27a000-f63e-42fd-8e78-a254b7a9c4e8_1783x1047.heic 1272w, https://substackcdn.com/image/fetch/$s_!95-w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a27a000-f63e-42fd-8e78-a254b7a9c4e8_1783x1047.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Large AI agents risk context rot when long-running tasks exceed LLM memory limits, degrading reasoning quality. The article explains how the Deep Agents SDK actively manages context using tool input and output offloading, filesystem-backed pointers, and structured summarization to stay within token limits. Targeted evaluations ensure agents can recover critical details from compressed context and maintain task intent over extended workflows.</p><p><strong><a href="https://www.blog.langchain.com/context-management-for-deepagents/">https://www.blog.langchain.com/context-management-for-deepagents/</a></strong></p><div><hr></div><h1>Dropbox: Engineering VP Josh Clemm on how we use knowledge graphs, MCP, and DSPy in Dash</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ioGh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6706b1e-7ecf-4828-877c-cb94457c44c1_1600x900.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ioGh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6706b1e-7ecf-4828-877c-cb94457c44c1_1600x900.heic 424w, https://substackcdn.com/image/fetch/$s_!ioGh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6706b1e-7ecf-4828-877c-cb94457c44c1_1600x900.heic 848w, https://substackcdn.com/image/fetch/$s_!ioGh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6706b1e-7ecf-4828-877c-cb94457c44c1_1600x900.heic 1272w, https://substackcdn.com/image/fetch/$s_!ioGh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6706b1e-7ecf-4828-877c-cb94457c44c1_1600x900.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ioGh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6706b1e-7ecf-4828-877c-cb94457c44c1_1600x900.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c6706b1e-7ecf-4828-877c-cb94457c44c1_1600x900.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:20332,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/186564828?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6706b1e-7ecf-4828-877c-cb94457c44c1_1600x900.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ioGh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6706b1e-7ecf-4828-877c-cb94457c44c1_1600x900.heic 424w, https://substackcdn.com/image/fetch/$s_!ioGh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6706b1e-7ecf-4828-877c-cb94457c44c1_1600x900.heic 848w, https://substackcdn.com/image/fetch/$s_!ioGh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6706b1e-7ecf-4828-877c-cb94457c44c1_1600x900.heic 1272w, https://substackcdn.com/image/fetch/$s_!ioGh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6706b1e-7ecf-4828-877c-cb94457c44c1_1600x900.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Building a universal search and agentic workspace is difficult because work data spans many tools, formats, and contexts, while LLMs face latency and context limits. The article explains how Dropbox Dash uses an index-based retrieval system, a context engine with multimodal processing, knowledge bundles, and MCP-based super tools, combined with LLM-as-a-judge and DSPy-driven prompt optimization.</p><p><strong><a href="https://dropbox.tech/machine-learning/vp-josh-clemm-knowledge-graphs-mcp-and-dspy-dash">https://dropbox.tech/machine-learning/vp-josh-clemm-knowledge-graphs-mcp-and-dspy-dash</a></strong></p><div><hr></div><h1>Whatnot: Lessons learned from scaling data scientists with AI</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!susK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10b0837-4a08-415f-939c-890e10cefc15_1400x402.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!susK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10b0837-4a08-415f-939c-890e10cefc15_1400x402.heic 424w, https://substackcdn.com/image/fetch/$s_!susK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10b0837-4a08-415f-939c-890e10cefc15_1400x402.heic 848w, https://substackcdn.com/image/fetch/$s_!susK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10b0837-4a08-415f-939c-890e10cefc15_1400x402.heic 1272w, https://substackcdn.com/image/fetch/$s_!susK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10b0837-4a08-415f-939c-890e10cefc15_1400x402.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!susK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10b0837-4a08-415f-939c-890e10cefc15_1400x402.heic" width="1400" height="402" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a10b0837-4a08-415f-939c-890e10cefc15_1400x402.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:402,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:14588,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/186564828?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10b0837-4a08-415f-939c-890e10cefc15_1400x402.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!susK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10b0837-4a08-415f-939c-890e10cefc15_1400x402.heic 424w, https://substackcdn.com/image/fetch/$s_!susK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10b0837-4a08-415f-939c-890e10cefc15_1400x402.heic 848w, https://substackcdn.com/image/fetch/$s_!susK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10b0837-4a08-415f-939c-890e10cefc15_1400x402.heic 1272w, https://substackcdn.com/image/fetch/$s_!susK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa10b0837-4a08-415f-939c-890e10cefc15_1400x402.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AI-driven analytics struggle to generate correct queries because raw tables lack explicit business meaning and consistent relationships. The article explains how semantic views encode business logic, table relationships, and approved data scope to give LLMs precise, machine-readable context for SQL generation. By standardizing definitions and constraining access to vetted datasets, semantic views improve query accuracy, reduce hallucinations, and make AI-assisted analytics safer and more reliable.</p><p><strong><a href="https://medium.com/whatnot-engineering/lessons-learned-from-scaling-data-scientists-with-ai-e7aa7b3235b4">https://medium.com/whatnot-engineering/lessons-learned-from-scaling-data-scientists-with-ai-e7aa7b3235b4</a></strong></p><div><hr></div><h1>Netflix: Data Bridge: How Netflix simplifies data movement</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CXjk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10cd3f05-b03d-4aac-984e-ea9481358af3_1275x582.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CXjk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10cd3f05-b03d-4aac-984e-ea9481358af3_1275x582.heic 424w, https://substackcdn.com/image/fetch/$s_!CXjk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10cd3f05-b03d-4aac-984e-ea9481358af3_1275x582.heic 848w, https://substackcdn.com/image/fetch/$s_!CXjk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10cd3f05-b03d-4aac-984e-ea9481358af3_1275x582.heic 1272w, https://substackcdn.com/image/fetch/$s_!CXjk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10cd3f05-b03d-4aac-984e-ea9481358af3_1275x582.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CXjk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10cd3f05-b03d-4aac-984e-ea9481358af3_1275x582.heic" width="1275" height="582" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/10cd3f05-b03d-4aac-984e-ea9481358af3_1275x582.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:582,&quot;width&quot;:1275,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17276,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/186564828?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10cd3f05-b03d-4aac-984e-ea9481358af3_1275x582.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CXjk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10cd3f05-b03d-4aac-984e-ea9481358af3_1275x582.heic 424w, https://substackcdn.com/image/fetch/$s_!CXjk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10cd3f05-b03d-4aac-984e-ea9481358af3_1275x582.heic 848w, https://substackcdn.com/image/fetch/$s_!CXjk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10cd3f05-b03d-4aac-984e-ea9481358af3_1275x582.heic 1272w, https://substackcdn.com/image/fetch/$s_!CXjk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10cd3f05-b03d-4aac-984e-ea9481358af3_1275x582.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Fragmented data movement tooling creates operational overhead, inconsistent governance, and tightly coupled implementations across large data ecosystems. The article describes how Netflix built Data Bridge as a unified control plane that separates user intent from execution, centralizes governance, and orchestrates existing data movement systems through standardized interfaces.</p><p><strong><a href="https://netflixtechblog.medium.com/data-bridge-how-netflix-simplifies-data-movement-36d10d91c313">https://netflixtechblog.medium.com/data-bridge-how-netflix-simplifies-data-movement-36d10d91c313</a></strong></p><div><hr></div><h1>LinkedIn: Contextual agent playbooks and tools: How LinkedIn gave AI coding agents organizational context</h1><p>AI coding agents struggle to operate effectively without access to company-specific context, tools, and workflows. The article describes LinkedIn&#8217;s CAPT framework, which uses MCP, executable playbooks, and scalable meta-tools to connect agents to internal systems while controlling context and tool discovery. By packaging CAPT as a zero-friction local service, LinkedIn enables agents to automate debugging, incident response, data analysis, and issue triage, reducing investigation time by up to 70%.</p><p><strong><a href="https://www.linkedin.com/blog/engineering/ai/contextual-agent-playbooks-and-tools-how-linkedin-gave-ai-coding-agents-organizational-context">https://www.linkedin.com/blog/engineering/ai/contextual-agent-playbooks-and-tools-how-linkedin-gave-ai-coding-agents-organizational-context</a></strong></p><div><hr></div><h1>Netflix: The AI Evolution of Graph Search at Netflix: From Structured Queries to Natural Language</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!WjZA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0332407-ffe3-4816-b42f-c3ecc1e51ecf_1323x1560.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!WjZA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0332407-ffe3-4816-b42f-c3ecc1e51ecf_1323x1560.heic 424w, https://substackcdn.com/image/fetch/$s_!WjZA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0332407-ffe3-4816-b42f-c3ecc1e51ecf_1323x1560.heic 848w, https://substackcdn.com/image/fetch/$s_!WjZA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0332407-ffe3-4816-b42f-c3ecc1e51ecf_1323x1560.heic 1272w, https://substackcdn.com/image/fetch/$s_!WjZA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0332407-ffe3-4816-b42f-c3ecc1e51ecf_1323x1560.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!WjZA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0332407-ffe3-4816-b42f-c3ecc1e51ecf_1323x1560.heic" width="1323" height="1560" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f0332407-ffe3-4816-b42f-c3ecc1e51ecf_1323x1560.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1560,&quot;width&quot;:1323,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:26591,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/186564828?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0332407-ffe3-4816-b42f-c3ecc1e51ecf_1323x1560.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!WjZA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0332407-ffe3-4816-b42f-c3ecc1e51ecf_1323x1560.heic 424w, https://substackcdn.com/image/fetch/$s_!WjZA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0332407-ffe3-4816-b42f-c3ecc1e51ecf_1323x1560.heic 848w, https://substackcdn.com/image/fetch/$s_!WjZA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0332407-ffe3-4816-b42f-c3ecc1e51ecf_1323x1560.heic 1272w, https://substackcdn.com/image/fetch/$s_!WjZA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0332407-ffe3-4816-b42f-c3ecc1e51ecf_1323x1560.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Enterprise search systems struggle when users must express complex filters through rigid, technical query languages. The article explains how Netflix evolved Graph Search by using LLMs to translate natural language into validated, schema-aware DSL queries with field-level RAG and AST-based verification. By visualizing AI-generated logic and supporting explicit entity selection, the platform lets users query federated data intuitively while maintaining correctness and trust.</p><p><strong><a href="https://netflixtechblog.com/the-ai-evolution-of-graph-search-at-netflix-d416ec5b1151">https://netflixtechblog.com/the-ai-evolution-of-graph-search-at-netflix-d416ec5b1151</a></strong></p><div><hr></div><h1>Modern Data 101: Modeling Semantics: How Data Models and Ontologies Connect to Build Your Semantic Foundations</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kQNr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d0c098a-29dc-4d5c-8b2c-f4e296970e84_1400x1003.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kQNr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d0c098a-29dc-4d5c-8b2c-f4e296970e84_1400x1003.heic 424w, https://substackcdn.com/image/fetch/$s_!kQNr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d0c098a-29dc-4d5c-8b2c-f4e296970e84_1400x1003.heic 848w, https://substackcdn.com/image/fetch/$s_!kQNr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d0c098a-29dc-4d5c-8b2c-f4e296970e84_1400x1003.heic 1272w, https://substackcdn.com/image/fetch/$s_!kQNr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d0c098a-29dc-4d5c-8b2c-f4e296970e84_1400x1003.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kQNr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d0c098a-29dc-4d5c-8b2c-f4e296970e84_1400x1003.heic" width="1400" height="1003" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5d0c098a-29dc-4d5c-8b2c-f4e296970e84_1400x1003.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1003,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:18258,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/186564828?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d0c098a-29dc-4d5c-8b2c-f4e296970e84_1400x1003.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kQNr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d0c098a-29dc-4d5c-8b2c-f4e296970e84_1400x1003.heic 424w, https://substackcdn.com/image/fetch/$s_!kQNr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d0c098a-29dc-4d5c-8b2c-f4e296970e84_1400x1003.heic 848w, https://substackcdn.com/image/fetch/$s_!kQNr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d0c098a-29dc-4d5c-8b2c-f4e296970e84_1400x1003.heic 1272w, https://substackcdn.com/image/fetch/$s_!kQNr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d0c098a-29dc-4d5c-8b2c-f4e296970e84_1400x1003.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AI-driven systems struggle without explicit semantic context to ground reasoning and reduce hallucinations. The article argues that data modeling and ontologies both capture entities and relationships and should serve as core methods for discovering and structuring organizational knowledge. By combining industry standards, conceptual modeling, and AI-assisted enrichment, teams can build a unified semantic foundation that improves both human understanding and AI accuracy.</p><p><strong><a href="https://medium.com/@community_md101/modeling-semantics-how-data-models-and-ontologies-connect-to-build-your-semantic-foundations-3a9a0664e3ff">https://medium.com/@community_md101/modeling-semantics-how-data-models-and-ontologies-connect-to-build-your-semantic-foundations-3a9a0664e3ff</a></strong></p><div><hr></div><p><em>All rights reserved, Dewpeche Private Limited. I have provided links for informational purposes and do not suggest endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers&#8217; opinions.</em></p>]]></content:encoded></item><item><title><![CDATA[The Missing Layer in Your AI Stack: Context, Not Just State]]></title><description><![CDATA[From SQL to Semantics: The Rise of the Context Graph for AI Agents]]></description><link>https://www.dataengineeringweekly.com/p/the-missing-layer-in-your-ai-stack</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/the-missing-layer-in-your-ai-stack</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Sat, 31 Jan 2026 04:13:17 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!3e6i!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50511770-69e0-465d-bcba-3f5078763d59_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ip7u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdad6f15-bc0b-4bdc-9369-04c11d67c9a8_1314x812.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ip7u!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdad6f15-bc0b-4bdc-9369-04c11d67c9a8_1314x812.heic 424w, https://substackcdn.com/image/fetch/$s_!Ip7u!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdad6f15-bc0b-4bdc-9369-04c11d67c9a8_1314x812.heic 848w, https://substackcdn.com/image/fetch/$s_!Ip7u!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdad6f15-bc0b-4bdc-9369-04c11d67c9a8_1314x812.heic 1272w, https://substackcdn.com/image/fetch/$s_!Ip7u!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdad6f15-bc0b-4bdc-9369-04c11d67c9a8_1314x812.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ip7u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdad6f15-bc0b-4bdc-9369-04c11d67c9a8_1314x812.heic" width="1314" height="812" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bdad6f15-bc0b-4bdc-9369-04c11d67c9a8_1314x812.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:812,&quot;width&quot;:1314,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:36462,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/186379044?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdad6f15-bc0b-4bdc-9369-04c11d67c9a8_1314x812.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ip7u!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdad6f15-bc0b-4bdc-9369-04c11d67c9a8_1314x812.heic 424w, https://substackcdn.com/image/fetch/$s_!Ip7u!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdad6f15-bc0b-4bdc-9369-04c11d67c9a8_1314x812.heic 848w, https://substackcdn.com/image/fetch/$s_!Ip7u!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdad6f15-bc0b-4bdc-9369-04c11d67c9a8_1314x812.heic 1272w, https://substackcdn.com/image/fetch/$s_!Ip7u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdad6f15-bc0b-4bdc-9369-04c11d67c9a8_1314x812.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong><a href="https://atlan.com/great-data-debate-2026/?utm_source=DEW+&amp;utm_medium=Substack&amp;utm_campaign=DEW_GDD">Join The Great Data Debate</a></strong> to get answers to questions the data &amp; AI industry is so curious about right now:</em></p><ul><li><p><em>Where does context materialize in practice?</em></p></li><li><p><em>Semantic layers, ontologies, context graphs - what should data teams build in 2026?</em></p></li><li><p><em>Who owns context as meaning evolves?</em></p></li><li><p><em>Where should that context live: in the warehouse, inside agents, or in a dedicated context layer?</em></p></li></ul><p><strong><a href="https://atlan.com/great-data-debate-2026/?utm_source=DEW+&amp;utm_medium=Substack&amp;utm_campaign=DEW_GDD">Register Here</a></strong></p><div><hr></div><h1>Why Data Engineers Must Think in Graphs, Not Just Tables</h1><p>If you have been following the &#8220;Systems of Record&#8221; debate on tech Twitter, you likely saw the clash between the &#8220;Agents kill SaaS&#8221; camp and the &#8220;Long live the Database&#8221; camp. But for data engineers, the reality is more nuanced&#8212;and far more interesting.</p><p>As we move from dashboards to autonomous agents, we are hitting a wall. It turns out that knowing the <em>state</em> (what happened) is not the same as knowing the <em>reasoning</em> (why it happened).</p><p>Drawing on recent insights from Foundation Capital, Jamin Ball (Altimeter), OpenAI&#8217;s internal engineering team, and the TrustGraph manifesto, this post explores the emergence of the <strong>Context Graph</strong>. This missing architectural layer will likely redefine how we build data platforms in the agentic era.</p><div><hr></div><h1>The Problem: State Machines vs. Decision Traces</h1><p>For the past decade, our role as data engineers has been to centralize data in the warehouse (or Lakehouse). We built ETL pipelines to move data from Salesforce, NetSuite, and Zendesk into a &#8220;Single Source of Truth.&#8221;</p><p>However, traditional Systems of Record (SoR) effectively act as &#8220;state machines.&#8221; They record the final output: the organization closed a deal, applied a discount, and escalated a ticket. But they fail to capture the <strong>decision traces</strong>.</p><p>As Foundation Capital notes, the <em>reasoning</em> behind a decision&#8212;the Slack threads, the cross-system synthesis, the VP&#8217;s verbal override of a policy&#8212;is rarely captured in the database. A CRM might show a &#8220;20% discount,&#8221; but it won&#8217;t tell an AI agent <em>why</em> that exception was granted (e.g., &#8220;Customer represents a strategic entry into the APAC market&#8221;).</p><p>Without these traces, agents fly blind. They have the rules (&#8221;Do not give discounts &gt;10%&#8221;), but they lack historical context on when and why they were violated.</p><div><hr></div><h1>The Solution: The Truth Registry and the Context Graph</h1><p>To address this, we observe a bifurcation in the modern data stack, as illustrated by the <strong>Hybrid Agentic Architecture</strong> (see Figure 1 below).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3e6i!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50511770-69e0-465d-bcba-3f5078763d59_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3e6i!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50511770-69e0-465d-bcba-3f5078763d59_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!3e6i!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50511770-69e0-465d-bcba-3f5078763d59_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!3e6i!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50511770-69e0-465d-bcba-3f5078763d59_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!3e6i!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50511770-69e0-465d-bcba-3f5078763d59_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3e6i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50511770-69e0-465d-bcba-3f5078763d59_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/50511770-69e0-465d-bcba-3f5078763d59_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3e6i!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50511770-69e0-465d-bcba-3f5078763d59_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!3e6i!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50511770-69e0-465d-bcba-3f5078763d59_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!3e6i!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50511770-69e0-465d-bcba-3f5078763d59_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!3e6i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50511770-69e0-465d-bcba-3f5078763d59_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This architecture consists of two distinct but integrated planes:</p><h2>1. The Warehouse as the &#8220;Truth Registry.&#8221;</h2><p>Jamin Ball argues that systems of record aren&#8217;t dying; they are becoming &#8220;boring, rock-solid sources of truth&#8221;. In an agentic world, the warehouse must evolve into a <strong>&#8220;Truth Registry&#8221;</strong> that encodes semantic contracts.</p><p>Agents are fragile. If an agent hallucinates the definition of &#8220;Churn,&#8221; it can automate disastrous decisions. Therefore, we must clean and canonize data <em>before</em> the agent sees it. In the architecture above, the flow is from <strong>Raw (Variant)</strong> to <strong>Silver (Extracted)</strong> to <strong>Gold (Canonical Model)</strong>.</p><ul><li><p><strong>Engineering Takeaway:</strong> You cannot feed agents raw JSON blobs. Extracting variant columns into typed, named columns in the Silver layer is critical. It transforms &#8220;available data&#8221; into &#8220;governed data,&#8221; preventing agents from guessing schemas at runtime.</p></li></ul><h2>2. The Context Graph as the &#8220;Reasoning Layer.&#8221;</h2><p>While the warehouse handles facts, the <strong>Context Graph</strong> handles relationships. TrustGraph defines a context graph as a &#8220;triples-representation of data (Subject &#8594; Predicate &#8594; Object) optimized for AI&#8221;.</p><p>Why a graph? Because <strong>structure is information</strong>. When you feed an LLM structured data (like RDF or Cypher), the structure itself encodes meaning. This allows agents to traverse relationships that SQL joins struggle to represent&#8212;stitching together a user&#8217;s support ticket, their billing status, and their web activity into a single, queryable context.</p><div><hr></div><h1>Case Study: Inside OpenAI&#8217;s Data Agent</h1><p>OpenAI recently reported that standard metadata was insufficient for their internal data agent. They had to build a custom &#8220;Context Layer&#8221; that closely resembles the architecture above.</p><p>Their agent failed when it relied solely on table schemas. To fix this, they added:</p><ol><li><p><strong>Human Annotations:</strong> Curated descriptions of what tables <em>actually</em> mean (e.g., &#8220;This table excludes logged-out users&#8221;).</p></li><li><p><strong>Code Enrichment:</strong> They used &#8220;Codex&#8221; to crawl their own codebase, understanding data lineage not just by metadata, but by reading the pipelines that produced the data.</p></li></ol><p>This confirms a major trend: <strong>The metadata </strong><em><strong>is</strong></em><strong> the model.</strong> Providing agents with a semantic ontology (machine-readable definitions of terms) is just as important as the data itself.</p><div><hr></div><h1>The &#8220;Front Door&#8221; is Moving</h1><p>The implications for the industry are massive. Historically, if you owned the System of Record (like Salesforce), you owned the &#8220;Front Door&#8221; (the UI).</p><p>But as agents take over workflows, the UI is unbundling from the data. Jamin Ball compares this to the travel industry: <strong>GDS systems</strong> (Sabre, Amadeus) remained the backend source of truth, but <strong>Online Travel Agencies</strong> (Expedia, Booking) captured the front door&#8212;and the value.</p><p>In our new stack, the <strong>Agents</strong> become the OTAs. They are the new interface. The Warehouse/Lakehouse becomes the GDS&#8212;the invisible, essential infrastructure layer.</p><div><hr></div><h1>What This Means for Data Engineers</h1><ol><li><p><strong>Stop Hoarding State, Start capturing Traces:</strong> We need to instrument our systems to emit &#8220;decision traces&#8221; on every run. If an agent (or human) makes a decision, record the <em>inputs</em> and the <em>logic</em> used, not just the result.</p></li><li><p><strong>The Rise of the &#8220;Gold&#8221; Layer:</strong> Your dbt models are no longer just for dashboards. They are the safety rails for autonomous agents. Strict typing, &#8220;Gold&#8221; tables, and canonical definitions are non-negotiable.</p></li><li><p><strong>Graph Literacy:</strong> You don&#8217;t need to be a Neo4j expert, but understanding the basics of triples (Subject-Predicate-Object) and ontologies is becoming a core DE skill.</p></li><li><p><strong>Extract Your Semi-Structured/Unstructured Data:</strong>&nbsp;As shown in the architecture diagram, leaving data in unstructured blobs is a liability. Agents need explicit structure to reason safely.</p></li></ol><p>As agents grow more capable, the infrastructure beneath them must evolve. The Context Graph offers a powerful new foundation&#8212;not just for smarter agents, but for more transparent, explainable, and aligned systems. It&#8217;s time for data teams to build not just pipelines, but reasoning engines.</p><div><hr></div><h1>References</h1><p><strong><a href="https://x.com/KirkMarple/status/2003944353342149021">https://x.com/KirkMarple/status/2003944353342149021</a></strong></p><p><strong><a href="https://x.com/KirkMarple/status/2005443843848856047">https://x.com/KirkMarple/status/2005443843848856047</a></strong></p><p><strong><a href="https://foundationcapital.com/context-graphs-ais-trillion-dollar-opportunity/">https://foundationcapital.com/context-graphs-ais-trillion-dollar-opportunity/</a></strong></p><p><strong><a href="https://trustgraph.ai/news/context-graph-manifesto/">https://trustgraph.ai/news/context-graph-manifesto/</a></strong></p><p><strong><a href="https://openai.com/index/inside-our-in-house-data-agent/">https://openai.com/index/inside-our-in-house-data-agent/</a></strong></p><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:181366171,&quot;url&quot;:&quot;https://cloudedjudgement.substack.com/p/clouded-judgement-121225-long-live&quot;,&quot;publication_id&quot;:56878,&quot;publication_name&quot;:&quot;Clouded Judgement&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!UZpO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3a0e019-4ab6-4db8-b07a-5fe763dd3fab_669x669.png&quot;,&quot;title&quot;:&quot;Clouded Judgement 12.12.25 - Long Live Systems of Record&quot;,&quot;truncated_body_text&quot;:&quot;Every week I&#8217;ll provide updates on the latest trends in cloud software companies. Follow along to stay up to date!&quot;,&quot;date&quot;:&quot;2025-12-12T14:03:40.604Z&quot;,&quot;like_count&quot;:203,&quot;comment_count&quot;:12,&quot;bylines&quot;:[{&quot;id&quot;:11803623,&quot;name&quot;:&quot;Jamin Ball&quot;,&quot;handle&quot;:&quot;cloudedjudgement&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/94fea488-4be3-4043-9fdc-62e3018a3163_297x297.jpeg&quot;,&quot;bio&quot;:&quot;Venture Capitalist investing in enterprise software businesses&quot;,&quot;profile_set_up_at&quot;:&quot;2021-09-08T18:11:34.947Z&quot;,&quot;reader_installed_at&quot;:&quot;2024-06-22T02:56:08.332Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:210259,&quot;user_id&quot;:11803623,&quot;publication_id&quot;:56878,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:56878,&quot;name&quot;:&quot;Clouded Judgement&quot;,&quot;subdomain&quot;:&quot;cloudedjudgement&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Weekly data driven analysis of SaaS companies &quot;,&quot;logo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/f3a0e019-4ab6-4db8-b07a-5fe763dd3fab_669x669.png&quot;,&quot;author_id&quot;:11803623,&quot;primary_user_id&quot;:11803623,&quot;theme_var_background_pop&quot;:&quot;#d10000&quot;,&quot;created_at&quot;:&quot;2020-06-16T19:15:55.639Z&quot;,&quot;email_from_name&quot;:&quot;Clouded Judgement by Jamin Ball&quot;,&quot;copyright&quot;:&quot;Jamin Ball&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:null,&quot;is_personal_mode&quot;:false}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:null,&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://cloudedjudgement.substack.com/p/clouded-judgement-121225-long-live?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!UZpO!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3a0e019-4ab6-4db8-b07a-5fe763dd3fab_669x669.png" loading="lazy"><span class="embedded-post-publication-name">Clouded Judgement</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Clouded Judgement 12.12.25 - Long Live Systems of Record</div></div><div class="embedded-post-body">Every week I&#8217;ll provide updates on the latest trends in cloud software companies. Follow along to stay up to date&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">4 months ago &#183; 203 likes &#183; 12 comments &#183; Jamin Ball</div></a></div><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:181913265,&quot;url&quot;:&quot;https://cloudedjudgement.substack.com/p/clouded-judgement-121925-the-front&quot;,&quot;publication_id&quot;:56878,&quot;publication_name&quot;:&quot;Clouded Judgement&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!UZpO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3a0e019-4ab6-4db8-b07a-5fe763dd3fab_669x669.png&quot;,&quot;title&quot;:&quot;Clouded Judgement 12.19.25 - The System of Record's Front Door&quot;,&quot;truncated_body_text&quot;:&quot;Every week I&#8217;ll provide updates on the latest trends in cloud software companies. Follow along to stay up to date!&quot;,&quot;date&quot;:&quot;2025-12-19T14:04:27.561Z&quot;,&quot;like_count&quot;:61,&quot;comment_count&quot;:10,&quot;bylines&quot;:[{&quot;id&quot;:11803623,&quot;name&quot;:&quot;Jamin Ball&quot;,&quot;handle&quot;:&quot;cloudedjudgement&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/94fea488-4be3-4043-9fdc-62e3018a3163_297x297.jpeg&quot;,&quot;bio&quot;:&quot;Venture Capitalist investing in enterprise software businesses&quot;,&quot;profile_set_up_at&quot;:&quot;2021-09-08T18:11:34.947Z&quot;,&quot;reader_installed_at&quot;:&quot;2024-06-22T02:56:08.332Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:210259,&quot;user_id&quot;:11803623,&quot;publication_id&quot;:56878,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:56878,&quot;name&quot;:&quot;Clouded Judgement&quot;,&quot;subdomain&quot;:&quot;cloudedjudgement&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Weekly data driven analysis of SaaS companies &quot;,&quot;logo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/f3a0e019-4ab6-4db8-b07a-5fe763dd3fab_669x669.png&quot;,&quot;author_id&quot;:11803623,&quot;primary_user_id&quot;:11803623,&quot;theme_var_background_pop&quot;:&quot;#d10000&quot;,&quot;created_at&quot;:&quot;2020-06-16T19:15:55.639Z&quot;,&quot;email_from_name&quot;:&quot;Clouded Judgement by Jamin Ball&quot;,&quot;copyright&quot;:&quot;Jamin Ball&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:null,&quot;is_personal_mode&quot;:false}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:null,&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://cloudedjudgement.substack.com/p/clouded-judgement-121925-the-front?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!UZpO!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3a0e019-4ab6-4db8-b07a-5fe763dd3fab_669x669.png" loading="lazy"><span class="embedded-post-publication-name">Clouded Judgement</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Clouded Judgement 12.19.25 - The System of Record's Front Door</div></div><div class="embedded-post-body">Every week I&#8217;ll provide updates on the latest trends in cloud software companies. Follow along to stay up to date&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">4 months ago &#183; 61 likes &#183; 10 comments &#183; Jamin Ball</div></a></div>]]></content:encoded></item><item><title><![CDATA[Data Engineering Weekly #254]]></title><description><![CDATA[The Weekly Data Engineering Newsletter]]></description><link>https://www.dataengineeringweekly.com/p/data-engineering-weekly-254</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/data-engineering-weekly-254</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Mon, 26 Jan 2026 04:12:24 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!AdQk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6f51c1bf-abc9-4cd3-ad69-22cc8e3f1ef2_1080x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/events/best-practices-for-llm-dagster-development?utm_campaign=33680422-26-01-WBNR_DEEP_Dive_LLM_Dagster&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_llm_best_practices&amp;utm_content=01_25_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lFOo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d9b37f3-97c3-4a24-a589-1b1e6666710e_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!lFOo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d9b37f3-97c3-4a24-a589-1b1e6666710e_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!lFOo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d9b37f3-97c3-4a24-a589-1b1e6666710e_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!lFOo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d9b37f3-97c3-4a24-a589-1b1e6666710e_3840x2160.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lFOo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d9b37f3-97c3-4a24-a589-1b1e6666710e_3840x2160.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d9b37f3-97c3-4a24-a589-1b1e6666710e_3840x2160.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:26077,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/events/best-practices-for-llm-dagster-development?utm_campaign=33680422-26-01-WBNR_DEEP_Dive_LLM_Dagster&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_llm_best_practices&amp;utm_content=01_25_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/185800885?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d9b37f3-97c3-4a24-a589-1b1e6666710e_3840x2160.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lFOo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d9b37f3-97c3-4a24-a589-1b1e6666710e_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!lFOo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d9b37f3-97c3-4a24-a589-1b1e6666710e_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!lFOo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d9b37f3-97c3-4a24-a589-1b1e6666710e_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!lFOo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d9b37f3-97c3-4a24-a589-1b1e6666710e_3840x2160.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>LLMs are transforming software development, but integrating them into real projects can be tricky when models don&#8217;t understand your codebase, pipelines, or conventions.<br><br>Join Dagster on Tuesday, January 27th, for a practical look at data engineering best practices, common pitfalls, and live demos of LLM developments.</p><p><strong><a href="https://dagster.io/events/best-practices-for-llm-dagster-development?utm_campaign=33680422-26-01-WBNR_DEEP_Dive_LLM_Dagster&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_llm_best_practices&amp;utm_content=01_25_data_engineering_weekly">Register now</a></strong></p><div><hr></div><h1><em><strong>Debate Alert: Ontology vs Context Graphs vs Semantic Layers: What Will AI Need in 2026?</strong></em></h1><p><em>What are Context Graphs, and why are builders, VCs, and operators calling it the next $1T opportunity? Join The Great Data Debate to get answers to questions the data &amp; AI industry is so curious about right now:</em></p><ul><li><p><em>Where does context materialize in practice? Who owns context as meaning evolves?</em></p></li><li><p><em>Semantic layers, ontologies, context graphs - what should data teams build in 2026?</em></p></li><li><p><em>Where should that context live: in the warehouse, inside agents, or in a dedicated context layer?</em></p></li></ul><p><em>Join <strong>Bob Muglia</strong>(former CEO, Snowflake), <strong>Karthik Ravindran</strong>(GM, Microsoft), <strong>Tony Gentilcore</strong>(Co-founder, Glean), <strong>Prukalpa Sankar</strong>(Co-founder, Atlan), and <strong>Jaya Gupta</strong>(Foundation Capital) for an open discussion on what data teams should actually build next.</em></p><p><em><strong><a href="https://atlan.com/great-data-debate-2026/?utm_source=DEW+&amp;utm_medium=Substack&amp;utm_campaign=DEW_GDD">Register: Feb 5 &#183; Virtual &#183; 11 AM ET</a></strong></em></p><div><hr></div><h1>Mark Rittman: Why We&#8217;ve Tried to Replace Data Analytics Developers Every Decade Since 1974</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!orSS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F741b983c-5936-4259-a3b4-d7ad04eda04e_1360x768.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!orSS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F741b983c-5936-4259-a3b4-d7ad04eda04e_1360x768.heic 424w, https://substackcdn.com/image/fetch/$s_!orSS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F741b983c-5936-4259-a3b4-d7ad04eda04e_1360x768.heic 848w, https://substackcdn.com/image/fetch/$s_!orSS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F741b983c-5936-4259-a3b4-d7ad04eda04e_1360x768.heic 1272w, https://substackcdn.com/image/fetch/$s_!orSS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F741b983c-5936-4259-a3b4-d7ad04eda04e_1360x768.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!orSS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F741b983c-5936-4259-a3b4-d7ad04eda04e_1360x768.heic" width="1360" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/741b983c-5936-4259-a3b4-d7ad04eda04e_1360x768.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1360,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:14087,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/185800885?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F741b983c-5936-4259-a3b4-d7ad04eda04e_1360x768.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!orSS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F741b983c-5936-4259-a3b4-d7ad04eda04e_1360x768.heic 424w, https://substackcdn.com/image/fetch/$s_!orSS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F741b983c-5936-4259-a3b4-d7ad04eda04e_1360x768.heic 848w, https://substackcdn.com/image/fetch/$s_!orSS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F741b983c-5936-4259-a3b4-d7ad04eda04e_1360x768.heic 1272w, https://substackcdn.com/image/fetch/$s_!orSS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F741b983c-5936-4259-a3b4-d7ad04eda04e_1360x768.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p>Perhaps the recurring dream of replacing data analytics developers isn&#8217;t a mistake. Perhaps it&#8217;s a necessary optimism that drives tool creation.</p></blockquote><p>Developers are always hungry for the next big abstractions. The article is an excellent reminder that Semantic layers didn&#8217;t make metric definition trivial, but they made it more maintainable. Self-service BI didn&#8217;t remove IT from the equation, but it expanded who could explore data.</p><p><strong><a href="https://blog.rittmananalytics.com/why-weve-tried-to-replace-data-analytics-developers-every-decade-since-1974-5c0de5a05088">https://blog.rittmananalytics.com/why-weve-tried-to-replace-data-analytics-developers-every-decade-since-1974-5c0de5a05088</a></strong></p><div><hr></div><h1>Alibaba: AI Trends Reshaping Data Engineering in 2026</h1><p>Alibaba writes that data engineering is evolving from data movement to building intelligent, autonomous systems in which data serves as a continuously learning capability. The field is getting reshaped by unified data&#8211;AI platforms, self-healing operations, context engineering for AI agents, real-time and multimodal pipelines, and privacy-first approaches such as synthetic data and federated learning.</p><p><strong><a href="https://www.alibabacloud.com/blog/ai-trends-reshaping-data-engineering-in-2026_602816">https://www.alibabacloud.com/blog/ai-trends-reshaping-data-engineering-in-2026_602816</a></strong></p><div><hr></div><h1>Thoughtworks: The state of data mesh in 2026: From hype to hard-won maturity</h1><p>As with any scientific theories, it starts with a half-finished, widely debated theory and matures over time. Data Mesh and Data Contract are two such theories; though widely adopted by many companies internally, they still have much room to mature.  ThoughtWorks writes an excellent article about the current state of data mesh adoption. </p><p><strong><a href="https://www.thoughtworks.com/insights/blog/data-strategy/the-state-of-data-mesh-in-2026-from-hype-to-hard-won-maturity">https://www.thoughtworks.com/insights/blog/data-strategy/the-state-of-data-mesh-in-2026-from-hype-to-hard-won-maturity</a></strong></p><div><hr></div><h1>Sponsored: The AI Modernization Guide</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=01_25_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NwaO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2479aabb-0ed2-4aba-b72f-e6a77a8e19dc_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!NwaO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2479aabb-0ed2-4aba-b72f-e6a77a8e19dc_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!NwaO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2479aabb-0ed2-4aba-b72f-e6a77a8e19dc_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!NwaO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2479aabb-0ed2-4aba-b72f-e6a77a8e19dc_2400x1260.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NwaO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2479aabb-0ed2-4aba-b72f-e6a77a8e19dc_2400x1260.heic" width="1456" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2479aabb-0ed2-4aba-b72f-e6a77a8e19dc_2400x1260.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:15511,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=01_25_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/185800885?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2479aabb-0ed2-4aba-b72f-e6a77a8e19dc_2400x1260.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NwaO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2479aabb-0ed2-4aba-b72f-e6a77a8e19dc_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!NwaO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2479aabb-0ed2-4aba-b72f-e6a77a8e19dc_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!NwaO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2479aabb-0ed2-4aba-b72f-e6a77a8e19dc_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!NwaO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2479aabb-0ed2-4aba-b72f-e6a77a8e19dc_2400x1260.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>While 75% of enterprises experiment with AI, traditional data platforms are becoming the biggest bottleneck. Learn how to build a unified control plane that enables AI-driven development, reduces pipeline failures, and cuts complexity.</p><p><br>- Transform from Big Complexity to AI-ready architecture<br>- Real metrics from organizations achieving 50% cost reductions<br>- Introduction to Dagster Components: YAML-first pipelines that AI can build</p><p><strong><a href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=01_25_data_engineering_weekly">Get the guide now</a></strong></p><div><hr></div><h1>Anthropic: Demystifying evals for AI agents</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fnpl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8092f781-174a-4c01-be40-bae48c63c054_3840x2161.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fnpl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8092f781-174a-4c01-be40-bae48c63c054_3840x2161.heic 424w, https://substackcdn.com/image/fetch/$s_!fnpl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8092f781-174a-4c01-be40-bae48c63c054_3840x2161.heic 848w, https://substackcdn.com/image/fetch/$s_!fnpl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8092f781-174a-4c01-be40-bae48c63c054_3840x2161.heic 1272w, https://substackcdn.com/image/fetch/$s_!fnpl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8092f781-174a-4c01-be40-bae48c63c054_3840x2161.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fnpl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8092f781-174a-4c01-be40-bae48c63c054_3840x2161.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8092f781-174a-4c01-be40-bae48c63c054_3840x2161.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:32650,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/185800885?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8092f781-174a-4c01-be40-bae48c63c054_3840x2161.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fnpl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8092f781-174a-4c01-be40-bae48c63c054_3840x2161.heic 424w, https://substackcdn.com/image/fetch/$s_!fnpl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8092f781-174a-4c01-be40-bae48c63c054_3840x2161.heic 848w, https://substackcdn.com/image/fetch/$s_!fnpl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8092f781-174a-4c01-be40-bae48c63c054_3840x2161.heic 1272w, https://substackcdn.com/image/fetch/$s_!fnpl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8092f781-174a-4c01-be40-bae48c63c054_3840x2161.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AI agent development faces evaluation gaps as systems evolve from single-turn prompts to autonomous, multi-step tool use. The article defines agent evals around task outcomes rather than transcripts, using repeated trials, hybrid grading methods, and reliability metrics such as pass@k and pass^k to capture non-deterministic behavior. Starting with small, failure-driven task sets and layering automated evals with production monitoring and human review enables teams to iterate faster, adopt new models safely, and maintain consistent agent reliability.</p><p><strong><a href="https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents">https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents</a></strong></p><div><hr></div><h1>Booking.com: AI Agent Evaluation - practical tips at Booking.com</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eKVJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F257d9e6b-6fa3-440c-ab07-6accb2405bf7_1124x889.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eKVJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F257d9e6b-6fa3-440c-ab07-6accb2405bf7_1124x889.heic 424w, https://substackcdn.com/image/fetch/$s_!eKVJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F257d9e6b-6fa3-440c-ab07-6accb2405bf7_1124x889.heic 848w, https://substackcdn.com/image/fetch/$s_!eKVJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F257d9e6b-6fa3-440c-ab07-6accb2405bf7_1124x889.heic 1272w, https://substackcdn.com/image/fetch/$s_!eKVJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F257d9e6b-6fa3-440c-ab07-6accb2405bf7_1124x889.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eKVJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F257d9e6b-6fa3-440c-ab07-6accb2405bf7_1124x889.heic" width="1124" height="889" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/257d9e6b-6fa3-440c-ab07-6accb2405bf7_1124x889.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:889,&quot;width&quot;:1124,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:26723,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/185800885?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F257d9e6b-6fa3-440c-ab07-6accb2405bf7_1124x889.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eKVJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F257d9e6b-6fa3-440c-ab07-6accb2405bf7_1124x889.heic 424w, https://substackcdn.com/image/fetch/$s_!eKVJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F257d9e6b-6fa3-440c-ab07-6accb2405bf7_1124x889.heic 848w, https://substackcdn.com/image/fetch/$s_!eKVJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F257d9e6b-6fa3-440c-ab07-6accb2405bf7_1124x889.heic 1272w, https://substackcdn.com/image/fetch/$s_!eKVJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F257d9e6b-6fa3-440c-ab07-6accb2405bf7_1124x889.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Evaluating autonomous AI agents is challenging because simple prompt-based metrics fail to capture whether user intents are reliably fulfilled. The article presents a dual approach combining black-box evaluation focused on task completion using LLM-as-a-judge with glass-box evaluation that inspects tool selection, syntax, and intermediate agent decisions. Benchmarking agents against simpler baselines and measuring consistency across repeated queries helps teams justify added complexity, control costs, and assess production readiness.</p><p><strong><a href="https://booking.ai/ai-agent-evaluation-82e781439d97">https://booking.ai/ai-agent-evaluation-82e781439d97</a></strong></p><div><hr></div><h1>LinkedIn: Reimagining LinkedIn&#8217;s search tech stack</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cDF_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d32c84f-b93a-4142-a357-5ffe41156bcc_1200x744.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cDF_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d32c84f-b93a-4142-a357-5ffe41156bcc_1200x744.heic 424w, https://substackcdn.com/image/fetch/$s_!cDF_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d32c84f-b93a-4142-a357-5ffe41156bcc_1200x744.heic 848w, https://substackcdn.com/image/fetch/$s_!cDF_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d32c84f-b93a-4142-a357-5ffe41156bcc_1200x744.heic 1272w, https://substackcdn.com/image/fetch/$s_!cDF_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d32c84f-b93a-4142-a357-5ffe41156bcc_1200x744.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cDF_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d32c84f-b93a-4142-a357-5ffe41156bcc_1200x744.heic" width="1200" height="744" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4d32c84f-b93a-4142-a357-5ffe41156bcc_1200x744.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:744,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:18561,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/185800885?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d32c84f-b93a-4142-a357-5ffe41156bcc_1200x744.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cDF_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d32c84f-b93a-4142-a357-5ffe41156bcc_1200x744.heic 424w, https://substackcdn.com/image/fetch/$s_!cDF_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d32c84f-b93a-4142-a357-5ffe41156bcc_1200x744.heic 848w, https://substackcdn.com/image/fetch/$s_!cDF_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d32c84f-b93a-4142-a357-5ffe41156bcc_1200x744.heic 1272w, https://substackcdn.com/image/fetch/$s_!cDF_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d32c84f-b93a-4142-a357-5ffe41156bcc_1200x744.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Traditional keyword-based search struggles to understand user intent, handle natural language queries, and bridge vocabulary gaps at scale. The article describes LinkedIn&#8217;s shift to a semantic search stack built on LLM-based query understanding, embedding-based retrieval, and small-language-model ranking, supported by LLM judges, model distillation, and continuous relevance measurement. </p><p><strong><a href="https://www.linkedin.com/blog/engineering/search/reimagining-linkedins-search-stack">https://www.linkedin.com/blog/engineering/search/reimagining-linkedins-search-stack</a></strong></p><div><hr></div><h1>Stefan Kecskes: Kafka Dead Letter Queue Triage: Debugging 25,000 Failed Messages</h1><p>The DLQ pattern in Event processing is well-known and somewhat widely adopted. However, I&#8217;m delighted to read about what happens after a message enters a failed state for the first time. Discard or fix it? The author provides practical tips for analyzing DLQ messages, cautions, and resiliency practices before rebroadcasting the message. </p><p><strong><a href="https://skey.uk/post/kafka-dead-letter-queue-troubleshooting-guide/">https://skey.uk/post/kafka-dead-letter-queue-troubleshooting-guide/</a></strong></p><div><hr></div><h1>Teads: The End of the Dashboard as We Know It: Designing for Insight in the Age of AI</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!M-3h!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5c7f9a-2513-424e-bd74-51154a6e7d9c_1200x800.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!M-3h!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5c7f9a-2513-424e-bd74-51154a6e7d9c_1200x800.heic 424w, https://substackcdn.com/image/fetch/$s_!M-3h!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5c7f9a-2513-424e-bd74-51154a6e7d9c_1200x800.heic 848w, https://substackcdn.com/image/fetch/$s_!M-3h!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5c7f9a-2513-424e-bd74-51154a6e7d9c_1200x800.heic 1272w, https://substackcdn.com/image/fetch/$s_!M-3h!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5c7f9a-2513-424e-bd74-51154a6e7d9c_1200x800.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!M-3h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5c7f9a-2513-424e-bd74-51154a6e7d9c_1200x800.heic" width="1200" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8c5c7f9a-2513-424e-bd74-51154a6e7d9c_1200x800.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:17429,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/185800885?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5c7f9a-2513-424e-bd74-51154a6e7d9c_1200x800.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!M-3h!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5c7f9a-2513-424e-bd74-51154a6e7d9c_1200x800.heic 424w, https://substackcdn.com/image/fetch/$s_!M-3h!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5c7f9a-2513-424e-bd74-51154a6e7d9c_1200x800.heic 848w, https://substackcdn.com/image/fetch/$s_!M-3h!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5c7f9a-2513-424e-bd74-51154a6e7d9c_1200x800.heic 1272w, https://substackcdn.com/image/fetch/$s_!M-3h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c5c7f9a-2513-424e-bd74-51154a6e7d9c_1200x800.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Static dashboards fail to support decision-making because they require users to interpret large volumes of data manually. The article argues for AI-driven dashboards that act as proactive, conversational assistants by focusing on scenarios, personalized insights, and transparent recommendations rather than fixed screens. Measuring success by decisions enabled instead of data displayed reframes dashboards as adaptive systems that guide action and improve human&#8211;AI collaboration over time.</p><p><strong><a href="https://medium.com/teads-engineering/the-end-of-the-dashboard-as-we-know-it-designing-for-insight-in-the-age-of-ai-fec16bddf677">https://medium.com/teads-engineering/the-end-of-the-dashboard-as-we-know-it-designing-for-insight-in-the-age-of-ai-fec16bddf677</a></strong></p><div><hr></div><h1>Flipkart: High-Risk, High-Scale: Guaranteeing Ad Budget Precision at 1 Million Events/Second</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!okI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F911a9812-5600-45ad-95e0-c3b88fb213d3_1024x454.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!okI5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F911a9812-5600-45ad-95e0-c3b88fb213d3_1024x454.heic 424w, https://substackcdn.com/image/fetch/$s_!okI5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F911a9812-5600-45ad-95e0-c3b88fb213d3_1024x454.heic 848w, https://substackcdn.com/image/fetch/$s_!okI5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F911a9812-5600-45ad-95e0-c3b88fb213d3_1024x454.heic 1272w, https://substackcdn.com/image/fetch/$s_!okI5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F911a9812-5600-45ad-95e0-c3b88fb213d3_1024x454.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!okI5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F911a9812-5600-45ad-95e0-c3b88fb213d3_1024x454.heic" width="1024" height="454" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/911a9812-5600-45ad-95e0-c3b88fb213d3_1024x454.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:454,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:14835,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/185800885?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F911a9812-5600-45ad-95e0-c3b88fb213d3_1024x454.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!okI5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F911a9812-5600-45ad-95e0-c3b88fb213d3_1024x454.heic 424w, https://substackcdn.com/image/fetch/$s_!okI5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F911a9812-5600-45ad-95e0-c3b88fb213d3_1024x454.heic 848w, https://substackcdn.com/image/fetch/$s_!okI5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F911a9812-5600-45ad-95e0-c3b88fb213d3_1024x454.heic 1272w, https://substackcdn.com/image/fetch/$s_!okI5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F911a9812-5600-45ad-95e0-c3b88fb213d3_1024x454.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Real-time ad budget enforcement at scale is risky because latency can cause advertiser overspend, revenue loss, and delayed mobile events to be miscounted. The article describes Flipkart Ads&#8217; architecture that separates real-time enforcement from batch settlement using Apache Flink, stateful deduplication with RocksDB, and event-time processing with watermarking. Prioritizing availability over strict consistency enables the system to process nearly one million events per second while ensuring accurate budget capping and final financial reconciliation.</p><p><strong><a href="https://blog.flipkart.tech/high-risk-high-scale-guaranteeing-ad-budget-precision-at-1-million-events-second-cc23977796d7">https://blog.flipkart.tech/high-risk-high-scale-guaranteeing-ad-budget-precision-at-1-million-events-second-cc23977796d7</a></strong></p><div><hr></div><p><em>All rights reserved, Dewpeche Private Limited. I have provided links for informational purposes and do not suggest endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers&#8217; opinions.</em></p>]]></content:encoded></item><item><title><![CDATA[Data Contracts: A Missed Opportunity]]></title><description><![CDATA[The Conversation We Should Have Had&#8212;Before Thought Leadership Replaced System Design]]></description><link>https://www.dataengineeringweekly.com/p/data-contracts-a-missed-opportunity</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/data-contracts-a-missed-opportunity</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Tue, 20 Jan 2026 18:31:10 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!1ja9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe050308d-37ed-4cde-8b4e-ce1dfb8c784e_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Over the last couple of years, the data industry has been having a conversation about data contracts that never quite went where they needed to.</p><p>There was no shortage of activity around the topic. Definitions were proposed and refined. Conceptual boundaries were drawn and redrawn. Data contracts were compared to APIs, governance frameworks, data mesh primitives, and ideas teams already &#8220;sort of&#8221; implemented in practice. </p><blockquote><p><em>The discussion was energetic and well-intentioned, but it tended to stay at the level of classification rather than construction.</em></p></blockquote><p>What was largely absent was sustained engagement with the engineering consequences of taking data contracts seriously. Questions about enforcement, evolution, compatibility, and failure modes appeared only briefly before the conversation moved on. The result was an industry consensus that data contracts were &#8220;interesting,&#8221; without a shared understanding of what it would actually mean to build platforms around them.</p><p>In hindsight, this matters&#8212;not because the debate was unproductive, but because of what happened in parallel.</p><div><hr></div><h2><strong>The Shift Happening Elsewhere</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7mSh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dcd1ddb-f6d6-4190-a234-23cc6febd34d_2048x1126.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7mSh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dcd1ddb-f6d6-4190-a234-23cc6febd34d_2048x1126.png 424w, https://substackcdn.com/image/fetch/$s_!7mSh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dcd1ddb-f6d6-4190-a234-23cc6febd34d_2048x1126.png 848w, https://substackcdn.com/image/fetch/$s_!7mSh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dcd1ddb-f6d6-4190-a234-23cc6febd34d_2048x1126.png 1272w, https://substackcdn.com/image/fetch/$s_!7mSh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dcd1ddb-f6d6-4190-a234-23cc6febd34d_2048x1126.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7mSh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dcd1ddb-f6d6-4190-a234-23cc6febd34d_2048x1126.png" width="1456" height="801" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3dcd1ddb-f6d6-4190-a234-23cc6febd34d_2048x1126.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:801,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7mSh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dcd1ddb-f6d6-4190-a234-23cc6febd34d_2048x1126.png 424w, https://substackcdn.com/image/fetch/$s_!7mSh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dcd1ddb-f6d6-4190-a234-23cc6febd34d_2048x1126.png 848w, https://substackcdn.com/image/fetch/$s_!7mSh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dcd1ddb-f6d6-4190-a234-23cc6febd34d_2048x1126.png 1272w, https://substackcdn.com/image/fetch/$s_!7mSh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dcd1ddb-f6d6-4190-a234-23cc6febd34d_2048x1126.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>While the data community was debating what data contracts <em>were</em>, the software engineering world was converging&#8212;quietly and pragmatically&#8212;on a different organizing principle: <strong>specifications as the primary unit of system design</strong>.</p><p>This wasn&#8217;t a philosophical shift so much as an operational one. As systems became more distributed, more automated, and more interdependent, informal agreements stopped scaling. Documentation drifted. Assumptions diverged. Human coordination became the bottleneck.</p><p>The response was not more process, but more precision.</p><p>APIs began with schemas rather than code. Infrastructure moved from scripts to declarative specifications. Compatibility rules were automatically encoded and enforced. In these systems, the specification was no longer an artifact produced alongside the system&#8212;it <em>was</em> the system.</p><p>More recently, AI agents have accelerated this trend. Agents do not operate on intent, convention, or context. They operate on explicit, machine-readable, and verifiable data. Where specifications exist, agents can reason deterministically. Where they do not, agents approximate&#8212;and approximation is rarely acceptable in core infrastructure.</p><p>This is where the connection to data contracts becomes unavoidable.</p><div><hr></div><h2><strong>Data Contracts as Specifications, Not Concepts</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1ja9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe050308d-37ed-4cde-8b4e-ce1dfb8c784e_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1ja9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe050308d-37ed-4cde-8b4e-ce1dfb8c784e_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!1ja9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe050308d-37ed-4cde-8b4e-ce1dfb8c784e_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!1ja9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe050308d-37ed-4cde-8b4e-ce1dfb8c784e_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!1ja9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe050308d-37ed-4cde-8b4e-ce1dfb8c784e_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1ja9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe050308d-37ed-4cde-8b4e-ce1dfb8c784e_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e050308d-37ed-4cde-8b4e-ce1dfb8c784e_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1ja9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe050308d-37ed-4cde-8b4e-ce1dfb8c784e_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!1ja9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe050308d-37ed-4cde-8b4e-ce1dfb8c784e_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!1ja9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe050308d-37ed-4cde-8b4e-ce1dfb8c784e_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!1ja9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe050308d-37ed-4cde-8b4e-ce1dfb8c784e_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Viewed through the lens of spec-driven development, data contracts stop looking like a data-specific innovation and become a familiar pattern applied to a different domain.</p><p>A properly implemented data contract is a specification:</p><ul><li><p>It defines structure, semantics, and invariants</p></li><li><p>It establishes compatibility guarantees over time.</p></li><li><p>It is versioned, validated, and enforced programmatically.</p></li><li><p>It creates a stable interface between independently evolving systems.</p></li></ul><p>This is exactly what spec-driven development has optimized for in software engineering.</p><p>The difference is not conceptual&#8212;it is operational. Software engineering treated specifications as executable constraints. The data industry often treated contracts as descriptive artifacts. As a result, contracts were discussed as governance tools or communication mechanisms, rather than as interfaces with failure semantics.</p><p>That framing limited how far the idea could go.</p><div><hr></div><h2><strong>Where the Two Worlds Diverged</strong></h2><p>Spec-driven systems force early clarity around hard problems.</p><ul><li><p>How does change propagate?</p></li><li><p>What is allowed to evolve independently?</p></li><li><p>What breaks compatibility, and how is that detected?</p></li><li><p>Where does enforcement occur, and what happens when it fails?</p></li></ul><p>In software systems, these questions are answered in code and tooling. In data systems, they were often answered socially. Producer&#8211;consumer agreements existed, but they lived in tickets, meetings, and tribal knowledge rather than in executable form.</p><p>Many teams compensated by building partial solutions: strong schemas, upstream quality checks, and informal SLAs. These patterns worked, but they relied heavily on human intervention. They were resilient, but not legible to machines.</p><p>As long as humans were the primary integrators, this was manageable. As soon as AI agents enter the workflow, it becomes a constraint.</p><div><hr></div><h2><strong>Why Spec-Driven Thinking Changes the Next Phase</strong></h2><p>AI agents make an implicit demand of data platforms: <strong>make your rules explicit</strong>.</p><p>Agents can generate schemas, propose transformations, reason about compatibility, and enforce policy&#8212;but only if the platform exposes contracts in a form they can execute against. Without that, agents revert to inference, which introduces uncertainty precisely where determinism is required.</p><p>This is the practical implication of spec-driven development for data engineering. It&#8217;s not about adopting a new paradigm. It&#8217;s about recognizing that the platform already behaves like a system of interfaces&#8212;and formalizing those interfaces accordingly.</p><p>Teams that have already internalized contract discipline will find this transition incremental. Teams that have not will experience it as friction.</p><div><hr></div><h2><strong>What We Should Do Next</strong></h2><p>At this point, the terminology matters less than the mechanics.</p><p>Whether we call them data contracts, data interfaces, or executable schemas, the path forward is the same:</p><ul><li><p>Treat schemas as specifications, not documentation</p></li><li><p>Encode quality, semantics, and compatibility as executable rules</p></li><li><p>Enforce contracts at clear system boundaries, preferably early.</p></li><li><p>Version data interfaces with the same rigor as APIs</p></li><li><p>Make ownership and accountability explicit and machine-readable.<br></p></li></ul><p>This is not about adding process. It is about making systems legible to other systems.</p><div><hr></div><h2><strong>Closing Thought</strong></h2><p>The original data contracts conversation wasn&#8217;t wrong. It just stopped too early.</p><p>Spec-driven development has shown that explicit, enforceable interfaces are not optional in complex, automated systems. Data platforms are now at that same inflection point.</p><p>Data contracts were never the destination.</p><p>They were the missing layer that would have made everything else easier to build.</p><p>The opportunity is still there&#8212;but only if we&#8217;re willing to treat contracts as infrastructure, not ideas.</p><h2>References</h2><p><em><strong><a href="https://www.thoughtworks.com/insights/podcasts/technology-podcasts/data-contracts-what-why">https://www.thoughtworks.com/insights/podcasts/technology-podcasts/data-contracts-what-why</a></strong></em></p><p><strong><a href="https://airbyte.com/data-engineering-resources/data-contracts">https://airbyte.com/data-engineering-resources/data-contracts</a></strong></p><p><strong><a href="https://soda.io/blog/what-are-data-contracts">https://soda.io/blog/what-are-data-contracts</a></strong></p><p><strong><a href="https://atlan.com/data-contracts/">https://atlan.com/data-contracts/</a></strong></p><p><strong><a href="https://en.wikipedia.org/wiki/Spec-driven_development">https://en.wikipedia.org/wiki/Spec-driven_development</a></strong></p><p><strong><a href="https://medium.com/software-architecture-in-the-age-of-ai/why-interfaces-and-contracts-are-not-the-same-and-why-that-matters-with-10-examples-408524f6d17c">https://medium.com/software-architecture-in-the-age-of-ai/why-interfaces-and-contracts-are-not-the-same-and-why-that-matters-with-10-examples-408524f6d17c</a></strong></p><p><strong><a href="https://arxiv.org/abs/2507.21056">https://arxiv.org/abs/2507.21056</a></strong></p>]]></content:encoded></item><item><title><![CDATA[Data Engineering Weekly #253]]></title><description><![CDATA[The Weekly Data Engineering Newsletter]]></description><link>https://www.dataengineeringweekly.com/p/data-engineering-weekly-253</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/data-engineering-weekly-253</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Mon, 19 Jan 2026 05:20:35 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!AdQk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6f51c1bf-abc9-4cd3-ad69-22cc8e3f1ef2_1080x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=01_18_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eded!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ce3b171-dc97-4fbf-90ec-9af339e459bd_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!eded!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ce3b171-dc97-4fbf-90ec-9af339e459bd_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!eded!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ce3b171-dc97-4fbf-90ec-9af339e459bd_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!eded!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ce3b171-dc97-4fbf-90ec-9af339e459bd_2400x1260.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eded!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ce3b171-dc97-4fbf-90ec-9af339e459bd_2400x1260.heic" width="1456" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9ce3b171-dc97-4fbf-90ec-9af339e459bd_2400x1260.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:28626,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=01_18_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/185026992?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ce3b171-dc97-4fbf-90ec-9af339e459bd_2400x1260.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eded!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ce3b171-dc97-4fbf-90ec-9af339e459bd_2400x1260.heic 424w, https://substackcdn.com/image/fetch/$s_!eded!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ce3b171-dc97-4fbf-90ec-9af339e459bd_2400x1260.heic 848w, https://substackcdn.com/image/fetch/$s_!eded!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ce3b171-dc97-4fbf-90ec-9af339e459bd_2400x1260.heic 1272w, https://substackcdn.com/image/fetch/$s_!eded!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ce3b171-dc97-4fbf-90ec-9af339e459bd_2400x1260.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1>Modernize your data platform for the age of AI.</h1><p>While 75% of enterprises experiment with AI, traditional data platforms are becoming the biggest bottleneck. Learn how to build a unified control plane that enables AI-driven development, reduces pipeline failures, and cuts complexity.<br><br>- Transform from Big Complexity to AI-ready architecture<br>- Real metrics from organizations achieving 50% cost reductions<br>- Introduction to Dagster Components: YAML-first pipelines that AI can build</p><p><strong><a href="https://dagster.io/ai-modernization-guide?utm_campaign=34120047-26-01-DMND_AI%20Modernization&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=ai_modernization_guide&amp;utm_content=01_18_data_engineering_weekly">Get the guide</a></strong></p><div><hr></div><h1>Lance Martin: Effective Agent Design</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rHCD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c2b5850-64d1-4b82-8c56-dea594363247_1200x571.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rHCD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c2b5850-64d1-4b82-8c56-dea594363247_1200x571.heic 424w, https://substackcdn.com/image/fetch/$s_!rHCD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c2b5850-64d1-4b82-8c56-dea594363247_1200x571.heic 848w, https://substackcdn.com/image/fetch/$s_!rHCD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c2b5850-64d1-4b82-8c56-dea594363247_1200x571.heic 1272w, https://substackcdn.com/image/fetch/$s_!rHCD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c2b5850-64d1-4b82-8c56-dea594363247_1200x571.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rHCD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c2b5850-64d1-4b82-8c56-dea594363247_1200x571.heic" width="1200" height="571" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8c2b5850-64d1-4b82-8c56-dea594363247_1200x571.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:571,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:13081,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/185026992?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c2b5850-64d1-4b82-8c56-dea594363247_1200x571.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rHCD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c2b5850-64d1-4b82-8c56-dea594363247_1200x571.heic 424w, https://substackcdn.com/image/fetch/$s_!rHCD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c2b5850-64d1-4b82-8c56-dea594363247_1200x571.heic 848w, https://substackcdn.com/image/fetch/$s_!rHCD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c2b5850-64d1-4b82-8c56-dea594363247_1200x571.heic 1272w, https://substackcdn.com/image/fetch/$s_!rHCD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c2b5850-64d1-4b82-8c56-dea594363247_1200x571.heic 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>An effective agent design largely boils down to context management. The author proposes design patterns to effectively build an agent, including providing filesystem and shell access to the agents, using a multi-layer action space, and offloading memory to a filesystem rather than keeping everything in the context window.</p><p><strong><a href="https://x.com/RLanceMartin/status/2009683038272401719">https://x.com/RLanceMartin/status/2009683038272401719</a></strong></p><div><hr></div><h1>M&#233;d&#233;ric Hurier: Architecting the AI Agent Platform: A Definitive Guide</h1><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4KVN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb59d3136-ec03-40ba-acf0-995d2177c2fb_8970x1550.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4KVN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb59d3136-ec03-40ba-acf0-995d2177c2fb_8970x1550.heic 424w, https://substackcdn.com/image/fetch/$s_!4KVN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb59d3136-ec03-40ba-acf0-995d2177c2fb_8970x1550.heic 848w, https://substackcdn.com/image/fetch/$s_!4KVN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb59d3136-ec03-40ba-acf0-995d2177c2fb_8970x1550.heic 1272w, https://substackcdn.com/image/fetch/$s_!4KVN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb59d3136-ec03-40ba-acf0-995d2177c2fb_8970x1550.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4KVN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb59d3136-ec03-40ba-acf0-995d2177c2fb_8970x1550.heic" width="1456" height="252" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b59d3136-ec03-40ba-acf0-995d2177c2fb_8970x1550.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:252,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42843,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/185026992?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb59d3136-ec03-40ba-acf0-995d2177c2fb_8970x1550.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4KVN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb59d3136-ec03-40ba-acf0-995d2177c2fb_8970x1550.heic 424w, https://substackcdn.com/image/fetch/$s_!4KVN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb59d3136-ec03-40ba-acf0-995d2177c2fb_8970x1550.heic 848w, https://substackcdn.com/image/fetch/$s_!4KVN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb59d3136-ec03-40ba-acf0-995d2177c2fb_8970x1550.heic 1272w, https://substackcdn.com/image/fetch/$s_!4KVN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb59d3136-ec03-40ba-acf0-995d2177c2fb_8970x1550.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The industry is shifting from simple LLMs and RAG (Retrieval-Augmented Generation) to AI Agents. The author proposes a 7-layer logical container architecture to build an AI agent platform. The 7-layer container architecture organizes the AI agent platform into logical levels&#8212;Interaction, Development, Core, Foundation, Information, Observability, and Trust&#8212;to manage the complexity of building production-grade systems. The structure enforces a separation of concerns, ensuring that user interfaces, execution engines, data management, and security governance are handled independently yet cohesively. </p><p><strong><a href="https://mlops.community/architecting-the-ai-agent-platform-a-definitive-guide/">https://mlops.community/architecting-the-ai-agent-platform-a-definitive-guide/</a></strong></p><div><hr></div><h1>Tidepool: Stop using natural language interfaces</h1><p>The user experience of chatbot-driven enterprise application flow is taking center stage in the product design. The author argues that pure natural language interfaces are inefficient due to the high latency of LLMs (often taking tens of seconds to respond). Instead, the author proposes a hybrid approach in which the LLM dynamically generates structured Graphic User Interfaces (GUIs)&#8212;such as popups with checkboxes, sliders, and forms&#8212;to interact with the user.</p><p><strong><a href="https://tidepool.leaflet.pub/3mcbegnuf2k2i">https://tidepool.leaflet.pub/3mcbegnuf2k2i</a></strong></p><div><hr></div><h1>Sponsored: Best practices for LLM development</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/events/best-practices-for-llm-dagster-development?utm_campaign=33680422-26-01-WBNR_DEEP_Dive_LLM_Dagster&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_llm_best_practices&amp;utm_content=01_18_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cl4f!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed55042-919c-421e-90a6-b2f541a8f8a4_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!cl4f!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed55042-919c-421e-90a6-b2f541a8f8a4_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!cl4f!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed55042-919c-421e-90a6-b2f541a8f8a4_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!cl4f!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed55042-919c-421e-90a6-b2f541a8f8a4_3840x2160.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cl4f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed55042-919c-421e-90a6-b2f541a8f8a4_3840x2160.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8ed55042-919c-421e-90a6-b2f541a8f8a4_3840x2160.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:28114,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/events/best-practices-for-llm-dagster-development?utm_campaign=33680422-26-01-WBNR_DEEP_Dive_LLM_Dagster&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_llm_best_practices&amp;utm_content=01_18_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/185026992?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed55042-919c-421e-90a6-b2f541a8f8a4_3840x2160.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cl4f!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed55042-919c-421e-90a6-b2f541a8f8a4_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!cl4f!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed55042-919c-421e-90a6-b2f541a8f8a4_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!cl4f!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed55042-919c-421e-90a6-b2f541a8f8a4_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!cl4f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed55042-919c-421e-90a6-b2f541a8f8a4_3840x2160.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>LLMs are transforming software development, but integrating them into real projects can be tricky when models don&#8217;t understand your codebase, pipelines, or conventions.<br><br><strong><a href="https://dagster.io/events/best-practices-for-llm-dagster-development?utm_campaign=33680422-26-01-WBNR_DEEP_Dive_LLM_Dagster&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_llm_best_practices&amp;utm_content=01_18_data_engineering_weekly">Join Dagster on January 27th</a></strong> for a practical look at data engineering best practices, common pitfalls, and live demos of LLM developments.</p><p><strong><a href="https://dagster.io/events/best-practices-for-llm-dagster-development?utm_campaign=33680422-26-01-WBNR_DEEP_Dive_LLM_Dagster&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_llm_best_practices&amp;utm_content=01_18_data_engineering_weekly">Save your spot</a></strong></p><div><hr></div><h1>Microsoft: SQL Telemetry &amp; Intelligence &#8211; How we built a Petabyte-scale Data Platform with Fabric</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Hbl4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81949dc-2722-4949-bc2d-5b1ebe2b3f18_4725x2658.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Hbl4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81949dc-2722-4949-bc2d-5b1ebe2b3f18_4725x2658.heic 424w, https://substackcdn.com/image/fetch/$s_!Hbl4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81949dc-2722-4949-bc2d-5b1ebe2b3f18_4725x2658.heic 848w, https://substackcdn.com/image/fetch/$s_!Hbl4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81949dc-2722-4949-bc2d-5b1ebe2b3f18_4725x2658.heic 1272w, https://substackcdn.com/image/fetch/$s_!Hbl4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81949dc-2722-4949-bc2d-5b1ebe2b3f18_4725x2658.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Hbl4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81949dc-2722-4949-bc2d-5b1ebe2b3f18_4725x2658.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c81949dc-2722-4949-bc2d-5b1ebe2b3f18_4725x2658.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:56106,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/185026992?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81949dc-2722-4949-bc2d-5b1ebe2b3f18_4725x2658.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Hbl4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81949dc-2722-4949-bc2d-5b1ebe2b3f18_4725x2658.heic 424w, https://substackcdn.com/image/fetch/$s_!Hbl4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81949dc-2722-4949-bc2d-5b1ebe2b3f18_4725x2658.heic 848w, https://substackcdn.com/image/fetch/$s_!Hbl4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81949dc-2722-4949-bc2d-5b1ebe2b3f18_4725x2658.heic 1272w, https://substackcdn.com/image/fetch/$s_!Hbl4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81949dc-2722-4949-bc2d-5b1ebe2b3f18_4725x2658.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Microsoft writes about how the SQL Telemetry &amp; Intelligence (T&amp;I) team built a 10+ petabyte Data Lake using Microsoft Fabric, processing real-time data from global SQL Server engines. The focus on CI/CD pipelines, testing optimization, local development, and data quality &amp; observability is an interesting system read. </p><p><strong><a href="https://blog.fabric.microsoft.com/en-us/blog/sql-telemetry-intelligence-how-we-built-a-petabyte-scale-data-platform-with-fabric">https://blog.fabric.microsoft.com/en-us/blog/sql-telemetry-intelligence-how-we-built-a-petabyte-scale-data-platform-with-fabric</a></strong></p><div><hr></div><h1>Vikram Sreekanti &amp; Joseph E. Gonzalez: Data is your only moat</h1><p>The ease of adopting a tool enables data collection, which in turn creates a defensive advantage hard for competitors to replicate. The authors make a solid argument that, for enterprise applications, the moat isn&#8217;t just about volume but about specificity. By deeply integrating with a company&#8217;s legacy systems, a product gathers data on exactly how that specific customer works. This creates &#8220;stickiness&#8221;&#8212;replacing the tool becomes difficult because a new competitor wouldn&#8217;t have that accumulated knowledge of the company&#8217;s unique workflows.</p><p><strong><a href="https://frontierai.substack.com/p/data-is-your-only-moat">https://frontierai.substack.com/p/data-is-your-only-moat</a></strong></p><div><hr></div><h1>Uber: Apache Hudi&#8482; at Uber: Engineering for Trillion-Record-Scale Data Lake Operations</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LlAO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fcb20a2-d87f-423d-ad39-8947486fab85_960x540.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LlAO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fcb20a2-d87f-423d-ad39-8947486fab85_960x540.heic 424w, https://substackcdn.com/image/fetch/$s_!LlAO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fcb20a2-d87f-423d-ad39-8947486fab85_960x540.heic 848w, https://substackcdn.com/image/fetch/$s_!LlAO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fcb20a2-d87f-423d-ad39-8947486fab85_960x540.heic 1272w, https://substackcdn.com/image/fetch/$s_!LlAO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fcb20a2-d87f-423d-ad39-8947486fab85_960x540.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LlAO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fcb20a2-d87f-423d-ad39-8947486fab85_960x540.heic" width="960" height="540" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4fcb20a2-d87f-423d-ad39-8947486fab85_960x540.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:540,&quot;width&quot;:960,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12278,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/185026992?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fcb20a2-d87f-423d-ad39-8947486fab85_960x540.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LlAO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fcb20a2-d87f-423d-ad39-8947486fab85_960x540.heic 424w, https://substackcdn.com/image/fetch/$s_!LlAO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fcb20a2-d87f-423d-ad39-8947486fab85_960x540.heic 848w, https://substackcdn.com/image/fetch/$s_!LlAO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fcb20a2-d87f-423d-ad39-8947486fab85_960x540.heic 1272w, https://substackcdn.com/image/fetch/$s_!LlAO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4fcb20a2-d87f-423d-ad39-8947486fab85_960x540.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Uber writes about the criticality of Apache Hudi in their overall data lake operations, enabling the management of trillion-record ingestion.  Uber highlighted the addition of record indexes, Which Enable O(1) record lookups and allow efficient updates on tables with hundreds of billions of rows. Personally, this is a pretty cool feature from Apache Hudi. </p><p><strong><a href="https://www.uber.com/en-IN/blog/apache-hudi-at-uber/">https://www.uber.com/en-IN/blog/apache-hudi-at-uber/</a></strong></p><div><hr></div><h1>Etsy: How Etsy Uses LLMs to Improve Search Relevance</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lLNb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b3256-d9b0-4bd3-bf3b-3b30f25c4500_720x402.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lLNb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b3256-d9b0-4bd3-bf3b-3b30f25c4500_720x402.heic 424w, https://substackcdn.com/image/fetch/$s_!lLNb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b3256-d9b0-4bd3-bf3b-3b30f25c4500_720x402.heic 848w, https://substackcdn.com/image/fetch/$s_!lLNb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b3256-d9b0-4bd3-bf3b-3b30f25c4500_720x402.heic 1272w, https://substackcdn.com/image/fetch/$s_!lLNb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b3256-d9b0-4bd3-bf3b-3b30f25c4500_720x402.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lLNb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b3256-d9b0-4bd3-bf3b-3b30f25c4500_720x402.heic" width="720" height="402" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c2b3256-d9b0-4bd3-bf3b-3b30f25c4500_720x402.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:402,&quot;width&quot;:720,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:10051,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/185026992?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b3256-d9b0-4bd3-bf3b-3b30f25c4500_720x402.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lLNb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b3256-d9b0-4bd3-bf3b-3b30f25c4500_720x402.heic 424w, https://substackcdn.com/image/fetch/$s_!lLNb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b3256-d9b0-4bd3-bf3b-3b30f25c4500_720x402.heic 848w, https://substackcdn.com/image/fetch/$s_!lLNb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b3256-d9b0-4bd3-bf3b-3b30f25c4500_720x402.heic 1272w, https://substackcdn.com/image/fetch/$s_!lLNb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c2b3256-d9b0-4bd3-bf3b-3b30f25c4500_720x402.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Etsy writes about upgrading its search capabilities by using LLMs to focus on semantic relevance, which prioritizes understanding a buyer's true intent over simple click data. Etsy uses high-quality human and LLM annotations train a lightweight "student" model that runs in real time. This model actively filters and ranks search results, successfully increasing the percentage of fully relevant listings shown to shoppers.</p><p><strong><a href="https://www.etsy.com/codeascraft/how-etsy-uses-llms-to-improve-search-relevance">https://www.etsy.com/codeascraft/how-etsy-uses-llms-to-improve-search-relevance</a></strong></p><div><hr></div><h1>AWS: How Slack achieved operational excellence for Spark on Amazon EMR using generative AI</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_jjA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32fefe3e-34cd-4a20-99f0-948768a21011_1370x492.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_jjA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32fefe3e-34cd-4a20-99f0-948768a21011_1370x492.heic 424w, https://substackcdn.com/image/fetch/$s_!_jjA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32fefe3e-34cd-4a20-99f0-948768a21011_1370x492.heic 848w, https://substackcdn.com/image/fetch/$s_!_jjA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32fefe3e-34cd-4a20-99f0-948768a21011_1370x492.heic 1272w, https://substackcdn.com/image/fetch/$s_!_jjA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32fefe3e-34cd-4a20-99f0-948768a21011_1370x492.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_jjA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32fefe3e-34cd-4a20-99f0-948768a21011_1370x492.heic" width="1370" height="492" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/32fefe3e-34cd-4a20-99f0-948768a21011_1370x492.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:492,&quot;width&quot;:1370,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7476,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/185026992?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32fefe3e-34cd-4a20-99f0-948768a21011_1370x492.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_jjA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32fefe3e-34cd-4a20-99f0-948768a21011_1370x492.heic 424w, https://substackcdn.com/image/fetch/$s_!_jjA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32fefe3e-34cd-4a20-99f0-948768a21011_1370x492.heic 848w, https://substackcdn.com/image/fetch/$s_!_jjA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32fefe3e-34cd-4a20-99f0-948768a21011_1370x492.heic 1272w, https://substackcdn.com/image/fetch/$s_!_jjA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32fefe3e-34cd-4a20-99f0-948768a21011_1370x492.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Slack writes about reaching operational excellence by replacing manual debugging with a custom monitoring framework that captures over 40 granular metrics from its EMR clusters. Slack exposed this data to generative AI models via Amazon Bedrock and a Model Context Protocol (MCP) server, enabling tools like Claude Code to analyze performance and suggest optimal configurations automatically. This automated system reduced compute costs by 30&#8211;50% and slashed developers' time spent tuning jobs by over 90%.</p><p><strong><a href="https://aws.amazon.com/blogs/big-data/how-slack-achieved-operational-excellence-for-spark-on-amazon-emr-using-generative-ai/">https://aws.amazon.com/blogs/big-data/how-slack-achieved-operational-excellence-for-spark-on-amazon-emr-using-generative-ai/</a></strong></p><div><hr></div><h1>Agoda: How Agoda Enhanced the Uptime and Consistency of Financial Metrics</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!t0gz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0ed9f5-1723-4a3e-867d-54eac6ba5f82_1400x813.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!t0gz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0ed9f5-1723-4a3e-867d-54eac6ba5f82_1400x813.heic 424w, https://substackcdn.com/image/fetch/$s_!t0gz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0ed9f5-1723-4a3e-867d-54eac6ba5f82_1400x813.heic 848w, https://substackcdn.com/image/fetch/$s_!t0gz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0ed9f5-1723-4a3e-867d-54eac6ba5f82_1400x813.heic 1272w, https://substackcdn.com/image/fetch/$s_!t0gz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0ed9f5-1723-4a3e-867d-54eac6ba5f82_1400x813.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!t0gz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0ed9f5-1723-4a3e-867d-54eac6ba5f82_1400x813.heic" width="1400" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3c0ed9f5-1723-4a3e-867d-54eac6ba5f82_1400x813.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:14271,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/185026992?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0ed9f5-1723-4a3e-867d-54eac6ba5f82_1400x813.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!t0gz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0ed9f5-1723-4a3e-867d-54eac6ba5f82_1400x813.heic 424w, https://substackcdn.com/image/fetch/$s_!t0gz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0ed9f5-1723-4a3e-867d-54eac6ba5f82_1400x813.heic 848w, https://substackcdn.com/image/fetch/$s_!t0gz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0ed9f5-1723-4a3e-867d-54eac6ba5f82_1400x813.heic 1272w, https://substackcdn.com/image/fetch/$s_!t0gz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c0ed9f5-1723-4a3e-867d-54eac6ba5f82_1400x813.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Agoda writes about addressing inconsistencies in its financial reporting by consolidating multiple disjointed data pipelines into a single Financial Unified Data Pipeline (FINUDP) built on Apache Spark. Agoda talks about approaches to ensure reliability and accuracy, including automated freshness monitoring, shadow testing for all code changes, and strict data contracts with upstream providers. </p><p><strong><a href="https://medium.com/agoda-engineering/how-agoda-enhanced-the-uptime-and-consistency-of-financial-metrics-ef7d54c4e4f0">https://medium.com/agoda-engineering/how-agoda-enhanced-the-uptime-and-consistency-of-financial-metrics-ef7d54c4e4f0</a></strong></p><div><hr></div><p><em>All rights reserved, Dewpeche Private Limited. I have provided links for informational purposes and do not suggest endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers&#8217; opinions.</em></p>]]></content:encoded></item><item><title><![CDATA[Data Engineering Weekly #252]]></title><description><![CDATA[The Weekly Data Engineering Newsletter]]></description><link>https://www.dataengineeringweekly.com/p/data-engineering-weekly-252</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/data-engineering-weekly-252</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Mon, 12 Jan 2026 02:38:04 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!AdQk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6f51c1bf-abc9-4cd3-ad69-22cc8e3f1ef2_1080x1080.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/events/best-practices-for-llm-dagster-development?utm_campaign=33680422-26-01-WBNR_DEEP_Dive_LLM_Dagster&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_llm_best_practices&amp;utm_content=01_04_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aQpJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf32b3a6-8440-4a59-ab3d-c9500ac447d5_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!aQpJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf32b3a6-8440-4a59-ab3d-c9500ac447d5_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!aQpJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf32b3a6-8440-4a59-ab3d-c9500ac447d5_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!aQpJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf32b3a6-8440-4a59-ab3d-c9500ac447d5_3840x2160.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aQpJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf32b3a6-8440-4a59-ab3d-c9500ac447d5_3840x2160.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/df32b3a6-8440-4a59-ab3d-c9500ac447d5_3840x2160.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42230,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/events/best-practices-for-llm-dagster-development?utm_campaign=33680422-26-01-WBNR_DEEP_Dive_LLM_Dagster&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_llm_best_practices&amp;utm_content=01_04_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/183513915?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf32b3a6-8440-4a59-ab3d-c9500ac447d5_3840x2160.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!aQpJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf32b3a6-8440-4a59-ab3d-c9500ac447d5_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!aQpJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf32b3a6-8440-4a59-ab3d-c9500ac447d5_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!aQpJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf32b3a6-8440-4a59-ab3d-c9500ac447d5_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!aQpJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf32b3a6-8440-4a59-ab3d-c9500ac447d5_3840x2160.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h1><strong>Best practices for LLM development</strong></h1><p>LLMs are transforming software development, but integrating them into real projects can be tricky when models don&#8217;t understand your codebase, pipelines, or conventions.<br><br>Join Dagster on January 27th for a practical look at data engineering best practices, common pitfalls, and live demos of LLM developments.</p><p><strong><a href="https://dagster.io/events/best-practices-for-llm-dagster-development?utm_campaign=33680422-26-01-WBNR_DEEP_Dive_LLM_Dagster&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=deep_dive_llm_best_practices&amp;utm_content=01_04_data_engineering_weekly">Reserve your spot now.</a></strong></p><div><hr></div><h1>Foundation Capital: AI&#8217;s trillion-dollar opportunity: Context graphs</h1><blockquote><p>Agents are cross-system and action-oriented. The UX of work is separating from the underlying data plane. Agents become the interface, but something still has to be canonical underneath.</p></blockquote><p>This will be a core construct of the next evolution of data engineering. A scalable data infrastructure that gives a unified view of the system of records and the analytical data, past decision traces, and a system of record that accepts high concurrent modifications. The promise of agents holds, but I don&#8217;t think our underlying infrastructure is ready for it.  </p><p><strong><a href="https://foundationcapital.com/context-graphs-ais-trillion-dollar-opportunity/">https://foundationcapital.com/context-graphs-ais-trillion-dollar-opportunity/</a></strong></p><div><hr></div><h1>ThoughtWorks: How to build the organizational muscle needed to scale AI beyond PoCs</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!e7mu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7f9197-a6ff-445b-a679-68645500077b_960x540.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!e7mu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7f9197-a6ff-445b-a679-68645500077b_960x540.heic 424w, https://substackcdn.com/image/fetch/$s_!e7mu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7f9197-a6ff-445b-a679-68645500077b_960x540.heic 848w, https://substackcdn.com/image/fetch/$s_!e7mu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7f9197-a6ff-445b-a679-68645500077b_960x540.heic 1272w, https://substackcdn.com/image/fetch/$s_!e7mu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7f9197-a6ff-445b-a679-68645500077b_960x540.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!e7mu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7f9197-a6ff-445b-a679-68645500077b_960x540.heic" width="960" height="540" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1d7f9197-a6ff-445b-a679-68645500077b_960x540.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:540,&quot;width&quot;:960,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:18792,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/184268483?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7f9197-a6ff-445b-a679-68645500077b_960x540.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!e7mu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7f9197-a6ff-445b-a679-68645500077b_960x540.heic 424w, https://substackcdn.com/image/fetch/$s_!e7mu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7f9197-a6ff-445b-a679-68645500077b_960x540.heic 848w, https://substackcdn.com/image/fetch/$s_!e7mu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7f9197-a6ff-445b-a679-68645500077b_960x540.heic 1272w, https://substackcdn.com/image/fetch/$s_!e7mu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7f9197-a6ff-445b-a679-68645500077b_960x540.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Thoughtworks argues that AI initiatives fail to scale beyond pilots because organizations hit compliance hurdles, data silos, and lack stakeholder engagement&#8212;problems that require building "organizational muscle" rather than buying technology solutions. The article recommends a "thin slice" approach that addresses five building blocks simultaneously for a single use case: starting with clear business outcomes instead of technology, building tech platforms incrementally based on concrete needs, creating repeatable MLOps paths to production through cross-functional product teams, and investing in AI literacy and human-collaborative tool design to drive sustained adoption.</p><p><strong><a href="https://www.thoughtworks.com/insights/articles/how-to-build-organizational-muscle-needed-to-scale-AI">https://www.thoughtworks.com/insights/articles/how-to-build-organizational-muscle-needed-to-scale-AI</a></strong></p><div><hr></div><h1>Sharon Campbell-Crow: Multi-Agent Systems: The Architecture Shift from Monolithic LLMs to Collaborative Intelligence</h1><p>Developers are moving away from monolithic LLM &#8220;God Prompts&#8221; toward multi-agent systems because single models suffer from context limits and lack built-in self-critique. Multi-agent systems use specialized, sometimes adversarial agents, improving factual accuracy by up to 23%. </p><p>The article describes four architectures&#8212;LangGraph for graph-based control and auditability, AutoGen for event-driven distributed agents, CrewAI for role-based content workflows, and OpenAI Swarm for stateless, high-scale routing&#8212;along with production patterns such as planner&#8211;executor separation, memory streams for relevance, and deferred execution to manage cost and latency.</p><p><strong><a href="https://www.comet.com/site/blog/multi-agent-systems/">https://www.comet.com/site/blog/multi-agent-systems/</a></strong></p><div><hr></div><h1><strong>Sponsored: The Scaling Data Teams Guide</strong></h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://dagster.io/how-to-scale-data-teams-ebook?utm_campaign=27879954-25-11-DMND_eBook_Scaling_Data_Teams&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=scaling_data_teams_ebook&amp;utm_content=01_04_data_engineering_weekly" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GZPK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55247c6c-6af7-429c-b72f-35fdc63d981a_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!GZPK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55247c6c-6af7-429c-b72f-35fdc63d981a_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!GZPK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55247c6c-6af7-429c-b72f-35fdc63d981a_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!GZPK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55247c6c-6af7-429c-b72f-35fdc63d981a_3840x2160.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GZPK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55247c6c-6af7-429c-b72f-35fdc63d981a_3840x2160.heic" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/55247c6c-6af7-429c-b72f-35fdc63d981a_3840x2160.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:44723,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:&quot;https://dagster.io/how-to-scale-data-teams-ebook?utm_campaign=27879954-25-11-DMND_eBook_Scaling_Data_Teams&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=scaling_data_teams_ebook&amp;utm_content=01_04_data_engineering_weekly&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/183513915?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55247c6c-6af7-429c-b72f-35fdc63d981a_3840x2160.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!GZPK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55247c6c-6af7-429c-b72f-35fdc63d981a_3840x2160.heic 424w, https://substackcdn.com/image/fetch/$s_!GZPK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55247c6c-6af7-429c-b72f-35fdc63d981a_3840x2160.heic 848w, https://substackcdn.com/image/fetch/$s_!GZPK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55247c6c-6af7-429c-b72f-35fdc63d981a_3840x2160.heic 1272w, https://substackcdn.com/image/fetch/$s_!GZPK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55247c6c-6af7-429c-b72f-35fdc63d981a_3840x2160.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Building and scaling a data platform has never been more important or more challenging. Whether you&#8217;re just starting to build a data platform or leading a mature data organization, this guide will help you scale your impact, accelerate your team, and prepare for the future of data-driven products.<br><br>Learn how real data teams, from solo practitioners to enterprise-scale organizations, build.</p><p><strong><a href="https://dagster.io/how-to-scale-data-teams-ebook?utm_campaign=27879954-25-11-DMND_eBook_Scaling_Data_Teams&amp;utm_source=email&amp;utm_medium=sponsorship&amp;utm_term=scaling_data_teams_ebook&amp;utm_content=01_04_data_engineering_weekly">Get the guide now</a></strong></p><div><hr></div><h1>Ly: Building a multi-agent pipeline for NL-to-SQL analytics</h1><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hyYM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ee44acf-f5ba-4125-9444-6ba5e25b7ce1_991x245.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hyYM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ee44acf-f5ba-4125-9444-6ba5e25b7ce1_991x245.heic 424w, https://substackcdn.com/image/fetch/$s_!hyYM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ee44acf-f5ba-4125-9444-6ba5e25b7ce1_991x245.heic 848w, https://substackcdn.com/image/fetch/$s_!hyYM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ee44acf-f5ba-4125-9444-6ba5e25b7ce1_991x245.heic 1272w, https://substackcdn.com/image/fetch/$s_!hyYM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ee44acf-f5ba-4125-9444-6ba5e25b7ce1_991x245.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hyYM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ee44acf-f5ba-4125-9444-6ba5e25b7ce1_991x245.heic" width="991" height="245" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3ee44acf-f5ba-4125-9444-6ba5e25b7ce1_991x245.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:245,&quot;width&quot;:991,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:15446,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/184268483?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ee44acf-f5ba-4125-9444-6ba5e25b7ce1_991x245.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hyYM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ee44acf-f5ba-4125-9444-6ba5e25b7ce1_991x245.heic 424w, https://substackcdn.com/image/fetch/$s_!hyYM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ee44acf-f5ba-4125-9444-6ba5e25b7ce1_991x245.heic 848w, https://substackcdn.com/image/fetch/$s_!hyYM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ee44acf-f5ba-4125-9444-6ba5e25b7ce1_991x245.heic 1272w, https://substackcdn.com/image/fetch/$s_!hyYM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3ee44acf-f5ba-4125-9444-6ba5e25b7ce1_991x245.heic 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>LY Corp writes about migrating from a monolithic MCP-based NL-to-SQL system to a five-agent pipeline after encountering execution coupling, single-point-of-failure debugging, and oversized prompt contexts.</p><p>The new design adopts a Swarm-style orchestration model in which specialized agents handle routing, intent parsing, validation, SQL generation, query execution, and result presentation, using strict JSON interfaces and a tightly scoped context.</p><p>Preprocessed domain-specific data marts with normalized action units further reduce hallucinations by helping agents reliably map intents to the correct tables and columns.</p><p><strong><a href="https://techblog.lycorp.co.jp/en/building-a-multi-agent-pipeline-for-nl-to-sql-analytics">https://techblog.lycorp.co.jp/en/building-a-multi-agent-pipeline-for-nl-to-sql-analytics</a></strong></p><div><hr></div><h1>Vinted: Building a Global, Event-Driven Platform: Our Ongoing Journey</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jcIX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede1a8dc-cef1-4dfe-8d29-5ee998867df8_512x289.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jcIX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede1a8dc-cef1-4dfe-8d29-5ee998867df8_512x289.heic 424w, https://substackcdn.com/image/fetch/$s_!jcIX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede1a8dc-cef1-4dfe-8d29-5ee998867df8_512x289.heic 848w, https://substackcdn.com/image/fetch/$s_!jcIX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede1a8dc-cef1-4dfe-8d29-5ee998867df8_512x289.heic 1272w, https://substackcdn.com/image/fetch/$s_!jcIX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede1a8dc-cef1-4dfe-8d29-5ee998867df8_512x289.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jcIX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede1a8dc-cef1-4dfe-8d29-5ee998867df8_512x289.heic" width="512" height="289" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ede1a8dc-cef1-4dfe-8d29-5ee998867df8_512x289.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:289,&quot;width&quot;:512,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7310,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/184268483?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede1a8dc-cef1-4dfe-8d29-5ee998867df8_512x289.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jcIX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede1a8dc-cef1-4dfe-8d29-5ee998867df8_512x289.heic 424w, https://substackcdn.com/image/fetch/$s_!jcIX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede1a8dc-cef1-4dfe-8d29-5ee998867df8_512x289.heic 848w, https://substackcdn.com/image/fetch/$s_!jcIX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede1a8dc-cef1-4dfe-8d29-5ee998867df8_512x289.heic 1272w, https://substackcdn.com/image/fetch/$s_!jcIX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fede1a8dc-cef1-4dfe-8d29-5ee998867df8_512x289.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Vinted Engineering describes migrating from a monolithic system handling 150k requests per second to a global, event-driven platform processing over 300k requests per second. The redesign applies Domain-Driven Design across nearly 300 domains and uses Saga-based orchestration to coordinate multi-step workflows, centralizes writes, and globally replicates read-only projections via event streams. Separating read and write paths enables low-latency features such as feeds and search to be close to users, but requires teams to design for eventual consistency, retries, and out-of-order events rather than assuming immediate consistency.</p><p><strong><a href="https://vinted.engineering/2026/01/09/building-global-event-driven-platform-part-1/">https://vinted.engineering/2026/01/09/building-global-event-driven-platform-part-1/</a></strong></p><p><strong><a href="https://vinted.engineering/2026/01/09/building-global-event-driven-platform-part-1/">https://vinted.engineering//2026/01/09/building-global-event-driven-platform-part-2/</a></strong></p><div><hr></div><h1>Lyft: Lyft&#8217;s Feature Store: Architecture, Optimization, and Evolution</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Cjo2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c123767-fd9e-45b0-9b68-f90969cfb2fc_1400x1054.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Cjo2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c123767-fd9e-45b0-9b68-f90969cfb2fc_1400x1054.heic 424w, https://substackcdn.com/image/fetch/$s_!Cjo2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c123767-fd9e-45b0-9b68-f90969cfb2fc_1400x1054.heic 848w, https://substackcdn.com/image/fetch/$s_!Cjo2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c123767-fd9e-45b0-9b68-f90969cfb2fc_1400x1054.heic 1272w, https://substackcdn.com/image/fetch/$s_!Cjo2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c123767-fd9e-45b0-9b68-f90969cfb2fc_1400x1054.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Cjo2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c123767-fd9e-45b0-9b68-f90969cfb2fc_1400x1054.heic" width="1400" height="1054" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9c123767-fd9e-45b0-9b68-f90969cfb2fc_1400x1054.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1054,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:54974,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/184268483?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c123767-fd9e-45b0-9b68-f90969cfb2fc_1400x1054.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Cjo2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c123767-fd9e-45b0-9b68-f90969cfb2fc_1400x1054.heic 424w, https://substackcdn.com/image/fetch/$s_!Cjo2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c123767-fd9e-45b0-9b68-f90969cfb2fc_1400x1054.heic 848w, https://substackcdn.com/image/fetch/$s_!Cjo2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c123767-fd9e-45b0-9b68-f90969cfb2fc_1400x1054.heic 1272w, https://substackcdn.com/image/fetch/$s_!Cjo2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c123767-fd9e-45b0-9b68-f90969cfb2fc_1400x1054.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Lyft writes about a centralized Feature Store that maintains consistency between offline training and online inference across batch, streaming, and real-time serving. Batch features run on Spark SQL with auto-generated Airflow DAGs and Hive storage, streaming features use Apache Flink with Kafka and Kinesis, and online serving relies on DynamoDB with a ValKey write-through cache for low-latency access.</p><p><strong><a href="https://eng.lyft.com/lyfts-feature-store-architecture-optimization-and-evolution-7835f8962b99">https://eng.lyft.com/lyfts-feature-store-architecture-optimization-and-evolution-7835f8962b99</a></strong></p><div><hr></div><h1>Zeta Global: Zeta&#8217;s Lakehouse Journey: A Composable, Scalable, and Federated Architecture</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UNdA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64b12aa8-f5bd-427f-b59c-b17ef3939984_469x311.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UNdA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64b12aa8-f5bd-427f-b59c-b17ef3939984_469x311.heic 424w, https://substackcdn.com/image/fetch/$s_!UNdA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64b12aa8-f5bd-427f-b59c-b17ef3939984_469x311.heic 848w, https://substackcdn.com/image/fetch/$s_!UNdA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64b12aa8-f5bd-427f-b59c-b17ef3939984_469x311.heic 1272w, https://substackcdn.com/image/fetch/$s_!UNdA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64b12aa8-f5bd-427f-b59c-b17ef3939984_469x311.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UNdA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64b12aa8-f5bd-427f-b59c-b17ef3939984_469x311.heic" width="469" height="311" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/64b12aa8-f5bd-427f-b59c-b17ef3939984_469x311.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:311,&quot;width&quot;:469,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5631,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/184268483?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64b12aa8-f5bd-427f-b59c-b17ef3939984_469x311.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UNdA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64b12aa8-f5bd-427f-b59c-b17ef3939984_469x311.heic 424w, https://substackcdn.com/image/fetch/$s_!UNdA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64b12aa8-f5bd-427f-b59c-b17ef3939984_469x311.heic 848w, https://substackcdn.com/image/fetch/$s_!UNdA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64b12aa8-f5bd-427f-b59c-b17ef3939984_469x311.heic 1272w, https://substackcdn.com/image/fetch/$s_!UNdA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F64b12aa8-f5bd-427f-b59c-b17ef3939984_469x311.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Zeta Global writes about moving to a composable, federated Lakehouse architecture to integrate a highly heterogeneous data landscape that traditional warehouses could not unify. The platform standardizes on object storage with Apache Iceberg for transactional guarantees. It uses AWS S3 Tables with AWS Glue as the control plane, allowing Spark, Snowflake, and Trino to operate on shared datasets.</p><p><strong><a href="https://medium.com/@zeta-decoded/zetas-lakehouse-journey-a-composable-scalable-and-federated-architecture-df0ab5f19c3a">https://medium.com/@zeta-decoded/zetas-lakehouse-journey-a-composable-scalable-and-federated-architecture-df0ab5f19c3a</a></strong></p><div><hr></div><h1>Google: Developer&#8217;s guide to multi-agent patterns in ADK</h1><p>Similar to the typical Enterprise Integration Pattern, the multi-agent integration pattern is emerging as agent architecture becomes more widely adopted. Google lists about 8 patterns emerging in multi-agent systems. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bkQe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27c9548-847d-4245-a9b9-9fc0d8d8d075_2752x1536.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bkQe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27c9548-847d-4245-a9b9-9fc0d8d8d075_2752x1536.heic 424w, https://substackcdn.com/image/fetch/$s_!bkQe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27c9548-847d-4245-a9b9-9fc0d8d8d075_2752x1536.heic 848w, https://substackcdn.com/image/fetch/$s_!bkQe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27c9548-847d-4245-a9b9-9fc0d8d8d075_2752x1536.heic 1272w, https://substackcdn.com/image/fetch/$s_!bkQe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27c9548-847d-4245-a9b9-9fc0d8d8d075_2752x1536.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bkQe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27c9548-847d-4245-a9b9-9fc0d8d8d075_2752x1536.heic" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e27c9548-847d-4245-a9b9-9fc0d8d8d075_2752x1536.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:37121,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/184268483?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27c9548-847d-4245-a9b9-9fc0d8d8d075_2752x1536.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bkQe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27c9548-847d-4245-a9b9-9fc0d8d8d075_2752x1536.heic 424w, https://substackcdn.com/image/fetch/$s_!bkQe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27c9548-847d-4245-a9b9-9fc0d8d8d075_2752x1536.heic 848w, https://substackcdn.com/image/fetch/$s_!bkQe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27c9548-847d-4245-a9b9-9fc0d8d8d075_2752x1536.heic 1272w, https://substackcdn.com/image/fetch/$s_!bkQe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27c9548-847d-4245-a9b9-9fc0d8d8d075_2752x1536.heic 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ol><li><p>Sequential Pipeline</p></li><li><p>Coordinator/Dispatcher</p></li><li><p>Parallel Fan-Out/Gather </p></li><li><p>Hierarchical Decomposition</p></li><li><p>Generator and Critic</p></li><li><p>Iterative Refinement</p></li><li><p>Human-in-the-loop</p></li><li><p>Composite Patterns</p></li></ol><p><strong><a href="https://developers.googleblog.com/developers-guide-to-multi-agent-patterns-in-adk/">https://developers.googleblog.com/developers-guide-to-multi-agent-patterns-in-adk/</a></strong></p><div><hr></div><h1>Ashpreet B: Memory: How Agents Learn</h1><blockquote><p>Most AI agents do not truly learn because they reset after each session.</p></blockquote><p>The author presents a &#8220;GPU-poor continuous learning&#8221; approach in which agents store and retrieve successful patterns from databases rather than retrain models, demonstrated using the agno library with SQLite for session context, a memory manager for user data, and vector databases with human review to curate high-quality learned memories.</p><p><strong><a href="https://www.ashpreetbedi.com/articles/memory">https://www.ashpreetbedi.com/articles/memory</a></strong></p><div><hr></div><p><em>All rights reserved, Dewpeche Private Limited. I have provided links for informational purposes and do not suggest endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers&#8217; opinions.</em></p>]]></content:encoded></item><item><title><![CDATA[A Critique of Iceberg REST Catalog: A Classic Case of Why Semantic Spec Fails]]></title><description><![CDATA[How a Semantically Correct API Becomes Operationally Unreliable at Scale]]></description><link>https://www.dataengineeringweekly.com/p/a-critique-of-iceberg-rest-catalog</link><guid isPermaLink="false">https://www.dataengineeringweekly.com/p/a-critique-of-iceberg-rest-catalog</guid><dc:creator><![CDATA[Ananth Packkildurai]]></dc:creator><pubDate>Fri, 09 Jan 2026 05:57:38 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!t5GO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15ba451a-d2ee-45ad-9ada-284d6558ed60_3280x1564.heic" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote><p><em><strong>&#8220;Latency is not just a performance characteristic; it is a fundamental part of correctness.&#8221; </strong></em><strong>&#8212; </strong><em><strong>Designing Data-Intensive Applications</strong></em></p></blockquote><p>In <em><strong><a href="https://dataintensive.net/">Designing Data-Intensive Applications</a></strong></em>, <strong><a href="https://martin.kleppmann.com/">Martin Kleppmann</a></strong> makes a subtle but critical point: the <strong><a href="https://en.wikipedia.org/wiki/CAP_theorem">CAP theorem</a></strong> omits latency, yet in real systems, latency often determines whether a system is usable at all. <strong>A system that is </strong><em><strong>correct but slow</strong></em><strong> is, in practice, incorrect.</strong></p><p>This observation is directly applicable to the <strong><a href="https://iceberg.apache.org/rest-catalog-spec/">Apache Iceberg REST Catalog specification</a></strong>. While the specification achieves semantic clarity, it fails to define the operational realities that enable distributed systems to remain predictable at scale. The result is a standard that is formally correct, yet operationally fragile.</p><div><hr></div><h2><strong>Semantic Interoperability Without Predictability</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!t5GO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15ba451a-d2ee-45ad-9ada-284d6558ed60_3280x1564.heic" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!t5GO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15ba451a-d2ee-45ad-9ada-284d6558ed60_3280x1564.heic 424w, https://substackcdn.com/image/fetch/$s_!t5GO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15ba451a-d2ee-45ad-9ada-284d6558ed60_3280x1564.heic 848w, https://substackcdn.com/image/fetch/$s_!t5GO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15ba451a-d2ee-45ad-9ada-284d6558ed60_3280x1564.heic 1272w, https://substackcdn.com/image/fetch/$s_!t5GO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15ba451a-d2ee-45ad-9ada-284d6558ed60_3280x1564.heic 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!t5GO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15ba451a-d2ee-45ad-9ada-284d6558ed60_3280x1564.heic" width="1456" height="694" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/15ba451a-d2ee-45ad-9ada-284d6558ed60_3280x1564.heic&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:694,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:139678,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/heic&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.dataengineeringweekly.com/i/183990563?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15ba451a-d2ee-45ad-9ada-284d6558ed60_3280x1564.heic&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!t5GO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15ba451a-d2ee-45ad-9ada-284d6558ed60_3280x1564.heic 424w, https://substackcdn.com/image/fetch/$s_!t5GO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15ba451a-d2ee-45ad-9ada-284d6558ed60_3280x1564.heic 848w, https://substackcdn.com/image/fetch/$s_!t5GO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15ba451a-d2ee-45ad-9ada-284d6558ed60_3280x1564.heic 1272w, https://substackcdn.com/image/fetch/$s_!t5GO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15ba451a-d2ee-45ad-9ada-284d6558ed60_3280x1564.heic 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Over the past two years, the Iceberg REST Catalog specification has emerged as the de facto standard for metadata access in the Iceberg ecosystem. We have seen the outburst of the <strong><a href="https://materializedview.io/p/begun-the-catalog-wars-have">catalog war</a></strong> around the REST spec. It promises a universal interface that allows engines such as Trino, Spark, Flink, and StarRocks to interact with Iceberg tables via a common REST abstraction, independent of the underlying catalog implementation.</p><p>At the semantic level, this promise largely holds. The specification rigorously defines metadata structures: tables, schemas, snapshots, and namespace operations. A LoadTable or CreateNamespace request looks identical across implementations. This semantic interoperability has been critical to Iceberg&#8217;s rapid ecosystem adoption.</p><p>However, semantic interoperability alone is insufficient. The specification defines <em>what</em> metadata operations mean, but it avoids specifying how they must behave in real-world conditions, such as concurrency, latency sensitivity, and cross-catalog synchronization.</p><p>This gap&#8212;between semantic interoperability and operational interoperability&#8212;is where systems begin to fail in production.</p><div><hr></div><h2><strong>The Core Problem: No Operational SLA, No Predictability</strong></h2><p>The Iceberg REST Catalog specification is intentionally silent on performance guarantees. There are no latency expectations, no throughput baselines, and no service-level objectives. While this flexibility lowers the barrier to implementation, it creates an ecosystem where:</p><ul><li><p>Two catalogs can both be &#8220;compliant&#8221; yet differ by orders of magnitude in response time.</p></li><li><p>Clients cannot reason about metadata latency during query planning.</p></li><li><p>Synchronization behavior across catalogs becomes unpredictable.</p></li></ul><p>In distributed data systems, <strong>predictability matters more than raw performance</strong>. Without a strict operational SLA&#8212;or at least defined behavioral constraints&#8212;clients are forced into defensive, retry-heavy designs that amplify load and increase tail latency.</p><div><hr></div><h2><strong>The &#8220;List Tables&#8221; Problem: Cross-Catalog Sync Failure</strong></h2><p>The ListTables endpoint (<strong><a href="https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L118">GET /v1/namespaces/{namespace}/tables</a></strong>) is semantically straightforward. It allows clients to enumerate tables within a namespace and supports pagination through pageSize and pageToken.</p><p>The primary issue is not pagination itself. The real failure emerges when <strong>the same Iceberg tables are registered in multiple catalogs</strong>, a pattern that is increasingly common in hybrid and multi-platform deployments.</p><h3><strong>A Realistic Scenario</strong></h3><ul><li><p>An Iceberg table is registered in <strong>Catalog A</strong> and <strong>Catalog B</strong></p></li><li><p>Both catalogs point to the same underlying metadata and object storage.</p></li><li><p>One catalog is used by ingestion and streaming workloads.</p></li><li><p>Analytics engines or BI tools use the other.</p></li></ul><h3><strong>The Sync Pathology</strong></h3><p>When a client connects to Catalog B and issues a metadata discovery operation&#8212;such as listing tables or syncing namespace state&#8212;the catalog must:</p><ol><li><p>Enumerate all tables</p></li><li><p>Resolve metadata pointers</p></li><li><p>Validate access permissions</p></li><li><p>Reconcile the state with the underlying storage.</p></li></ol><p>Because the REST specification defines no operational expectations:</p><ul><li><p>There is no SLA for how long this sync should take</p></li><li><p>There is no distinction between a &#8220;lightweight&#8221; listing and a fully validated listing.</p></li><li><p>There is no mechanism to express intent (e.g., <em>names only</em>, <em>no ACL validation</em>)</p></li></ul><p>As table counts grow into the tens of thousands, synchronization latency grows non-linearly. In practice, sync operations can take minutes&#8212;or fail&#8212;causing engines to stall, time out, or repeatedly retry.</p><p>The result is not merely slow metadata access. It is <strong>system-wide unpredictability</strong>. Query engines cannot determine whether a delay is transient, systemic, or catastrophic.</p><div><hr></div><h2><strong>Latency Is Treated as an Implementation Detail&#8212;But It Is a Contract</strong></h2><p>The REST Catalog specification implicitly treats latency as an implementation concern. From a standards perspective, this is understandable. But in data-intensive systems, latency is part of the correctness contract.</p><p>The specification does not define:</p><ul><li><p>Upper bounds on metadata retrieval latency</p></li><li><p>Maximum metadata payload sizes</p></li><li><p>Limits on metadata fan-out operations</p></li><li><p>The number of round-trip required to plan a query</p></li></ul><p>As a result, a compliant catalog may require megabytes of JSON metadata and dozens of HTTP calls just to validate a single query plan. Engines appear slow and unstable, even though the root cause lies in an underspecified protocol.</p><p>This is precisely the class of problem Kleppmann warns about: correctness without latency guarantees is operationally meaningless.</p><div><hr></div><h2><strong>Commit Semantics Under Contention: Undefined and Unfair</strong></h2><p>Iceberg relies on optimistic concurrency control. When multiple writers attempt to commit simultaneously, conflicts are expected and resolved through retries.</p><p>The REST specification defines the 409 Conflict response, but stops there. It does not define:</p><ul><li><p>Backoff expectations</p></li><li><p>Retry fairness</p></li><li><p>Starvation prevention</p></li></ul><p>In a multi-engine environment, this creates asymmetric outcomes. A high-frequency streaming writer with aggressive retries can permanently starve batch compaction jobs that follow conservative retry policies. Over time, table health degrades due to file explosion and unbounded metadata growth.</p><p>Once again, the issue is not semantic correctness. It is the absence of operational guarantees.</p><div><hr></div><h2><strong>Caching Without a Freshness Model</strong></h2><p>While HTTP caching is permitted, it is not part of the correctness model. Support for conditional requests, ETags, or freshness validation is optional.</p><p>This forces clients into a pessimistic stance: always re-fetch, always revalidate, always assume staleness. The REST protocol degenerates into a chatty, high-latency control plane that negates its own architectural benefits.</p><p>Without a standardized freshness contract, caching becomes a gamble rather than a reliability tool.</p><div><hr></div><h2><strong>Behavioral Conformance Is Missing</strong></h2><p>The Iceberg ecosystem has strong conformance testing for table formats. It lacks an equivalent for catalog behavior.</p><p>Today, &#8220;REST Catalog compliant&#8221; means:</p><ul><li><p>The endpoints exist</p></li><li><p>The JSON schema is correct.</p></li><li><p>The happy path works.</p></li></ul><p>It does not mean:</p><ul><li><p>Predictable latency under load</p></li><li><p>Stable pagination during concurrent updates</p></li><li><p>Graceful overload signaling</p></li><li><p>Bounded retry amplification</p></li></ul><p>Without behavioral conformance tests, compliance guarantees syntax, not operability.</p><div><hr></div><h2><strong>Underspecification Is Still a Design Decision</strong></h2><p>The absence of operational constraints is not accidental. It reflects a deliberate choice to prioritize adoption and flexibility.</p><p>However, in distributed systems, underspecification pushes complexity downstream. It burdens clients, operators, and platform teams with the need to implement compensating logic. As Iceberg becomes core infrastructure rather than experimental tooling, this trade-off increasingly limits its reliability.</p><p>Semantic agreement without behavioral agreement leads to fragile systems.</p><div><hr></div><h2><strong>Toward Operational Interoperability</strong></h2><p>Operational interoperability does not require rigid SLAs or centralized control. It requires acknowledging that <strong>latency, retries, and fairness are part of the interface</strong>.</p><p>Concrete improvements could include:</p><ul><li><p>Defined operational profiles with minimum latency and concurrency expectations</p></li><li><p>Lightweight metadata views to avoid synchronization amplification</p></li><li><p>Standardized retry and backoff semantics for conflict scenarios</p></li><li><p>Explicit freshness and caching contracts</p></li></ul><p>Semantic interoperability enabled Iceberg&#8217;s success. Operational interoperability will determine whether it remains dependable at scale.</p><p>Until then, the Iceberg REST Catalog remains a textbook example of why <strong>semantic specifications alone are not enough</strong>.</p><div><hr></div><p><em>All rights reserved, Dewpeche Private Limited. I have provided links for informational purposes and do not suggest endorsement. All views expressed in this newsletter are my own and do not represent current, former, or future employers&#8217; opinions.</em></p>]]></content:encoded></item></channel></rss>