<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Thinking in Models]]></title><description><![CDATA[Using models, data, and first-principles thinking to understand football, AI, and complex systems.]]></description><link>https://siddhraj.substack.com</link><image><url>https://substackcdn.com/image/fetch/$s_!ZnSg!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd75a1975-5315-4392-a66f-e9b4c7678177_376x376.png</url><title>Thinking in Models</title><link>https://siddhraj.substack.com</link></image><generator>Substack</generator><lastBuildDate>Sat, 04 Apr 2026 16:58:53 GMT</lastBuildDate><atom:link href="https://siddhraj.substack.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Siddhraj Thakor]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[siddhraj@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[siddhraj@substack.com]]></itunes:email><itunes:name><![CDATA[Siddhraj Thakor]]></itunes:name></itunes:owner><itunes:author><![CDATA[Siddhraj Thakor]]></itunes:author><googleplay:owner><![CDATA[siddhraj@substack.com]]></googleplay:owner><googleplay:email><![CDATA[siddhraj@substack.com]]></googleplay:email><googleplay:author><![CDATA[Siddhraj Thakor]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[I Tested This With an Algorithm to See What Four Films Reveal About Taste, Identity, and Performance]]></title><description><![CDATA[What our favorite movies quietly reveal about how we curate identity online]]></description><link>https://siddhraj.substack.com/p/i-tested-this-with-an-algorithm-to</link><guid isPermaLink="false">https://siddhraj.substack.com/p/i-tested-this-with-an-algorithm-to</guid><dc:creator><![CDATA[Siddhraj Thakor]]></dc:creator><pubDate>Tue, 03 Feb 2026 09:48:59 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!lHFX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbe2d087-6134-4ab7-8f12-f31cf1254670_719x403.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lHFX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbe2d087-6134-4ab7-8f12-f31cf1254670_719x403.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lHFX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbe2d087-6134-4ab7-8f12-f31cf1254670_719x403.jpeg 424w, https://substackcdn.com/image/fetch/$s_!lHFX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbe2d087-6134-4ab7-8f12-f31cf1254670_719x403.jpeg 848w, https://substackcdn.com/image/fetch/$s_!lHFX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbe2d087-6134-4ab7-8f12-f31cf1254670_719x403.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!lHFX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbe2d087-6134-4ab7-8f12-f31cf1254670_719x403.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lHFX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbe2d087-6134-4ab7-8f12-f31cf1254670_719x403.jpeg" width="719" height="403" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dbe2d087-6134-4ab7-8f12-f31cf1254670_719x403.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:403,&quot;width&quot;:719,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!lHFX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbe2d087-6134-4ab7-8f12-f31cf1254670_719x403.jpeg 424w, https://substackcdn.com/image/fetch/$s_!lHFX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbe2d087-6134-4ab7-8f12-f31cf1254670_719x403.jpeg 848w, https://substackcdn.com/image/fetch/$s_!lHFX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbe2d087-6134-4ab7-8f12-f31cf1254670_719x403.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!lHFX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbe2d087-6134-4ab7-8f12-f31cf1254670_719x403.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Most people don&#8217;t choose their favorite movies honestly.</p><p>They choose them <strong>strategically</strong>.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://siddhraj.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thinking in Models is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>For the person who might scroll their profile.<br>For the friend who might judge.<br>For the version of themselves they want to signal.</p><p>If you&#8217;ve ever rearranged your &#8220;Top 4&#8221; films even once, you already know this.</p><p>That tiny grid of posters isn&#8217;t just taste, it&#8217;s <strong>presentation</strong>.</p><p>And I wanted to know what it actually reveals.</p><p></p><h3><strong>The Quiet Performance We All Pretend Isn&#8217;t There</strong></h3><p>On platforms like Letterboxd, you&#8217;re asked to pin four favorite films.</p><p>Just four.</p><p>It sounds casual. It isn&#8217;t.</p><p>People swap films out because one feels <em>too basic</em>.<br>They keep another because it feels <em>serious</em>.<br>They hesitate before adding something they genuinely love &#8212; because of how it might look.</p><p><strong>Four films end up doing the work of an identity summary.</strong></p><p>And we all pretend that summary is accidental.</p><p>It&#8217;s not.</p><p>So I asked a slightly uncomfortable question:</p><blockquote><p><em>If people already judge us based on four movies&#8230;<br>what exactly are they reacting to?</em></p></blockquote><p></p><h3><strong>Movie Taste Isn&#8217;t Objective But It Isn&#8217;t Random</strong></h3><p>We argue about films as if we&#8217;re debating facts.</p><p>&#8220;This is a better movie.&#8221;<br>&#8220;You just don&#8217;t understand cinema.&#8221;<br>&#8220;That&#8217;s not real film.&#8221;</p><p>But most of these arguments aren&#8217;t about quality.</p><p>They&#8217;re about <strong>orientation</strong>.</p><p>Some people use movies for comfort.<br>Some for challenge.<br>Some for cultural credibility.<br>Some for discovery.</p><p>Taste isn&#8217;t a score, it&#8217;s a <strong>pattern of choices</strong>.</p><p>Patterns can be observed.<br>Patterns can be measured.</p><p>That&#8217;s where the experiment began.</p><p></p><h3><strong>Why I Built the Algorithm (and Why I Didn&#8217;t Trust It)</strong></h3><p>I built a small tool to analyze only one thing:</p><p><strong>A person&#8217;s Top 4 films. Nothing else.</strong></p><p>No watch history.<br>No reviews.<br>No social data.</p><p>Just four titles.</p><p>The idea was simple:</p><ul><li><p>Not to judge quality</p></li><li><p>Not to crown &#8220;good&#8221; taste</p></li><li><p>But to detect <strong>how someone relates to cinema</strong></p></li></ul><p>At first, I didn&#8217;t trust the result.</p><p>Then I ran my own Top 4.</p><p>And a friend looked at the output and said:</p><blockquote><p><em>&#8220;Yeah&#8230; that&#8217;s accurate. You curate your taste more than you admit.&#8221;</em></p></blockquote><p>That comment landed harder than the score.</p><p>Because it wasn&#8217;t about movies anymore.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JnIk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d8f59e9-e9cd-4ee2-acf1-60276f35df8f_735x475.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JnIk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d8f59e9-e9cd-4ee2-acf1-60276f35df8f_735x475.jpeg 424w, https://substackcdn.com/image/fetch/$s_!JnIk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d8f59e9-e9cd-4ee2-acf1-60276f35df8f_735x475.jpeg 848w, https://substackcdn.com/image/fetch/$s_!JnIk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d8f59e9-e9cd-4ee2-acf1-60276f35df8f_735x475.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!JnIk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d8f59e9-e9cd-4ee2-acf1-60276f35df8f_735x475.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JnIk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d8f59e9-e9cd-4ee2-acf1-60276f35df8f_735x475.jpeg" width="735" height="475" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3d8f59e9-e9cd-4ee2-acf1-60276f35df8f_735x475.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:475,&quot;width&quot;:735,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!JnIk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d8f59e9-e9cd-4ee2-acf1-60276f35df8f_735x475.jpeg 424w, https://substackcdn.com/image/fetch/$s_!JnIk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d8f59e9-e9cd-4ee2-acf1-60276f35df8f_735x475.jpeg 848w, https://substackcdn.com/image/fetch/$s_!JnIk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d8f59e9-e9cd-4ee2-acf1-60276f35df8f_735x475.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!JnIk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d8f59e9-e9cd-4ee2-acf1-60276f35df8f_735x475.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Patterns don&#8217;t describe who we are, they describe how we choose</p><p></p><h3><strong>What the Algorithm Looks For (In Plain English)</strong></h3><p>This isn&#8217;t about precision.<br>It&#8217;s about <strong>pattern recognition</strong>.</p><p>The system looks at five signals:</p><ul><li><p>how consistently strong the films are</p></li><li><p>how much genre overlap exists</p></li><li><p>how rare or safe the picks are</p></li><li><p>whether the films span different eras</p></li><li><p>whether there are cinephile markers (auteurs, foreign films, black-and-white, etc.)</p></li></ul><p>Here&#8217;s the limitation and it matters:</p><blockquote><p><em><strong>Four films can&#8217;t capture everything about you.</strong><br>But they reveal how you choose to be seen.</em></p></blockquote><p>That&#8217;s the trade-off.</p><p>Someone who picks four modern Hollywood dramas is sending a different signal than someone whose Top 4 mixes a classic, a foreign film, an auteur work, and something unconventional.</p><p>Same number of films.<br>Different intent.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wsON!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8811c00d-dc18-4aa8-9747-af259dc8b285_736x414.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wsON!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8811c00d-dc18-4aa8-9747-af259dc8b285_736x414.jpeg 424w, https://substackcdn.com/image/fetch/$s_!wsON!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8811c00d-dc18-4aa8-9747-af259dc8b285_736x414.jpeg 848w, https://substackcdn.com/image/fetch/$s_!wsON!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8811c00d-dc18-4aa8-9747-af259dc8b285_736x414.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!wsON!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8811c00d-dc18-4aa8-9747-af259dc8b285_736x414.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wsON!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8811c00d-dc18-4aa8-9747-af259dc8b285_736x414.jpeg" width="736" height="414" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8811c00d-dc18-4aa8-9747-af259dc8b285_736x414.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:414,&quot;width&quot;:736,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!wsON!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8811c00d-dc18-4aa8-9747-af259dc8b285_736x414.jpeg 424w, https://substackcdn.com/image/fetch/$s_!wsON!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8811c00d-dc18-4aa8-9747-af259dc8b285_736x414.jpeg 848w, https://substackcdn.com/image/fetch/$s_!wsON!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8811c00d-dc18-4aa8-9747-af259dc8b285_736x414.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!wsON!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8811c00d-dc18-4aa8-9747-af259dc8b285_736x414.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The question is never whether it&#8217;s good it&#8217;s whether it&#8217;s good enough</p><p></p><h3><strong>The Result: A Cinema Level (and Why People React So Strongly)</strong></h3><p>The output is a <strong>Cinema Level :- </strong>from 1 to 10.</p><p>Not intelligence.<br>Not superiority.<br>Not &#8220;good&#8221; versus &#8220;bad&#8221;.</p><p>Just depth, range, and curiosity as reflected through four choices.</p><p>What surprised me wasn&#8217;t where people landed.</p><p>It was what happened after.</p><p>People screenshot results.<br>Friends disagreed.<br>Group chats argued.</p><p>Someone always says:</p><blockquote><p><em>&#8220;No way that&#8217;s accurate.&#8221;</em></p></blockquote><p>And someone else replies:</p><blockquote><p><em>&#8220;No&#8230; that&#8217;s exactly you.&#8221;</em></p></blockquote><p>That tension is the real data.</p><p></p><h3><strong>This Stopped Being About Movies</strong></h3><p>At some point, I realized this wasn&#8217;t really about film taste.</p><p>It was about <strong>identity compression</strong>.</p><p>We reduce ourselves to:</p><ul><li><p>playlists</p></li><li><p>bios</p></li><li><p>pinned posts</p></li><li><p>top fours</p></li></ul><p>And we hope those fragments explain us honestly.</p><p>Sometimes they do.<br>Sometimes they expose the gap between who we are and who we want to appear to be.</p><p>Four films sit right in that gap.</p><p></p><h3><strong>Try This Properly (Not Casually)</strong></h3><p>If you&#8217;re curious, don&#8217;t do this alone.</p><p>Run your Top 4 through the tool here &#8594; <a href="https://top4theory.vercel.app/">Top4Theory</a></p><p>Screenshot the result.<br>Send it to someone who knows your taste well.</p><p>Then listen to their reaction.</p><p>The most interesting feedback doesn&#8217;t come from the algorithm.</p><p>It comes from the person who says:</p><blockquote><p><em>&#8220;That&#8217;s you.&#8221;</em></p></blockquote><p>Or worse:</p><blockquote><p><em>&#8220;That&#8217;s who you want to be.&#8221;</em></p></blockquote><p></p><h3><strong>A Small Ask (If This Made You Pause)</strong></h3><p>If this made you rethink your Top 4 even slightly<br><strong>Like the blog</strong> so more people see it.</p><p>And if you enjoy thoughtful experiments at the intersection of culture, identity, and the internet,<br><strong>Subscribe me, </strong>I&#8217;m building more things like this. (leetterboxd + serializd thing if i got enough time to build)</p><p>Because sometimes, four films are enough to tell a story we didn&#8217;t mean to tell.</p><p>If you&#8217;re curious about how it works, the project is open source on <a href="https://github.com/siddhraj1412/top4theory">GitHub</a>.</p><p><em>Thanks for reading.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://siddhraj.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thinking in Models is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Pythagorean Theorem Can Predict Premier League Tables Better Than You Think (and other mind-blowing football analytics tricks I just discovered)]]></title><description><![CDATA[No, I didn&#8217;t build any models &#8212; I just fell down a rabbit hole and now I can&#8217;t watch matches the same way.]]></description><link>https://siddhraj.substack.com/p/the-pythagorean-theorem-can-predict</link><guid isPermaLink="false">https://siddhraj.substack.com/p/the-pythagorean-theorem-can-predict</guid><dc:creator><![CDATA[Siddhraj Thakor]]></dc:creator><pubDate>Mon, 01 Dec 2025 08:02:56 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/50ca7b60-7007-48e4-8ddb-47d1a1ff0779_736x736.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I&#8217;m not a data scientist. I don&#8217;t code betting models. I just watch way too much football.</p><p>But last week I stumbled across two analytics concepts that are so simple yet so powerful that they&#8217;ve completely changed how I look at the scoreboard. Here they are.</p><h3>1. There&#8217;s a Pythagorean Theorem for Football (and it actually works)</h3><p>You know a&#178; + b&#178; = c&#178; from school, right?</p><p>In sports analytics, there&#8217;s a formula that looks almost exactly like it:</p><p><strong>Expected Win % = Goals Scored&#178; / (Goals Scored&#178; + Goals Conceded&#178;)</strong></p><p>It was originally created for baseball by Bill James in the 1980s, but someone tried it on football&#8230; and it&#8217;s scarily accurate.</p><p>I saw a table someone posted for the current 2025&#8211;26 Premier League season (14 weeks in as of December 1):</p><p>&#8226; Arsenal are top and exactly where Pythagorean says they should be<br>&#8226; Man City are 5 points ahead of expectation (they'll comeback always but who knows)<br>&#8226; Manchester United are 9 points behind what their goals hope they comeback soon<br>&#8226; Chelsea are 6 points short<br>&#8226; Sunderland surprisingly are 8 points short</p><p>The correlation between this simple formula and actual points is usually 92&#8211;95% by the end of a season&#8212;across Europe&#8217;s top 5 leagues. All from just goals scored and conceded.</p><p>Mind. Blown.</p><p>People even have fancier versions using expected goals (xG) instead of actual goals, and it gets even more accurate. Apparently xG Pythagorean is one of the best single-number team-strength metrics out there.</p><p></p><blockquote><p><a href="https://www.kaggle.com/datasets/siddhrajthakor/premier-league-table-202526/">Table on kaggle</a></p><p><a href="https://www.kaggle.com/code/siddhrajthakor/tutorial">notebook</a></p></blockquote><h3>2. Ordered Logistic Regression: The secret weapon for predicting Win-Draw-Lose</h3><p>The second thing that made me pause Netflix and open 17 tabs:</p><p>Most basic prediction models treat Win / Draw / Lose as three completely separate outcomes. But they&#8217;re not&#8212;they&#8217;re ordered:</p><p>Home win &#8594; Draw &#8594; Away win</p><p>There&#8217;s a statistical model called <strong>ordered logistic regression</strong> (or proportional odds model) that was literally designed for exactly this kind of ordered outcome.</p><p>Instead of learning three different equations, it learns one underlying &#8220;team strength&#8221; score and then two cutoff points that separate win/draw/lose.</p><p>Analysts who use this (especially the really sharp ones on Twitter and in betting companies) say it&#8217;s particularly good at:<br>&#8226; Not underestimating draws like most models do<br>&#8226; Giving cleaner, more realistic probabilities when teams are evenly matched</p><p>I&#8217;ve now seen dozens of side-by-side comparisons where the ordered logistic model beats regular multinomial logistic regression on accuracy and log-loss&#8212;especially in leagues with lots of draws (Ligue 1, Serie A, etc.).</p><h3>So What?</h3><p>I haven&#8217;t built any of this myself (yet). I probably never will&#8212;I&#8217;m lazy.</p><p>But just knowing these two things exist has ruined normal football watching for me in the best possible way.</p><p>Now when a team is 12th but their Pythagorean expected points say they should be 6th, I notice.<br>When the commentator says &#8220;this should be a home banker&#8221; but the ordered logit probabilities show 42% home, 31% draw, 27% away&#8230; I smirk a little.</p><p>Football is still random and beautiful and infuriating.<br>But underneath all the chaos, there&#8217;s this weird layer of math that actually works.</p><p>And now I can&#8217;t unsee it.</p><p>If you&#8217;re like me&#8212;a nerdy fan who loves when sports and numbers collide&#8212;go Google &#8220;Pythagorean expectation football&#8221; and &#8220;ordered logistic regression soccer&#8221; right now.</p><p>You&#8217;ll thank me when you&#8217;re annoying your friends at the pub with &#8220;actually, according to the proportional odds model&#8230;&#8221;</p><div><hr></div><p>Hit reply and tell me: which team is the most &#8220;unlucky&#8221; according to Pythagorean right now? I need to know I&#8217;m not the only one obsessed.</p><p>P.S. If you want more laid-back football + numbers posts like this (roughly once a week, always free), subscribe below&#8212;zero pressure!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://siddhraj.substack.com/p/the-pythagorean-theorem-can-predict/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://siddhraj.substack.com/p/the-pythagorean-theorem-can-predict/comments"><span>Leave a comment</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://siddhraj.substack.com/p/the-pythagorean-theorem-can-predict?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://siddhraj.substack.com/p/the-pythagorean-theorem-can-predict?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://siddhraj.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://siddhraj.substack.com/subscribe?"><span>Subscribe now</span></a></p><div class="directMessage button" data-attrs="{&quot;userId&quot;:390152009,&quot;userName&quot;:&quot;Siddhraj Thakor&quot;,&quot;canDm&quot;:null,&quot;dmUpgradeOptions&quot;:null,&quot;isEditorNode&quot;:true}" data-component-name="DirectMessageToDOM"></div><div class="community-chat" data-attrs="{&quot;url&quot;:&quot;https://open.substack.com/pub/siddhraj/chat?utm_source=chat_embed&quot;,&quot;subdomain&quot;:&quot;siddhraj&quot;,&quot;pub&quot;:{&quot;id&quot;:6433167,&quot;name&quot;:&quot;Siddhraj&#8217;s Substack&quot;,&quot;author_name&quot;:&quot;Siddhraj Thakor&quot;,&quot;author_photo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!pGZ3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc51c79b4-fe58-4770-beab-bb37c97a61bd_736x736.jpeg&quot;}}" data-component-name="CommunityChatRenderPlaceholder"></div><p></p>]]></content:encoded></item><item><title><![CDATA[Join my new subscriber chat]]></title><description><![CDATA[A private space for us to converse and connect]]></description><link>https://siddhraj.substack.com/p/join-my-new-subscriber-chat</link><guid isPermaLink="false">https://siddhraj.substack.com/p/join-my-new-subscriber-chat</guid><dc:creator><![CDATA[Siddhraj Thakor]]></dc:creator><pubDate>Thu, 06 Nov 2025 08:21:15 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!gD3a!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd63a7195-4e72-4238-af8e-a9a6ee468648_414x414.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Today I&#8217;m announcing a brand new addition to my Substack publication: Siddhraj&#8217;s Substack subscriber chat.</p><p>This is a conversation space exclusively for subscribers&#8212;kind of like a group chat or live hangout. I&#8217;ll post questions and updates that come my way, and you can jump into the discussion.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://open.substack.com/pub/siddhraj/chat&quot;,&quot;text&quot;:&quot;Join chat&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://open.substack.com/pub/siddhraj/chat"><span>Join chat</span></a></p><div><hr></div><h2>How to get started</h2><ol><li><p><strong>Get the Substack app by clicking <a href="https://substack.com/app/app-store-redirect">this link</a> or the button below.</strong> New chat threads won&#8217;t be sent sent via email, so turn on push notifications so you don&#8217;t miss conversation as it happens. You can also access chat <a href="https://open.substack.com/pub/siddhraj/chat">on the web</a>.</p></li></ol><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://substack.com/app/app-store-redirect&quot;,&quot;text&quot;:&quot;Get app&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://substack.com/app/app-store-redirect"><span>Get app</span></a></p><ol start="2"><li><p><strong>Open the app and tap the Chat icon.</strong> It looks like two bubbles in the bottom bar, and you&#8217;ll see a row for my chat inside.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KYZT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0f63c9a-2296-4c96-a2f9-52648999bb00_2000x1000.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KYZT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0f63c9a-2296-4c96-a2f9-52648999bb00_2000x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!KYZT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0f63c9a-2296-4c96-a2f9-52648999bb00_2000x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!KYZT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0f63c9a-2296-4c96-a2f9-52648999bb00_2000x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!KYZT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0f63c9a-2296-4c96-a2f9-52648999bb00_2000x1000.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KYZT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0f63c9a-2296-4c96-a2f9-52648999bb00_2000x1000.jpeg" width="1456" height="728" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e0f63c9a-2296-4c96-a2f9-52648999bb00_2000x1000.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:728,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:241528,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://kylewarrentest.substack.com/i/114198534?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0f63c9a-2296-4c96-a2f9-52648999bb00_2000x1000.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KYZT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0f63c9a-2296-4c96-a2f9-52648999bb00_2000x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!KYZT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0f63c9a-2296-4c96-a2f9-52648999bb00_2000x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!KYZT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0f63c9a-2296-4c96-a2f9-52648999bb00_2000x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!KYZT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0f63c9a-2296-4c96-a2f9-52648999bb00_2000x1000.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ol start="3"><li><p><strong>That&#8217;s it!</strong> Jump into my thread to say hi, and if you have any issues, check out <a href="https://support.substack.com/hc/en-us/sections/360007461791-Frequently-Asked-Questions">Substack&#8217;s FAQ</a>.</p></li></ol><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://siddhraj.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Siddhraj&#8217;s Substack is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[How I Built My Portfolio with AI (And a Free Checklist to Start Yours!)]]></title><description><![CDATA[From Coding Chaos to a Live Site in Days&#8212;Here&#8217;s How You Can Do It Too]]></description><link>https://siddhraj.substack.com/p/how-i-built-my-portfolio-with-ai</link><guid isPermaLink="false">https://siddhraj.substack.com/p/how-i-built-my-portfolio-with-ai</guid><dc:creator><![CDATA[Siddhraj Thakor]]></dc:creator><pubDate>Wed, 15 Oct 2025 20:06:14 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!9Xg1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff194393c-580f-4339-9f57-df68b0d13fd3_1100x522.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey there, awesome subscribers!</p><p>Siddhraj here, your fellow data nerd from India, If you&#8217;re one of my 7 amazing free subscribers&#8212;huge shoutout to you! &#129782;</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://siddhraj.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Siddhraj&#8217;s Substack is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>You&#8217;re the reason I&#8217;m fired up to share my journey of building my portfolio website (<a href="https://siddhraj-portfolio.vercel.app/">siddhraj-portfolio.vercel.app</a>). I was a mess of nerves trying to showcase my GitHub projects&#8212;like my football analytics dashboard and WhatsApp Chat Analyzer&#8212;until AI swooped in like a superhero. This is the story of how I went from &#8220;I have no idea what I&#8217;m doing&#8221; to a live portfolio that screams <em>me</em>, all without losing my sanity. Stick around for the full tale, a free portfolio checklist, and tips to start your own. If you love it, share it with a friend&#8212;it&#8217;ll help grow our little Substack crew!</p><p></p><h2>The Panic: Where Do I Even Start?</h2><p>A few months back, I realized my resume wasn&#8217;t cutting it. I had cool projects on <a href="https://github.com/siddhraj1412">GitHub</a>&#8212;think ML models and data viz&#8212;but no way to show them off to recruiters or clients. I needed a portfolio website, but the options overwhelmed me. Should I code from scratch? Use a template? What language? I felt like I was picking a Netflix show with 10 seconds left on my trial.</p><p>Here&#8217;s the rundown of tools I considered, so you can skip my trial-and-error:</p><ul><li><p><strong>HTML/CSS/JavaScript</strong>: The OG stack. HTML for structure, CSS for looks, JS for flair. It&#8217;s simple but time-consuming to make it modern. My first attempt looked like it belonged in 2005&#8212;not a vibe.</p></li><li><p><strong>React.js</strong>: My eventual pick. It&#8217;s like building with Lego&#8212;components for each section (About, Projects, etc.). Great for dynamic portfolios, but you need some JS chops.</p></li><li><p><strong>Next.js</strong>: React&#8217;s fancier sibling with SEO and speed baked in. Perfect if you want a blog alongside your portfolio (maybe for my football analytics posts?). A bit much for beginners, though.</p></li><li><p><strong>Tailwind CSS or Bootstrap</strong>: CSS frameworks to style without pain. Tailwind&#8217;s utility classes let you customize fast; Bootstrap&#8217;s pre-built components are plug-and-play. I went Tailwind for flexibility.</p></li></ul><p>I started with vanilla JS, cobbling together a page. It worked, but the design? Yawn city. I wanted something that popped, so I turned to templates.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GGcm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F857c98e3-7cf9-4024-835a-804f4c3d6b86_1100x533.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GGcm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F857c98e3-7cf9-4024-835a-804f4c3d6b86_1100x533.webp 424w, https://substackcdn.com/image/fetch/$s_!GGcm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F857c98e3-7cf9-4024-835a-804f4c3d6b86_1100x533.webp 848w, https://substackcdn.com/image/fetch/$s_!GGcm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F857c98e3-7cf9-4024-835a-804f4c3d6b86_1100x533.webp 1272w, https://substackcdn.com/image/fetch/$s_!GGcm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F857c98e3-7cf9-4024-835a-804f4c3d6b86_1100x533.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GGcm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F857c98e3-7cf9-4024-835a-804f4c3d6b86_1100x533.webp" width="1100" height="533" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/857c98e3-7cf9-4024-835a-804f4c3d6b86_1100x533.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:533,&quot;width&quot;:1100,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:12464,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://siddhraj.substack.com/i/176267722?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F857c98e3-7cf9-4024-835a-804f4c3d6b86_1100x533.webp&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!GGcm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F857c98e3-7cf9-4024-835a-804f4c3d6b86_1100x533.webp 424w, https://substackcdn.com/image/fetch/$s_!GGcm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F857c98e3-7cf9-4024-835a-804f4c3d6b86_1100x533.webp 848w, https://substackcdn.com/image/fetch/$s_!GGcm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F857c98e3-7cf9-4024-835a-804f4c3d6b86_1100x533.webp 1272w, https://substackcdn.com/image/fetch/$s_!GGcm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F857c98e3-7cf9-4024-835a-804f4c3d6b86_1100x533.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2>Templates: A Love-Hate Story</h2><p>Templates seemed like the answer: Free, pre-built designs on GitHub you can tweak and deploy. I dug through repos and found some solid ones:</p><ul><li><p><strong>DeveloperFolio by saadpasta</strong>: React-based, with spots for projects, skills, and a blog. Clean and dev-friendly.</p></li><li><p><strong>MasterPortfolio by ashutosh1919</strong>: Uses Material-UI, has dark mode, and feels pro. Easy to tweak via JSON.</p></li><li><p><strong>Gatsby Starters</strong>: For React fans, these are minimal and SEO-ready. Great if you want to write portfolio blogs.</p></li><li><p><strong>GitHub&#8217;s Portfolio Topic</strong>: Search &#8220;portfolio-template&#8221; on GitHub&#8212;hundreds of options, from HTML to Next.js.</p></li></ul><p>But here&#8217;s the drama: Some templates were <em>too</em> perfect, like they&#8217;d fool people into thinking I didn&#8217;t build it. Others didn&#8217;t match my data-heavy style&#8212;too flashy, not enough substance. Customizing them took hours, and I still felt like I was borrowing someone else&#8217;s vibe. I wanted my portfolio to be <em>mine</em>, not a fork of someone else&#8217;s repo.</p><p>Enter YouTube, my late-night savior.</p><p></p><h2>AI Magic: Discovering Tempo.new</h2><p>One restless night, I stumbled on a YouTube video: &#8220;Build a Portfolio with AI in Minutes.&#8221; It was a game-changer. The creator showed WordPress for drag-and-drop ease, but then dropped a bomb: AI tools like <a href="https://www.tempo.new/">Tempo.new</a>. Think of it as a code generator, design tool, and previewer in one. Free tier gives you 30 prompts/month; Pro ($30/month) unlocks unlimited.</p><p>I hopped on Tempo and typed this prompt:</p><blockquote><p>I want to create my own portfolio. The given link gives my GitHub which has all my work and other link gives my LinkedIn which gives other information. Give me full step-by-step guide from scratch: <a href="https://github.com/siddhraj1412">https://github.com/siddhraj1412</a> <a href="https://www.linkedin.com/in/siddhraj-thakor">https://www.linkedin.com/in/siddhraj-thakor</a><br>I want to use Vercel for deployment so give codes from scratch. I don&#8217;t want to git clone other person&#8217;s portfolio so give codes by that. Do one thing: just give structure of the portfolio with the doable code where I can add things by my way about the parts. After building and deploying the portfolio and want to build it using npm create vite@latest</p></blockquote><p>Tempo churned out a Vite + React setup: Hero section, Projects linked to my GitHub, Skills from LinkedIn, and a Contact form. It wasn&#8217;t fully filled (smart move&#8212;no fake data), but gave me components to customize. I hit preview, saw a clean layout, and downloaded the code.</p><p></p><h2>From Code to Live Site: The Fun Part</h2><p>Unzipping the files in VS Code, I ran <code>npm install</code> to grab dependencies, then <code>npm run dev</code>&#8212;boom, my portfolio was live on localhost:5173. The design needed work, so I tweaked with Tempo&#8217;s prompts, like &#8220;Add a Tailwind gradient header.&#8221; Free tier capped me at 5-6 prompts, so I turned to other AIs:</p><ul><li><p><strong>ChatGPT</strong>: Quick for CSS fixes, like &#8220;Make this nav responsive.&#8221;</p></li><li><p><strong>Grok (xAI)</strong>: Fun for creative ideas, like &#8220;Suggest a hover effect.&#8221;</p></li><li><p><strong>Claude or Gemini</strong>: Great for deeper code logic.</p></li><li><p><strong>Blackbox AI</strong>: Nailed specific tweaks, like &#8220;Fix this card layout.&#8221;</p></li></ul><p>I edited files in <code>/src/components/</code>&#8212;swapped text in About.jsx, added my football analytics project to Portfolio.jsx. For design, I&#8217;d prompt: &#8220;Add Framer Motion animations to this section.&#8221; Pro tip: Be specific with prompts, like &#8220;Use Tailwind for a blue gradient and animate on scroll.&#8221;</p><p>Deployment was a breeze: Pushed to GitHub, connected to Vercel, and my site was live at <a href="https://siddhraj-portfolio.vercel.app/">siddhraj-portfolio.vercel.app</a>. It&#8217;s not perfect, but it&#8217;s 100% Siddhraj.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9Xg1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff194393c-580f-4339-9f57-df68b0d13fd3_1100x522.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9Xg1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff194393c-580f-4339-9f57-df68b0d13fd3_1100x522.webp 424w, https://substackcdn.com/image/fetch/$s_!9Xg1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff194393c-580f-4339-9f57-df68b0d13fd3_1100x522.webp 848w, https://substackcdn.com/image/fetch/$s_!9Xg1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff194393c-580f-4339-9f57-df68b0d13fd3_1100x522.webp 1272w, https://substackcdn.com/image/fetch/$s_!9Xg1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff194393c-580f-4339-9f57-df68b0d13fd3_1100x522.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9Xg1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff194393c-580f-4339-9f57-df68b0d13fd3_1100x522.webp" width="1100" height="522" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f194393c-580f-4339-9f57-df68b0d13fd3_1100x522.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:522,&quot;width&quot;:1100,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:20112,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://siddhraj.substack.com/i/176267722?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff194393c-580f-4339-9f57-df68b0d13fd3_1100x522.webp&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!9Xg1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff194393c-580f-4339-9f57-df68b0d13fd3_1100x522.webp 424w, https://substackcdn.com/image/fetch/$s_!9Xg1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff194393c-580f-4339-9f57-df68b0d13fd3_1100x522.webp 848w, https://substackcdn.com/image/fetch/$s_!9Xg1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff194393c-580f-4339-9f57-df68b0d13fd3_1100x522.webp 1272w, https://substackcdn.com/image/fetch/$s_!9Xg1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff194393c-580f-4339-9f57-df68b0d13fd3_1100x522.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2>Your Free Portfolio Checklist</h2><p>To help you start, here&#8217;s the checklist I wish I had:</p><ul><li><p><strong>Set Your Goal</strong>: Job portfolio? Freelance showcase? Mine was to highlight my GitHub projects for tech roles.</p></li><li><p><strong>Pick Your Tools</strong>: Start with React + Tailwind for flexibility. Vite&#8217;s a solid boilerplate.</p></li><li><p><strong>Curate Content</strong>: Choose 3&#8211;5 projects (e.g., my <a href="https://github.com/siddhraj1412">WhatsApp Chat Analyzer</a>). Add stories behind them.</p></li><li><p><strong>Use AI Smartly</strong>: Try Tempo.new for structure, then refine with free AIs like Grok or Claude.</p></li><li><p><strong>Deploy Fast</strong>: Vercel&#8217;s your friend&#8212;free and easy.</p></li><li><p><strong>Test and Tweak</strong>: Check mobile view, get feedback, iterate.</p></li></ul><p>Want this as a PDF? comment to this email, and I&#8217;ll send it your way!</p><p></p><h2>Why This Matters (And Why You Should Join Me)</h2><p>Building this portfolio taught me that you don&#8217;t need to be a design wizard or coding guru. AI tools level the playing field, but it&#8217;s your story&#8212;your projects, your hustle&#8212;that makes it shine. I&#8217;m sharing this because I want you to skip my stress and build something you&#8217;re proud of.</p><p>Got a portfolio idea? Hit reply and tell me about it&#8212;I read every email. Loved this post? Share it with a friend who&#8217;s stuck like I was, and let&#8217;s grow this Substack fam! Subscribe for free to get weekly tips on AI, coding, and data projects (next up: how I used my portfolio to land freelance gigs). Check out my <a href="https://github.com/siddhraj1412">GitHub</a> or <a href="https://www.linkedin.com/in/siddhraj-thakor">LinkedIn</a> for more. Made with love and a lot of coffee&#8212;let&#8217;s build something epic together!&#10084;&#65039;</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://siddhraj.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Siddhraj&#8217;s Substack is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Can Data Uncover Football’s Next Superstar? A Deep Dive into FM23 Wonderkids ⚽📊]]></title><description><![CDATA[Just A guy doing this work to make himself disciplined.]]></description><link>https://siddhraj.substack.com/p/can-data-uncover-footballs-next-superstar</link><guid isPermaLink="false">https://siddhraj.substack.com/p/can-data-uncover-footballs-next-superstar</guid><dc:creator><![CDATA[Siddhraj Thakor]]></dc:creator><pubDate>Thu, 09 Oct 2025 02:29:14 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!8HXP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9711a5fe-9973-44b9-9e95-58a3fef348d7_933x366.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Why I Did This</h2><p>Football Manager (FM) isn&#8217;t just a game&#8212;it&#8217;s a treasure trove of scouting data masquerading as entertainment. As a lifelong fan of both football and data, I&#8217;ve always wondered: <em>Could FM&#8217;s data help us spot the next Erling Haaland or Kylian Mbapp&#233; before they hit the big stage?</em></p><p>Last week, I rolled up my sleeves and dove into the FM23 dataset to find out. This isn&#8217;t just a data project&#8212;it&#8217;s a football adventure where stats meet scouting, and I&#8217;m taking you along for the ride. Stick with me, and you&#8217;ll see how spreadsheets can uncover the game&#8217;s hidden gems. &#128640;</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://siddhraj.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Siddhraj&#8217;s Substack is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Chapter 1: The Data &#8212; Building a Football Universe</h2><p>To kick things off, I scraped the FM23 database, a goldmine of player info&#8212;think attributes like dribbling, passing, and pace, plus potential ratings, positions, and more. It&#8217;s every football nerd&#8217;s dream spreadsheet, with over 100,000 players and 50+ attributes per player, from agility to work rate.</p><p>Here&#8217;s the setup:</p><ul><li><p><strong>Data Source</strong>: Scraped from the FM23 database, a realistic simulation of real-world football scouting. (Pro tip: FM&#8217;s attributes are tuned to mirror actual player evals from scouts&#8212;technical, physical, mental all in one!)</p></li><li><p><strong>Preprocessing</strong>: I tackled duplicates, filled in missing values (like incomplete stamina or vision stats), and standardized attributes to ensure consistency. No more wonky ranges throwing off my rankings!</p></li><li><p><strong>Realism</strong>: FM&#8217;s data mirrors real scouting datasets closely, with attributes reflecting what scouts prioritize&#8212;technical skills, physical traits, and mental sharpness. It&#8217;s not perfect (hello, game biases), but it&#8217;s damn close for simulation fun.</p></li></ul><p>This dataset felt like stepping into a virtual scouting room. But raw data is messy, and I had to clean it up before the real fun began. &#129529;</p><div><hr></div><h2>Chapter 2: Cleaning the Mess</h2><p>If you&#8217;ve ever worked with data, you know it&#8217;s rarely clean. The FM23 dataset was no exception&#8212;missing attributes, inconsistent ranges (e.g., &#8220;10-15&#8221; for passing), and outliers galore. It was like sorting through a football club&#8217;s transfer records after a chaotic deadline day. &#128517;</p><p>Here&#8217;s what I did:</p><ul><li><p><strong>Handled Missing Data</strong>: Filled in gaps using median values for attributes like stamina or tackling, ensuring no player was unfairly excluded.</p></li><li><p><strong>Standardized Ranges</strong>: Converted attribute ranges (e.g., &#8220;10-15&#8221;) to their mean values for consistency&#8212;simple but game-changing.</p></li><li><p><strong>Exploratory Data Analysis (EDA)</strong>: I dug into transfer values and player demographics. Surprisingly, Colombia topped the charts for youth players, with a staggering number of prospects flooding the dataset. Even more shocking? Midfielders, not attackers, had the highest average transfer values&#8212;a twist I didn&#8217;t see coming. (Who knew the engine room was worth more than the fireworks up top? &#129327;) Plots showed clear spikes: Colombia&#8217;s youth pipeline is a goldmine, and midfielders&#8217; versatility jacks up their price tags.</p></li></ul><p>These insights set the stage for the real question: <em>Who are the wonderkids hiding in this data?</em> Let&#8217;s scout &#8216;em out.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8HXP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9711a5fe-9973-44b9-9e95-58a3fef348d7_933x366.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8HXP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9711a5fe-9973-44b9-9e95-58a3fef348d7_933x366.png 424w, https://substackcdn.com/image/fetch/$s_!8HXP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9711a5fe-9973-44b9-9e95-58a3fef348d7_933x366.png 848w, https://substackcdn.com/image/fetch/$s_!8HXP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9711a5fe-9973-44b9-9e95-58a3fef348d7_933x366.png 1272w, https://substackcdn.com/image/fetch/$s_!8HXP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9711a5fe-9973-44b9-9e95-58a3fef348d7_933x366.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8HXP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9711a5fe-9973-44b9-9e95-58a3fef348d7_933x366.png" width="933" height="366" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9711a5fe-9973-44b9-9e95-58a3fef348d7_933x366.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:366,&quot;width&quot;:933,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42372,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://siddhraj.substack.com/i/175602084?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9711a5fe-9973-44b9-9e95-58a3fef348d7_933x366.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!8HXP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9711a5fe-9973-44b9-9e95-58a3fef348d7_933x366.png 424w, https://substackcdn.com/image/fetch/$s_!8HXP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9711a5fe-9973-44b9-9e95-58a3fef348d7_933x366.png 848w, https://substackcdn.com/image/fetch/$s_!8HXP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9711a5fe-9973-44b9-9e95-58a3fef348d7_933x366.png 1272w, https://substackcdn.com/image/fetch/$s_!8HXP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9711a5fe-9973-44b9-9e95-58a3fef348d7_933x366.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Chapter 3: The Scouting Techniques</h2><p>Time to hunt for wonderkids! Using the FM23 dataset, I applied three unique methods to mimic real-world scouting, each uncovering a different flavor of talent. These approaches will make you rethink how you scout in Football Manager&#8212;and maybe even spot real-world stars. Let&#8217;s dive in; the results are worth sticking around for! &#128526;</p><h3>Method 1: Media Description &#8211; &#8220;Wonderkid&#8221;</h3><p>In FM23, the &#8220;wonderkid&#8221; media description is the holy grail. These players are already tipped for greatness&#8212;think Haaland-level hype with price tags to match. I filtered for players labeled &#8220;wonderkid,&#8221; zeroing in on their attributes and market values. These are the elite prospects, but their hefty costs reflect their reputation. Curious why they&#8217;re worth the buzz? Check my <a href="https://www.kaggle.com/code/siddhrajthakor/scouting-wonderkids-using-different-methods">Kaggle notebook</a> for the stats that make them shine.</p><h3>Method 2: Media Description &#8211; &#8220;Promising&#8221;</h3><p>Not every star is a household name yet. Players tagged as &#8220;promising&#8221; in FM23 are the hidden gems&#8212;underrated talents with massive potential. I sifted through this group, focusing on high Potential Ability (PA) paired with lower Current Ability (CA) to find players with room to grow. These are the budget-friendly picks perfect for your FM save or spotting real-world bargains before they explode. &#128142;</p><h3>Method 3: Young International Stars</h3><p>Nothing screams &#8220;future legend&#8221; like a teenager already earning caps for their national team. I targeted players under 21 who&#8217;ve represented their countries&#8212;whether it&#8217;s powerhouses like England or Other nations like USA. These young internationals often have attributes that rival seasoned pros, with maturity beyond their years. This method unearthed some wild surprises from unexpected corners of the globe. &#127757;</p><p><em>Hook</em>: The names I found will blow you away&#8212;some are already stars, others are waiting to shine. Keep reading to see who made the cut! (P.S. The notebook&#8217;s bar charts and scatter plots bring these players to life!)</p><h2>Chapter 4: The Results &#8212; Who Are the Wonderkids?</h2><p>Running these three methods on the FM23 dataset was like opening a treasure chest. I filtered for under-21 players with sky-high potential (PA 150+) and cross-checked them across &#8220;wonderkid&#8221; and &#8220;promising&#8221; labels, plus young international status. The results? A mix of household names and under-the-radar talents that had me double-checking the data. Some of these kids are already making waves IRL, while others are the kind you&#8217;d snap up in FM for pennies before they&#8217;re worth millions.</p><p>Check my <a href="https://www.kaggle.com/code/siddhrajthakor/scouting-wonderkids-using-different-methods">Kaggle notebook</a> for the full list and visuals&#8212;bar charts highlight their standout attributes, and CA vs. PA scatter plots show who&#8217;s ready to skyrocket. I was stunned by how some &#8220;promising&#8221; players outshone bigger names in potential, while young internationals brought veteran-level composure to the table. Trust me, you&#8217;ll want to see these names for yourself! &#128562;</p><p>(The data is 3 years old so don&#8217;t judge the output)</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PEAW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6fbd99-3380-4b3c-a9e4-8258e8864722_636x713.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PEAW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6fbd99-3380-4b3c-a9e4-8258e8864722_636x713.png 424w, https://substackcdn.com/image/fetch/$s_!PEAW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6fbd99-3380-4b3c-a9e4-8258e8864722_636x713.png 848w, https://substackcdn.com/image/fetch/$s_!PEAW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6fbd99-3380-4b3c-a9e4-8258e8864722_636x713.png 1272w, https://substackcdn.com/image/fetch/$s_!PEAW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6fbd99-3380-4b3c-a9e4-8258e8864722_636x713.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PEAW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6fbd99-3380-4b3c-a9e4-8258e8864722_636x713.png" width="636" height="713" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5e6fbd99-3380-4b3c-a9e4-8258e8864722_636x713.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:713,&quot;width&quot;:636,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:119628,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://siddhraj.substack.com/i/175602084?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6fbd99-3380-4b3c-a9e4-8258e8864722_636x713.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!PEAW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6fbd99-3380-4b3c-a9e4-8258e8864722_636x713.png 424w, https://substackcdn.com/image/fetch/$s_!PEAW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6fbd99-3380-4b3c-a9e4-8258e8864722_636x713.png 848w, https://substackcdn.com/image/fetch/$s_!PEAW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6fbd99-3380-4b3c-a9e4-8258e8864722_636x713.png 1272w, https://substackcdn.com/image/fetch/$s_!PEAW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e6fbd99-3380-4b3c-a9e4-8258e8864722_636x713.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Chapter 5: Insights &amp; Surprises</h2><p>The real magic wasn&#8217;t just the names&#8212;it was how each method told a different story:</p><ul><li><p><strong>Wonderkid Filter</strong>: Loved players with polished skills and big reputations, like those already lighting up top leagues. Their high market values reflect the hype, but their stats back it up&#8212;perfect for big-budget FM clubs. &#128176;</p></li><li><p><strong>Promising Filter</strong>: Uncovered raw talents from smaller clubs, often with lower transfer fees but massive upside. These are the players you sign early and watch become legends.</p></li><li><p><strong>Young Internationals</strong>: Highlighted teens with freakish maturity, like kids bossing it for their national teams. Some came from unexpected nations, proving talent doesn&#8217;t always need a big stage to shine.</p></li></ul><p>This project showed me scouting isn&#8217;t one-size-fits-all, just like in real life. The Colombia youth boom from my EDA? That&#8217;s a goldmine for bargains. And midfielders topping transfer values? It&#8217;s their versatility that clubs pay for. This is why data matters&#8212;it spots what the eye might miss. &#128293;</p><p>Want to find these gems yourself? My <a href="https://www.kaggle.com/code/siddhrajthakor/scouting-wonderkids-using-different-methods">Kaggle notebook</a> has all the code and visuals to start your own scouting mission.</p><h2>Final Thoughts: Why Football Data Matters</h2><p>This wasn&#8217;t just a data exercise&#8212;it was a love letter to football and the fans who dig deeper. FM&#8217;s dataset is a scouting tool, and data is the lens that brings it into focus. Who knows? The next big signing your club makes might just be hiding in this dataset&#8212;like Yamal, who&#8217;s already Ballon d&#8217;Or whispering.</p><p>Want to uncover the next superstar yourself? </p><p>Dive into my <a href="https://www.kaggle.com/code/siddhrajthakor/scouting-wonderkids-using-different-methods">Kaggle notebook</a> and tweak the methods&#8212;your save could change forever.</p><h2>Outro</h2><p>If you enjoyed this football-meets-data journey, subscribe to my Substack for weekly stories where stats and passion collide. Next up, I&#8217;ll share tips on building a killer data portfolio&#8212;perfect for aspiring analysts.</p><p>Life&#8217;s been hectic, so thanks for sticking with me. Drop a comment or share your own wonderkid picks&#8212;I&#8217;d love to hear them! What&#8217;s your FM23 steal of the century? &#9917;&#128172;</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://siddhraj.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Siddhraj&#8217;s Substack is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[How I Turned Football Manager 2023 into a Kaggle Data Goldmine]]></title><description><![CDATA[From FM23 Scouting to Kaggle Gold: My Data Journey]]></description><link>https://siddhraj.substack.com/p/how-i-turned-football-manager-2023</link><guid isPermaLink="false">https://siddhraj.substack.com/p/how-i-turned-football-manager-2023</guid><dc:creator><![CDATA[Siddhraj Thakor]]></dc:creator><pubDate>Wed, 01 Oct 2025 19:51:35 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!EXCX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4ec0821-abb4-4e83-b47c-647eae7debff_1000x600.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div><hr></div><h3>From Game Obsession to Data Triumph: My Journey to Create a Massive FM23 Player Dataset</h3><p>If you&#8217;re a Football Manager fan like me, you know the thrill of scouting the next virtual Messi or building a dream team from obscure leagues. But what if you could take that passion for FM23 and turn it into a data science adventure? That&#8217;s exactly what I did when I stumbled across an FM20 dataset on Kaggle and thought, &#8220;I have FM23&#8202;&#8212;&#8202;why not create my own dataset?&#8221; Spoiler: It wasn&#8217;t easy, but the result is a 70+ attribute, global player dataset now live on Kaggle, ready for you to explore. Here&#8217;s my story of trial, error, and eventual triumph&#8202;&#8212;&#8202;plus tips for you to try it yourself!</p><div><hr></div><h3>The Spark: Discovering the FM20 Dataset</h3><p>It all started when I found a Football Manager 2020 dataset on Kaggle. Thousands of players, stats like Pace, Dribbling, and Transfer Value&#8202;&#8212;&#8202;it was a goldmine for analyzing virtual football talent. As a data nerd and FM23 addict, I wondered: <em>Could I create something like this for FM23?</em> With nearly 89,000 players in the game, I saw a chance to build something epic for gamers and data scientists alike. Little did I know, my PC and I were in for a wild ride.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://siddhraj.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Siddhraj&#8217;s Substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h3>The First Attempt: A PC-Breaking Disaster</h3><p>Eager to start, I dove into FM23&#8217;s scouting menu. I went to <em>Scouting &gt; Players in Range</em>, customized my view, and added every column I could think of&#8202;&#8212;&#8202;UID, Name, Position, Pace, Finishing, Determination, Transfer Value, you name it. With 89,000 players loaded, I thought, &#8220;This is it!&#8221; Following some AI advice, I tried <em>Ctrl+A</em> to select all, then <em>Ctrl+P</em> to print the table as a webpage, hoping to extract an HTML file and convert it to CSV.</p><p>Big mistake. My PC groaned, FM23 froze, and the game crashed so hard I had to restart my computer. Lesson learned: 89,000 players&#8217; worth of data is <em>not</em> something my laptop could handle in one go. I needed a better plan.</p><div><hr></div><h3>Plan B: Terminal Commands and Third-Party Tools</h3><p>Frustrated but undeterred, I tried a new approach: terminal commands. I figured I could export data via scripts or game files. Nope&#8202;&#8212;&#8202;FM23&#8217;s data structure wasn&#8217;t that friendly, and my command-line skills hit a wall. Next, I tested third-party tools designed for game data extraction. They worked&#8230; sort of. The problem? They didn&#8217;t capture all the columns I wanted, like Media Description or Injury Proneness, which are gold for analytics. I was stuck again, and my dream dataset felt further away than ever.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EXCX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4ec0821-abb4-4e83-b47c-647eae7debff_1000x600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EXCX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4ec0821-abb4-4e83-b47c-647eae7debff_1000x600.png 424w, https://substackcdn.com/image/fetch/$s_!EXCX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4ec0821-abb4-4e83-b47c-647eae7debff_1000x600.png 848w, https://substackcdn.com/image/fetch/$s_!EXCX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4ec0821-abb4-4e83-b47c-647eae7debff_1000x600.png 1272w, https://substackcdn.com/image/fetch/$s_!EXCX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4ec0821-abb4-4e83-b47c-647eae7debff_1000x600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EXCX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4ec0821-abb4-4e83-b47c-647eae7debff_1000x600.png" width="1000" height="600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e4ec0821-abb4-4e83-b47c-647eae7debff_1000x600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:600,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EXCX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4ec0821-abb4-4e83-b47c-647eae7debff_1000x600.png 424w, https://substackcdn.com/image/fetch/$s_!EXCX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4ec0821-abb4-4e83-b47c-647eae7debff_1000x600.png 848w, https://substackcdn.com/image/fetch/$s_!EXCX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4ec0821-abb4-4e83-b47c-647eae7debff_1000x600.png 1272w, https://substackcdn.com/image/fetch/$s_!EXCX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4ec0821-abb4-4e83-b47c-647eae7debff_1000x600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3>The Breakthrough: Batch Processing and HTML Conversion</h3><p>After multiple failures, I had an epiphany: break the problem into smaller chunks. Instead of grabbing all 89,000 players at once, I split the data by age groups:</p><ul><li><p>15&#8211;17-year-olds (young prospects)</p></li><li><p>18&#8211;20-year-olds (emerging talents)</p></li><li><p>21&#8211;25-year-olds (prime players)</p></li><li><p>26&#8211;30-year-olds (seasoned pros)</p></li><li><p>31+ years (veterans)</p></li></ul><p>For each group, I customized the scouting view, selected the players, and used <em>Ctrl+P</em> to save the table as an HTML file. This time, my PC didn&#8217;t crash! I ended up with five HTML files, each packed with player stats. Now, I needed to turn them into CSVs.</p><p>Enter ConvertCSV.com, an online tool for converting HTML tables to CSV. I uploaded each HTML file, selected the table I wanted (some had multiple tables), and waited. The website lagged&#8202;&#8212;&#8202;hard. My browser froze, and I thought I&#8217;d hit another dead end. But after a few nerve-wracking minutes, it worked! I had five CSV files, each with thousands of players and 70+ attributes.</p><div><hr></div><h3>The Final Step: Merging with Python</h3><p>With five CSVs in hand, I needed to combine them into one unified dataset. Python and pandas came to the rescue. Here&#8217;s the simple code I used (and you can too):</p><p>code :- https://www.kaggle.com/code/siddhrajthakor/merging-different-csv-to-get-final-dataset</p><p>This created fm23_players.csv, a massive dataset covering players from every corner of the globe&#8202;&#8212;&#8202;England&#8217;s Premier Division to Argentina&#8217;s lower leagues, youth prospects to grizzled veterans. There was a problem in the dataset like postion column was 2 times where 1 column didnt had all the information so it had to be dropped.It includes everything from Acceleration to Transfer Value, ready for analysis.</p><div><hr></div><h3>Why This Dataset Is a Game-Changer</h3><p>After all that effort, I uploaded the dataset to Kaggle as <strong><a href="https://www.kaggle.com/datasets/siddhrajthakor/football-manager-2023-dataset">Scout Stars: Football Manager Player Data</a></strong><a href="https://www.kaggle.com/datasets/siddhrajthakor/football-manager-2023-dataset">.</a> Here&#8217;s why it&#8217;s worth checking out:</p><ul><li><p><strong>Massive Scope</strong>: Thousands of players from Europe, South America, Asia, Africa, and beyond.</p></li><li><p><strong>Rich Data</strong>: 80+ attributes, from Pace and Dribbling to Leadership and Injury Proneness.</p></li><li><p><strong>Versatile Uses</strong>: Perfect for scouting virtual stars, building ML models, or creating visualizations.</p></li><li><p><strong>Ready-to-Go</strong>: Cleaned, merged, and UTF-8 encoded for special characters (e.g., Jos&#233; Gonz&#225;lez).</p></li></ul><p>Whether you&#8217;re a Football Manager fan wanting to scout hidden gems or a data scientist analyzing player trends, this dataset has something for you.</p><div><hr></div><h3>Lessons Learned and Tips for You</h3><p>My journey wasn&#8217;t smooth, but it taught me a ton. If you want to create your own FM dataset, here&#8217;s what I&#8217;d recommend:</p><ol><li><p><strong>Break It Down</strong>: Don&#8217;t try to export all players at once. Use filters (e.g., age, league) to manage data in chunks.</p></li><li><p><strong>HTML to CSV</strong>: Tools like ConvertCSV.com are lifesavers, but be patient&#8202;&#8212;&#8202;they can lag with large files.</p></li><li><p><strong>Python for Merging</strong>: Use pandas&#8217; pd.concat for combining CSVs. Check for duplicates with df.drop_duplicates(subset=[&#8216;UID&#8217;]).</p></li><li><p><strong>Test Your PC</strong>: Ensure your computer can handle large exports. Close other apps to avoid crashes.</p></li><li><p><strong>Share on Kaggle</strong>: A great title (like &#8220;Scout Stars&#8221;) and thumbnail (football action + charts) can boost upvotes.</p></li></ol><div><hr></div><h3>Call to Action: Explore the Dataset!</h3><p>I poured hours of trial and error into this dataset, and now it&#8217;s live on Kaggle for you to dive into. Want to find the next virtual wonderkid? Build a model to predict transfer values? Or create a dashboard of global talent? <strong>Download Scout Stars: Football Manager Player Data</strong> and start exploring!</p><ul><li><p><strong>Try This</strong>: Filter for players under 18 with high Technique and Determination to scout future stars.</p></li><li><p><strong>Share Your Work</strong>: Build a notebook, visualization, or scouting report and post it on Kaggle. Let&#8217;s make this a football analytics hub!</p></li><li><p><strong>Upvote if You Love It</strong>: If you&#8217;re a Football Manager fan or data geek, show some love with an upvote to spread the word.</p></li></ul><p>&#10024; Feel free to click the &#9829; button on this post so more people can discover it on Substack &#128525; tell me what you think in the comments!</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://siddhraj.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Siddhraj&#8217;s Substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>